The present invention relates to compositions and methods for identifying, selecting and producing enhanced disease and/or pathogen resistant plants using novel resistance genes.
A Sequence Listing in ASCII text format, submitted under 37 C.F.R. § 1.821, entitled “82250PCT_ST25.txt”, approximately ˜9.07 MB in size, generated on Jan. 7, 2022 and filed via EFS-Web is provided in lieu of a paper copy. This Sequence Listing is hereby incorporated by reference into the specification in its entirety.
Plant pathogens are known to cause considerable damage to important crops, resulting in significant agricultural losses with widespread consequences for both the food supply and other industries that rely on plant materials. As such, applicant desires to reduce the incidence and/or impact of agricultural pathogens on crop production.
Several pathogens have been associated with damage to soybeans, which individually and collectively have the potential to cause significant yield losses in the United States and throughout the world. Exemplary pathogens include, but are not limited to fungi (e.g., genus Phytophthora and Asian Soybean rust Phakopsora pachyrhizi), nematodes (e.g., genus Meloidogyne, particularly, Meloidogyne javanica), and soybean stem canker. Given the significant threat to global food supplies that these pathogens present as well as the time and expense associated with treating soybean crops to prevent yield loss, new methods for producing pathogen resistant soybean cultivars are needed. What is needed is novel resistance genes (herein, “R-Genes”) that can be introduced into commercial soybean plants to control soybean pathogens.
This summary lists several embodiments of the presently disclosed subject matter, and in many cases lists variations of these embodiments. Thus, in embodiments, it is an object of the presently disclosed subject matter to provide methods for conveying pathogen resistance into non-resistant plants and germplasm. Further, the presently disclosed subject matter provides novel Glycine max lines comprising in its genome a chromosome interval, loci, and/or gene that is derived from Glycine tomentella and confers pathogen resistance in said novel Glycine max line, which object is achieved in whole or in part by the presently disclosed subject matter.
In embodiments of the present invention, the plant or germplasm is a soybean plant or germplasm, and the pathogen is a soybean pathogen, particularly a soybean fungal pathogen such as Asian Soybean Rust (e.g., Phakopsora pachyrhizi; herein “ASR”).
The present invention provides for chromosomal intervals derived from Glycine tomentella, particularly accession line PI505267, that when introduced into a plant (e.g., a soybean plant such as Glycine max strain Williams 82 or an elite Glycine max line) are sufficient to confer increased rust resistance, such as increased Asian soybean rust (“ASR”) resistance, as compared to a control plant not comprising said chromosomal interval.
The invention also provides for novel proteins and nucleic acids derived from the chromosomal interval of Glycine tomentella accession line PI505267 that confer rust resistance. In embodiments of the present invention, the novel protein is the protein of SEQ ID NO: 5 or functional variants thereof, such as variants that are substantially similar (e.g., having at least 85%, at least 90% or at least 95% sequence identity) and that confer increased ASR resistance on a plant in which the protein is expressed. In further embodiments, nucleic acids comprising and/or encoding novel R-genes derived from the chromosomal interval of Glycine tomentella accession line PI505267 are provided that, when expressed in a plant, confer rust resistance. In embodiments, the nucleic acid comprises a nucleotide sequence encoding the novel protein of SEQ ID NO: 5, or functional variants thereof. In other embodiments, the nucleic acid comprises the nucleotide sequence of any of SEQ ID NOS: 2-4 and 11-12; or a sequence that is substantially similar and capable of conferring ASR resistance (such as sequences that have at least 85%, at least 90% or at least 95% sequence identity to any one of SEQ ID NOs: 2-4 and 11-12).
The present invention also provides for expression cassettes, vectors, and DNA constructs comprising the novel R-gene and/or encoding the novel protein of the invention. In embodiments, the expression cassette allows for transgenic expression of the novel nucleic acid and/or protein via a promoter operably connected thereto. In other embodiments, the DNA constructs allow for gene editing of the novel nucleic acid and/or protein.
The present invention also encompasses novel plants that have stably incorporated into their genome a novel nucleic acid sequence derived from the chromosomal interval of Glycine tomentella accession line PI505267 (such as a nucleic acid encoding the novel protein of SEQ ID NO: 5, or a substantially similar polypeptide) that confers the novel plant with increased pathogen resistance as compared to control plants not comprising the nucleic acid. In embodiments, the plants are novel Glycine max plants and/or novel elite Glycine max plants that have increased ASR resistance as compared to the control plants. In embodiments, the novel nucleic acid sequence is introduced into the plant through transgenic expression, through introgression, through known breeding methods, or through gene editing.
The present invention also provides for methods of producing plants having increased ASR resistance through the introduction of a nucleic acid sequence encoding the novel R-gene and/or protein of the invention. In particular embodiments, a method of producing a transgenic plant with improved resistance against ASR is provided by introducing a nucleic acid molecule comprising a nucleotide sequence encoding the novel protein (e.g., encoding the protein of SEQ ID NO: 5 or a substantially similar sequence, or comprising the nucleotide sequence of any of SEQ ID NOS: 2-4, 11-12, or a substantially similar sequence). In embodiments, the nucleic acid is introduced through an expression cassette comprising the nucleic acid sequence, the expression cassette introduced to a recipient plant to obtain a transgenic plant, wherein the transgenic plant has increased resistance against ASR compared to the recipient plant.
Compositions and methods for identifying, selecting and producing Glycine plants (including wild Glycines, elite Glycine lines and Glycine max lines) with enhanced pathogen (e.g., rust) resistance are provided. Pathogen resistant plants and germplasm (e.g., soybean plants and germplasms) are also provided. In some embodiments, methods of producing an ASR resistant soybean plant are provided.
In some embodiments, methods of identifying a rust resistant soybean plant or germplasm are provided. Such methods may comprise detecting, in the soybean plant or germplasm, a genetic loci or molecular marker (e.g. SNP or a Quantitative Trait Loci (QTL)) associated with enhanced disease resistance, in particular ASR resistance. In some embodiments the genetic loci or molecular marker associates with the presence of a chromosomal interval comprising the nucleotide sequence of SEQ ID NO 1, 2-4, 11, or 12, or a portion thereof, wherein the portion thereof associates with ASR resistance.
The foregoing and other objects and aspects of the present invention are explained in detail in the drawings and specification set forth below.
SEQ ID NO: 1 is a chromosomal interval derived from Glycine tomentella line accession number PI505267, herein also referred to as “Contig 0133”. Contig 0133 has been mapped to G. tomentella (genotype D3) at an approximate interval of ˜9.28 MB-16.48 MB (that is was ˜33.8 Mbp in size) on chromosome 3. Genetic population mapping studies for PI505267 indicate that Glycine tomentella Chromosome 3 contains chromosomal intervals highly associated with ASR resistance (e.g., as corresponding to SEQ ID NO: 1). This chromosomal interval or portions thereof may be introduced (e.g., transgenically, through gene editing, and/or introgressed through use of embryo rescue & marker assisted breeding (MAB)) into Glycine max lines to create Glycine max lines resistant to various diseases such as ASR. In embodiments, “Contig 0133” is on Chromosome 3 in the span from position 3004342-36810588 of the reference genome.
Further research into the interval led to the discovery of a plurality of putative causative genes potentially associated with increased pathogen resistance trait (herein also referred to as R-genes). The putative genes from the above interval are located on or corresponding to Glycine tomentella Chromosome 3. Each of the causative genes were identified and isolated, their functions validated and their efficacy in resisting soybean pathogens were assessed. As elaborated in the Examples below, the chromosomal interval of SEQ ID NO: 1, derived from Glycine tomentella accession PI505267, can be used as a source for the R-genes corresponding to SEQ ID NOs: 2-5 and 7-8.
SEQ ID NO: 2 is a genomic DNA sequence of a soy rust resistance candidate gene (herein referred to as “GtoRG30”) from PI505267 encoding a protein containing Toll/Interleukin-1 receptor (TIR), nucleotide-binding site (NBS), and leucine rich-repeat (LRR) domains (herein, a “TNL” R-gene motif). The gene is syntenic to Glyma.05G165800 (Soy_william82_v2). The genomic DNA fragment has been mapped to an approximate interval of ˜11.44 MB-11.46 MB on chromosome 3 of G. tomentella. The genomic DNA sequence includes the gene with its native 5′UTR and 3′UTR and native introns.
SEQ ID NO: 3 is a genomic DNA sequence for soy rust resistance candidate gene GtoRG30 with its native 5′UTR and 3′UTR and with the first native intron replaced with Arabidopsis intron, iAtBAF60-01 (SEQ ID NO: 21).
SEQ ID NO: 4 is a DNA coding sequence for soy rust resistance candidate gene GtoRG30 that encodes a protein containing toll/interleukin receptor-1 (TIR), nucleotide-binding site (NBS), and leucine rich repeat (LRR) domains. The first native intron was replaced with Arabidopsis intron, iAtBAF60-01 (SEQ ID NO: 21).
SEQ ID NO: 5 is the amino acid sequence of the protein encoded by soy rust resistance candidate gene GtoRG30. The protein of SEQ ID NO: 5 is encoded by any of the nucleic acid sequences of SEQ ID NOS: 2-4 and 11-12.
SEQ ID NO: 6 is a genomic DNA sequence for a soy rust resistance candidate gene from PI505267 that encodes a protein comprising coiled-coiled (CC), nucleotide-binding site (NBS), and leucine rich repeat (LRR) domains (herein, a “CNL” R-gene motif).
SEQ ID NO: 7 is a genomic DNA sequence for another soy rust resistance candidate gene from PI505267 that encodes a protein comprising coiled-coiled (CC), nucleotide-binding site (NBS), and leucine rich repeat (LRR) domains. The genes of SEQ ID NOS. 6-7 are syntenic to Glyma.05G165600 (Soy_william82_v2).
SEQ ID NOS: 8-9 is the DNA sequence for a primer pair that can be used for generating an amplicon (such as via PCR) comprising the soy rust resistance candidate gene GtoRG30.
SEQ ID NO: 10 is the DNA sequence for a probe that can be used for detecting a polynucleotide, such as an amplicon, comprising soy rust resistance gene GtoRG30.
SEQ ID NO: 11 is a genomic DNA sequence (gGtoRG30-01) for soy rust resistance candidate gene GtoRG30 with its native promoter (SEQ ID NO: 15) and terminator (SEQ ID NO: 18) and with the first native intron replaced with Arabidopsis intron, iAtBAF60-01 (SEQ ID NO: 21).
SEQ ID NO: 12 is the cDNA sequence for soy rust resistance candidate gene GtoRG30.
SEQ ID NO: 13 is the DNA sequence of a promoter from the Medicago truncatula gene Mt12344. The promoter is active in plant cells and can be used to drive the expression of a heterologous nucleic acid sequence, such as any R-gene.
SEQ ID NO: 14 is the DNA sequence of a promoter from the Medicago truncatula gene Mt51186. The promoter is active in plant cells and can be used to drive the expression of a heterologous nucleic acid sequence, such as any R-gene.
SEQ ID NO: 15 is the DNA sequence of an endogenous promoter (“RG30_promoter”) from the soy rust resistance candidate gene herein referred to as “GtoRG30”. It consists of the 5′-non-transcribed sequence and the 5′ UTR (SEQ ID NO: 19) of the candidate gene.
SEQ ID NO: 16 is the DNA sequence of a terminator from the Medicago truncatula gene Mt12344.
SEQ ID NO: 17 is the DNA sequence of a terminator from the Medicago truncatula gene Mt51186. The promoter is active in plant cells and can be used to drive the expression of a heterologous nucleic acid sequence, such as any R-gene.
SEQ ID NO: 18 is the DNA sequence of an endogenous terminator (“RG30_terminator”) from the soy rust resistance candidate gene herein referred to as “GtoRG30”. It consists of the 3′-non-transcribed sequence and the 3′ UTR (SEQ ID NO: 20) of the candidate gene.
SEQ ID NO: 19 is the DNA sequence of the 5′ UTR from the soy rust resistance candidate gene herein referred to as “GtoRG30”.
SEQ ID NO: 20 is the DNA sequence of the 3′ UTR from the soy rust resistance candidate gene herein referred to as “GtoRG30”.
SEQ ID NO: 21 is the DNA sequence of intron iAtBAF60-01 of Arabidopsis thaliana BAF60 homolog.
ASR resistance can be introduced into G. max plants using a nucleic acid having a sequence encoding a polypeptide that is substantially identical to SEQ ID NO: 5 (e.g., having at least 70% sequence identity to SEQ ID NO: 5). For example, ASR resistance can be introduced through the introduction of a nucleic acid comprising any one of SEQ ID NOs: 2-4, 6-7, 11-12 or a nucleic acid sequence substantially identical to any one of SEQ ID NOS: 2-4, 6-7, and 11-12 (e.g., having at least 70% sequence identity).
SEQ ID NOS: 22-237, listed at Table 6, describe the DNA sequence of example assay components, including primers and probes, that can be used to detect and differentiate between favorable and unfavorable alleles associated with a given SNP position within the chromosomal interval of SEQ ID NO: 1.
The presently disclosed subject matter relates to compositions and methods for introducing novel resistance genes (herein “R-Genes”) encoding novel proteins for pathogen resistance into commercial plants to control plant pathogens. The methods involve transforming organisms with nucleic acid molecules having nucleotide sequences encoding the novel proteins for pathogen resistance of the invention. The nucleotide sequences of the invention are useful for generating plants, particularly soybean plants, that show increased resistance to plant pathogens, particularly fungal pathogens such as Asian Soybean Rust (herein, “ASR”). Thus, transformed plants, plant cells, plant tissues and seeds are provided. Compositions include nucleic acids and proteins relating to pathogen resistant plants as well as transformed plants, plant tissues and seeds. Nucleotide sequences of nucleic acids comprising the novel R-genes and/or encoding the amino acid sequence of the novel resistance proteins are disclosed. The sequences find use in the construction of vectors and expression cassettes for subsequent transformation into plants of interest, as probes and/or primers for the detection and isolation of the R-genes, and the like. In particular embodiments, the compositions and methods are used to introduce novel R-genes into soybean plants to control soybean pathogens, such as fungal pathogens (e.g., ASR) and/or nematodes.
This description is not intended to be a detailed catalog of all the different ways in which the invention may be implemented, or all the features that may be added to the instant invention. For example, features illustrated with respect to one embodiment may be incorporated into other embodiments, and features illustrated with respect to a particular embodiment may be deleted from that embodiment. Thus, the invention contemplates that in some embodiments of the invention, any feature or combination of features set forth herein can be excluded or omitted. In addition, numerous variations and additions to the various embodiments suggested herein will be apparent to those skilled in the art in light of the instant disclosure, which do not depart from the instant invention. Hence, the following descriptions are intended to illustrate some particular embodiments of the invention, and not to exhaustively specify all permutations, combinations and variations thereof.
All references listed below, as well as all references cited in the instant disclosure, including but not limited to all patents, patent applications and publications thereof, scientific journal articles, and database entries (e.g., GENBANK® database entries and all annotations available therein) are incorporated herein by reference in their entireties to the extent that they supplement, explain, provide a background for, or teach methodology, techniques, and/or compositions employed herein.
Nucleotide sequences provided herein are presented in the 5′ to 3′ direction, from left to right and are presented using the standard code for representing nucleotide bases as set forth in 37 CFR §§ 1.821-1.825 and the World Intellectual Property Organization (WIPO) Standard ST.25, for example: adenine (A), cytosine (C), thymine (T), and guanine (G).
Amino acids are likewise indicated using the WIPO Standard ST.25, for example: alanine (Ala; A), arginine (Arg; R), asparagine (Asn; N), aspartic acid (Asp; D), cysteine (Cys; C), glutamine (Gln; Q), glutamic acid (Glu; E), glycine (Gly; G), histidine (His; H), isoleucine (Ile; 1), leucine (Leu; L), lysine (Lys; K), methionine (Met; M), phenylalanine (Phe; F), proline (Pro; P), serine (Ser; S), threonine (Thr; T), tryptophan (Trp; W), tyrosine (Tyr; Y), and valine (Val; V).
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood to one of ordinary skill in the art to which the presently disclosed subject matter belongs.
Although the following terms are believed to be well understood by one of ordinary skill in the art, the following definitions are set forth to facilitate understanding of the presently disclosed subject matter.
As used in the description of the invention and the appended claims, the singular forms “a,” “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise.
As used herein, “and/or” refers to and encompasses any and all possible combinations of one or more of the associated listed items, as well as the lack of combinations when interpreted in the alternative (“or”).
The term “about,” as used herein when referring to a measurable value such as a dosage or time period and the like, is meant to encompass variations of ±20%, ±10%, ±5%, ±1%, ±0.5%, or even ±0.1% of the specified amount. As used herein, phrases such as “between about X and Y” mean “between about X and about Y” and phrases such as “from about X to Y” mean “from about X to about Y.”
As used herein, phrases such as “between about X and Y”, “between about X and about Y”, “from X to Y” and “from about X to about Y” (and similar phrases) should be interpreted to include X and Y, unless the context indicates otherwise.
As used herein, a “coding sequence” or “CDS” is a nucleic acid sequence that is transcribed into RNA such as mRNA, rRNA, tRNA, snRNA, sense RNA or antisense RNA. In embodiments, the RNA is then translated to produce a protein. In example embodiments, the CDS is derived from a cDNA sequence and includes the sequence of spliced exons of a transcript in DNA notation and does not include any intron or 5′ or 3′-untranslated regions (UTRs). In other example embodiments, the CDS is derived from a genomic DNA sequence and includes the sequence of spliced exons of a transcript in DNA notation as well as one or more introns, and 5′ and/or 3′-untranslated regions (UTRs).
As used herein, a “codon optimized” nucleotide sequence means a nucleotide sequence of a recombinant, transgenic, or synthetic polynucleotide wherein the codons are chosen to reflect the particular codon bias that a host cell or organism may have. This is typically done in such a way so as to preserve the amino acid sequence of the polypeptide encoded by the codon optimized nucleotide sequence. In certain embodiments, a nucleotide sequence is codon optimized for the cell (e.g., an animal, plant, fungal or bacterial cell) in which the construct is to be expressed. For example, a construct to be expressed in a plant cell can have all or parts of its sequence codon optimized for expression in a plant. See, for example, U.S. Pat. No. 6,121,014. In embodiments, the polynucleotides of the invention are codon-optimized for expression in a plant cell (e.g., a dicot cell or a monocot cell) or bacterial cell.
The term “comprise”, “comprises” or “comprising,” when used in this specification, indicates the presence of the stated features, integers, steps, operations, elements, or components, but does not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
As used herein, the transitional phrase “consisting essentially of” (and grammatical variants) means that the scope of a claim is to be interpreted to encompass the specified materials or steps recited in the claim “and those that do not materially alter the basic and novel characteristic(s)” of the claimed invention. Thus, the term “consisting essentially of” when used in a claim of this invention is not intended to be interpreted to be equivalent to “comprising.”
The term “corresponding to” in the context of nucleic acid sequences or protein sequences means that when the nucleic acid sequences or amino acid sequences of certain sequences are aligned with each other, the nucleic acids or amino acids that “correspond to” certain enumerated positions in the present invention are those that align with these positions in a reference sequence, but that are not necessarily in these exact numerical positions relative to a particular nucleic acid sequence of the invention. Optimal alignment of sequences for comparison can be conducted by computerized implementations of known algorithms or by visual inspection. Readily available sequence comparison and multiple sequence alignment algorithms are, respectively, the Basic Local Alignment Search Tool (BLAST) and ClustalW/ClustalW2/Clustal Omega programs available on the Internet (e.g., the website of the EMBL-EBI). Other suitable programs include, but are not limited to, GAP, BestFit, Plot Similarity, and FASTA, which are part of the Accelrys GCG Package available from Accelrys, Inc. of San Diego, Calif., United States of America. See also Smith & Waterman, 1981; Needleman & Wunsch, 1970; Pearson & Lipman, 1988; Ausubel et al., 1988; and Sambrook & Russell, 2001.
Unless otherwise stated, identity and similarity will be calculated by the Needleman-Wunsch global alignment and scoring algorithms (Needleman and Wunsch (1970) J. Mol. Biol. 48(3):443-453) as implemented by the “needle” program, distributed as part of the EMBOSS software package (Rice, P. Longden, and Bleasby, A., EMBOSS: The European Molecular Biology Open Software Suite, 2000, Trends in Genetics 16, (6) pp 276-277, versions 6.3.1 available from EMBnet at embnet.org/resource/emboss and emboss.sourceforge.net, among other sources) using default gap penalties and scoring matrices (EBLOSUM62 for protein and EDNAFULL for DNA). Equivalent programs may also be used. By “equivalent program” is intended any sequence comparison program that, for any two sequences in question, generates an alignment having identical nucleotide residue matches and an identical percent sequence identity when compared to the corresponding alignment generated by needle from EMBOSS version 6.3.1.
Additional mathematical algorithms are known in the art and can be utilized for the comparison of two sequences. See, for example, the algorithm of Karlin and Altschul (1990) Proc. Natl. Acad. Sci. USA 87:2264, modified as in Karlin and Altschul (1993) Proc. Natl. Acad. Sci. USA 90:5873-5877. Such an algorithm is incorporated into the BLAST programs of Altschul et al. (1990) J. Mol. Biol. 215:403. BLAST nucleotide searches can be performed with the BLASTN program (nucleotide query searched against nucleotide sequences) to obtain nucleotide sequences homologous to nucleic acid molecules of the invention, or with the BLASTX program (translated nucleotide query searched against protein sequences) to obtain protein sequences homologous to nucleic acid molecules of the invention. BLAST protein searches can be performed with the BLASTP program (protein query searched against protein sequences) to obtain amino acid sequences homologous to protein molecules of the invention, or with the TBLASTN program (protein query searched against translated nucleotide sequences) to obtain nucleotide sequences homologous to protein molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST (in BLAST 2.0) can be utilized as described in Altschul et al. (1997) Nucleic Acids Res. 25:3389. Alternatively, PSI-Blast can be used to perform an iterated search that detects distant relationships between molecules. See Altschul et al. (1997) supra. When utilizing BLAST, Gapped BLAST, and PSI-Blast programs, the default parameters of the respective programs (e.g., BLASTX and BLASTN) can be used. Alignment may also be performed manually by inspection.
“Expression cassette” as used herein means a nucleic acid molecule capable of directing expression of at least one polynucleotide of interest, such as a nucleic acid comprising the sequence of an R-gene polynucleotide that encodes a protein of the invention, the protein conferring increased pathogen resistance when expressed in an appropriate host cell, the expression cassette comprising a promoter operably linked to the polynucleotide of interest which is operably linked to a termination signal. An “expression cassette” also typically comprises additional polynucleotides to facilitate proper translation of the polynucleotide of interest. The expression cassette may also comprise other polynucleotides not related to the expression of a polynucleotide of interest but which are present due to convenient restriction sites for removal of the cassette from an expression vector. In embodiments, at least one of the components in the expression cassette may be heterologous (i.e., foreign) with respect to at least one of the other components (e.g., a heterologous promoter operatively associated with a polynucleotide of interest). The expression cassette may also be one that is naturally occurring but has been obtained in a recombinant form useful for heterologous expression. Typically, however, the expression cassette is heterologous with respect to the host, i.e., the expression cassette (or even the polynucleotide of interest) does not occur naturally in the host cell and has been introduced into the host cell or an ancestor cell thereof by a transformation process or a breeding process. The expression of the polynucleotide(s) of interest in the expression cassette is generally under the control of a promoter. The promoter may be a heterologous promoter or an endogenous (or native) promoter derived from the same source as the nucleic acid of interest. In the case of a multicellular organism, such as a plant, the promoter can also be specific or preferential to a particular tissue, or organ, or stage of development (as described in more detail herein). An expression cassette, or fragment thereof, can also be referred to as “inserted polynucleotide” or “insertion polynucleotide” when transformed into a plant.
The term “introduced” as used herein, in connection to a plant, means accomplished by any manner including but not limited to; introgression, transgenic, Clustered Regularly Interspaced Short Palindromic Repeats modification (CRISPR), Transcription activator-like effector nucleases (TALENs) (Feng et al. 2013, Joung & Sander 2013), meganucleases, or zinc finger nucleases (ZFNs).
As used herein, the term “wild glycine” refers to a perennial Glycine plant, for example any one of G. canescens, G. argyrea, G. clandestine, G. latrobeana, G. albicans, G. aphyonota, G. arenaria, G. curvata, G. cyrtoloba, G. dolichocarpa, G. falcate, G. gracei, G. hirticaulis, G. lactovirens, G. latifolia, G. microphylla, G. montis-douglas, G. peratosa, G. pescadrensis, G. pindanica, G. pullenii, G. rubiginosa, G. stenophita, G. syndetika, or G. tomentella.
As used herein, the term “allele” refers to one of two or more different nucleotides or nucleotide sequences that occur at a specific locus.
A marker is “associated with” a trait when it is linked to it and when the presence of the marker is an indicator of whether and/or to what extent the desired trait or trait form will occur in a plant/germplasm comprising the marker. Similarly, a marker is “associated with” an allele when it is linked to it and when the presence of the marker is an indicator of whether the allele is present in a plant/germplasm comprising the marker. For example, “a marker associated with enhanced pathogen resistance” refers to a marker whose presence or absence can be used to predict whether and/or to what extent a plant will display a pathogen resistant phenotype. In example embodiments of the present invention, a nucleic acid (e.g., a chromosomal interval) comprising the R-gene of interest and capable of conferring enhanced pathogen resistance may be detected, identified or selected based on the presence of a “favorable” marker, such as any of the favorable markers of Table 1 and/or 2.
A marker may be, but is not limited to, an allele, a gene, a haplotype, a restriction fragment length polymorphism (RFLP), a simple sequence repeat (SSR), random amplified polymorphic DNA (RAPD), cleaved amplified polymorphic sequences (CAPS) (Rafalski and Tingey, Trends in Genetics 9: 275 (1993)), an amplified fragment length polymorphism (AFLP) (Vos et al., Nucleic Acids Res. 23: 4407 (1995)), a single nucleotide polymorphism (SNP) (Brookes, Gene 234: 177 (1993)), a sequence-characterized amplified region (SCAR) (Paran and Michelmore, Theor. Appl. Genet. 85: 985 (1993)), a sequence-tagged site (STS) (Onozaki et al., Euphytica 138: 255 (2004)), a single-stranded conformation polymorphism (SSCP) (Orita et al., Proc. Natl. Acad. Sci. USA 86: 2766 (1989)), an inter-simple sequence repeat (ISSR) (Blair et al., Theor. Appl. Genet. 98: 780 (1999)), an inter-retrotransposon amplified polymorphism (IRAP), a retrotransposon-microsatellite amplified polymorphism (REMAP) (Kalendar et al., Theor. Appl. Genet. 98: 704 (1999)), a chromosome interval, or an RNA cleavage product (such as a Lynx tag). A marker may be present in genomic or expressed nucleic acids (e.g., ESTs). The term marker may also refer to nucleic acids used as probes or primers (e.g., primer pairs) for use in amplifying, hybridizing to and/or detecting nucleic acid molecules according to methods well known in the art (e.g., using PCR). A large number of soybean molecular markers are known in the art, and are published or available from various sources, such as the SoyBase internet resource.
Markers corresponding to genetic polymorphisms between members of a population can be detected by methods well-established in the art. These include, e.g., nucleic acid sequencing, hybridization methods, amplification methods (e.g., PCR-based sequence specific amplification methods), detection of restriction fragment length polymorphisms (RFLP), detection of isozyme markers, detection of polynucleotide polymorphisms by allele specific hybridization (ASH), detection of amplified variable sequences of the plant genome, detection of self-sustained sequence replication, detection of simple sequence repeats (SSRs), detection of single nucleotide polymorphisms (SNPs), and/or detection of amplified fragment length polymorphisms (AFLPs). Well established methods are also known for the detection of expressed sequence tags (ESTs) and SSR markers derived from EST sequences and randomly amplified polymorphic DNA (RAPD).
A “marker allele” also described as an “allele of a marker locus” can refer to one of a plurality of polymorphic nucleotide sequences found at a marker locus in a population that is polymorphic for the marker locus.
“Marker-assisted selection” (MAS) is a process by which phenotypes are selected based on marker genotypes. In some embodiments, marker genotypes are used to identify plants that will be selected for a breeding program or for planting. In some embodiments, marker genotypes are used to identify plants that will not be selected for a breeding program or for planting (i.e., counter-selected plants), allowing them to be removed from the breeding/planting population.
As used herein, the terms “marker locus” and “marker loci” refer to a specific chromosome location or locations in the genome of an organism where a specific marker or markers can be found. A marker locus can be used to track the presence of a second linked locus, e.g., a linked locus that encodes or contributes to expression of a phenotypic trait. For example, a marker locus can be used to monitor segregation of alleles at a locus, such as a QTL or single gene, that are genetically or physically linked to the marker locus.
As used herein, the terms “marker probe” and “probe” refer to a nucleotide sequence or nucleic acid molecule that can be used to detect the presence of one or more particular alleles within a marker locus (e.g., a nucleic acid probe that is complementary to all of or a portion of the marker or marker locus, through nucleic acid hybridization). Marker probes comprising about 8, 10, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100 or more contiguous nucleotides may be used for nucleic acid hybridization. Alternatively, in some aspects, a marker probe refers to a probe of any type that is able to distinguish (i.e., genotype) the particular allele that is present at a marker locus.
As used herein, the terms “molecular marker” or “genetic marker” may be used to refer to a genetic marker, as defined above, or an encoded product thereof (e.g., a protein) used as a point of reference when identifying a linked locus. A molecular marker can be derived from genomic nucleotide sequences or from expressed nucleotide sequences (e.g., from a spliced RNA, a cDNA, etc.). The term also refers to nucleotide sequences complementary to or flanking the marker sequences, such as nucleotide sequences used as probes and/or primers capable of amplifying the marker sequence. Nucleotide sequences are “complementary” when they specifically hybridize in solution, e.g., according to Watson-Crick base pairing rules. Some of the markers described herein are also referred to as hybridization markers when located on an indel region. Thus, the marker need only indicate whether the indel region is present or absent. Any suitable marker detection technology may be used to identify such a hybridization marker, e.g., SNP technology is used in the examples provided herein.
As used herein, the terms “backcross” and “backcrossing” refer to the process whereby a progeny plant is repeatedly crossed back to one of its parents. In a backcrossing scheme, the “donor” parent refers to the parental plant with the desired gene or locus to be introgressed. The “recipient” parent (used one or more times) or “recurrent” parent (used two or more times) refers to the parental plant into which the gene or locus is being introgressed. For example, see Ragot, M. et al. Marker-assisted Backcrossing: A Practical Example, in T
A centimorgan (“cM”) is a unit of measure of recombination frequency. One cM is equal to a 1% chance that a marker at one genetic locus will be separated from a marker at a second locus due to crossing over in a single generation.
As used herein, the term “chromosomal interval defined by and including,” used in reference to particular loci and/or alleles, refers to a chromosomal interval delimited by and encompassing the stated loci/alleles.
As used herein, the terms “cross” or “crossed” refer to the fusion of gametes via pollination to produce progeny (e.g., cells, seeds or plants). The term encompasses both sexual crosses (the pollination of one plant by another) and selfing (self-pollination, e.g., when the pollen and ovule are from the same plant). The term “crossing” refers to the act of fusing gametes via pollination to produce progeny.
As used herein, the terms “cultivar” and “variety” refer to a group of similar plants that by structural or genetic features and/or performance can be distinguished from other varieties within the same species.
As used herein, the terms “desired allele”, “favorable allele” and “allele of interest” are used interchangeably to refer to an allele associated with a desired trait (e.g. ASR resistance). In example embodiments, the desired allele may be detected or identified via marker based assays, such as using a SNP marker assay.
As used herein, the terms “enhanced pathogen resistance”, “enhanced disease resistance”, and “conferring or enhancing resistance to a pathogen” refers to an improvement, enhancement, or increase in a plant's ability to endure and/or thrive despite being infected with a pathogen or disease (e.g. Asian soybean rust) as compared to one or more control plants (e.g., one or both of the parents, or a plant lacking a nucleic acid comprising an R-gene or marker associated with enhanced pathogen resistance to respective pathogen/disease). An enhanced plant pathogen resistance comprises any statistically significant increase in resistance to the plant pathogen, including, for example, an increase of at least 1%, 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100% or higher. The control plants may be fully susceptible to the pathogen or have limited resistance to the pathogen. Enhanced disease resistance includes any mechanism (other than whole-plant immunity or resistance) that reduces the expression of symptoms indicative of infection for a respective disease such as Asian soybean rust, soybean cyst nematode, Phytophthora, etc. Conferring or enhancing of resistance may include a reduction (partial reduction or complete reduction) in symptoms or phenotypic characteristics associated with susceptibility to the pathogen and/or an increase in phenotypic characteristics associated with resistance to the pathogen. In example embodiments, conferring or increasing of resistance to Asian Soy Rust can include a reduction in the number, size, and/or density of lesions, change in the color of lesions (such as from a tan coloration to a reddish brown coloration), reduction in number and density of pustule formation, reduction in sporulation, or any combination thereof.
In embodiments, the nucleic acid of the present invention, encoding a protein conferring enhanced pathogen resistance when expressed in a plant cell, herein also referred to as a resistance gene or R-gene, can used to enhance pathogen resistance to a fungal pathogen and/or a nematode. As non-limiting examples, the R-gene of the present invention can be used to enhance resistance to: soy cyst nematode, bacterial pustule, root knot nematode, frog eye leaf spot, phytophthora, brown stem rot, nematode, Asian Soybean Rust, smut, Golovinomyces cichoracearum, Erysiphe cichoracearum, Blumeria graminis, Podosphaera xanthii, Sphaerotheca fuliginea, Pythium ultimum, Uncinula necator, Mycosphaerella pinodes, Magnaporthe grisea, Bipolaris oryzae, Magnaporthe grisea, Rhizoctonia solani, Phytophthora sojae, Schizaphis graminum, Bemisia tabaci, Rhopalosiphum maidis, Deroceras reticulatum, Diatraea saccharalis, Schizaphis graminum, Myzus persicae, Sclerotinia sclerotiorum, Macrophomina phaseolina, or Fusarium virguliforme.
An “elite line” or “elite strain” is an agronomically superior line that has resulted from many cycles of breeding and selection for superior agronomic performance. Numerous elite lines are available and known to those of skill in the art of soybean breeding. An “elite population” is an assortment of elite individuals or lines that can be used to represent the state of the art in terms of agronomically superior genotypes of a given crop species, such as soybean. Similarly, an “elite germplasm” or elite strain of germplasm is an agronomically superior germplasm, typically derived from and/or capable of giving rise to a plant with superior agronomic performance, such as an existing or newly developed elite line of soybean.
An “elite” plant is any plant from an elite line, such that an elite plant is a representative plant from an elite variety. Non-limiting examples of elite soybean varieties that are commercially available to farmers or soybean breeders include: AG00802, A0868, AG0902, A1923, AG2403, A2824, A3704, A4324, A5404, AG5903, AG6202 AG0934; AG1435; AG2031; AG2035; AG2433; AG2733; AG2933; AG3334; AG3832; AG4135; AG4632; AG4934; AG5831; AG6534; and AG7231 (Asgrow Seeds, Des Moines, Iowa, USA); BPR0144RR, BPR 4077NRR and BPR 4390NRR (Bio Plant Research, Camp Point, Ill., USA); DKB17-51 and DKB37-51 (DeKalb Genetics, DeKalb, Ill., USA); DP 4546 RR, and DP 7870 RR (Delta & Pine Land Company, Lubbock, Tex., USA); JG 03R501, JG 32R606C ADD and JG 55R503C (JGL Inc., Greencastle, Ind., USA); NKS 13-K2 (NK Division of Syngenta Seeds, Golden Valley, Minnesota, USA); 90M01, 91M30, 92M33, 93M11, 94M30, 95M30, 97B52, P008T22R2; P16T17R2; P22T69R; P25T51R; P34T07R2; P35T58R; P39T67R; P47T36R; P46T21R; and P56T03R2 (Pioneer Hi-Bred International, Johnston, Iowa, USA); SG4771NRR and SG5161NRR/STS (Soygenetics, LLC, Lafayette, Ind., USA); S00-K5, S11-L2, S28-Y2, S43-B1, S53-A1, S76-L9, S78-G6, S0009-M2; S007-Y4; S04-D3; S14-A6; S20-T6; S21-M7; S26-P3; S28-N6; S30-V6; S35-C3; S36-Y6; S39-C4; S47-K5; S48-D9; S52-Y2; S58-Z4; S67-R6; S73-S8; and S78-G6 (Syngenta Seeds, Henderson, Ky., USA); Richer (Northstar Seed Ltd. Alberta, CA); 14RD62 (Stine Seed Co. Ia., USA); or Armor 4744 (Armor Seed, LLC, Ar., USA).
The terms “agronomically elite” as used herein, means a genotype that has a culmination of many distinguishable traits such as emergence, vigor, vegetative vigor, disease resistance, seed set, standability, yield and threshability which allows a producer to harvest a product of commercial significance.
A “native” or “wild type” nucleic acid, nucleotide sequence, polypeptide or amino acid sequence refers to a naturally occurring or endogenous nucleic acid, nucleotide sequence, polypeptide or amino acid sequence. Thus, for example, a “wild type mRNA” is an mRNA that is naturally occurring in, or endogenous to, the organism.
The terms “nucleic acid,” “nucleic acid molecule,” “nucleotide sequence,” “oligonucleotide”, “polynucleic acids” and “polynucleotide” are used interchangeably herein, unless the context indicates otherwise, and refer to a heteropolymer of nucleotides. These terms include without limitation DNA and RNA molecules, including cDNA, genomic DNA, synthetic (e.g., chemically synthesized) DNA and RNA, plasmid DNA, mRNA, anti-sense RNA, and RNA/DNA hybrids, any of which can be linear or branched, single stranded or double stranded, or a combination thereof. When dsRNA is produced synthetically, less common bases, such as inosine, 5-methylcytosine, 6-methyladenine, hypoxanthine and others can also be used for antisense, dsRNA, and ribozyme pairing. For example, polynucleotides that contain C-5 propyne analogues of uridine and cytidine have been shown to bind RNA with high affinity and to be potent antisense inhibitors of gene expression. Other modifications, such as modification to the phosphodiester backbone, or the 2′-hydroxy in the ribose sugar group of the RNA can also be made. In embodiments, the “nucleic acid,” “nucleic acid molecule,”, “nucleotide sequence,”, “oligonucleotide” or “polynucleotide” refer to DNA.
By “operably linked” or “operably associated” as used herein, it is meant that the indicated elements are functionally related to each other and are also generally physically related. Thus, the term “operably linked” or “operably associated” as used herein, refers to nucleotide sequences on a single nucleic acid molecule that are functionally associated. Thus, a first nucleotide sequence that is operably linked to a second nucleotide sequence, means a situation when the first nucleotide sequence is placed in a functional relationship with the second nucleotide sequence. For instance, a promoter is operably associated with a nucleotide sequence if the promoter effects the transcription or expression of said nucleotide sequence. Those skilled in the art will appreciate that the control sequences (e.g., promoter) need not be contiguous with the nucleotide sequence to which it is operably associated, as long as the control sequences function to direct the expression thereof. Thus, for example, intervening untranslated, yet transcribed, sequences can be present between a promoter and a nucleotide sequence, and the promoter can still be considered “operably linked” to or “operatively associated” with the nucleotide sequence.
As used herein, the terms “disease tolerance” and “disease resistant” refer to a plant's ability to endure and/or thrive despite being infected with a respective disease. When used in reference to germplasm, the terms refer to the ability of a plant that arises from that germplasm to endure and/or thrive despite being infected with a respective disease. In some embodiments, infected disease resistant soybean plants may yield as well (or nearly as well) as uninfected soybean plants. In general, a plant or germplasm is labeled as “Disease resistant” if it displays “enhanced pathogen resistance.”
As used herein, the term “endogenous” or “native” refers to materials originating from within an organism or cell. In contrast, “heterogenous” or “heterologous” or “exogenous” refers to materials not originating naturally from within the organism or cell, due to modifications being artificially introduced to their endogenous state. This typically applies to nucleic acid molecules used in producing transformed or transgenic host cells and plants. For example, a nucleic acid molecule comprising the R-gene of the present invention is an exogenous nucleic acid used to confer or enhance pathogen resistance in a plant cell transformed with the nucleic acid molecule.
As used herein, the terms “exotic,” “exotic line” and “exotic germplasm” refer to any plant, line or germplasm that is not elite. In general, exotic plants/germplasms are not derived from any known elite plant or germplasm, but rather are selected to introduce one or more desired genetic elements into a breeding program (e.g., to introduce novel alleles into a breeding program).
As used herein, a “genetic map” is a description of genetic linkage relationships among loci on one or more chromosomes within a given species, generally depicted in a diagrammatic or tabular form. For each genetic map, distances between loci are measured by the recombination frequencies between them. Recombinations between loci can be detected using a variety of markers. A genetic map is a product of the mapping population, types of markers used, and the polymorphic potential of each marker between different populations. The order and genetic distances between loci can differ from one genetic map to another.
As used herein, the term “genome” as it applies to plant cells encompasses not only chromosomal DNA found within the nucleus, but organelle DNA found within subcellular components of the cell.
The term “gene” or “genomic sequence” means a nucleic acid that comprises chromosomal DNA, genomic DNA, plasmid DNA, cDNA, an artificial DNA polynucleotide, or other DNA encoding a polypeptide of interest. In particular embodiments, the nucleic acid sequence of the gene encodes a protein that, when expressed, is responsible, at least in part, for a particular characteristic or trait. In embodiments, the gene may be native, modified (e.g., by directed recombination or site-specific mutation), or synthetic. In example embodiments, the gene is transcribed into an RNA molecule (e.g., an mRNA) in a cell wherein the RNA may encode a peptide, polypeptide, or protein of interest, and in some examples may also encode genetic elements flanking the coding sequence that are involved in the regulation of expression of the mRNA or polypeptide of the present invention. A gene may thus comprise several operably linked sequences, such as a promoter sequence, a 5′ leader sequence comprising, for example, sequences involved in translation initiation, a (protein) coding region (comprising cDNA or genomic DNA), a 3′ non-translated sequence comprising, for example, transcription termination sequence sites, introns (e.g., one or more native, foreign, or modified introns). In example embodiments, the nucleic acid sequence of the isolated gene may include introns, exons, 5′ or 3′-untranslated regions (UTRs), and native regulatory elements (such as native promoters). In other example embodiments, the gene comprises a coding sequence for a polypeptide of interest without including any regulatory elements (e.g., without any native or foreign introns, with some native introns replaced with foreign or modified introns, without any untranslated sequences, or with native regulatory elements replaced with foreign, heterologous or modified regulatory elements).
A “fragment” of a gene or nucleic acid is a portion of a full-length nucleic acid molecule that is of at least a minimum length capable of transcription into an RNA, translation into a peptide, or useful as a probe or primer in a DNA detection method. A “functional fragment” of a gene or nucleic acid is a portion of the full-length nucleic acid molecule that is capable of performing the same function as the full-length nucleic acid molecule. In embodiments, a functional fragment of a chromosomal interval that confers increased pathogen resistance includes a gene derived from the chromosomal interval.
The terms “nucleic acid,” “nucleic acid molecule,” and “polynucleotide” are used interchangeably herein. In embodiments, the gene is a segment of single-stranded, double-stranded or partially double-stranded DNA or RNA, or a hybrid thereof, that can be isolated or synthesized from any source. In the context of the present disclosure, the gene is typically a segment of DNA. In some embodiments, the gene of the disclosure includes isolated nucleic acid molecules. In some embodiments, the gene of the disclosure is comprised within a vector, expression cassette, a plant, or a plant cell.
As used herein, in particular embodiments, “R-gene” or “Resistance gene” refers to a nucleic acid having a nucleotide sequence (e.g., DNA sequence) encoding a polypeptide of interest, R-protein, or Resistance protein, that when expressed in a plant cell, confers to the plant cell, and/or the plant comprising the plant cell, with increased resistance to one or more plant pathogens. For example, in embodiments, the R-gene(s) of the present disclosure encode polypeptides or R-proteins that, when expressed in a soybean plant cell, confer the soybean plant with resistance to at least Asian Soybean Rust. In embodiments, the R-gene may comprise one or more motifs that correlate with one or more domains of the corresponding R-protein. For example, embodiments of the R-gene may comprise a TNL motif comprising a Toll/Interleukin-1 receptor (TIR) motif, a nucleotide-binding site (NBS), and a leucine rich-repeat (LRR) motif. When expressed, the TNL motif encodes a TNL motif in the R-protein comprising a Toll/Interleukin-1 receptor (TIR) domain a nucleotide-binding site (NBS) domain, and a leucine rich-repeat (LRR) domain. In other embodiments, the R-gene may comprise a CNL motif comprising a coiled coil (CC) motif, a nucleotide-binding site (NBS), and a leucine rich-repeat (LRR) motif. When expressed, the CNL motif encodes a CNL motif in the R-protein comprising a coiled coil (CC) domain, a nucleotide-binding site (NBS) domain, and a leucine rich-repeat (LRR) domain. The R-gene may comprise still other domains and motifs, such as a WRKY motif. In embodiments, the nucleic acid sequence of the R-gene is derived from a wild plant exhibiting increased resistance to the pathogen and includes, at least a coding sequence encoding the R-protein. The nucleic acid sequence of the R-gene may further comprise nucleic acid sequences corresponding to one or more native regulatory elements (such as native introns, native promoters, native UTRs), one or more heterologous regulatory elements (such as a heterologous promoter and introns), and combinations thereof. Insertion of the R-gene into a plant that has decreased resistance to the pathogen (e.g., no resistance or partially or fully susceptible), at a chromosomal location (e.g., stably integrated into the plant genome) or extra-chromosomal location (e.g., on a vector or plasmid) results in conferring of the wild plant-derived pathogen resistance to the recipient plant. For example, in representative embodiments, an R-gene of the present invention is derived from Glycine tomentella and can be inserted into Glycine max plants to confer or enhance resistance of G. max plants to Asian Soy Rust.
As used herein, the term “genotype” refers to the genetic constitution of an individual (or group of individuals) at one or more genetic loci, as contrasted with the observable and/or detectable and/or manifested trait (the phenotype). Genotype is defined by the allele(s) of one or more known loci that the individual has inherited from its parents. The term genotype can be used to refer to an individual's genetic constitution at a single locus, at multiple loci, or more generally, the term genotype can be used to refer to an individual's genetic make-up for all the genes in its genome. Genotypes can be indirectly characterized, e.g., using markers and/or directly characterized by nucleic acid sequencing.
As used herein, the term “germplasm” refers to genetic material of or from an individual (e.g., a plant), a group of individuals (e.g., a plant line, variety or family), or a clone derived from a line, variety, species, or culture. The germplasm can be part of an organism or cell or can be separate from the organism or cell. In general, germplasm provides genetic material with a specific molecular makeup that provides a physical foundation for some or all of the hereditary qualities of an organism or cell culture. As used herein, germplasm may refer to seeds, cells (including protoplasts and calli) or tissues from which new plants may be grown, as well as plant parts that can be cultured into a whole plant (e.g., stems, buds, roots, leaves, etc.).
As used herein, a “Heterologous DNA” sequence refers to a polynucleotide sequence that originates from a foreign source or species or, if from the same source, is modified from its original form.
As used herein, a “Homologous DNA” refers to DNA from the same source as that of the recipient cell.
As used herein, the term “hybrid” refers to a seed and/or plant produced when at least two genetically dissimilar parents are crossed.
As used herein, the term “inbred” refers to a substantially homozygous plant or variety. The term may refer to a plant or variety that is substantially homozygous throughout the entire genome or that is substantially homozygous with respect to a portion of the genome that is of particular interest.
As used herein, the term “indel” refers to an insertion or deletion in a pair of nucleotide sequences, wherein a first sequence may be referred to as having an insertion relative to a second sequence or the second sequence may be referred to as having a deletion relative to the first sequence.
As used herein, the terms “introgression,” “introgressing” and “introgressed” refer to both the natural and artificial transmission of a desired allele or combination of desired alleles of a genetic locus or genetic loci from one genetic background to another. For example, a desired allele at a specified locus can be transmitted to at least one progeny via a sexual cross between two parents of the same species, where at least one of the parents has the desired allele in its genome. Alternatively, for example, transmission of an allele can occur by recombination between two donor genomes, e.g., in a fused protoplast, where at least one of the donor protoplasts has the desired allele in its genome. The desired allele may be a selected allele of a marker, a QTL, a transgene, or the like. Offspring comprising the desired allele can be repeatedly backcrossed to a line having a desired genetic background and selected for the desired allele, with the result being that the desired allele becomes fixed in the desired genetic background. For example, an R-gene or marker associated with enhanced ASR tolerance or resistance may be introgressed from a donor into a recurrent parent that is not disease resistant. The resulting offspring could then be repeatedly backcrossed and selected until the progeny possess the ASR tolerance allele(s) in the recurrent parent background.
As used herein, an “isolated” nucleic acid molecule or gene is substantially separated away from other nucleic acid or gene sequences with which the nucleic acid is normally associated, such as, from the chromosomal or extrachromosomal DNA of a cell in which the nucleic acid or gene naturally occurs. A nucleic acid molecule is an isolated nucleic acid molecule when it comprises a transgene or part of a transgene present in the genome of another organism. The term also embraces nucleic acids that are biochemically purified to substantially remove contaminating nucleic acids and other cellular components.
A polypeptide is “isolated” if it has been separated from the cellular components (nucleic acids, lipids, carbohydrates, and other polypeptides) that naturally accompany it or that is chemically synthesized or recombinant. A polypeptide molecule is an isolated polypeptide molecule when it is expressed from a transgene in another organism. A monomeric polypeptide is isolated when at least 60% by weight of a sample is composed of the polypeptide, preferably 90% or more, more preferably 95% or more, and most preferably more than 99%. Protein purity or homogeneity is indicated, for example, by polyacrylamide gel electrophoresis of a protein sample, followed by visualization of a single polypeptide band upon staining the polyacrylamide gel; high pressure liquid chromatography; or other conventional methods. Proteins can be purified by any of the means known in the art, for example as described in Guide to Protein Purification, ed. Deutscher, Meth. Enzymol. 185, Academic Press, San Diego, 1990; and Scopes, Protein Purification: Principles and Practice, Springer Verlag, New York, 1982.
Using well-known methods, the skilled artisan can readily produce nucleotide and amino acid sequence variants of genes and proteins that provide a modified gene product. Chemical synthesis of nucleic acids can be performed, for example, on automated oligonucleotide synthesizers. Such variants preferably do not change the reading frame of the protein-coding region of the nucleic acid. The present invention also encompasses fragments of a protein that lacks at least one residue of a full-length protein, but that substantially maintains activity of the protein.
A “locus” is a position on a chromosome where a gene or marker or allele is located. In some embodiments, a locus may encompass one or more nucleotides.
A “non-naturally occurring variety of soybean” is any variety of soybean that does not naturally exist in nature. A “non-naturally occurring variety of soybean” may be produced by any method known in the art, including, but not limited to, transforming a soybean plant or germplasm, transfecting a soybean plant or germplasm and crossing a naturally occurring variety of soybean with a non-naturally occurring variety of soybean. In some embodiments, a “non-naturally occurring variety of soybean” may comprise one of more heterologous nucleotide sequences. In some embodiments, a “non-naturally occurring variety of soybean” may comprise one or more non-naturally occurring copies of a naturally occurring nucleotide sequence (i.e., extraneous copies of a gene that naturally occurs in soybean). In some embodiments, a “non-naturally occurring variety of soybean” may comprise a non-natural combination of two or more naturally occurring nucleotide sequences (i.e., two or more naturally occurring genes that do not naturally occur in the same soybean, for instance genes not found in Glycine max lines such as polynucleotides from wild glycine species).
As used herein, the terms “phenotype,” “phenotypic trait” or “trait” refer to one or more traits and/or manifestations of an organism. The phenotype can be a manifestation that is observable to the naked eye, or by any other means of evaluation known in the art, e.g., microscopy, biochemical analysis, or an electromechanical assay. In some cases, a phenotype or trait is directly controlled by a single gene or genetic locus, i.e., a “single gene trait.” In other cases, a phenotype or trait is the result of several genes. It is noted that, as used herein, the term “pathogen resistant phenotype” or “disease resistant phenotype” takes into account environmental conditions that might affect the respective pathogen or disease such that the effect is real and reproducible.
As used herein, the term “plant” may refer to a whole plant, any part thereof, or a cell or tissue culture derived from a plant. Thus, the term “plant” can refer to any of: whole plants, plant components or organs (e.g., roots, stems, leaves, buds, flowers, pods, etc.), plant tissues, seeds and/or plant cells. A plant cell is a cell of a plant, taken from a plant, or derived through culture from a cell taken from a plant. Thus, the term “soybean plant” may refer to a whole soybean plant, one or more parts of a soybean plant (e.g., roots, root tips, stems, leaves, buds, flowers, pods, seeds, cotyledons, etc.), soybean plant cells, soybean plant protoplasts and/or soybean plant calli.
A “plant cell” is a structural and physiological unit of a plant, comprising a protoplast and a cell wall. The plant cell may be in the form of an isolated single cell or a cultured cell, or as a part of a higher organized unit such as, for example, plant tissue, a plant organ, or a whole plant. In embodiments, the plant cell is non-propagating and/or cannot regenerate a whole plant.
A “plant cell culture” means a culture of plant units such as, for example, protoplasts, cell culture cells, cells in plant tissues, pollen, pollen tubes, ovules, embryo sacs, zygotes and embryos at various stages of development.
“Plant material” refers to leaves, stems, roots, flowers or flower parts, fruits, pollen, egg cells, zygotes, seeds, cuttings, cell or tissue cultures, or any other part or product of a plant.
A “plant organ” is a distinct and visibly structured and differentiated part of a plant such as a root, stem, leaf, flower bud, or embryo.
As used herein, the term “plant part” includes but is not limited to embryos, pollen, ovules, seeds, leaves, flowers, branches, fruit, stalks, roots, root tips, anthers, and/or plant cells including plant cells that are intact in plants and/or parts of plants, plant protoplasts, plant tissues, plant cell tissue cultures, plant calli, plant clumps, and the like.
“Plant tissue” as used herein means a group of plant cells organized into a structural and functional unit. Any tissue of a plant in planta or in culture is included. This term includes, but is not limited to, whole plants, plant organs, plant seeds, tissue culture and any groups of plant cells organized into structural or functional units. The use of this term in conjunction with, or in the absence of, any specific type of plant tissue as listed above or otherwise embraced by this definition is not intended to be exclusive of any other type of plant tissue.
“Polyadenylation signal” or “polyA signal” refers to a nucleic acid sequence located 3′ to a coding region that causes the addition of adenylate nucleotides to the 3′ end of the mRNA transcribed from the coding region.
“Polymerase chain reaction (PCR)” refers to a DNA amplification method that uses an enzymatic technique to create multiple copies of one sequence of nucleic acid (amplicon). Copies of a DNA molecule are prepared by shuttling a DNA polymerase between two amplimers. The basis of this amplification method is multiple cycles of temperature changes to denature, then re-anneal amplimers (DNA primer molecules), followed by extension to synthesize new DNA strands in the region located between the flanking amplimers. Nucleic-acid amplification can be accomplished by any of the various nucleic-acid amplification methods known in the art, including the polymerase chain reaction (PCR). A variety of amplification methods are known in the art and are described, inter alia, in U.S. Pat. Nos. 4,683,195 and 4,683,202 and in PCR Protocols: A Guide to Methods and Applications, ed. Innis et al., Academic Press, San Diego, 1990. PCR amplification methods have been developed to amplify up to 22 kb of genomic DNA and up to 42 kb of bacteriophage DNA (Cheng et al., Proc. Natl. Acad. Sci. USA 91:5695-5699, 1994). These methods as well as other methods known in the art of DNA amplification may be used in the practice of the present invention.
As used herein, the term “primer” refers to an oligonucleotide which is capable of annealing to a nucleic acid target and serving as a point of initiation of DNA synthesis when placed under conditions in which synthesis of a primer extension product is induced (e.g., in the presence of nucleotides and an agent for polymerization such as DNA polymerase and at a suitable temperature and pH). A primer (in some embodiments an extension primer and in some embodiments an amplification primer) is in some embodiments single stranded for maximum efficiency in extension and/or amplification. In some embodiments, the primer is an oligodeoxyribonucleotide. A primer is typically sufficiently long to prime the synthesis of extension and/or amplification products in the presence of the agent for polymerization. The minimum length of the primer can depend on many factors, including, but not limited to temperature and composition (A/T vs. G/C content) of the primer. In the context of amplification primers, these are typically provided as a pair of bi-directional primers consisting of one forward and one reverse primer or provided as a pair of forward primers as commonly used in the art of DNA amplification such as in PCR amplification. As such, it will be understood that the term “primer,” as used herein, can refer to more than one primer, particularly in the case where there is some ambiguity in the information regarding the terminal sequence(s) of the target region to be amplified. Hence, a “primer” can include a collection of primer oligonucleotides containing sequences representing the possible variations in the sequence or includes nucleotides which allow a typical base pairing. Primers can be prepared by any suitable method known in the art. Methods for preparing oligonucleotides of specific sequence are known in the art, and include, for example, cloning and restriction of appropriate sequences and direct chemical synthesis. Chemical synthesis methods can include, for example, the phospho di- or tri-ester method, the diethylphosphoramidate method and the solid support method disclosed in U.S. Pat. No. 4,458,066. Primers can be labeled, if desired, by incorporating detectable moieties by for instance spectroscopic, fluorescence, photochemical, biochemical, immunochemical, or chemical moieties. Primers diagnostic (i.e. able to identify or select based on presence of ASR resistant alleles) for ASR resistance can be created to any favorable SNP as described in any one of Tables 1-5. The PCR method is well described in handbooks and known to the skilled person. After amplification by PCR, target polynucleotides can be detected by hybridization with a probe polynucleotide, which forms a stable hybrid with the target sequence under stringent to moderately stringent hybridization and wash conditions. If it is expected that the probes are essentially completely complementary (i.e., about 99% or greater) to the target sequence, stringent conditions can be used. If some mismatching is expected, for example if variant strains are expected with the result that the probe will not be completely complementary, the stringency of hybridization can be reduced. In some embodiments, conditions are chosen to rule out non-specific/adventitious binding. Conditions that affect hybridization, and that select against non-specific binding are known in the art, and are described in, for example, Sambrook & Russell (2001). Molecular Cloning: A Laboratory Manual, Third Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, United States of America. Generally, lower salt concentration and higher temperature hybridization and/or washes increase the stringency of hybridization conditions.
As used herein, the terms “progeny” and “progeny plant” refer to a plant generated from a vegetative or sexual reproduction from one or more parent plants. A progeny plant may be obtained by cloning or selfing a single parent plant, or by crossing two parental plants.
As used herein, “protein” refers to a polynucleotide, usually upstream (5′) of its coding polynucleotide, which controls the expression of the coding polynucleotide by providing the recognition for RNA polymerase and other factors required for proper transcription. In example embodiments of the present invention, proteins or polynucleotides are provided that when expressed in a plant or plant cell, confer the plant with enhanced resistance to a plant pathogen.
The term “promoter” or “promoter region” refers to a polynucleic acid molecule that functions as a regulatory element, usually found upstream (5′) to a coding sequence, that controls expression of the coding sequence by controlling production of messenger RNA (mRNA) by providing the recognition site for RNA polymerase and/or other factors necessary for start of transcription at the correct site. As contemplated herein, a promoter or promoter region includes variations of promoters derived by means of ligation to various regulatory sequences, random or controlled mutagenesis, and addition or duplication of enhancer sequences. The promoter region disclosed herein, and biologically functional equivalents thereof, are responsible for driving the transcription of coding sequences under their control when introduced into a host as part of a suitable recombinant DNA construct, as demonstrated by its ability to produce mRNA. In some embodiments, such as in the vector constructs and expression cassettes disclosed herein, the promoter may be heterologous to the coding sequence whose expression the promoter controls, such as when the promoter and the coding sequence are derived from different sources, such as different organisms (e.g., in embodiments, the vectors described herein comprise an R-gene sequence derived from Glycine tomentella and a promoter sequence derived from Medicago trunculata). In other examples, the promoter may be endogenous or native to the coding sequence whose expression the promoter controls, such as when the promoter and the coding sequence are derived from a common source, such as a common organism. A number of promoters can be used in an expression cassette, including a combination of the native promoter of an R gene encoding an R protein and one or more heterologous promoters.
Alternatively, promoters can be selected based upon a desired outcome. Such promoters include, but are not limited to, “constitutive promoters” (where expression of a polynucleotide sequence operably linked to the promoter is unregulated and therefore continuous), “inducible promoters” (where expression of a polynucleotide sequence operably linked to the promoter is induced by an analyte, cofactor, regulatory protein, etc.), “repressible promoters” (where expression of a polynucleotide sequence operably linked to the promoter is repressed by an analyte, cofactor, regulatory protein, etc.), and “tissue-preferred promoters” (where expression of a polynucleotide sequence operably linked to the promoter is higher in a preferred tissue relative to other tissues, such as higher in leaf tissue relative to other plant tissues).
As used herein, “plant promoter” means a promoter that drives expression in a plant such as a constitutive, inducible (e.g., chemical-, environmental-, pathogen- or wound-inducible), repressible, tissue-preferred or other promoter for use in plants.
Example promoters are set forth in WO 99/43838 and in U.S. Pat. Nos. 8,575,425; 7,790,846; 8,147,856; 8,586832; 7,772,369; 7,534,939; 6,072,050; 5,659,026; 5,608,149; 5,608,144; 5,604,121; 5,569,597; 5,466,785; 5,399,680; 5,268,463; 5,608,142; and 6,177,611; herein incorporated by reference. Example constitutive promoters include CaMV 35S promoter (Odell et al. (985) Nature 313:810-812); rice actin (McElroy et al. (1990) Plant Cell 2: 163-171); ubiquitin (Christensen et al. (1989) Plant Mol. Biol. 12:619-632 and Christensen et al. (1992) Plant Mol. Biol. 18:675-689); pEMU (Last et al. (1991) Theor. Appl. Genet. 81: 581-588); MAS (Velten e/a/. (1984) EMBO J. 3:2723-2730). Example inducible promoters include those that drive expression of pathogenesis-related proteins (PR proteins), which are induced following infection by a pathogen. See, for example, Redolfi et al. (1983) Neth. J. Plant Pathol. 89:245-254; Uknes et al. (1992) Plant Cell 4:645-656; and Van Loon (1985) Plant Mol. Virol. 4: 111-116; and WO 99/43819, herein incorporated by reference. Promoters that are expressed locally at or near the site of pathogen infection may also be used (Marineau et al. (1987) Plant Mol. Biol. 9:335-342; Matton et al. (1989) Molecular Plant-Microbe Interactions 2: 325-331; Somsisch et al. (1986) Proc. Natl. Acad. Sci. USA 83:2427-2430; Somsisch et al. (1988) Mol. Gen. Genet. 2:93-98; and Yang (1996) Proc. Natl. Acad. Sci. USA 93: 14972-14977; Chen et al. (1996) Plant J. 10:955-966; Zhang et al. (1994) Proc. Natl. Acad. Sci. USA 91:2507-2511; Warner et al. (1993) Plant J. 3: 191-201; Siebertz et al. (1989) Plant Cell 1:961-968; Cordero et al. (1992) Physiol. Mol. Plant Path. 41: 189-200; U.S. Pat. No. 5,750,386 (nematode-inducible); and the references cited therein).
Wound-inducible promoters include pin II promoter (Ryan (1990) Ann. Rev. Phytopath. 28:425-449; Ouan et al. (1996) Nature Biotechnology 14:494-498); wun1 and wun2 (U.S. Pat. No. 5,428,148); win1 and win2 (Stanford et al. (1989) Mol. Gen. Genet. 215:200-208); systemin (McGurl et al. (1992) Science 225: 1570-1573); WIP1 (Rohmeier et al. (1993) Plant Mol. Biol. 22:783-792; Eckelkamp et al. (1993) FEBS Letters 323:73-76); MPI gene (Corderok et al. (1994) Plant J. 6(2): 141-150); and the like, herein incorporated by reference.
Tissue-preferred promoters for use in the invention include those set forth in Yamamoto et al. (1997) Plant J. 12(2):255-265; Kawamata et al. (1997) Plant Cell Physiol. 38(7):792-803; Hansen et al. (1997) Mol. Gen Genet. 254(3):337-343; Russell et al. (1997) Transgenic Res. 6(2): 157-168; Rinehart et al. (1996) Plant Physiol. 112(3): 1331-1341; Van Camp et al. (1996) Plant Physiol. 112(2):525-535; Canevascim et al. (1996) Plant Physiol. 112(2):513-524; Yamamoto et al. (1994) Plant Cell Physiol. 35(5):773-778; Lam (1994) Results Probl. Cell Differ. 20: 181-196; Orozco et al. (1993) PlantMolBiol. 23(6): 1129-1138; Matsuoka et al. (1993) Proc Natl. Acad. Sci. USA 90(20):9586-9590; and Guevara-Garcia et al. (1993) Plant J. 4(3):495-505.
Leaf-preferred promoters include those set forth in Yamamoto et al. (1997) Plant J. 12(2):255-265; Kwon et al. (1994) Plant Physiol. 105:357-67; Yamamoto et al. (1994) Plant Cell Physiol. 35(5):773-778; Gotor et al. (1993) Plant J. 3:509-18; Orozco et al. (1993) Plant Mol. Biol. 23(6): 1129-1138; and Matsuoka et al. (1993) Proc. Natl. Acad. Sci. USA 90(20):9586-9590.
Root-preferred promoters are known and include those in Hire et al. (1992) Plant Mol. Biol. 20(2):207-218 (soybean root-specific glutamine synthetase gene); Keller and Baumgartner (1991) Plant Cell 3(10): 1051-1061 (root-specific control element); Sanger et al. (1990) Plant Mol. Biol. 14(3):433-443 (mannopine synthase (MAS) gene of Agrobacterium tumefaciens); and Miao et al. (1991) Plant Cell 3(1): 11-22 (cytosolic glutamine synthetase (GS)); Bogusz et al. (1990) Plant Cell 2(7):633-641; Leach and Aoyagi (1991) Plant Science (Limerick) 79(1):69-76 (rolC and rolD); Teeri et al. (1989) EMBO J. 8(2):343-350; Kuster et al. (1995) Plant Mol. Biol. 29(4):759-772 (the VfENOD-GRP3 gene promoter); and, Capana et al. (1994) Plant Mol. Biol. 25(4):681-691 (rolB promoter). See also U.S. Pat. Nos. 5,837,876; 5,750,386; 5,633,363; 5,459,252; 5,401,836; 5,110,732; and 5,023,179.
As used herein, “recombinant” refers to a form of nucleic acid (e.g., DNA or RNA) and/or protein and/or an organism that would not normally be found in nature and as such was created by human intervention. Such human intervention may produce a recombinant nucleic acid molecule and/or a recombinant plant. As used herein, a “recombinant DNA molecule” is a DNA molecule comprising a combination of DNA molecules that would not naturally occur together and is the result of human intervention, e.g., a DNA molecule that is comprised of a combination of at least two DNA molecules heterologous to each other, and/or a DNA molecule that is artificially synthesized and comprises a polynucleotide that deviates from the polynucleotide that would normally exist in nature, and/or a DNA molecule that is artificially incorporated into a host cell's genomic DNA and the associated flanking DNA of the host cell's genome. An example of a recombinant DNA molecule is a DNA molecule resulting from the insertion of the transgene or a genome modification (i.e., a gene edit) into a plant's genomic DNA, which may ultimately result in the expression of a recombinant RNA and/or protein molecule in that organism. As used herein, a “recombinant plant” is a plant that would not normally exist in nature, is the result of human intervention, and contains a transgene and/or heterologous DNA molecule and/or a genome modification (i.e., a gene edit) incorporated into its genome. As a result of such genomic alteration, the recombinant plant is distinctly different from the related wildtype plant. The term “recombinant DNA construct” refers to any agent such as a plasmid, cosmid, virus, autonomously replicating sequence, phage, or linear or circular single-stranded or double-stranded DNA or RNA nucleotide sequence, derived from any source, capable of genomic integration or autonomous replication, comprising a DNA molecule that one or more DNA sequences have been linked in a functionally operative manner. Recombinant DNA constructs may be constructed to be capable of expressing antisense RNAs or stabilized double stranded antisense RNAs.
The phrase “substantially identical,” in the context of two nucleic acids or two amino acid sequences, refers to two or more sequences or subsequences that have at least about 50% nucleotide or amino acid residue identity when compared and aligned for maximum correspondence as measured using a sequence comparison algorithm or by visual inspection. In certain embodiments, substantially identical sequences have at least about 60%, 65%, 70%, 75%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more nucleotide or amino acid residue identity. In certain embodiments, substantial identity exists over a region of the sequences that is at least about 50 amino acid residues, 100 amino acid residues, 150 amino acid residues, 200 amino acid residues, 250 amino acid residues, 300 amino acid residues, 350 amino acid residues, 400 amino acid residues, 450 amino acid residues, 500 amino acid residues, 525 amino acid residues, 526, amino acid residues 527 amino acid residues, 528 amino acid residues, 529 amino acid residues, 530 amino acid residues, 531 amino acid residues, 532 amino acid residues, 533 amino acid residues, 534 amino acid residues, 535 amino acid residues, 536 amino acid residues or more with respect to the protein sequence or the nucleotide sequence encoding the same. In further embodiments, the sequences are substantially identical when they are identical over the entire length of the coding regions.
The term “identity”, “sequence identity”, “homology”, “similarity”. “sequence similarity” or “identical” in the context of two nucleic acid or amino acid sequences, refers to the percentage of identical nucleotides or amino acids in a linear polynucleotide or amino acid sequence of a reference (“query”) sequence (or its complementary strand) as compared to a test (“subject”) sequence when the two sequences are globally aligned. Unless otherwise stated, sequence identity as used herein refers to the value obtained using the Needleman and Wunsch algorithm ((1970) J. Mol. Biol. 48:443-453) implemented in the EMBOSS Needle alignment tool using default matrix files EBLOSUM62 for protein with default parameters (Gap Open=10, Gap Extend=0.5, End Gap Penalty=False, End Gap Open=10, End Gap Extend=0.5) or DNAfull for nucleic acids with default parameters (Gap Open=10, Gap Extend=0.5, End Gap Penalty=False, End Gap Open=10, End Gap Extend=0.5); or any equivalent program thereof. EMBOSS Needle is available, e.g., from EMBL-EBI such as at the following website: ebi.ac.uk/Tools/psa/emboss_needle/ and as described in the following publication: “The EMBL-EBI search and sequence analysis tools APIs in 2019.” Madeira et al. Nucleic Acids Research, June 2019, 47(W1):W636-W641. The term “equivalent program” as used herein refers to any sequence comparison program that, for any two sequences in question, generates an alignment having identical nucleotide or amino acid residue matches and an identical percent sequence identity when compared to the corresponding alignment generated by EMBOSS Needle. In a preferred embodiment, substantially identical nucleic acid or amino acid sequences may perform substantially the same function. In embodiments, sequences that are substantially identical have at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, or at least 95% sequence identity to each other.
The terms “homology”, “sequence similarity”, or “sequence identity” in reference to nucleotide or amino acid sequences mean a degree of identity or similarity of two or more sequences and may be determined conventionally by using known software or computer programs such as the Best-Fit or Gap pairwise comparison programs (GCG Wisconsin Package, Genetics Computer Group, 575 Science Drive, Madison, Wis. 53711). BestFit uses the local homology algorithm of Smith and Waterman, Advances in Applied Mathematics 2:482-489 (1981), to find the best segment of identity or similarity between two sequences. Sequence comparison between two or more polynucleotides or polypeptides is generally performed by comparing portions of the two sequences over a comparison window to identify and compare local regions of sequence similarity. The comparison window is generally from about 20 to 200 contiguous nucleotides. Gap performs global alignments: all of one sequence with all of another similar sequence using the method of Needleman and Wunsch, J. Mol. Biol. 48:443-453 (1970). When using a sequence alignment program such as BestFit to determine the degree of DNA sequence homology, similarity or identity, the default setting may be used, or an appropriate scoring matrix may be selected to optimize identity, similarity or homology scores. Similarly, when using a program such as BestFit to determine sequence identity, similarity or homology between two different amino acid sequences, the default settings may be used, or an appropriate scoring matrix, such as blosum45 or blosum80, may be selected to optimize identity, similarity or homology scores.
Two nucleotide sequences can also be considered to be substantially identical when the two sequences hybridize to each other under stringent conditions. In representative embodiments, two nucleotide sequences considered to be substantially identical hybridize to each other under highly stringent conditions.
The terms “stringent conditions” or “stringent hybridization conditions” include reference to conditions under which a nucleic acid will selectively hybridize to a target sequence to a detectably greater degree than other sequences (e.g., at least 2-fold over a non-target sequence), and optionally may substantially exclude binding to non-target sequences. Stringent conditions are sequence-dependent and will vary under different circumstances. By controlling the stringency of the hybridization and/or washing conditions, target sequences can be identified that can be up to 100% complementary to the reference nucleotide sequence. Alternatively, conditions of moderate or even low stringency can be used to allow some mismatching in sequences so that lower degrees of sequence similarity are detected. For example, those skilled in the art will appreciate that to function as a primer or probe, a nucleic acid sequence only needs to be sufficiently complementary to the target sequence to substantially bind thereto so as to form a stable double-stranded structure under the conditions employed. Thus, primers or probes can be used under conditions of high, moderate or even low stringency. Likewise, conditions of low or moderate stringency can be advantageous to detect homolog, ortholog and/or paralog sequences having lower degrees of sequence identity than would be identified under highly stringent conditions.
The terms “complementary” or “complementarity” (and similar terms), as used herein, refer to the natural binding of polynucleotides under permissive salt and temperature conditions by base-pairing. For example, the sequence “A-G-T” binds to the complementary sequence “T-C-A.” Complementarity between two single-stranded molecules may be partial, in which only some of the nucleotides bind, or it may be complete when total complementarity exists between the single stranded molecules. The degree of complementarity between nucleic acid strands has significant effects on the efficiency and strength of hybridization between the molecules. As used herein, the term “substantially complementary” (and similar terms) means that two nucleic acid sequences are at least about 50%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% or more complementary. Alternatively, the term “substantially complementary” (and similar terms) can mean that two nucleic acid sequences can hybridize together under high stringency conditions (as described herein).
As used herein, “specifically” or “selectively” hybridizing (and similar terms) refers to the binding, duplexing, or hybridizing of a molecule to a particular nucleic acid target sequence under stringent conditions when that sequence is present in a complex mixture (e.g., total cellular DNA or RNA) to the substantial exclusion of non-target nucleic acids, or even with no detectable binding, duplexing or hybridizing to non-target sequences. Specifically or selectively hybridizing sequences typically are at least about 40% complementary and are optionally substantially complementary or even completely complementary (i.e., 100% identical).
For DNA-DNA hybrids, the Tm can be approximated from the equation of Meinkoth and Wahl, Anal. Biochem., 138:267-84 (1984): Tm=81.5° C.+16.6 (log M)+0.41 (% GC)−0.61 (% formamide)−500/L; where M is the molarity of monovalent cations, % GC is the percentage of guanosine and cytosine nucleotides in the DNA, % formamide is the percentage of formamide in the hybridization solution, and L is the length of the hybrid in base pairs. The Tm is the temperature (under defined ionic strength and pH) at which 50% of a complementary target sequence hybridizes to a perfectly matched probe. Tm is reduced by about 1° C. for each 1% of mismatching; thus, Tm, hybridization and/or wash conditions can be adjusted to hybridize to sequences of the desired degree of identity. For example, if sequences with >90% identity are sought, the Tm can be decreased 10° C. Generally, stringent conditions are selected to be about 5° C. lower than the thermal melting point (Tm) for the specific sequence and its complement at a defined ionic strength and pH. However, highly stringent conditions can utilize a hybridization and/or wash at the thermal melting point (Tm) or 1, 2, 3 or 4° C. lower than the thermal melting point (Tm); moderately stringent conditions can utilize a hybridization and/or wash at 6, 7, 8, 9 or 10° C. lower than the thermal melting point (Tm); low stringency conditions can utilize a hybridization and/or wash at 11, 12, 13, 14, 15 or 20° C. lower than the thermal melting point (Tm). If the desired degree of mismatching results in a Tm of less than 45° C. (aqueous solution) or 32° C. (formamide solution), optionally the SSC concentration can be increased so that a higher temperature can be used. An extensive guide to the hybridization of nucleic acids is found in Tijssen, Laboratory Techniques in Biochemistry and Molecular Biology-Hybridization with Nucleic Acid Probes, part I, chapter 2, “Overview of principles of hybridization and the strategy of nucleic acid probe assays,” Elsevier, New York (1993); Current Protocols in Molecular Biology, chapter 2, Ausubel, et al., eds, Greene Publishing and Wiley-Interscience, New York (1995); and Green & Sambrook, In: Molecular Cloning, A Laboratory Manual, 4th Edition, Cold Spring Harbor Press, Cold Spring Harbor, N.Y. (2012).
Typically, stringent conditions are those in which the salt concentration is less than about 1.5 M Na ion, typically about 0.01 to 1.0 M Na ion concentration (or other salts) at about pH 7.0 to pH 8.3 and the temperature is at least about 30° C. for short probes (e.g., 10 to 50 nucleotides) and at least about 60° C. for longer probes (e.g., greater than 50 nucleotides). Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide or Denhardt's (5 g Ficoll, 5 g polyvinylpyrrolidone, 5 g bovine serum albumin in 500 ml of water). Exemplary low stringency conditions include hybridization with a buffer solution of 30% to 35% formamide, 1 M NaCl, 1% SDS (sodium dodecyl sulfate) at 37° C. and a wash in 1× to 2×SSC (20×SSC=3.0 M NaCl/0.3 M trisodium citrate) at 50° C. to 55° C. Exemplary moderate stringency conditions include hybridization in 40% to 45% formamide, 1 M NaCl, 1% SDS at 37° C. and a wash in 0.5× to 1×SSC at 55° C. to 60° C. Exemplary high stringency conditions include hybridization in 50% formamide, 1 M NaCl, 1% SDS at 37° C. and a wash in 0.1×SSC at 60° C. to 65° C. A further non-limiting example of high stringency conditions include hybridization in 4×SSC, 5×Denhardt's, 0.1 mg/ml boiled salmon sperm DNA, and 25 mM Na phosphate at 65° C. and a wash in 0.1×SSC, 0.1% SDS at 65° C. Another illustration of high stringency hybridization conditions includes hybridization in 7% SDS, 0.5 M NaPO4, 1 mM EDTA at 50° C. with washing in 2×SSC, 0.1% SDS at 50° C., alternatively with washing in 1×SSC, 0.1% SDS at 50° C., alternatively with washing in 0.5×SSC, 0.1% SDS at 50° C., or alternatively with washing in 0.1×SSC, 0.1% SDS at 50° C., or even with washing in 0.1×SSC, 0.1% SDS at 65° C. Those skilled in the art will appreciate that specificity is typically a function of post-hybridization washes, the relevant factors being the ionic strength and temperature of the final wash solution.
Nucleic acids that do not hybridize to each other under stringent conditions are still substantially identical if the proteins that they encode are substantially identical (e.g., due to the degeneracy of the genetic code).
A further indication that two nucleic acids or proteins are substantially identical is that the protein encoded by the first nucleic acid is immunologically cross reactive with the protein encoded by the second nucleic acid. Thus, a protein is typically substantially identical to a second protein, for example, where the two proteins differ only by conservative substitutions.
The term “vector” refers to a composition for transferring, delivering or introducing a nucleic acid (or nucleic acids) into a cell. A vector comprises a nucleic acid molecule comprising the nucleotide sequence(s) to be transferred, delivered or introduced.
Provided herein are plants expressing polypeptides that increase the plant's ability for pathogen resistance as compared to a control plant that does not express the polypeptide. The polypeptide includes SEQ ID NO: 5 and functional fragments and variants thereof. Various means of introducing nucleic acid sequence into the soybean plant are also disclosed, which include transgenic means, gene editing, and breeding. Markers for identifying the presence of these nucleic acid sequences in the plant are also disclosed.
In some embodiments, the plants provided herein are a non-naturally occurring variety of soybean having the desired trait. In specific embodiments, the non-naturally occurring variety of soybean is an elite soybean variety. A “non-naturally occurring variety of soybean” is any variety of soybean that does not naturally exist in nature. A “non-naturally occurring variety of soybean” may be produced by any method known in the art, including, but not limited to, transforming a soybean plant or germplasm, transfecting a soybean plant or germplasm and crossing a naturally occurring variety of soybean with a non-naturally occurring variety of soybean. In some embodiments, a “non-naturally occurring variety of soybean” may comprise one of more heterologous nucleotide sequences. In some embodiments, a “non-naturally occurring variety of soybean” may comprise one or more non-naturally occurring copies of a naturally occurring nucleotide sequence (i.e., extraneous copies of a gene that naturally occurs in soybean). In some embodiments, a “non-naturally occurring variety of soybean” may comprise a non-natural combination of two or more naturally occurring nucleotide sequences (i.e., two or more naturally occurring genes that do not naturally occur in the same soybean, for instance genes not found in Glycine max lines).
Methods and compositions are provided that increase the pathogen resistance capability in a plant, a plant part, or a seed. In specific embodiments, various methods and compositions are provided that produce an increase in resistance to Asian Soy Rust in the plant, plant part or seed. An increase in pathogen resistance includes any statistically significant increase in plant's ability to resist infection by the pathogen when compared to an appropriate control plant or plant part.
A “subject plant or plant cell” is one in which genetic alteration, such as transformation, has been affected as to a polynucleotide of interest, or is a plant or plant cell which is descended from a plant or cell so altered and which comprises the alteration. A “control” or “control plant” or “control plant cell” provides a reference point for measuring changes in phenotype of the subject plant or plant cell. A control plant or plant cell may comprise, for example: (a) a wild-type plant or cell, i.e., of the same genotype as the starting material for the genetic alteration which resulted in the subject plant or cell; (b) a plant or plant cell of the same genotype as the starting material but which has been transformed with a null construct (i.e., with a construct which has no known effect on the trait of interest, such as a construct comprising a marker gene); (c) a plant or plant cell which is a non-transformed segregant among progeny of a subject plant or plant cell; (d) a plant or plant cell genetically identical to the subject plant or plant cell but which is not exposed to conditions or stimuli that would induce expression of the gene of interest; or (e) the subject plant or plant cell itself, under conditions in which the gene of interest is not expressed.
Expression of Polynucleotides and Polypeptides that Confer Increased Pathogen Resistance
Compositions and methods for conferring increased pathogen resistance are provided. Polypeptides, polynucleotides and functional fragments and variants thereof that confer increased pathogen resistance are provided. In some embodiments, the polypeptide is SEQ ID NO: 5 or a fragment or variant of SEQ ID NO: 5. In some embodiments, the polynucleotide is any one of SEQ ID NOS: 1, 2-4, 11 and 12 or a polynucleotide encoding a polypeptide having the sequence of SEQ ID NO: 5, or a fragment or variant of any one thereof. In various embodiments, the polypeptide and polynucleotides or variant and fragments thereof confer increased resistance to Asian Soy Rust.
Fragments of the polypeptides that increase pathogen resistance when expressed in a plant, plant part, or seed include those that are shorter than the full-length sequences, either due to the use of an alternate downstream start site, or due to processing that produces a shorter protein having the activity. Such biologically active portions can be prepared by recombinant techniques and evaluated for activity of being able to confer increased pathogen resistance.
Variants disclosed herein include polypeptides having an amino acid sequence that has at least 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 86%, about 87%, about 88%, about 89%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98% or about 99% identity to the amino acid sequence of SEQ ID NO: 5. Similarly, variants disclosed herein include polynucleotides having a nucleotide sequence that has at least 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 86%, about 87%, about 88%, about 89%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98% or about 99% identity to the nucleotide sequence of any of SEQ ID NOS: 1, 2-4 and 11-12. Such variants will increase pathogen resistance when expressed in a plant, plant part or seed. In some embodiments, a variant polynucleotide/polypeptide comprises a deletion and/or addition of one or more nucleotides/amino acids at one or more internal sites within the native polynucleotide/polypeptide and/or a substitution of one or more nucleotides/amino acids at one or more sites in the native polynucleotide/polypeptide.
In some embodiments, the polypeptides disclosed herein may comprise a heterologous amino acid sequence attached thereto. For example, a polypeptide may have a polypeptide tag or additional protein domain attached thereto. The heterologous amino acid sequence can be attached to the N terminus, the C terminus, or internally within the polypeptide. In some instances, the polypeptide may have one or more polypeptide tags and/or additional protein domains attached thereto at one or more positions of the polypeptide.
In some embodiments, the nucleic acid sequence encoding the polypeptides disclosed herein may comprise a heterologous nucleic acid sequence attached thereto. For example, the heterologous nucleic acid sequence may encode a polypeptide tag or additional protein domain that will be attached to the encoded polypeptide. As another example, the heterologous nucleic acid sequence may encode a regulatory element such as an intron, an enhancer, a promoter, a terminator, etc. The heterologous nucleic acid sequence can be positioned at the 5′ end, the 3′ end, or in-frame within the coding sequence of the polypeptide. In some instances, the nucleic acid sequence encoding the polypeptides disclosed herein may have one or more heterologous nucleic acid sequences attached thereto at one or more positions of the nucleic acid sequence. In still other embodiments, nucleotide sequences disclosed herein further comprise one or more native regulatory elements, including, for example, the native promoter sequence, the native 5′UTR, the native 3′UTR and/or the native terminator, or any combination thereof.
Polynucleotides encoding the polypeptides provided herein can be provided in expression cassettes (herein also referred to as “DNA constructs”) for expression in an organism of interest. The cassette will include 5′ and 3′ regulatory sequences operably linked to a polynucleotide encoding a polypeptide provided herein that allows for expression of the polynucleotide comprising an R-gene in plants, thereby imparting pathogen resistance to the plants in which they are expressed. The cassette may additionally contain at least one additional gene or genetic element to be co-transformed into the organism. Where additional genes or elements are included, the components are operably linked. Alternatively, the additional gene(s) or element(s) can be provided on multiple expression cassettes. Such an expression cassette is provided with a plurality of restriction sites and/or recombination sites for insertion of the polynucleotides to be under the transcriptional regulation of the regulatory elements or regions. The expression cassette may additionally contain a selectable marker gene.
“DNA construct” refers to the genetic elements operably linked to each other making up a recombinant DNA molecule and may comprise elements that provide expression of a DNA polynucleotide molecule in a host cell and elements that provide maintenance of the construct in the host cell. The various genetic elements within the DNA construct can be native to polynucleotide encoding the polypeptide or heterologous to the native polynucleotide encoding the polypeptide. A plant expression cassette comprises the operable linkage of genetic elements that when transferred into a plant cell provides expression of a desirable gene product. “Plant expression cassette” refers to chimeric DNA segments comprising the regulatory elements that are operably linked to provide the expression of a transgene product in plants. Promoters, leaders, introns, transit peptide encoding polynucleic acids, 3′ transcriptional termination regions are all genetic elements that may be operably linked by those skilled in the art of plant molecular biology to provide a desirable level of expression or functionality to an R-gene of the present invention. A DNA construct can contain one or more plant expression cassettes expressing the DNA molecules of the present invention or other DNA molecules useful in the genetic engineering of crop plants. One example of a DNA construct that may be used for expressing the R-gene of the present invention is a vector with a nucleic acid sequence comprising the R-gene operably linked to one or more heterologous regulatory elements (e.g., a heterologous promoter and/or intron) or native regulatory elements (e.g., native promoter, introns, or terminator regions).
The expression cassette will include in the 5′-3′ direction of transcription, a transcriptional and translational initiation region (i.e., a promoter), a polynucleotide of the invention, and a transcriptional and translational termination region (i.e., termination region) functional in the organism of interest, i.e., a plant or bacteria. The promoters of the invention are capable of directing or driving transcription and expression of a coding sequence in a host cell. The regulatory regions (i.e., promoters, transcriptional regulatory regions, and translational termination regions) may be endogenous or heterologous to the host cell or to each other. As used herein, a chimeric gene or a chimeric nucleic acid molecule comprises a coding sequence operably linked to a transcription initiation region that is heterologous to the coding sequence.
A variety of transcriptional terminators are available for use in expression cassettes. These are responsible for the termination of transcription beyond the transgene and correct mRNA polyadenylation. The termination region may be native with the transcriptional initiation region, may be native with the operably linked DNA sequence of interest, may be native with the plant host, or may be derived from another source (i.e., foreign or heterologous to the promoter, the DNA sequence of interest, the plant host, or any combination thereof). Appropriate transcriptional terminators are those that are known to function in plants and include the CAMV 35S terminator, the tml terminator, the nopaline synthase terminator and the pea rbcs E9 terminator. These can be used in both monocotyledons and dicotyledons. Still other terminators that can be used include heterologous terminators derived from Arabidopsis genes, such as the terminators of SEQ ID NOS: 16-17. In addition, a gene's native transcription terminator may be used. In one example embodiment, the native transcription terminator of the nucleic acid (R-gene) encoding a polypeptide conferring increased pathogen resistance is used, such as the native terminator of SEQ ID NO: 18. Termination regions used in the expression cassettes can also be obtained from, e.g., the Ti-plasmid of A. tumefaciens, such as the octopine synthase and nopaline synthase termination regions. See also Guerineau et al. (1991) Mol. Gen. Genet. 262: 141-144; Proudfoot (1991) Cell 64:671-674; Sanfacon et al. (1991) Genes Dev. 5: 141-149; Mogen et al. (990) Plant Cell 2: 1261-1272; Munroe et al. (1990) Gene 91: 151-158; Ballas et al. (1989) Nucleic Acids Res. 17:7891-7903; and Joshi et al. (1987) Nucleic Acids Res. 15:9627-9639.
Additional regulatory signals include, but are not limited to, transcriptional initiation start sites, operators, activators, enhancers, other regulatory elements, ribosomal binding sites, an initiation codon, termination signals, and the like. See, for example, U.S. Pat. Nos. 5,039,523 and 4,853,331; EPO 0480762A2; Sambrook et al. (1992) Molecular Cloning: A Laboratory Manual, ed. Maniatis et al. (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.), hereinafter “Sambrook 11”; Davis et al, eds. (1980).
In preparing the expression cassette, the various DNA fragments may be manipulated, so as to provide for the DNA sequences in the proper orientation and, as appropriate, in the proper reading frame. Toward this end, adapters or linkers may be employed to join the DNA fragments or other manipulations may be involved to provide for convenient restriction sites, removal of superfluous DNA, removal of restriction sites, or the like. For this purpose, in vitro mutagenesis, primer repair, restriction, annealing, resubstitutions, e.g., transitions and transversions, may be involved A variety of promoters that are constitutively active or specifically active in vegetative tissues, such as leaves, stems, roots and tubers, can be used to express the nucleic acid or R-gene of the present invention. The promoters can be selected based on the desired outcome. The nucleic acids can be combined with constitutive, inducible, tissue-preferred, or other promoters for expression in the organism of interest. See, for example, promoters set forth in WO 99/43838 and in U.S. Pat. Nos. 8,575,425; 7,790,846; 8,147,856; 8,586832; 7,772,369; 7,534,939; 6,072,050; 5,659,026; 5,608,149; 5,608,144; 5,604,121; 5,569,597; 5,466,785; 5,399,680; 5,268,463; 5,608,142; and 6,177,611; herein incorporated by reference. In some embodiments, the promoter used to drive the expression of the polynucleotides provided herein comprises an exogenous promoter not found in plants in nature, for example, a synthetic promoter. In some embodiments, the promoter used to drive the expression of the polynucleotides provided herein comprises a heterologous promoter sourced from an organism that is different from the organism from where the R-gene is sourced. In example embodiments, the nucleic acid comprising the R-gene is expressed in soybean plants wherein the expression is driven by a plant promoter derived from Medicago truncatula, such as the promoters of SEQ ID NOS: 13-14.
In embodiments, the promoter may also optionally comprise an intron. In some embodiments, a promoter comprises or consists of the about 2 kb region upstream (5′) of the translation start site of a known or predicted coding sequence. In other embodiments, the promoter is a minimal or core promoter comprising only those elements that are required to initiate transcription. For example, a minimal promoter may consist of a transcription start site (TSS), a binding site for RNA polymerase, and a transcription factor binding site (such as a TATA box or B recognition element). Such minimal promoter may not comprise any introns or splice sites.
In some embodiments, the promoter used herein to drive the expression of the polynucleotides provided herein comprises a native promoter or an active variant or fragment thereof. For purpose of this disclosure, the term “native promoter,” used interchangeably with the term “endogenous promoter,” refers to a promoter that is found in plants in nature. An active variant or fragment of a native promoter refers to a promoter sequence that has one or more nucleotide substitutions, deletions, or insertions and that can drive expression of an operably-linked polynucleotide sequence under conditions similar to those under which the native promoter is active. Such active variants or fragments may be created by site-directed mutagenesis, induced mutation, or may occur as allelic variants (polymorphisms). In some embodiments, the native promoter comprises a polynucleotide having the sequence of SEQ ID NO: 15. In some embodiments, disclosed herein is a construct comprising a native promoter (e.g., a native promoter comprising SEQ ID NO: 15) or its active variant or fragment operably linked to a polynucleotide encoding a polypeptide having the sequence of SEQ ID NO: 5, or a fragment or variant of SEQ ID NO: 5 (e.g., having least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity to the native promoter, wherein the variant or fragment thereof retains the ability to direct expression of a sequence of interest); and when introduced into a plant, the construct confers increased pathogen resistance. In some embodiments, the native promoter is a heterologous promoter to the polynucleotide.
The translation leader sequence means a DNA molecule located between the promoter of a gene and the coding sequence. The translation leader sequence is present in the fully processed mRNA upstream of the translation start sequence. The translation leader sequence may affect processing of the primary transcript to mRNA, mRNA stability or translation efficiency. Examples of translation leader sequences include maize and petunia heat shock protein leaders, plant virus coat protein leaders, plant rubisco gene leaders among others (Turner and Foster, Molecular Biotechnology 3:225, 1995).
The “3′ non-translated sequences” (or 3′ untranslated sequences or 3′-UTR) means DNA sequences located downstream of a structural polynucleotide sequence and include sequences encoding polyadenylation and other regulatory signals capable of affecting mRNA processing or gene expression. The polyadenylation signal functions in plants to cause the addition of multiple adenylate nucleotides to the 3′ end of the mRNA precursor. The polyadenylation sequence can be derived from the natural gene, from a variety of plant genes, or from T-DNA. An example of the polyadenylation sequence is the nopaline synthase 3′ sequence (nos 3′; Fraley et al., Proc. Natl. Acad. Sci. USA 80: 4803-4807, 1983). The use of different 3′ non-translated sequences is exemplified by Ingelbrecht et al., Plant Cell 1:671-680, 1989. In embodiments of the present invention, the nucleic acid encoding a polypeptide conferring increased pathogen resistance comprises the native or endogenous 3′-UTR of the R-gene, such as the 3′-UTR of SEQ ID NO: 20.
The “5′ non-translated sequences” (or 5′ untranslated sequences or 5′-UTR) means DNA sequences located upstream of an initiation codon of structural polynucleotide sequence and include sequences capable of affecting translation of an mRNA sequence. The 5′-UTR sequence is also referred to as a leader sequence. In different organisms, the 5′-UTR may remain untranslated, and form complex secondary structures to regulate translation of the downstream sequence. The leader sequence can be derived from the natural gene or from a variety of plant genes. In embodiments of the present invention, the nucleic acid encoding a polypeptide conferring increased pathogen resistance comprises the native or endogenous 5′-UTR of the R-gene, such as the 5′-UTR of SEQ ID NO: 19.
As used herein, the term “intron” refers to a nucleotide sequence provided within a gene (that is, in an intragenic region) and that is removed by splicing during maturation of a final RNA product. Thus, introns are non-coding regions of an RNA transcript, or the DNA encoding it. Introns may have regulatory function, such as due to the presence of transcriptional enhancer or repressor sequences embedded therein. Example introns include introns derived from Arabidopsis genes, such as the intron of SEQ ID NO: 21. Introns separate exons such that splicing results in removal of introns and joining of exons. Introns are marked by the presence of conserved sequences known as splice sites at 5′ and 3′ ends. Typically, the splice site at the 5′ end includes an AG sequence and the splice site at the 3′ end includes a GU sequence. Splicing of the introns is catalyzed by a spliceosome comprising RNA and proteins. Promoter sequences can, in some embodiments (e.g., for larger promoter sequences such as those for proximal or distal promoters) include an intron. In other embodiments, as described herein, introns may be optionally coupled to a minimal or core promoter sequence, as well as to a nucleic acid encoding a protein of interest, to improve expression of the protein.
This in embodiments, novel regulatory elements are disclosed for the expression of polynucleotides and polypeptides in a plant cell. In particular embodiments, the novel regulatory elements are used for the expression of polynucleotides and polypeptides that when expressed in a plant cell confer the plant with increased pathogen resistance, such as to ASR. The novel regulatory elements include native promoters comprising the nucleotide sequence of SEQ ID NO: 15, or an active variant or fragment to SEQ ID NO: 15 (e.g., having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 15 and retaining the ability to drive expression of an operably linked polynucleotide of interest). In other embodiments, the promoter set forth in SEQ ID 15 is operably linked to a polynucleotide encoding the polypeptide of interest. In specific embodiments, the polynucleotide sequence of interest encodes a polypeptide that upon expression in a plant cell increases resistance of the plant cell to a plant pathogen, such as ASR. Such polypeptides include, but are not limited to, the polypeptide of SEQ ID NO: 5 or an active variant or fragment thereof or polypeptides encoding R-genes as set forth in any one of US Patent publication No. US 20200354739 and PCT Publications Nos. WO2019103918, WO2021154632A1, WO2021022022, WO2021022026, WO2021022101, WO2021260673, and WO2021263249, the contents of each of which are herein incorporated by reference in their entirety.
The novel regulatory elements further include native terminators comprising the nucleotide sequence of SEQ ID NO: 18, or a sequence that is substantially identical to SEQ ID NO: 18 (e.g., having at least 90% or at least 95% sequence identity). The novel regulatory elements further include native 5′- and 3′-UTRs comprising the nucleotide sequence of SEQ ID NO: 19 and/or 20, or a sequence that is substantially identical to SEQ ID NOS: 19 and/or 20 (e.g., having at least 90% or at least 95% sequence identity).
The polynucleotide of the invention comprises any coding sequence that can express the novel R-gene and encode a polypeptide that confers increased pathogen resistance. In particular embodiments, the coding sequence comprises any polynucleotide that encodes a polypeptide having the amino acid sequence of SEQ ID NO: 5, or an amino acid sequence having at least 90% or at least 95% sequence identity to SEQ ID NO: 5 and conferring increased pathogen resistance when expressed. In example embodiments, the coding sequence comprises a nucleic acid having the nucleotide sequence of any of SEQ ID NOS: 1, 2-4 and 11-12, or any nucleic acid having at least 90% or at least 95% sequence identity to any of SEQ ID NOS: 1, 2-4 and 11-12.
In example embodiments, the coding sequence is a cDNA derived nucleotide sequence that comprises the exons of the R-gene spliced together and without the sequence of intervening introns, or upstream and downstream UTRs (for example, the R-gene coding sequence of SEQ ID NO: 12). In other example embodiments, the coding sequence is a genomic DNA derived nucleotide sequence that comprises the exons with any combination of intervening native introns (e.g., one or more or all intervening introns), native 5′ and 3′ UTRs, native promoters and native terminators. One of skill in the art may be able to use standard procedures in recombinant DNA technology to combine any of the coding sequence options with native or exogenous terminators, native or exogenous promoters, and any combination of native or exogenous regulatory elements in the expression cassette.
In one embodiment, the genomic DNA derived coding sequence of the R-gene of the present invention comprises all the native introns and exons of the gene in addition to the native 5′ and 3′ UTRs and native promoters and terminators (for example, the R-gene coding sequence of SEQ ID NO: 2). In other example embodiments, one or more of the native introns are replaced with a non-native intron, such as with an intron known to enhance transformation, transcriptional, or translational activity in a host cell (for example, the R-gene coding sequences of SEQ ID NOS: 3-4 and 11-12 wherein the first native intron is replaced with the intron of SEQ ID NO: 21). In still further embodiments, the coding sequence comprises the native promoter and terminator as well as native UTRs allowing for the gene expression to be driven by the native promoter (for example, the R-gene coding sequences of SEQ ID NO: 11). In other embodiments, the coding sequence comprises the native UTRs but not the native promoter or terminator to allow for gene expression to be driven by a heterogenous promoter (for example, the R-gene coding sequences of SEQ ID NOS: 3-4).
The laboratory procedures in recombinant DNA technology used herein are those well-known and commonly employed in the art. Standard techniques are used for cloning, DNA and RNA isolation, amplification and purification. Generally enzymatic reactions involving DNA ligase, DNA polymerase, restriction endonucleases and the like are performed according to the manufacturer's specifications. These techniques and various other techniques are generally performed according to Sambrook et al. (1989).
The nucleic acids, polynucleotides, nucleotide sequences, R-genes, vectors and DNA constructs of the present invention may be introduced into the genome of a desired plant host by a variety of conventional transformation techniques that are well known to those skilled in the art. “Transformation” refers to a process of stably introducing an exogenous nucleic acid molecule (for example, a DNA construct, a vector, an expression cassette, or a recombinant polynucleic acid molecule) into a cell or protoplast and that exogenous nucleic acid molecule is stably incorporated into a host cell genome or an organelle genome (for example, chloroplast or mitochondria) or is capable of autonomous replication. “Transformed” or “transgenic” refers to a cell, tissue, organ, or organism into which a foreign polynucleic acid, such as a DNA vector or recombinant polynucleic acid molecule, is incorporated and maintained. Further, once stably transformed into a cell, the foreign polynucleic acid can be passed on to a progeny of the cell. A “transgenic”, “transformed”, or “stably transformed” cell or organism also includes progeny of the cell or organism and progeny produced from a breeding program employing such a “transgenic” plant as a parent in a cross and exhibiting an altered phenotype resulting from the presence of the foreign polynucleic acid molecule.
Methods of transformation of plant cells or tissues include but are not limited to Agrobacterium mediated transformation method and the Biolistics or particle-gun mediated transformation method. Suitable plant transformation vectors for the purpose of Agrobacterium mediated transformation include-those elements derived from a tumor inducing (Ti) plasmid of Agrobacterium tumefaciens, for example, right border (RB) regions and left border (LB) regions, and others disclosed by Herrera-Estrella et al., Nature 303:209 (1983); Bevan, Nucleic Acids Res. 12:8711-8721 (1984); Klee et al., Bio-Technology 3(7):637-642 (1985). In addition to plant transformation vectors derived from the Ti or root-inducing (Ri) plasmids of Agrobacterium, alternative methods can be used to insert the DNA constructs of this invention into plant cells. Such methods may involve, but are not limited to, for example, the use of liposomes, electroporation, chemicals that increase free DNA uptake, free DNA delivery via microprojectile bombardment, and transformation using viruses or pollen.
DNA constructs, vectors, and expression cassettes can be prepared that incorporate the R-gene coding sequences of the present invention for use in directing the expression of the sequences directly from the host plant cell plastid. Examples of such constructs suitable for this purpose and methods that are known in the art and are generally described, for example, in Svab et al., Proc. Natl. Acad. Sci. USA 87:8526-8530, (1990) and Svab et al., Proc. Natl. Acad. Sci. USA 90:913-917 (1993) and in U.S. Pat. No. 5,693,507.
When adequate numbers of cells containing the exogenous nucleic acid molecule encoding polypeptides from the present invention are obtained, the cells can be cultured, then regenerated into whole plants. “Regeneration” refers to the process of growing a plant from a plant cell (for example, plant protoplast or explant). Such regeneration techniques rely on manipulation of certain phytohormones in a tissue culture growth medium, typically relying on a biocide and/or herbicide marker that has been introduced together with the desired nucleotide sequences. Choice of methodology for the regeneration step is not critical See, for example, Ammirato et al., Handbook of Plant Cell Culture-Crop Species. Macmillan Publ. Co. (1984); Shimamoto et al., Nature 338:274-276 (1989); Fromm, UCLA Symposium on Molecular Strategies for Crop Improvement, Apr. 16-22, 1990. Keystone, Colo. (1990); Vasil et al., Bio/Technology 8:429-434 (1990); Vasil et al., Bio/Technology 10:667-674 (1992); Hayashimoto, Plant Physiol. 93:857-863 (1990); and Datta et al., Bio-technology 8:736-740 (1990). Such regeneration techniques are described generally in Klee et al., Ann. Rev. Plant Phys. 38:467-486 (1987).
The development or regeneration of transgenic plants containing the exogenous polynucleic acid molecule that encodes a polypeptide of interest is well known in the art. Preferably, the regenerated plants are self-pollinated to provide homozygous transgenic plants, as discussed above. Otherwise, pollen obtained from the regenerated plants is crossed to seed-grown plants of agronomically important lines. Conversely, pollen from plants of these important lines is used to pollinate regenerated plants.
In certain embodiments, the polynucleotides of the invention encoding a protein conferring enhanced pathogen resistance can be stacked with any combination of polynucleotide sequences of interest in order to create plants with a desired trait. A trait, as used herein, refers to the phenotype derived from a particular sequence or groups of sequences. For example, the polynucleotides encoding the novel R gene may be stacked with any other polynucleotides encoding polypeptides that confer a desirable trait, including but not limited to resistance to diseases, insects, and herbicides, tolerance to heat and drought, reduced time to crop maturity, improved industrial processing, such as for the conversion of starch or biomass to fermentable sugars, and improved agronomic quality, such as high oil content and high protein content.
In a particular embodiment of the invention, polynucleotides may be stacked (or, alternatively, multiple expression cassettes may be stacked on a single polynucleotide) so as to express more than one R-gene within a plant. This is a particular advantage where, for example, one R-gene is particularly suitable for providing resistance to one class of plant pathogens (e.g., a first rust isolate) while the other provides resistance to a different class of plant pathogens (or a different result isolate). In alternate embodiments, a first R-gene is provided that provides resistance via a first mode of action against a plant pathogen (e.g., against ASR) while the other provides resistance to the same plant pathogen via a second, different mode of action. Herein, the synergistic effect of the different modes of action of the different R-genes can provide a higher increase in overall pathogen resistance than either R-gene by itself. Stacking polypeptides encoded by different R-genes is also an advantage where one polypeptide expresses inherent pathogen-resistance but is somewhat labile.
Exemplary polynucleotides encoding proteins that confer increased pathogen resistance that may be stacked with polynucleotides of the invention include polynucleotides encoding proteins that confer increased ASR resistance as described in US Patent publication Nos. US 20200354739 and PCT Publications Nos. WO2019103918, WO2021154632A1, WO2021022022, WO2021022026, WO2021022101, WO2021260673, and WO2021263249, each of which is incorporated by reference in its entirety.
Exemplary polynucleotides that may be stacked with polynucleotides of the invention encoding a novel R-gene include polynucleotides encoding polypeptides conferring resistance to pests/pathogens such as viruses, nematodes, insects or fungi, and the like. Exemplary polynucleotides that may be stacked with polynucleotides of the invention include polynucleotides encoding: polypeptides having pesticidal and/or insecticidal activity, such as other Bacillus thuringiensis toxic proteins (described in U.S. Pat. Nos. 5,366,892; 5,747,450; 5,737,514; 5,723,756; 5,593,881; and Geiser et al. (1986) Gene 48:109), lectins (Van Damme et al. (1994) Plant Mol. Biol. 24:825, pentin (described in U.S. Pat. No. 5,981,722), and the like; traits desirable for disease or herbicide resistance (e.g., fumonisin detoxification genes (U.S. Pat. No. 5,792,931); avirulence and disease resistance genes (Jones et al. (1994) Science 266:789; Martin et al. (1993) Science 262:1432; Mindrinos et al. (1994) Cell 78:1089); a gene encoding an aryloxyalkanoate dioxygenase conferring resistance to certain classes of auxin and acetylCoA carboxylase herbicides (e.g. in PCT Publication Nos. WO 2008/141154, WO 2007/053482 or a tfdA gene giving resistance to 2,4 D in U.S. Pat. No. 6,153,401); a gene encoding a dicamba monoxygenase (Behrens et al. (2007) Science, 316, 1185) conferring resistance to dicamba; a gene encoding a homogentisate solanesyltransferase (HST) conferring resistance to HST-inhibiting herbicides (PCT Publication No. WO 2010/029311); a gene encoding a nitrilase conferring resistance to a nitrile-containing herbicide (e.g the bxnA bromoxynil nitrilase); acetolactate synthase (ALS) mutants that lead to herbicide resistance such as the S4 and/or Hra mutations; glyphosate resistance (e.g., 5-enol-pyrovyl-shikimate-3-phosphate-synthase (EPSPS) gene, described in U.S. Pat. Nos. 4,940,935 and 5,188,642; or the glyphosate N-acetyltransferase (GAT) gene, described in Castle et al. (2004) Science, 304:1151-1154; and in U.S. Patent Application Publication Nos. 20070004912, 20050246798, and 20050060767)); glufosinate resistance (e.g, phosphinothricin acetyl transferase genes PAT and BAR, described in U.S. Pat. Nos. 5,561,236 and 5,276,268); a cytochrome P450 or variant thereof that confers herbicide resistance or tolerance to, inter alia, HPPD herbicides (U.S. Patent Application Publication No. 20090011936; U.S. Pat. Nos. 6,380,465; 6,121,512; 5,349,127; 6,649,814; and 6,300,544; and PCT Publication No. WO 2007/000077); and traits desirable for processing or process products such as high oil (e.g., U.S. Pat. No. 6,232,529); modified oils (e.g., fatty acid desaturase genes (U.S. Pat. No. 5,952,544; PCT Publication No. WO 94/11516)); modified starches (e.g., ADPG pyrophosphorylases (AGPase), starch synthases (SS), starch branching enzymes (SBE), and starch debranching enzymes (SDBE)); and polymers or bioplastics (e.g., U.S. Pat. No. 5,602,321; beta-ketothiolase, polyhydroxybutyrate synthase, and acetoacetyl-CoA reductase (Schubert et al. (1988) J. Bacteriol. 170:5837-5847) facilitate expression of polyhydroxyalkanoates (PHAs)).
These stacked combinations can be created by any method including, but not limited to, cross-breeding plants by any conventional or TopCross methodology, or genetic transformation. If the sequences are stacked by genetically transforming the plants, the polynucleotide sequences of interest can be combined at any time and in any order. For example, a transgenic plant comprising one or more desired traits can be used as the target to introduce further traits by subsequent transformation. The traits can be introduced simultaneously in a co-transformation protocol with the polynucleotides of interest provided by any combination of transformation cassettes. For example, if two sequences will be introduced, the two sequences can be contained in separate transformation cassettes (trans) or contained on the same transformation cassette (cis). Expression of the sequences can be driven by the same promoter or by different promoters. In certain cases, it may be desirable to introduce a transformation cassette that will suppress the expression of the polynucleotide of interest. This may be combined with any combination of other suppression cassettes or overexpression cassettes to generate the desired combination of traits in the plant. It is further recognized that polynucleotide sequences can be stacked at a desired genomic location using a site-specific recombination system. See, for example, PCT Publication Nos. WO 99/25821, WO 99/25854, WO 99/25840, WO 99/25855, and WO 99/25853.
In the plants provided herein, the polynucleotides as described earlier this disclosure is a heterologous nucleic acid sequence in the genome of the plant. As used herein, the term “heterologous” in the context of a chromosomal segment refers to one or more DNA sequences (e.g., genetic loci) in a configuration in which they are not found in nature, for example as a result of a recombination event between homologous chromosomes during meiosis, or for example as a result of introduction of a transgenic sequence, or for example as a result of modification through gene editing.
Although soybean plants are used to exemplify the composition and methods throughout the application, a polynucleotide as provided herein may be introduced to any plant species, including, but not limited to, monocots and dicots. Examples of plants of interest include, but are not limited to, corn (maize), sorghum, wheat, sunflower, tomato, crucifers, peppers, potato, cotton, rice, soybean, sugarbeet, sugarcane, tobacco, barley, and oilseed rape, Brassica sp., alfalfa, rye, millet, safflower, peanuts, sweet potato, cassava, coffee, coconut, pineapple, citrus trees, cocoa, tea, banana, avocado, fig, guava, mango, olive, papaya, cashew, macadamia, almond, oats, vegetables, ornamentals, and conifers.
Glycine (soybean or soya bean) is a genus in the bean family Fabaceae. The Glycine plants can be Glycine arenaria, Glycine argyrea, Glycine cyrtoloba, Glycine canescens, Glycine clandestine, Glycine curvata, Glycine falcata, Glycine latifolia, Glycine microphylla, Glycine pescadrensis, Glycine stenophita, Glycine syndetica, Glycine soja Seib. Et Zucc., Glycine max (L.) Merrill., Glycine tabacina, or Glycine tomentella.
In some embodiments, the plants provided herein are elite plants or derived from an elite line.
As used herein, an “elite line” is an agronomically superior line that has resulted from many cycles of breeding and selection for superior agronomic performance. Numerous elite lines are available and known to those of skill in the art of soybean breeding. An “elite population,” is an assortment of elite individuals or lines that can be used to represent the state of the art in terms of agronomically superior genotypes of a given crop species, such as soybean. Similarly, an “elite germplasm” or elite strain of germplasm is an agronomically superior germplasm, typically derived from, and/or can give rise to, a plant with superior agronomic performance, such as an existing or newly developed elite line of soybean.
An “elite” plant is any plant from an elite line, such that an elite plant is a representative plant from an elite variety. In some embodiments, the soybean plant comprising a polynucleotide encoding any one of the polypeptides disclosed herein is an elite soybean plant. Non-limiting examples of elite soybean varieties that are commercially available to farmers or soybean breeders include: AG00802, A0868, AG0902, A1923, AG2403, A2824, A3704, A4324, A5404, AG5903, AG6202 AG0934; AG1435; AG2031; AG2035; AG2433; AG2733; AG2933; AG3334; AG3832; AG4135; AG4632; AG4934; AG5831; AG6534; and AG7231 (Asgrow Seeds, Des Moines, Iowa, USA); BPR0144RR, BPR 4077NRR and BPR 4390NRR (Bio Plant Research, Camp Point, Ill., USA); DKB 17-51 and DKB37-51 (DeKalb Genetics, DeKalb, Ill., USA); DP 4546 RR, and DP 7870 RR (Delta & Pine Land Company, Lubbock, Tex., USA); JG 03R501, JG 32R606C ADD and JG 55R503C (JGL Inc., Greencastle, Ind., USA); NKS 13-K2 (NK Division of Syngenta Seeds, Golden Valley, Minnesota, USA); 90M01, 91M30, 92M33, 93M11, 94M30, 95M30, 97B52, P008T22R2; P16T17R2; P22T69R; P25T51R; P34T07R2; P35T58R; P39T67R; P47T36R; P46T21R; and P56T03R2 (Pioneer Hi-Bred International, Johnston, Iowa, USA); SG4771NRR and SG5161NRR/STS (Soygenetics, LLC, Lafayette, Ind., USA); S00-K5, S11-L2, S28-Y2, S43-B1, S53-A1, S76-L9, S78-G6, S0009-M2; S007-Y4; S04-D3; S14-A6; S20-T6; S21-M7; S26-P3; S28-N6; S30-V6; S35-C3; S36-Y6; S39-C4; S47-K5; S48-D9; S52-Y2; S58-Z4; S67-R6; S73-S8; and S78-G6 (Syngenta Seeds, Henderson, Ky., USA); Richer (Northstar Seed Ltd. Alberta, CA); 14RD62 (Stine Seed Co. Ia., USA); or Armor 4744 (Armor Seed, LLC, Ar., USA).
A Disease resistant soybean plant or germplasm of the present invention may be produced by any method whereby an R-gene of the present invention is introduced into the soybean plant or germplasm, including, but not limited to, transformation, protoplast transformation or fusion, a double haploid technique, embryo rescue, gene editing, conventional breeding, and/or by any other nucleic acid transfer system.
In some embodiments, the soybean plant or germplasm comprises a non-naturally occurring variety of soybean. In some embodiments, the soybean plant or germplasm is at least about 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% or 100% identical to that of an elite variety of soybean.
The disease resistant soybean plant or germplasm may be the progeny of a cross between an elite variety of soybean and a variety of soybean that comprises an R-gene for enhanced Disease tolerance or resistance (e.g. ASR) wherein the R-gene is a novel gene encoding a protein that confers increased pathogen resistance; a R-gene that is substantially identical to any of SEQ ID NOs: 1, 2-4, 11-12; or a R-gene encoding a polypeptide that is substantially identical to SEQ ID NO: 5 while conferring or enhancing pathogen resistance (e.g., ASR resistance) in the plant. In many examples, alternative embodiments of the R-gene will have at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% homology to any of SEQ ID NOs: 1, 2-4, and 11-12. In many examples, alternative embodiments of the R-gene will have at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% homology to a nucleic acid molecule encoding the polypeptide of SEQ ID NO: 5, or a nucleic acid molecule encoding a polypeptide having at least 90% homology to SEQ ID NO: 5, while providing ASR resistance in the plant. In many examples, the polypeptide encoded by the R-gene will have at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% homology to SEQ ID NO: 5.
In particular embodiments, the disease resistant soybean plant or germplasm may be the progeny of a cross between an elite variety of soybean and a variety of soybean that comprises an R-gene for enhanced Disease tolerance (e.g. ASR) wherein the R-gene is a novel gene encoding a protein conferring enhanced pathogen resistance, wherein the R-gene is substantially identical to any of SEQ ID NOs: 1, 2-4, 11-12, or a R-gene encoding a polypeptide that is substantially identical to SEQ ID NO: 5 while conferring ASR resistance in the plant. In many examples, alternative embodiments of the R-gene will have at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% homology to any of SEQ ID NOs: 1, 2-4, 11-12. In many examples, alternative embodiments of the R-gene will have at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% homology to a nucleotide sequence encoding the polypeptide of SEQ ID NO: 5. In many examples, the polypeptide encoded by the R-gene of SEQ ID NOs: 1, 2-4, and 11-12 will comprise one of at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% homology to SEQ ID NO: 5 while conferring ASR resistance.
The disease resistant soybean plant or germplasm may be the progeny of an introgression wherein the recurrent parent is an elite variety of soybean and the donor comprises an R-gene associated with enhanced disease tolerance and/or resistance wherein the donor carries a R-gene having substantial identity to any of SEQ ID NOs: 1, 2-4, and 11-12 or a R-gene encoding a polypeptide having substantial identity to SEQ ID NO: 5. In some embodiments, the plant will comprise a R-gene having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% homology to any of SEQ ID NOs: 1, 2-4, 11-12 and increased tolerance to tolerance to ASR as compared to a plant not comprising the R-gene. In some embodiments, the plant will comprise a R-gene encoding a protein having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% homology to SEQ ID NO: 5 and increased tolerance to tolerance to ASR as compared to a plant not comprising the R-gene.
The disease resistant soybean plant or germplasm may be the progeny of a cross between a first elite variety of soybean (e.g., a tester line) and the progeny of a cross between a second elite variety of soybean (e.g., a recurrent parent) and a variety of soybean that comprises an R-gene.
A disease resistant soybean plant and germplasm of the present invention may comprise one or more R-genes of the present invention (e.g., any of SEQ ID NOs: 1, 2, 3, 4, 6, 7, 11 and 12).
In some embodiments, the plants provided herein can comprise one or more additional polynucleotides that encode an additional polypeptide that can confer a phenotype of increased pathogen resistance.
In specific embodiments, the plants, plant parts or seeds having the heterologous polynucleotide or polypeptide disclosed herein or active variants and fragment thereof can have a modified level of expression of the polynucleotide or polypeptide (i.e., an increase or a decrease in expression level). In other embodiments, the plants, plant parts or seeds having the heterologous polynucleotide or polypeptide disclosed herein or active variants and fragment thereof can have a modified level of activity of the polypeptide (i.e., an increase or a decrease in activity level). Methods to generate such modified levels of expression or activity are disclosed elsewhere herein and include, but are not limited to, breeding, gene editing, and transgenic techniques.
Plants produced as described above can be propagated to produce progeny plants, and the progeny plants that have stably incorporated into its genome a polynucleotide conferring increased pathogen resistance can be selected and can be further propagated if desired. The term “progeny,” refers to the descendant(s) of a particular cross. Typically, progeny result from breeding of two individuals, although some species (particularly some plants and hermaphroditic animals) can be selfed (i.e., the same plant acts as the donor of both male and female gametes). The descendant(s) can be, for example, of the F1, the F2, or any subsequent generation.
In addition to the phenotypic traits, the genetic characteristic of the plant as represented by its genetic marker profile can be used to select plants of desired traits. The term “marker-based selection” refers to the use of genetic markers to detect one or more nucleic acids from the plant, where the nucleic acid is associated with a desired trait to identify plants that carry genes for desirable (or undesirable) traits. Markers include but are not limited to Restriction Fragment Length Polymorphisms (RFLPs), Randomly Amplified Polymorphic DNAs (RAPDs), Arbitrarily Primed Polymerase Chain Reaction (AP-PCR), DNA Amplification Fingerprinting (DAF), Sequence Characterized Amplified Regions (SCARs), Amplified Fragment Length Polymorphisms (AFLPs), Simple Sequence Repeats (SSRs) which are also referred to as Microsatellites, and Single Nucleotide Polymorphisms (SNPs). There are known sets of public markers that are being examined by ASTA and other industry groups for their applicability in standardizing determinations of what constitutes an essentially derived variety under the US Plant Variety Protection Act. However, these standard markers do not limit the type of marker and marker profile which can be employed in breeding or developing backcross conversions, or in distinguishing varieties or plant parts or plant cells or verify a progeny pedigree. Primers and PCR protocols for assaying these and other markers are disclosed in the Soybase (sponsored by the USDA Agricultural Research Service and Iowa State University) located at the world wide web at 129.186.26.94/SSR.html.
In one embodiment, the markers used to identify the plants comprising the polynucleotides disclosed herein are SNPs. Non-limiting examples of SNP genotyping methods include hybridization, primer extension, oligonucleotide ligation, nuclease cleavage, minisequencing and coded spheres. Such methods are well known and disclosed in e.g., Gut, I. G., Hum. Mutat. 17: 475-492 (2001); Shi, Clin. Chem. 47(2): 164-172 (2001); Kwok, Pharmacogenomics 1(1): 95-100 (2000); and Bhattramakki and Rafalski, Discovery and application of single nucleotide polymorphism markers in plants, in PLANT GENOTYPING: THE DNA FINGERPRINTING OF PLANTS, CABI Publishing, Wallingford (2001). A wide range of commercially available technologies utilize these and other methods to interrogate SNPs, including Masscode Sup™/Sup (Qiagen, Germantown, MD, (Hologic, Madison, WI), (Applied Biosystems, Foster City, CA), (Applied Biosystems, Foster City, CA) and Beadarrays Sup™/Sup (Illumina, San Diego, CA).
In some embodiments, an assay (e.g., generally a two-step allelic discrimination assay or similar), a KASP Sup™/Sup assay (generally a one-step allelic discrimination assay defined below or similar), or both can be employed to identify the SNPs that associate with increased pathogen resistance. In an exemplary two-step assay, a forward primer, a reverse primer, and two assay probes that recognize two different alleles at the SNP site (or hybridization oligos) are employed. The forward and reverse primers are employed to amplify genetic loci that comprise SNPs that are associated with increased pathogen resistance. The particular nucleotides that are present at the SNP positions are then assayed using the probes. In some embodiments, the assay probes and the reaction conditions are designed such that an assay probe will only hybridize to the reverse complement of a 100% perfectly matched sequence, thereby permitting identification of which allele (s) that are present based upon detection of hybridizations. In some embodiments, the probes are differentially labeled with, for example, fluorophores to permit distinguishing between the two assay probes in a single reaction. Exemplary methods of amplifying include employing a polymerase chain reaction (PCR) or ligase chain reaction (LCR) using a nucleic acid isolated from a soybean plant or germplasm as a template in the PCR or LCR.
In some embodiments, a number of SNP alleles together within a sequence, or across linked sequences, can be used to describe a haplotype for any particular genotype. Ching et al., BMC Genet. 3: 19 (2002) (14 pages); Gupta et al., (2001) Curr Sci. 80:524-535, Rafalski, Plant Sci. 162: 329-333 (2002). In some cases, haplotypes can be more informative than single SNPs and can be more descriptive of any particular genotype. For example, a single SNP may be allele “T” for a specific disease resistant line or variety, but the allele “T” might also occur in the soybean breeding population being utilized for recurrent parents. In this case, a combination of alleles at linked SNPs may be more informative. Once a unique haplotype has been assigned to a donor chromosomal region, that haplotype can be used in that population or any subset thereof to determine whether an individual has a particular gene. The use of automated high throughput marker detection platforms known to those of ordinary skill in the art makes this process highly efficient and effective.
These SNP markers can be used in a marker assisted breeding program to move traits, such as native traits or traits conferred by transgenes or traits conferred by genome editing, into the a desired plant background. As used herein, the term “native trait” refers to a trait already existing in germplasm, including wild relatives of crop species, or that can be produced by recombination of existing traits. For example, progeny plants from a cross between a donor soybean plant comprising in its genome a nucleic acid sequence encoding SEQ ID NO: 1, 2-4, 11 or 12, and a recipient soybean plant not comprising said nucleic acid sequence can be screened to detect the presence of the markers associated with increased pathogen resistance profile. Plants comprising said markers can be selected and verified for increased pathogen resistance as compared to control plants. In embodiments of the invention, plants comprising in their genome a nucleic acid sequence encoding SEQ ID NO: 1, 2-4, 11 or 12 can be identified, detected, or selected using any combination of the “favorable” SNP markers of Table 1 and/or 2.
Also provided herein are the kits and primers that can be used to introduce a polynucleotide sequence as described in this disclosure into a recipient plant or to detect a polynucleotide sequence as described in this disclosure in a plant.
In some embodiments, the kit may also comprise one or more probes having a sequence corresponding to or complementary to a sequence having 80% to 100% sequence identity with a specific region of the transgenic event or gene editing event. In some embodiments, the kit may comprise any reagent and material required to perform the assay or detection method.
In embodiments, molecular marker-based assays can be used to select parent lines for propagation and also to select progeny plants. In example embodiments, such marker-based assays may use any of the SNP markers of Table 1 and/or 2 to identify a plant with a “favorable” allele associated with the increased pathogen resistance trait. In embodiments, primer-based assays may be used to detect for the presence of an amplicon comprising the novel R-gene of the present invention. In example embodiments, such primer-based assays may use the primer pair of SEQ ID NOS: 8-9 or any of the primer pairs listed in Table 6. Further, a probe may be used to detect for the presence of the amplicon, such as using any probe listed in Table 6 or the probe of SEQ ID NO: 10.
In some embodiments, a plant cell, seed, or plant part or harvest product can be obtained from the plant produced as above and the plant cell, seed, or plant part can be screened using methods disclosed above for the evidence of stable incorporation of the polynucleotide. The term “stable incorporation” refers to the integration of a nucleic acid sequence into the genome of a plant and said nucleic acid sequence is capable of being inherited by the progeny thereof. As used herein, the term “plant part” indicates a part of a plant, including single cells and cell tissues such as plant cells that are intact in plants, cell clumps and tissue cultures from which plants can be regenerated. Examples of plant parts include, but are not limited to, single cells and tissues from pollen, ovules, zygotes, leaves, embryos, roots, root tips, anthers, flowers, flower parts, fruits, stems, shoots, cuttings, and seeds; as well as pollen, ovules, egg cells, zygotes, leaves, embryos, roots, root tips, anthers, flowers, flower parts, fruits, stems, shoots, cuttings, scions, rootstocks, seeds, protoplasts, calli, and the like.
In some embodiments, plant products can be harvested from the plant disclosed above and processed to produce processed products, such as flour, soy meal, oil, starch, and the like. These processed products are also within the scope of this invention provided that they comprise a polynucleotide or polypeptide or variant thereof disclosed herein. Other soybean plant products include but are not limited to protein concentrate, protein isolate, soybean hulls, meal, flower, oil and the whole soybean itself.
The present invention provides disease resistant soybean seeds. As discussed above, the methods of the present invention may be utilized to identify, produce and/or select a disease resistant soybean seed. In addition to the methods described above, a disease resistant soybean seed may be produced by any method whereby an R-gene is introduced into the soybean seed, including, but not limited to, transformation, protoplast transformation or fusion, a double haploid technique, embryo rescue, genetic editing (e.g. CRISPR or TALEN or MegaNucleases) and/or by any other nucleic acid transfer system.
In some embodiments, the disease resistant soybean seed comprises a non-naturally occurring variety of soybean. In some embodiments, the soybean seed is at least about 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99% identical to that of an elite variety of soybean.
The disease resistant soybean seed may be produced by a disease resistant soybean plant identified, produced or selected by the methods of the present invention.
A disease resistant soybean seed of the present invention may comprise, be selected by or produced by use of one or more novel R-genes of the present invention.
Provided herein are methods of producing a plant that has increased pathogen resistance by introducing a nucleic acid sequence encoding a polypeptide as provided herein. A nucleic acid sequence may be introduced to a plant cell by various ways, for example, by transformation, by genome modification techniques (such as by genome editing), or by breeding. In one aspect, the plant can be produced by transforming the nucleic acid sequence encoding a polypeptide disclosed above into a recipient plant. In one aspect, the method can comprise editing the genome of the recipient plant so that the resulting plant comprises a polynucleotide encoding a polypeptide disclosed above. In yet another aspect, the method can comprise increasing the expression level and/or activity of the above-mentioned proteins in a recipient plant, for example, by enhancing promoter activity or replacing the endogenous promoter with a stronger promoter. In another aspect, the method can comprise breeding a donor plant comprising a polynucleotide as described above with a recipient plant and selecting for incorporation of the polynucleotide into the recipient plant genome.
In some embodiments, the method comprises transforming a polynucleotide disclosed herein or an active variant or fragment thereof into a recipient plant to obtain a transgenic plant, and said transgenic plant has increased pathogen resistance. Expression cassettes comprising polynucleotides encoding the polypeptides as described above can be used to transform plants of interest.
As used herein, the term “transgenic” and grammatical variations thereof refer to a plant, including any part derived from the plant, such as a cell, tissue or organ, in which a heterologous nucleic acid is integrated into the genome. In specific embodiments, the heterologous nucleic acid is a recombinant construct, vector or expression cassette comprising one or more nucleic acids. In other embodiments, a transgenic plant is produced by a genetic engineering method, such as Agrobacterium transformation. Through gene technology, the heterologous nucleic acid is stably integrated into chromosomes, so that the next generation can also be transgenic. As used herein, “transgenic” and grammatical variations thereof also encompass biological treatments, which include plant hybridization and/or natural recombination.
Transformation results in a transformed plant, including whole plants, as well as plant organs (e.g., leaves, stems, roots, etc.), seeds, plant cells, propagules, embryos and progeny of the same. Plant cells can be differentiated or undifferentiated (e.g., callus, suspension culture cells, protoplasts, leaf cells, root cells, phloem cells, pollen). Transformation may result in stable or transient incorporation of the nucleic acid into the cell. “Stable transformation” is intended to mean that the nucleotide construct introduced into a host cell integrates into the genome of the host cell and is capable of being inherited by the progeny thereof. “Transient transformation” is intended to mean that a polynucleotide is introduced into the host cell and does not integrate into the genome of the host cell.
Methods for transformation typically involve introducing a nucleotide construct into a plant. In some embodiments, the transformation method is an Agrobacterium-mediated transformation. In some embodiments, the transformation method is a biolistic-mediated transformation. Transformation may also be performed by infection, transfection, microinjection, electroporation, microprojection, biolistics or particle bombardment, electroporation, silica/carbon fibers, ultrasound mediated, PEG mediated, calcium phosphate co-precipitation, poly cation DMSO technique, DEAE dextran procedure, Agrobacterium and viral mediated (e.g., Caulimoriviruses, Geminiviruses, RNA plant viruses), liposome mediated and the like.
Transformation protocols as well as protocols for introducing polypeptides or polynucleotide sequences into plants may vary depending on the type of plant or plant cell, i.e., monocot or dicot, targeted for transformation. Methods for transformation are known in the art and include those set forth in U.S. Pat. Nos. 8,575,425; 7,692,068; 8,802,934; and 7,541,517; each of which is herein incorporated by reference. See, also, Rakoczy-Trojanowska, M. (2002) Cell Mol Biol Lett. 7:849-858; Jones et al. (2005) Plant Methods, Vol. 1, Article 5; Rivera et al. (2012) Physics of Life Reviews 9:308-345; Bartlett et al. (2008) Plant Methods 4: 1-12; Bates, G. W. (1999) Methods in Molecular Biology 111:359-366; Binns and Thomashow (1988) Annual Reviews in Microbiology 42:57 Sup′/Sup5-606; Christou, P. (1992) The Plant Journal 2:275-281; Christou, P. (1995) Euphytica 85: 13-27; Tzfira et al. (2004) TRENDS in Genetics 20:375-383; Yao et al. (2006) Journal of Experimental Botany 57:3737-3746; Zupan and Zambryski (1995) Plant Physiology 107:
Methods for transformation of chloroplasts are known in the art. See, for example, Svab et al. (1990) Proc. Natl. Acad. Sci. USA 87(21):8526-8530; Svab and Maliga (1993) Proc. Natl. Acad. Sci. USA 90(3):913-917; Staub and Maliga (1993) EMBO J. 12(2):601-606. The method relies on particle gun delivery of DNA containing a selectable marker and targeting of the DNA to the plastid genome through homologous recombination. Additionally, plastid transformation can be accomplished by transactivation of a silent plastid-borne transgene by tissue-preferred expression of a nuclear-encoded and plastid-directed RNA polymerase. Such a system has been reported in McBride et al. (1994) Proc. Natl. Acad. Sci. USA 91(15):7301-7305.
The cells that have been transformed may be grown into plants in accordance with conventional ways. See, for example, McCormick et al. (1986) Plant Cell Reports 5:81-84. These plants may then be grown, and either pollinated with the same transformed strain or different strains, and the resulting hybrid having constitutive expression of the desired phenotypic characteristic identified. Two or more generations may be grown to ensure that expression of the desired phenotypic characteristic is stably maintained and inherited and then seeds harvested to ensure expression of the desired phenotypic characteristic has been achieved. In this manner, the present invention provides transformed seed (also referred to as “transgenic seed”) having a nucleotide construct of the invention, for example, an expression cassette of the invention, stably incorporated into their genome.
In some embodiments, the method comprises crossing a donor plant comprising a polynucleotide encoding a polypeptide disclosed herein with a recipient plant, and the polypeptide is able to confer increased pathogen resistance in the recipient plant. As used herein, the terms “crossing” and “breeding” refer to the fusion of gametes to produce progeny (e.g., by fertilization, such as to produce seed by pollination in plants). In some embodiments, a “cross,” “breeding,” or “cross-fertilization” is fertilization of one individual by another (e.g., cross-pollination in plants). The plant disclosed herein may be a whole plant, or may be a plant cell, seed, or tissue, or a plant part such as leaf, stem, pollen, or cell that can be cultivated into a whole plant.
In some embodiments, a progeny plant created by the crossing or breeding process is repeatedly crossed back to one of its parents through a process referred to herein as “backcrossing”. In a backcrossing scheme, the “donor” parent refers to the parental plant with the desired gene or locus to be introgressed. The “recipient” parent (used one or more times) or “recurrent” parent (used two or more times) refers to the parental plant into which the gene or locus is being introgressed. For example, see Ragot, M. et al. Marker-assisted Backcrossing: A Practical Example, in Techniques et Utilisations des Marqueurs Moleculaires Les Colloques, Vol. 72, pp. 45-56 (1995); and Openshaw et al., Marker-assisted Selection in Backcross Breeding, in Proceedings of the Symposium “Analysis of Molecular Marker Data,” pp. 41-43 (1994). The initial cross gives rise to the F1 generation. The term “BC1” refers to the second use of the recurrent parent, “BC2” refers to the third use of the recurrent parent, and so on.
In some embodiments, the donor soybean plant is a Glycine max plant. In some embodiments, the donor soybean plant is a Glycine soja plant. In some embodiments, the recipient soybean plant is an elite Glycine max plant or an elite Glycine soja plant.
In some embodiments, the polynucleotide sequences provided herein can be targeted to specific sites within the genome of a recipient plant cell. Such methods include, but are not limited to, meganucleases designed against the plant genomic sequence of interest CRISPR-Cas9, TALENs, and other technologies for precise editing of genomes (Feng, et al. Cell Research 23: 1229-1232, 2013, WO 2013/026740); Cre-lox site-specific recombination; FLP-FRT recombination (Li et al. (2009) Plant Physiol 151:1087-1095); Bxbl-mediated integration (Yau et al. Plant J (2011) 701: 147-166); zinc-finger mediated integration (Wright et al. (2005) Plant J 44:693-705); Cai et al. (2009) Plant Mol Biol 69:699-709); and homologous recombination (Lieberman-Lazarovich and Levy (2011) Methods Mol Biol: 51-65); prime editing and transposases (Anzalone, A. et al., Nat Biotechnol. 2020 July; 38(7):824-844); translocation; and inversion.
Various embodiments of the methods described herein use gene editing. In some embodiments, gene editing is used to mutagenize the genome of a plant to produce plants having one or more of the polypeptides that is able to confer increased pathogen resistance.
In some embodiments, provided herein are plants transformed with and expressing gene-editing machinery as described above, which, when crossed with a target plant, result in gene editing in the target plant. In general, gene editing may involve transient, inducible, or constitutive expression of the gene editing components or systems. Gene editing may involve genomic integration or episomal presence of the gene editing components or systems.
Gene editing generally refers to the use of a site-directed nuclease (including but not limited to CRISPR/Cas, zinc fingers, meganucleases, and the like) to cut a nucleotide sequence at a desired location. This may be to cause an insertion/deletion (“indel”) mutation, (i.e., “SDN1”), a base edit (i.e., “SDN2”), or allele insertion or replacement (i.e., “SDN3”). SDN2 or SDN3 gene editing may comprise the provision of one or more recombination templates (e.g., in a vector) comprising a gene sequence of interest that can be used for homology directed repair (HDR) within the plant (i.e., to be introduced into the plant genome). In some embodiments, the gene or allele of interest is one that is able to confer to the plant an improved trait, e.g., increased pathogen resistance, increased ASR resistance, etc. The recombination template can be introduced into the plant to be edited either through transformation or through breeding with a donor plant comprising the recombination template. Breaks in the plant genome may be introduced within, upstream, and/or downstream of a target sequence. In some embodiments, a double strand DNA break is made within or near the target sequence locus. In some embodiments, breaks are made upstream and downstream of the target sequence locus, which may lead to its excision from the genome. In some embodiments, one or more single strand DNA breaks (nicks) are made within, upstream, and/or downstream of the target sequence (e.g., using a nickase Cas9 variant). Any of these DNA breaks, as well as those introduced via other methods known to one of skill in the art, may induce HDR. Through HDR, the target sequence is replaced by the sequence of the provided recombination template comprising a polynucleotide of interest, e.g., SEQ ID NO: 2-4, 11, 12 or a polynucleotide encoding a polypeptide having the sequence of SEQ ID NO: 5 may be provided on/as a template. By designing the system such that one or more single strand or double strand breaks are introduced within, upstream, and/or downstream of the corresponding region in the genome of a plant not comprising the gene sequence of interest, this region can be replaced with the template. In some embodiments, the polynucleotide of interest is operably linked to a promoter and the expression of the polynucleotide of interest controlled by the promoter conferred increased pathogen resistance to the plant. In some embodiments, the promoter is a native promoter or an active variant or fragment thereof as described above. In some embodiments, the native promoter comprises SEQ ID NO: 15.
In some embodiments, mutations in the genes of interest described herein may be generated without the use of a recombination template via targeted introduction of DNA double strand breaks. Such breaks may be repaired through the process of non-homologous end joining (NHEJ), which can result in the generation of small insertions or deletions (indels) at the repair site. Such indels may lead to frameshift mutations causing premature stop codons or other types of loss-of-function mutations in the targeted genes.
In some embodiments, gene editing may involve transient, inducible, or constitutive expression of the gene editing components or systems in the target plant. Gene editing may also involve genomic integration or episomal presence of the gene editing components or systems in the target plant.
In certain embodiments, the nucleic acid modification or mutation is effected by a (modified) zinc-finger nuclease (ZFN) system. The ZFN system uses artificial restriction enzymes generated by fusing a zinc finger DNA-binding domain to a DNA-cleavage domain that can be engineered to target desired DNA sequences. Exemplary methods of genome editing using ZFNs can be found for example in U.S. Pat. Nos. 6,534,261; 6,607,882; 6,746,838; 6,794,136; 6,824,978; 6,866,997; 6,933,113; and 6,979,539.
In certain embodiments, the nucleic acid modification is effected by a (modified) meganuclease, which are endodeoxyribonucleases characterized by a large recognition site (double-stranded DNA sequences of 12 to 40 base pairs). Exemplary method for using meganucleases can be found in U.S. Pat. Nos. 8,163,514; 8,133,697; 8,021,867; 8,119,361; 8,119,381; 8,124,369; and 8,129,134, which are specifically incorporated by reference.
In certain embodiments, the nucleic acid modification is effected by a (modified) CRISPR/Cas complex or system. In certain embodiments, the CRISPR/Cas system or complex is a class 2 CRISPR/Cas system. In certain embodiments, said CRISPR/Cas system or complex is a type II, type V, or type VI CRISPR/Cas system or complex. The CRISPR/Cas system does not require the generation of customized proteins to target specific sequences but rather a single Cas protein can be programmed by an RNA guide (gRNA) to recognize a specific nucleic acid target, in other words the Cas enzyme protein can be recruited to a specific nucleic acid target locus (which may comprise or consist of RNA and/or DNA) of interest using said short RNA guide.
In general, the CRISPR/Cas or CRISPR system is as used herein foregoing documents refers collectively to transcripts and other elements involved in the expression of or directing the activity of CRISPR-associated (“Cas”) genes, including sequences encoding a Cas gene and one or more of, a tracr (trans-activating CRISPR) sequence (e.g. tracrRNA or an active partial tracrRNA), a tracr-mate sequence (encompassing a “direct repeat” and a tracrRNA-processed partial direct repeat in the context of an endogenous CRISPR system), a guide sequence (also referred to as a“spacer” in the context of an endogenous CRISPR system), or “RNA(s)” as that term is herein used (e.g., RNA(s) to guide Cas, such as Cas9, e.g. CRISPR RNA and, where applicable, transactivating (tracr) RNA or a single guide RNA (sgRNA) (chimeric RNA)) or other sequences and transcripts from a CRISPR locus. In general, a CRISPR system is characterized by elements that promote the formation of a CRISPR complex at the site of a target sequence (also referred to as a protospacer in the context of an endogenous CRISPR system). In the context of formation of a CRISPR complex, “target sequence” refers to a sequence to which a guide sequence is designed to have complementarity, where hybridization between a target sequence and a guide sequence promotes the formation of a CRISPR complex. A target sequence may comprise any polynucleotide, such as DNA or RNA polynucleotides.
In certain embodiments, the gRNA is a chimeric guide RNA or single guide RNA (sgRNA). In certain embodiments, the gRNA comprises a guide sequence and a tracr mate sequence (or direct repeat). In certain embodiments, the gRNA comprises a guide sequence, a tracr mate sequence (or direct repeat), and a tracr sequence. In certain embodiments, the CRISPR/Cas system or complex as described herein does not comprise and/or does not rely on the presence of a tracr sequence (e.g. if the Cas protein is Cas12a).
The Cas protein as referred to herein, such as but not limited to Cas9, Cas12a (formerly referred to as Cpf1), Cas12b (formerly referred to as C2c1), Cas13a (formerly referred to as C2c2), C2c3, Cas13b protein, may originate from any suitable source, and hence may include different orthologues, originating from a variety of (prokaryotic) organisms, as is well documented in the art. In certain embodiments, the Cas protein is (modified) Cas9, preferably (modified) Staphylococcus aureus Cas9 (SaCas9) or (modified) Streptococcus pyogenes Cas9 (SpCas9). In certain embodiments, the Cas protein is Cas12a, optionally from Acidaminococcus sp., such as Acidaminococcus sp. BV3L6 Cpf1 (AsCas12a) or Lachnospiraceae bacterium Cas12a, such as Lachnospiraceae bacterium MA2020 or Lachnospiraceae bacterium MD2006 (LBCas12a). See U.S. Pat. No. 10,669,540, incorporated herein by reference in its entirety. Alternatively, the Cas12a protein may be from Moraxella bovoculi AAX08_00205 [Mb2Cas12a] or Moraxella bovoculi AAX11_00205 [Mb3Cas12a]. See WO 2017/189308, incorporated herein by reference in its entirety. In certain embodiments, the Cas protein is (modified) C2c2, preferably Leptotrichia wadei C2c2 (LwC2c2) or Listeria newyorkensis FSL M6-0635 C2c2 (LbFSLC2c2). In certain embodiments, the (modified) Cas protein is C2c1. In certain embodiments, the (modified) Cas protein is C2c3. In certain embodiments, the (modified) Cas protein is Cas13b. Other Cas enzymes are available to a person skilled in the art.
Gene editing methods and compositions are also disclosed in U.S. Pat. Nos. 10,519,456 and 10,285,348 82, the entire content of which is herein incorporated by reference.
The gene-editing machinery (e.g., the DNA modifying enzyme) introduced into the plants can be controlled by any promoter that can drive recombinant gene expression in plants. In some embodiments, the promoter is a constitutive promoter. In some embodiments, the promoter is a tissue-specific promoter, e.g., a pollen-specific promoter or a sperm cell specific promoter, a zygote specific promoter, or a promoter that is highly expressed in sperm, eggs and zygotes (e.g., prOsActin1). Suitable promoters are disclosed in U.S. Pat. No. 10,519,456, the entire content of which is herein incorporated by reference.
In another aspect, provided herein is a method of editing plant genomic DNA. In some embodiments, the method comprises using a first soybean plant expressing a DNA modification enzyme and at least one optional guide nucleic acid as described above to pollinate a target plant comprising genomic DNA to be edited.
The various polynucleotides and variants thereof provided herein can be stacked with one or more polynucleotides encoding a desirable trait such as a polynucleotide that confers, for example, insect, disease or herbicide resistance or other desirable agronomic traits of interest including, but not limited to, traits associated with high oil content; high protein content; increased digestibility; balanced amino acid content; and high energy content. Such traits may refer to properties of both seed and non-seed plant tissues, or to food or feed prepared from plants or seeds having such traits.
As used herein, gene or trait “stacking” is combining desired genes or traits into one transgenic plant line. As one approach, plant breeders stack transgenic traits by making crosses between parents that each have a desired trait and then identifying offspring that have both of these desired traits (so-called “breeding stacks”). Another way to stack genes is by transferring two or more genes into the cell nucleus of a plant at the same time during transformation. Another way to stack genes is by re-transforming a transgenic plant with another gene of interest. For example, gene stacking can be used to combine two different insect resistance traits, an insect resistance trait and a disease resistance trait, or a herbicide resistance trait (such as, for example, Btl1). The use of a selectable marker in addition to a gene of interest would also be considered gene stacking.
In some embodiments, a nucleic acid molecule or vector of the disclosure can include an additional coding sequence for one or more polypeptides or double stranded RNA molecules (dsRNA) of interest for agronomic traits that primarily are of benefit to a seed company, grower or grain processor. A polypeptide of interest can be any polypeptide encoded by a nucleotide sequence of interest. Non-limiting examples of polypeptides of interest that are suitable for production in plants include those resulting in agronomically important traits such as herbicide resistance (also sometimes referred to as “herbicide tolerance”), virus resistance, bacterial pathogen resistance, insect resistance, nematode resistance, or fungal resistance. See, e.g., U.S. Pat. Nos. 5,569,823; 5,304,730; 5,495,071; 6,329,504; and 6,337,431. The polypeptide also can be one that increases plant vigor or yield (including traits that allow a plant to grow at different temperatures, soil conditions and levels of sunlight and precipitation), or one that allows identification of a plant exhibiting a trait of interest (e.g., a selectable marker, seed coat color, relative maturity group, etc.). Various polypeptides of interest, as well as methods for introducing these polypeptides into a plant, are described, for example, in U.S. Pat. Nos. 4,761,373; 4,769,061; 4,810,648; 4,940,835; 4,975,374; 5,013,659; 5,162,602; 5,276,268; 5,304,730; 5,495,071; 5,554,798; 5,561,236; 5,569,823; 5,767,366; 5,879,903, 5,928,937; 6,084,155; 6,329,504 and 6,337,431; as well as US Patent Publication No. 2001/0016956.
Polynucleotides conferring resistance/tolerance to an herbicide that inhibits the growing point or meristem, such as an imidazalinone or a sulfonylurea can also be suitable in some embodiments. Exemplary polynucleotides in this category code for mutant ALS and AHAS enzymes as described, e.g., in U.S. Pat. Nos. 5,767,366 and 5,928,937. U.S. Pat. Nos. 4,761,373 and 5,013,659 are directed to plants resistant to various imidazalinone or sulfonamide herbicides. U.S. Pat. No. 4,975,374 relates to plant cells and plants containing a nucleic acid encoding a mutant glutamine synthetase (GS) resistant to inhibition by herbicides that are known to inhibit GS, e.g., phosphinothricin and methionine sulfoximine. U.S. Pat. No. 5,162,602 discloses plants resistant to inhibition by cyclohexanedione and aryloxyphenoxypropanoic acid herbicides. The resistance is conferred by an altered acetyl coenzyme A carboxylase (ACCase).
Polypeptides encoded by nucleotides sequences conferring resistance to glyphosate are also suitable for the disclosure. See, e.g., U.S. Pat. Nos. 4,940,835 and 4,769,061. U.S. Pat. No. 5,554,798 discloses transgenic glyphosate resistant maize plants, which resistance is conferred by an altered 5-enolpyruvyl-3-phosphoshikimate (EPSP) synthase gene.
Polynucleotides coding for resistance to phosphono compounds such as glufosinate ammonium or phosphinothricin, and pyridinoxy or phenoxy propionic acids and cyclohexones are also suitable. See, European Patent Application No. 0 242 246. See also, U.S. Pat. Nos. 5,879,903, 5,276,268, and 5,561,236.
Other suitable polynucleotides include those coding for resistance to herbicides that inhibit photosynthesis, such as a triazine and a benzonitrile (nitrilase) See, U.S. Pat. No. 4,810,648. Additional suitable polynucleotides coding for herbicide resistance include those coding for resistance to 2,2-dichloropropionic acid, sethoxydim, haloxyfop, imidazolinone herbicides, sulfonylurea herbicides, triazolopyrimidine herbicides, s-triazine herbicides and bromoxynil. Also suitable are polynucleotides conferring resistance to a protox enzyme, or that provide enhanced resistance to plant diseases; enhanced tolerance of adverse environmental conditions (abiotic stresses) including but not limited to drought, excessive cold, excessive heat, or excessive soil salinity or extreme acidity or alkalinity; and alterations in plant architecture or development, including changes in developmental timing. See, e.g., U.S. Patent Publication No. 2001/0016956 and U.S. Pat. No. 6,084,155.
Additional suitable polynucleotides include those coding for insecticidal polypeptides. These polypeptides may be produced in amounts sufficient to control, for example, insect pests (i.e., insect controlling amounts). It is recognized that the amount of production of an insecticidal polypeptide in a plant necessary to control insects or other pests may vary depending upon the cultivar, type of pest, environmental factors and the like. Polynucleotides useful for additional insect or pest resistance include, for example, those that encode toxins identified in Bacillus organisms. Polynucleotides comprising nucleotide sequences encoding Bacillus thuringiensis (Bt) Cry proteins from several subspecies have been cloned and recombinant clones have been found to be toxic to lepidopteran, dipteran and/or coleopteran insect larvae. Examples of such Bt insecticidal proteins include the Cry proteins such as Cry1Aa, Cry1Ab, Cry1Ac, Cry1B, Cry1C, Cry1D, Cry1Ea, Cry1Fa, Cry3A, Cry9A, Cry9B, Cry9C, and the like, as well as vegetative insecticidal proteins such as Vip1, Vip2, Vip3, and the like. A full list of Bt-derived proteins can be found on the worldwide web at Bacillus thuringiensis Toxin Nomenclature Database maintained by the University of Sussex (see also, Crickmore et al. (1998) Microbiol. Mol. Biol. Rev. 62:807-813).
In embodiments, an additional polypeptide is an insecticidal polypeptide derived from a non-Bt source, including without limitation, an alpha-amylase, a peroxidase, a cholesterol oxidase, a patatin, a protease, a protease inhibitor, a urease, an alpha-amylase inhibitor, a pore-forming protein, a chitinase, a lectin, an engineered antibody or antibody fragment, a Bacillus cereus insecticidal protein, a Xenorhabdus spp. (such as X. nematophila or X. bovienii) insecticidal protein, a Photorhabdus spp. (such as P. luminescens or P. asymobiotica) insecticidal protein, a Brevibacillus spp. (such as B. laterosporous) insecticidal protein, a Lysinibacillus spp. (such as L. sphearicus) insecticidal protein, a Chromobacterium spp. (such as C. subtsugae or C. piscinae) insecticidal protein, a Yersinia spp. (such as Y. entomophaga) insecticidal protein, a Paenibacillus spp. (such as P. propylaea) insecticidal protein, a Clostridium spp. (such as C. bifermentans) insecticidal protein, a Pseudomonas spp. (such as P. fluorescens) and a lignin.
Polypeptides that are suitable for production in plants further include those that improve or otherwise facilitate the conversion of harvested plants or plant parts into a commercially useful product, including, for example, increased or altered carbohydrate content or distribution, improved fermentation properties, increased oil content, increased protein content, modified oil profile, improved digestibility, and increased nutraceutical content, e.g., increased phytosterol content, increased tocopherol content, increased stanol content or increased vitamin content. Polypeptides of interest also include, for example, those resulting in or contributing to a reduced content of an unwanted component in a harvested crop, e.g., phytic acid, or sugar degrading enzymes. By “resulting in” or “contributing to” is intended that the polypeptide of interest can directly or indirectly contribute to the existence of a trait of interest (e.g., increasing cellulose degradation by the use of a heterologous cellulase enzyme).
In some embodiments, the polypeptide contributes to improved digestibility for food or feed. Xylanases are hemicellulolytic enzymes that improve the breakdown of plant cell walls, which leads to better utilization of the plant nutrients by an animal. This leads to improved growth rate and feed conversion. Also, the viscosity of the feeds containing xylan can be reduced. Heterologous production of xylanases in plant cells also can facilitate lignocellulosic conversion to fermentable sugars in industrial processing.
Numerous xylanases from fungal and bacterial microorganisms have been identified and characterized (see, e.g., U.S. Pat. No. 5,437,992; Coughlin et al. (1993) “Proceedings of the Second TRICEL Symposium on Trichoderma reesei Cellulases and Other Hydrolases” Espoo; Souminen and Reinikainen, eds. (1993) Foundation for Biotechnical and Industrial Fermentation Research 8:125-135; U.S. Patent Publication No. 2005/0208178; and PCT Publication No. WO 03/16654). In particular, three specific xylanases (XYL-I, XYL-II, and XYL-III) have been identified in T. reesei (Tenkanen et al. (1992) Enzyme Microb. Technol. 14:566; Torronen et al. (1992) Bio/Technology 10:1461; and Xu et al. (1998) Appl. Microbiol. Biotechnol. 49:718).
In other embodiments, a polypeptide useful for the disclosure can be a polysaccharide degrading enzyme. Plants of this disclosure producing such an enzyme may be useful for generating, for example, fermentation feedstocks for bioprocessing. In some embodiments, enzymes useful for a fermentation process include alpha amylases, proteases, pullulanases, isoamylases, cellulases, hemicellulases, xylanases, cyclodextrin glycotransferases, lipases, phytases, laccases, oxidases, esterases, cutinases, granular starch hydrolyzing enzyme and other glucoamylases.
Polysaccharide-degrading enzymes include: starch degrading enzymes such as α-amylases (EC 3.2.1.1), glucuronidases (E.C. 3.2.1.131); exo-1,4-α-D glucanases such as amyloglucosidases and glucoamylase (EC 3.2.1.3), β-amylases (EC 3.2.1.2), α-glucosidases (EC 3.2.1.20), and other exo-amylases; starch debranching enzymes, such as a) isoamylase (EC 3.2.1.68), pullulanase (EC 3.2.1.41), and the like; b) cellulases such as exo-1,4-3-cellobiohydrolase (EC 3.2.1.91), exo-1,3-β-D-glucanase (EC 3.2.1.39), β-glucosidase (EC 3.2.1.21); c) L-arabinases, such as endo-1,5-α-L-arabinase (EC 3.2.1.99), α-arabinosidases (EC 3.2.1.55) and the like; d) galactanases such as endo-1,4-β-D-galactanase (EC 3.2.1.89), endo-1,3-β-D-galactanase (EC 3.2.1.90), α-galactosidase (EC 3.2.1.22), β-galactosidase (EC 3.2.1.23) and the like; e) mannanases, such as endo-1,4-β-D-mannanase (EC 3.2.1.78), β-mannosidase (EC 3.2.1.25), α-mannosidase (EC 3.2.1.24) and the like; f) xylanases, such as endo-1,4-β-xylanase (EC 3.2.1.8), β-D-xylosidase (EC 3.2.1.37), 1,3-β-D-xylanase, and the like; and g) other enzymes such as α-L-fucosidase (EC 3.2.1.51), α-L-rhamnosidase (EC 3.2.1.40), levanase (EC 3.2.1.65), inulanase (EC 3.2.1.7), and the like. In one embodiment, the α-amylase is the synthetic α-amylase, Amy797E, described is U.S. Pat. No. 8,093,453, herein incorporated by reference in its entirety.
Further enzymes which may be used with the disclosure include proteases, such as fungal and bacterial proteases. Fungal proteases include, but are not limited to, those obtained from Aspergillus, Trichoderma, Mucor and Rhizopus, such as A. niger, A. awamori, A. oryzae and M. miehei. In some embodiments, the polypeptides of this disclosure can be cellobiohydrolase (CBH) enzymes (EC 3.2.1.91). In one embodiment, the cellobiohydrolase enzyme can be CBH1 or CBH2.
Other enzymes useful with the disclosure include, but are not limited to, hemicellulases, such as mannases and arabinofuranosidases (EC 3.2.1.55); ligninases; lipases (e.g., E.C. 3.1.1.3), glucose oxidases, pectinases, xylanases, transglucosidases, alpha 1,6 glucosidases (e.g., E.C. 3.2.1.20); esterases such as ferulic acid esterase (EC 3.1.1.73) and acetyl xylan esterases (EC 3.1.1.72); and cutinases (e.g., E.C. 3.1.1.74).
Double stranded RNA molecules useful with the disclosure include but are not limited to those that suppress target insect genes. As used herein the words “gene suppression”, when taken together, are intended to refer to any of the well-known methods for reducing the levels of protein produced as a result of gene transcription to mRNA and subsequent translation of the mRNA. Gene suppression is also intended to mean the reduction of protein expression from a gene or a coding sequence including posttranscriptional gene suppression and transcriptional suppression. Posttranscriptional gene suppression is mediated by the homology between of all or a part of a mRNA transcribed from a gene or coding sequence targeted for suppression and the corresponding double stranded RNA used for suppression and refers to the substantial and measurable reduction of the amount of available mRNA available in the cell for binding by ribosomes. The transcribed RNA can be in the sense orientation to effect what is called co-suppression, in the anti-sense orientation to effect what is called anti-sense suppression, or in both orientations producing a dsRNA to effect what is called RNA interference (RNAi). Transcriptional suppression is mediated by the presence in the cell of a dsRNA, a gene suppression agent, exhibiting substantial sequence identity to a promoter DNA sequence or the complement thereof to effect what is referred to as promoter trans suppression. Gene suppression may be effective against a native plant gene associated with a trait, e.g., to provide plants with reduced levels of a protein encoded by the native gene or with enhanced or reduced levels of an affected metabolite. Gene suppression can also be effective against target genes in plant pests that may ingest or contact plant material containing gene suppression agents, specifically designed to inhibit or suppress the expression of one or more homologous or complementary sequences in the cells of the pest. Such genes targeted for suppression can encode an essential protein, the predicted function of which is selected from the group consisting of muscle formation, juvenile hormone formation, juvenile hormone regulation, ion regulation and transport, digestive enzyme synthesis, maintenance of cell membrane potential, amino acid biosynthesis, amino acid degradation, sperm formation, pheromone synthesis, pheromone sensing, antennae formation, wing formation, leg formation, development and differentiation, egg formation, larval maturation, digestive enzyme formation, hemolymph synthesis, hemolymph maintenance, neurotransmission, cell division, energy metabolism, respiration, and apoptosis.
Non-limiting embodiments of the invention include proteins and nucleic acids that confer increased pathogen resistance when expressed. In embodiments, a polypeptide is selected from: (a) a polypeptide having the amino acid sequence shown in SEQ ID NO: 5, or any portion thereof, and having a heterologous amino acid sequence attached thereto, wherein expression of the polypeptide or portion thereof confers increased pathogen resistance on a plant; (b) a polypeptide comprising the amino acid sequence of SEQ ID NO: 5, and having substitution and/or deletion and/or addition of one or more amino acid residues, wherein expression of the polypeptide confers increased pathogen resistance on the plant; (c) a polypeptide having more than 99%, 98%, 97%, 96%, 95%, 94%, 93%, 92%, 91%, 90%, 85%, or more than 80% sequence identity with the amino acid sequence of SEQ ID NO: 5, wherein the polypeptide when expressed in a plant confers increased pathogen resistance on the plant; or (d) a fusion polypeptide comprising the amino acid sequence of SEQ ID NO: 5, or the polypeptide as defined in any one of (a) to (c). In embodiments, a nucleic acid molecule comprises (a) a nucleotide sequence encoding a protein having an amino acid sequence sharing at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to SEQ ID NO: 5, wherein said nucleotide sequence comprises a heterologous nucleic acid sequence attached thereto and expression of the nucleic acid molecule confers increased pathogen resistance on the plant; (b) a nucleotide sequence encoding the aforementioned polypeptide; (c) the nucleotide sequence of part (a) comprising a sequence of any one of SEQ ID NOS: 2-4 and 11-12; or (d) the nucleotide sequence of part (a) having at least more than more than 99%, 98%, 97%, 96%, 95%, 94%, 93%, 92%, 91%, 90%, 85%, or more than 80% sequence identity to any one of SEQ ID NOs: 2-4 and 11-12.
Non-limiting embodiments of the invention include expression cassettes, vectors and DNA constructs comprising the aforementioned nucleic acid molecules and/or expressing the aforementioned polypeptides that confer increased pathogen resistance. In embodiments, an expression cassette comprises the aforementioned nucleic acid molecule of the invention or encodes the aforementioned polypeptide of the invention. In embodiments of the expression cassette, the nucleic acid molecule is operably linked to a promoter capable of directing expression in a plant cell. In some embodiments, the promoter is an endogenous promoter. In other embodiments, the promoter is an exogenous promoter. In particular embodiments, the promoter comprises any one of SEQ ID NOS: 13-15. In embodiments, a vector comprises the aforementioned nucleic acid molecule or the expression cassette. In embodiments, a transgenic cell comprises the nucleic acid molecule or the expression cassette of the invention.
Non-limiting embodiments include transgenic plants that have increased pathogen resistance. In embodiments, a plant is provided having stably incorporated into its genome a nucleic acid sequence operably linked to a promoter active in the plant, wherein the nucleic acid sequence encodes a polypeptide having: an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQID NO: 5; or an amino acid sequence set forth in SEQ ID NO: 5, wherein said nucleic acid sequence is heterologous to the plant, and wherein the plant has increased pathogen resistance as compared to a control plant not comprising the nucleic acid sequence. In embodiments of the plant, the nucleic acid sequence comprises at least 85% identity, at least 90% identity, or at least 95% identity to any one of SEQ ID NOs: 2-4 and 11-12; or the nucleic acid sequence is SEQ ID NO: 2, 3, 4, 11 or 12. In embodiments of the plant, the nucleic acid sequence is introduced into the genome by transgenic expression. In embodiments of the plant, the promoter is an endogenous promoter. In particular embodiments, the endogenous promoter comprises at least 90%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to SEQ ID NO: 15. In embodiments of the plant, the promoter is a heterologous promoter comprising at least 90%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to SEQ ID NO: 13 or 14. In embodiments of the plant, the promoter is a constitutive promoter, inducible promoter, or a tissue-specific promoter. In particular embodiments, the plant is a dicot plant, such as soybean plant or an elite soybean plant. In other embodiments, the plant is a monocot plant, such as a monocot plant is selected from the group consisting of rice, wheat, maize, and sugar cane. In embodiments, the plant is an agronomically elite plant having a commercially significant yield and/or commercially susceptible vigor, seed set, standability, threshability, abiotic/biotic resistance, or herbicide tolerance. In any of the aforementioned plant embodiments, the plant has increased resistance to any one of the following pathogens: soy cyst nematode, bacterial pustule, root knot nematode, frog eye leaf spot, phytophthora, brown stem rot, nematode, Asian Soybean Rust, smut, Golovinomyces cichoracearum, Erysiphe cichoracearum, Blumeria graminis, Podosphaera xanthii, Sphaerotheca fuliginea, Pythium ultimum, Uncinula necator, Mycosphaerella pinodes, Magnaporthe grisea, Bipolaris oryzae, Magnaporthe grisea, Rhizoctonia solani, Phytophthora sojae, Schizaphis graminum, Bemisia tabaci, Rhopalosiphum maidis, Deroceras reticulatum, Diatraea saccharalis, Schizaphis graminum, Myzus persicae, Sclerotinia sclerotiorum, Macrophomina phaseolina, or Fusarium virguliforme. In particular embodiments, the plant is a soybean plant that has increased resistance to ASR as compared to the control plant.
Non-limiting embodiments of the invention further include gene edited plants with increased pathogen resistance. Embodiments of gene edited plants include a plant, the genome of which has been edited to comprise a nucleic acid sequence encoding at least one polypeptide having at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% to SEQ ID NO: 5, wherein said polypeptide confers increased pathogen resistance relative to a control plant, wherein the plant does not comprise said nucleic acid sequence before the genome editing. In embodiments of the plant, the nucleic acid sequence is introduced into said plant genome by genome editing of the nucleic acid sequence set forth in any one of SEQ ID NOS: 1, 2, 3 4, 11 and/or 12. In embodiments, the genome editing comprises duplication, inversion, promoter modification, terminator modification and/or splicing modification of the nucleic acid sequence. In particular embodiments, the genome editing is accomplished through CRISPR, TALEN, meganucleases, or through modification of genomic nucleic acids. In embodiments, the gene edited plant is an agronomically elite plant having a commercially significant yield and/or commercially susceptible vigor, seed set, standability, threshability, abiotic/biotic resistance, or herbicide tolerance. In embodiments, the nucleic acid sequence is operably linked to a heterologous promoter, and wherein the heterologous promoter is active in the plant. In particular embodiments, the heterologous promoter active in the plant has at least 95% sequence identity to one of SEQ ID NOS: 13 and 14. In other embodiments, the heterologous promoter is a native promoter or active variant or fragment thereof, and wherein optionally the native promoter has at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to SEQ ID NO: 15. In particular embodiments, the plant is a soybean plant having increased resistance to Asian Soy Rust relative to the control plant.
Non-limiting embodiments further include Glycine max plants with increased pathogen resistance. In embodiments, an elite Glycine max plant is provided having in its genome a nucleic acid sequence from a donor Glycine plant, wherein the donor Glycine plant is a different strain from the elite Glycine max plant, and wherein the nucleic acid sequence encodes at least one polypeptide having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to SEQ ID NO: 5, wherein said polypeptide confers increased pathogen resistance on the elite Glycine max plant as compared to a control plant not comprising said nucleic acid sequence. In embodiments, the donor Glycine plant is a Glycine tomentella plant, or a progeny thereof. In particular embodiments, the Glycine tomentella plant is a plant of Glycine tomentella accession line PI505267 or a progeny thereof. In further embodiments, the nucleic acid sequence has at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to any one of SEQ ID NOs: 2-4 and 11-12. In still other embodiments, the nucleic acid sequence has at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identity to SEQ ID NO: 1, or a functional fragment thereof, wherein the functional fragment comprises at least at least 10%; at least 15%; at least 20%; at least 25%; at least 30%; at least 35%; at least 40%; at least 45%; at least 50%; at least 55 at least 60%; at least 65%; at least 70%; at least 75%; at least 80%; at least 85%; at least 90%; at least 95%; 96%, 97%, 98%, or 99% of SEQ ID NO: 1 and confers increased pathogen resistance. In embodiments, the nucleic acid sequence comprises a SNP marker associated with increased ASR resistance, wherein said SNP marker is any of the favorable markers of Table 1 and/or 2. In particular embodiments, the nucleic acid sequence from the donor glycine plant is inserted into chromosome 3 of the plant. In embodiments, said nucleic acid sequence is introduced into said plant genome by genome editing of genomic sequences corresponding to and comprising any one of SEQ ID NOs: 1, 2-4, and 11-12, wherein the genome editing confers the elite plant with enhanced resistance to the pathogen, and wherein the gene editing is by CRISPR, TALEN, meganucleases, or through modification of genomic nucleic acids. In other embodiments, said nucleic acid sequence is introduced into said plant genome by transgenic expression of: (a) a nucleic acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identity to any one of SEQ ID NOS: 2-4 and 11-12, (b) a nucleic acid sequence encoding a polypeptide having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identity to SEQ ID NO: 5; or (c) a nucleic acid sequence encoding a polypeptide having the sequence of SEQ ID NO: 5; wherein said polypeptide confers enhanced pathogen resistance on the elite Glycine max plant. In particular embodiments, said nucleic acid sequence is introgressed into the genome of said plant through the use of one or more of: (a) chemically induced chromosome doubling; and (b) doubling of an elite Glycine max line to obtain a doubled Glycine max plant before crossing the doubled plant with a Glycine tomentella plant derived from accession line PI505267 or a progeny thereof, as described in Example 3, the Glycine tomentella plant comprising said nucleic acid sequence. In embodiments, the plant has increased resistance to any one or more of the following pathogens: soy cyst nematode, bacterial pustule, root knot nematode, frog eye leaf spot, phytophthora, brown stem rot, nematode, Asian Soybean Rust, smut, Golovinomyces cichoracearum, Erysiphe cichoracearum, Blumeria graminis, Podosphaera xanthii, Sphaerotheca fuliginea, Pythium ultimum, Uncinula necator, Mycosphaerella pinodes, Magnaporthe grisea, Bipolaris oryzae, Magnaporthe grisea, Rhizoctonia solani, Phytophthora sojae, Schizaphis graminum, Bemisia tabaci, Rhopalosiphum maidis, Deroceras reticulatum, Diatraea saccharalis, Schizaphis graminum, Myzus persicae, Sclerotinia sclerotiorum, Macrophomina phaseolina, or Fusarium virguliforme. In particular embodiments, the plant has increased resistance to Asian Soybean Rust. In embodiments, elite Glycine max plant is an agronomically elite Glycine max plant having a commercially significant yield and/or commercially susceptible vigor, seed set, standability, threshability, abiotic/biotic resistance, or herbicide tolerance.
Non-limiting embodiments of the invention include plants, plant parts and products with increased pathogen resistance. In embodiments, a progeny plant from any of the aforementioned plants is provided, wherein the progeny plant has stably incorporated into its genome the nucleic acid sequence of the invention. In embodiment, a plant cell, seed, or plant part is provided that is derived from any of the aforementioned plants, wherein said plant cell, seed or plant part has stably incorporated into its genome the nucleic acid sequence.
Non-limiting embodiments of the invention include methods of producing transgenic plants. In embodiments, use of the aforementioned polypeptide or nucleic acid molecule or expression cassette or vector or transgenic cell of the invention is provided in conferring increased resistance to Asian Soy Rust (ASR). In particular embodiments, the method comprises use of the expression cassette of the invention in a cell, wherein the expression level and/or activity of the polypeptide in the cell is increased, and the resistance of the cell to Asian Soy Rust is enhanced. Embodiments include a method for improving the resistance of a plant against ASR, comprising increasing the expression level and/or activity of the polypeptide of the invention in the plant. In embodiments, the increasing comprises increasing the expression level and/or activity of the nucleic acid molecule of the invention in the plant. In embodiments, the increasing the expression level and/or activity in the plant is realized by transgenic means or by breeding. Embodiments are provided for a method for producing a transgenic plant with improved resistance against ASR, comprising: introducing the nucleic acid molecule or the expression cassette of the invention to a recipient plant to obtain a transgenic plant, wherein the transgenic plant has increased resistance against ASR compared to the recipient plant.
Non-limiting embodiments include methods of producing plants with increased pathogen resistance including by breeding methods. In embodiments, a method of producing a soybean plant having increased pathogen resistance comprises the steps of: a) providing a donor soybean plant comprising in its genome a nucleic acid sequence encoding at least one polypeptide having at least 90% identity or 95% identity to SEQ ID NO: 5, wherein said nucleic acid sequence confers to said donor soybean plant increased pathogen resistance as compared to another donor soybean plant not comprising said nucleic acid sequence in its genome; b) crossing the donor soybean plant of a) with a recipient soybean plant not comprising said nucleic acid sequence; and c) selecting a progeny plant from the cross of b) by detecting the presence of the nucleic acid sequence, or the presence of one or more molecular markers associated with the nucleic acid sequence in the progeny plant, thereby producing a soybean plant having increased pathogen resistance. In embodiments of the method, the molecular marker is a single nucleotide polymorphism (SNP), a quantitative trait locus (QTL), an amplified fragment length polymorphism (AFLP), randomly amplified polymorphic DNA (RAPD), a restriction fragment length polymorphism (RFLP) or a microsatellite. In particular embodiments, the molecular marker is at least one favorable SNP marker selected from Table 1 and/or Table 2, or a molecular marker located within 20 cM, 10 cM, 5 cM, 1 cM, or 0.5 cM of a favorable SNP marker selected from Table 1 or Table 2. In embodiments, one or more of the donor soybean plant and the recipient soybean plant is an elite Glycine max plant. Embodiments include a method for producing a Glycine max plant having increased resistance to ASR, the method comprising the steps of: providing a Glycine tomentella plant line, or progeny thereof comprising a nucleic acid sequence encoding at least one polypeptide having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identity to SEQ ID NO: 5; carrying out the embryo rescue method essentially as described in U.S. Pat. No. 7,842,850 or transgenically; collecting the seeds resulting from the method of b); and regenerating the seeds of c) into plants. In particular embodiments, the Glycine tomentella plant line is accession line PI505267, or a progeny thereof. In embodiments, the nucleic acid sequence is: a nucleic acid sequence comprising at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identity to any one of SEQ ID NOs: 2-4 and 11-12; or the nucleic acid sequence of SEQ ID NO: 2, 3, 4, 11 or 12;
Embodiments include a method of producing a Glycine max plant with increased resistance to Asian Soy Rust (ASR), the method comprising the steps of: a) isolating a nucleic acid from a Glycine max plant; b) detecting in the nucleic acid of a) at least one molecular marker associated with a nucleic acid sequence comprising any one of SEQ ID NO: 2-4, wherein said nucleic acid sequence confers to the Glycine max plant increased ASR resistance; c) selecting a Glycine max plant based on the presence of the molecular marker detected in b); and d) producing a Glycine max progeny plant from the plant of c) identified as having said molecular marker associated with increased ASR resistance. In embodiments, the molecular marker is a favorable SNP marker selected from Table 1 or Table 2, or a molecular marker located within 20 cM, 10 cM, 5 cM, 1 cM, or 0.5 cM of a favorable SNP marker selected from Table 1 and/or Table 2. In embodiments, the detecting comprises amplifying a molecular marker locus or a portion of the molecular marker locus and detecting the resulting amplified molecular marker amplicon. In embodiments, the amplifying comprises employing a polymerase chain reaction (PCR) or ligase chain reaction (LCR) using a nucleic acid isolated from a soybean plant or germplasm as a template in the PCR or LCR. In particular embodiments, the amplifying further comprises employing a primer pair selected from the group comprising: the primer pair of SEQ ID NOS: 8-9; and a primer pair from the primers of Table 6. In particular embodiments, the detecting further comprises employing a nucleic acid probe selected from the group comprising: the probe of SEQ ID NO: 10 and a probe from the probes of Table 6. In embodiments, the nucleic acid is DNA or RNA. In embodiments, a plant is provided produced by any of the aforementioned methods.
Non-limiting embodiments include a method of conferring increased ASR resistance to a plant, comprising: a) introducing into the genome of the plant a nucleic acid molecule operably linked to a promoter active in the plant, wherein the nucleic acid sequence is stably incorporated into the genome, wherein the nucleic acid sequence encodes a polypeptide having (i) an amino acid sequence comprising at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 5, or (ii) an amino acid sequence as set forth in SEQ ID NO: 5, wherein said nucleic acid sequence is heterologous to the plant, and wherein expression of said nucleic acid sequence increases ASR resistance compared to a control plant not expressing said nucleic acid sequence. In embodiments, the nucleic acid sequence is introduced into the genome of the plant by transformation. In other embodiments, the nucleic acid sequence is introduced into the genome of the plant by crossing a donor plant comprising the nucleic acid sequence with the plant to produce a progeny plant having increased ASR resistance. In particular embodiments, the nucleic acid sequence is inserted into chromosome 3. In particular embodiments, the promoter is an exogenous promoter, and wherein optionally the exogenous promoter comprises SEQ ID NO: 13 or 14. In other embodiments, the promoter is an endogenous promoter, and wherein optionally the endogenous promoter comprises SEQ ID NO: 15. In embodiments, the method further comprises screening for the introduced nucleic acid sequence with PCR and/or sequencing. In particular embodiments, the plant is a dicot plant, and wherein the dicot plant is a soybean plant. In other embodiments, the plant is a monocot plant selected from the group consisting of rice, wheat, maize, and sugar cane. In embodiments, a plant is produced by any of the aforementioned methods.
In non-limiting embodiments, a primer pair is provided for amplifying the nucleic acid molecule of the invention. In particular embodiments, the primer pair is the primer pair of SEQ ID NOS: 8-9 or a primer pair selected from the primers of Table 6. In embodiments, a primer diagnostic for ASR resistance is provided, wherein said primer can be used in a PCR reaction to indicate the presence of an allele associated with ASR resistance, wherein said allele is any favorable allele as described in Table 1 and/or Table 2 and wherein said primer is any primer selected from the primers of Table 6.
The following examples are not intended to be a detailed catalog of all the different ways in which the present invention may be implemented or of all the features that may be added to the present invention. Persons skilled in the art will appreciate that numerous variations and additions to the various embodiments may be made without departing from the present invention. Hence, the following descriptions are intended to illustrate some particular embodiments of the invention, and not to exhaustively specify all permutations, combinations and variations thereof.
Wild glycine lines were evaluated for ASR resistance against sixteen rust strains collected across a diverse range of environments. The rust data were generated using single pustule derived isolates from USDA-ARS (FL Q09, FL Q12, LABR13, FLQ11) and field populations (FL Q15, NC06, Vero, GLC15, UBL, BR south and BR central). The screening was carried out in contained facilities. Each wild glycine line was evaluated over a multiple day course of infection and rated at various time points. The rating and evaluation were performed using methods well known in the art, based upon Burdon and Speer (Euphytica, 33: 891-896, 1984; also TAG, 1984). Each accession line of interest was screened >2 times with ˜4 plants each time in North & South America using the large diverse panel of rust isolates. Based on the analysis of the rust data, wild Glycine tomentella accession line PI505267 was determined to be an ASR resistant wild glycine line of interest.
Chromosome discovery for causal loci in the tetraploid soybean population, PI505267, was carried out using Data2Bio's Genomic Bulked Segregant Analysis (gBSA) technology (Ames, IA). Data2Bio generated several libraries from DNA samples extracted from two susceptible tissue pools and one resistant tissue pool and sequenced these in eight (8) Illumina HiSeq2000 2×100 bp Paired-End (PE) lanes (San Diego, CA). Processing of raw data including quality trimming, alignment, SNP discovery and SNP impact was performed. After various filtering steps, a plurality of informative SNPs were identified in the PI505267 genome that significantly associated with ASR resistance. A Bayesian approach was then used to calculate trait-associated probabilities. Next, a physical map of trait-associated SNPs (probability cutoff 0.01) for the top contigs were created. Clustering of the SNPs indicated that the ASR resistance loci is located on or near one particular scaffold, Contig 0133 (SEQ ID NO: 1), which was identified and mapped to chromosome 3 of Glycine tomentella accession line PI505267. The context sequences associated with the SNPs from this scaffold were aligned to the public G. max genome to create a chromosome-level understanding of the mapping interval. Data indicated that the causative R-gene(s) may map within or near the interval from 9.28 to 16.48 MB of Contig 0133 on chromosome 3. Genes from this interval were expected to encode polypeptide(s) that may be transgenically expressed or genetically modified (i.e., gene edited via TALEN or CRISPR) in plants to confer disease resistance (e.g., Asian Soy Rust (ASR) resistance).
A listing of single nucleotide polymorphisms (SNPs) within SEQ ID NO: 1 that associate with ASR resistance is provided below at Tables 1 and 2. SNPs were identified and verified by crossing the ASR resistant line PI505267 with two different susceptible female lines. All alleles for the identified SNPs were determined to be significantly linked with resistance or susceptibility (p<0.05). Detection of the presence of a molecular marker, such as any of the favorable markers of Tables 1 and/or 2, in the nucleic acid isolated from a plant, can be used to identify or select a plant as having the ASR resistance allele derived from the ASR resistant Glycine tomentella line, such as due to the introgression of the chromosomal interval of SEQ ID NO: 1, or a functional fragment thereof (such as functional fragment of the chromosomal interval of SEQ ID NO: 1 comprising the causative R-gene).
Known methods of introgression from wild glycine species involve doubled F1 plants (FiD). However, such methods tend to be inefficient with a low number of infertile hybrids being produced. Further, few hybrids survive the subsequent chromosome doubling process wherein the chromosomes of the infertile hybrid are doubled by a chemical agent (typically colchicine) to make it fertile.
In another method of introgression from wild glycine species, as described below and illustrated at
Doubled soy lines (tetraploid soy) were generated from two ASR susceptible G. max elite lines, herein referred to as Female 1 and Female 2 (two Syngenta proprietary lines). The double susceptible G. max line had 2n=40 chromosomes (GiGi genome) and after doubling is in a tetraploid state as 4n=80. In comparison, the G. tomentella resistant parent has 2n=78 or 40 chromosomes (D3E1 or D genome, respectively). For the doubling, immature soybean embryos of the G. max lines in tissue culture medium were treated with approximately 0.25-1.0 Omg/ml colchicine for 3-4 days at 25° C. Regenerated plants were transferred to soil, and leaf samples were taken for ploidy analysis to confirm chromosome doubling. Tetraploid plants were allowed to self, and ploidy analysis was performed on embryos to confirm doubling. An unlimited seed supply was produced by allowing the tetraploid soy to self.
Flower buds prior to anthesis were prepared for the doubled female lines by gently removing sepals and petals to expose the mature stigma. Pollen from freshly opened flowers of G. tomentella (2n=78 or 40) was obtained by gently removing the petals to expose the mature anthers and dusting the pollen onto the soybean stigma. The crosses are shown at Table 3.
G. tomentella;
G. max; 2n = 40
G. tomentella;
G. max; 2n = 40
Dicamba, a synthetic auxin herbicide (FeXapan, Corteva Agriscience, Wilmington, DE), was sprayed on the tetraploid x Glycine crosses to produce pod and embryo formation. Dicamba was sprayed at a 3 to 20 mg/L concentration. A spray bottle or atomizer was used to achieve good saturation of the pollinated gynoecia and the node to which it was attached.
Multiple B1 embryos and a B1 plant was generated by crossing the doubled susceptible G. max parent with the resistant G. tomentella parent (these plants would have been the FlD plants if standard introgression was used). The cross was then verified via TaqMan assays (Applied Biosystems, Waltham, MA). In one particular example, the TaqMan assays depicted at Table 4 were used to confirm the presence of the chromosomal interval associated with ASR resistance (that is, the interval comprising SEQ ID NO: 1) in the hybrid plants.
It is well known in the art that given the sequence and the SNP allele associated with a given trait (e.g. ASR resistance), one having ordinary skill in the art could develop oligonucleotide primers and use said primers to identify plants carrying the chromosomal interval depicted in SEQ ID NO: 1, or a functional fragment thereof comprising the causative gene. A TAQMAN® assay (e.g. generally a two-step allelic discrimination assay or similar), a KASP™ assay (generally a one-step allelic discrimination assay defined below or similar), or both can be employed to assay the SNPs as disclosed herein. In an exemplary two-step assay, a forward primer, a reverse primer, and two assay probes (or hybridization oligos; herein also referred to as assay primers or assay probes) are employed (see SEQ ID NOs: 22-237 detailed at Tables 5-6). The forward and reverse primers are employed to amplify genetic loci that comprise SNPs that are associated with ASR resistance loci. The particular nucleotides that are present at the SNP positions are then assayed using the assay primer, which in each pair differ from each other with respect to the nucleotides that are present at the SNP position (although it is noted that in any given pair, the primers can differ in their 5′ or 3′ ends without impacting their abilities to differentiate between nucleotides present at the corresponding SNP positions). In some embodiments, each pair of assay primers are differentially labeled with, for example, fluorophores to permit distinguishing between the two assay probes in a single reaction. In some embodiments, the assay primers and the reaction conditions are designed such that an assay primer will only hybridize to the reverse complement of a 100% perfectly matched sequence, thereby permitting identification of which allele(s) is/are present based upon detection of hybridizations.
Table 5 provides a list of example assay IDs, wherein each assay ID corresponds to a particular SNP position within the chromosomal interval represented by SEQ ID NO: 1. The assays are designed to differentiate between favorable and unfavorable alleles associated with a given SNP position, as indicated.
Table 6 provides a list and sequence of the assay components used in each of the assays listed in Table 5. Particularly, Table 6 lists the sequences of the specific forward and reverse primers as well as the sequence and combination of fluorophores used for each of the assays. In the listing of the assay components, the assay component ID indicates the associated assay ID (Table 5) and the nature of the component (whether it is a probe or a primer). The suffix F2 indicates that the corresponding sequence is for a forward primer, the suffix R1 indicates that the corresponding sequence is for a forward primer, the suffix FM indicates that the corresponding sequence is for an assay probe having the FAM fluorophore, and the suffix TT indicates that the corresponding sequence is for an assay probe having the TET fluorophore. For example, “S21399A1FM”, “S21399A1TT”, “S21399F2” and “S21399R1” refer, respectively, to the FAM probe, TET probe, forward primer, and reverse primer for Assay ID S21399 used for identification of the allele corresponding to the SNP at position 10832017.
Introgression of the R gene intervals into G. max can alternatively be achieved using embryo rescue and chemical doubling. Therein, first an infertile hybrid of G. max and G. tomentella must be produced, which is an inefficient process resulting in low numbers of infertile hybrids. Next, embryo rescue must be performed and chemical treatment applied in order to generate amphidiploid shoots. If the amphidiploid plants are fertile, they are used to backcross with G. max for several generations in order to gradually eliminate the perennial Glycine chromosomes.
In one example, a wide cross is performed wherein Elite Syngenta soybean lines (RM 3.7 to 4.8) are used as the females (pollen recipients) and the recited accession line of Glycine tomentella is used as the males or pollen donors. Next, flowers are selected from the glycine plant containing anthers at the proper developmental stage. New, fully opened, brightly colored flowers hold anthers with mature pollen. The pollen appears as loose, yellow dust. These flowers are removed from the glycine plant and taken to the soybean plant for pollination. Pollen from the Glycine plants is generally used within 30 minutes of flower removal. Soybean flower buds that are ready for pollination are identified and selected. A soybean flower bud is generally ready when it is larger in size when compared to an immature bud. The sepals of the soybean blossoms are lighter in color and the petals are just beginning to appear. First, a pair of fine-tipped tweezers are used to carefully detach the sepals from the flower bud to expose the outer set of petals. Then, gently grasping and removing the petals (5 in total) from the flower, the ring of stamens surrounding the pistil is exposed. Since the stigma is receptive to pollen 1 day before the anthers begin shedding pollen it is important to recognize the stage development of “female ready, male not ready”. When pollinating soybean flowers at this developmental stage it is not necessary to emasculate the female flower. Next, the stigma is located on the soybean flower. Then using 1 male flower, the petals are carefully peeled off to expose the anthers and the pollen grains are gently dusted onto the stigma of the soybean flower. Care is taken not to damage the stigma at any time during this process. Starting the day after pollination, a hormone mixture is sprayed onto the pollinated flower and eventual developing F1 pod one time every day until harvest. The pollinated flower or pod is saturated with a light mist of the hormone mixture, taking care not to cause the flower/pod to prematurely detach from the plant. The mixture contains 100 mg GA3, 25 mg NAA and 5 mg kinetin/L distilled water. These hormones aid in the retention of the developing pod and in increased pod growth.
Pods from wide crosses are harvested at approximately 14 to 16 days post pollination. Before selecting an individual pod to harvest, it is verified that the sepals were removed (to indicate a wide cross attempt) and that the seed size is as expected for a wide cross. Pods are collected and counted according to wide cross combination to determine crossing success. The average crossing success may be approximately 40%. The wide cross pods can contain 1 to 3 seeds but generally 2 seeds are found in each F1 pod.
Harvested pods are collected and brought back to the lab to be sterilized. The pods are first rinsed with 70% EtOH for 2 to 3 minutes and then placed in 10% Clorox bleach for an additional 30 minutes on a platform shaker at approximately 130 RPM. Finally, the pods are rinsed multiple times with sterile water to remove any residual bleach. Embryo isolation can begin immediately following pod sterilization or pods can be stored at 4° C. for up to 24 hours prior to embryo isolation. The sterilized pods are next taken to a laminar flow hood where the embryos can be rescued. Individual pods are placed in a sterile petri dish and opened using a scalpel and forceps. An incision is made along the length of the wide cross pod away from the seed. The pod can then be easily opened to expose the seed. Alternatively, two pair of forceps can be used to separate the pod shell. The seed is carefully removed from the pod and placed in a sterile petri dish under the dissection microscope. Very fine forceps are used to isolate the embryo from the seed. With forceps in one hand, the side of the seed away from the embryo is gently held, with hilum facing up. Using another pair of forceps in the other hand, the seed coat is removed from the side of the seed containing the embryo. The membrane surrounding the embryo is peeled off and the embryo is pushed up from the bottom side. Embryos should be past the globular developmental stage and preferably past the early heart developmental stage (middle to late heart stage, cotyledon stage and early maturation stage embryos are desired). Isolated embryos are transferred to embryo rescue medium. Embryos can be treated to induce chromosome doubling at this time. (See below for chromosome doubling details.) Isolated embryos remain on embryo rescue medium for 21 to 30 days at 24° C. Embryos may remain in the dark for the entire incubation on the embryo rescue medium, may begin the incubation in the dark and complete it in the light, or may spend the entire incubation in the light. There is not a callus induction stage in this protocol. Shoots are developed directly from the embryos.
Either colchicine or trifluralin (both, Sigma-Aldrich, St. Louis, MO) can be used to induce chromosome doubling. Ideally, late heart stage wide cross embryos (or larger) are chemically treated to induce chromosome doubling at any time from immediately following isolation up to 1 week post isolation. The doubling agent can be mixed in either solid or liquid medium and applied for several hours or up to a few days. Trifluralin is used at a concentration of 10-40 uM in either solid or liquid media. Additionally, colchicine is used at a concentration of 0.4-1 mg/ml in either solid or liquid media. Following the chemical treatment, the embryos are transferred to fresh embryo rescue medium.
Developing embryos are transferred from rescue medium to germination medium such as Soy ER GSMv2 (i.e. 3.1 g B5 basal salt, Gamborg's, 1 ml B5 vitamins 1000×, 40 g sucrose [C12H22O11], 0.25 g casein hydrolysate, 0.25 ml BAP, 0.75 g MgCl2*6H20, 20 ml glutamine 25 mg/ml, 0.1 g serine [C3H7NO3], 4 ml Asparagine 25 mg/ml and 0.05 ml of IBA 1 mg/ml) for approximately 3 to 5 weeks in the light at 24° C. Alternatively, developing embryos may be transferred from rescue medium to elongation medium such as Soy E1 0 No TCV (i.e., 4.3 g MS Basal salt Mixture [MSPO1], 5 ml MS iron 200×, 30 g Sucrose [C12H22011], 1 g MES [C6H13NO4S], 8 g purified agar, 1 ml B5 vitamins 100×, 2 ml glutamine 25 mg/ml, 0.50 ml zeatin riboside, trans isomers 1 mg/ml, 0.1 ml IAA 1 mg/ml, 0.2 ml GA3 5 mg/ml, 1.5 ml timentin 100 mg/ml, 0.3 ml cefotaxime 250 mg/ml, 0.5 ml vancomycin 100 mg/ml) for approximately 3 to 5 weeks in the light at 24° C. Developing shoots may be transferred from media plates to Phytocons (PhytoTechnology Laboratories, Lenexa, KS) containing either germination or elongation medium for further shoot development. Established shoots are moved to soil. Initial plant care is critical for survival of these shoots.
Ploidy analysis is conducted using a flow cytometer. Leaf tissue for ploidy analysis is collected from small shoots either in culture or after establishment in soil. Tissue is collected on dry ice and stored at −80° C. until analysis or collected on wet ice and analyzed the same day. A sample size of 0.5 cm2 is sufficient. Samples are prepared according to standard techniques. Each sample set contains an untreated F1 plant (not treated to induce chromosome doubling) as a control.
The method described above results in a significantly higher wide cross success rate than is reported in literature. Further, no emasculation of female flowers is required, which saves time and reduces risk of damage to the stigma.
Further genotyping of the G. tomentella chromosomal interval (SEQ ID NO: 1) led to the discovery of three potential causative genes for ASR resistance (herein also referred to as the candidate R-genes) located on chromosome 3 within the disclosed interval. Associations between each of the candidate genes and ASR resistance was validated and the efficacy of each of the genes in conferring ASR resistance was assessed.
One of the genes, herein also referred to as “GtoRG30”, encodes an R-gene with a TNL motif (SEQ ID NOS: 2-4). The sequence of the polypeptide encoded by the R-gene is depicted at SEQ ID NO: 5. The native promoter for this gene is provided at SEQ ID NO: 15. Details regarding the validation and efficacy of the R-gene is provided at Examples 6, 8 and
Additionally, two other R genes encoding a CNL motif were identified, as depicted at SEQ ID NOs: 6 and 7. Both these R-genes were validated.
It is contemplated that the R-genes of the present invention, or variants, fragments, homologs or orthologs thereof, can be employed in a transgenic, gene editing, or breeding method utilizing embryo rescue, tetraploid soy, or other introgression methods, as described above, to generate plants having increased resistance to fungal pathogens including ASR. Additionally, or optionally, nucleic acid molecules comprising the R-gene coding sequence of the present invention (e.g., any of SEQ ID NOS: 2-4, 11-12), or nucleic acid molecules with a nucleotide sequence substantially identical to the R-gene coding sequence of the present invention (e.g., SEQ ID NOs: 2-4, 11-12), or nucleic acid molecules comprising a nucleotide sequence encoding a polypeptide having the amino acid sequence of SEQ ID NO: 5, or a polypeptide having an amino acid sequence substantially identical to SEQ ID NO: 5, can be employed in a transgenic, gene editing, or breeding method utilizing embryo rescue, tetraploid soy, or other introgression methods, as described above to generate pathogen resistant (e.g., ASR resistant) plants.
Given the sequence of the R-gene and the associated trait (e.g. ASR resistance), oligonucleotide primers can be developed and use said primers to identify plants carrying the gene with the nucleotide sequence depicted in SEQ ID NOs: 2-4, 11-12. A TAQMAN® assay (e.g. generally a two-step allelic discrimination assay or similar), a KASP™ assay (generally a one-step allelic discrimination assay defined below or similar), or both can be employed to assay the genes. In an exemplary two-step assay, a primer pair comprising a forward primer and a reverse primer are employed to amplify the gene, or a functional part thereof, associated with conferring ASR resistance. Further, an assay probe (or hybridization oligo) can be employed with the primer pair to detect a target sequence present in the amplified gene. The probe may be labeled with, for example, fluorophores to permit easy detection. Further, the probes may include a minor groove binder (MGB) moiety at the 3′ end that increases the melting temperature (Tm) of the probe and stabilizes probe-target hybrids. This allows the length of the probe to be shortened while still providing sequence discrimination and flexibility to accommodate the target. Further, the probe can include a nonfluorescent quencher (NFQ) to absorb (quench) signal from the fluorescent dye label at the other end of the probe, reducing background noise and improving sensitivity of the probe. In some embodiments, the assay primers and the reaction conditions are designed such that an assay primer will only hybridize to the reverse complement of a 100% perfectly matched sequence, thereby permitting detection of the gene based upon detection of hybridizations.
As a non-limiting example, presence of a nucleic acid molecule comprising the R-gene sequence of any of SEQ ID NOs: 2-4, 11-12 or a sequence having at least 90% sequence identity, at least 95% sequence identity, or at least 99% sequence identity to any of SEQ ID NO: 2-4, 11-12; or a nucleotide sequence encoding the protein of SEQ ID NO: 5 or encoding a protein having at least 90% sequence identity, at least 95% sequence identity, or at least 99% sequence identity to SEQ ID NO: 5, can be detected by generating an amplicon using the primer pair of SEQ ID NOs: 8-9 and/or detected using the probe sequence of SEQ ID NO: 10, the probe comprising a FAM fluorophore at the 5′-end and an MGB and NFQ moiety at the 3′-end. In one example, presence of the R-gene, or a functional part thereof, in a disease resistant plant (e.g., ASR resistant soybean plant) may be identified by isolating nucleic acid molecules from said plant and generating an amplicon comprising at least a portion of the R-gene using the above-mentioned primers and/or probes.
In still further examples, presence of a nucleic acid molecule comprising the R-gene sequence of any of SEQ ID NOs: 2-4, 11-12 or a sequence having at least 90% sequence identity, at least 95% sequence identity, or at least 99% sequence identity to any of SEQ ID NO: 2-4, 11-12 or a nucleotide sequence encoding the protein of SEQ ID NO: 5 or encoding a protein having at least 90% sequence identity, at least 95% sequence identity, or at least 99% sequence identity to SEQ ID NO: 5, can be determined by detecting for the presence of the R-protein encoded by the R-gene. For example, presence of the R-gene, or a functional part thereof, in a disease resistant plant (e.g., ASR resistant soybean plant) can be determined by isolating proteins from said plant and detecting the presence of a protein encoded by the R-gene (such as a protein having the polypeptide sequence of SEQ ID NO: 5, or at least 90% sequence identity to SEQ ID NO: 5) using commonly known protein detection assays (e.g., Western blot, ELISA, radioimmunoassay, etc.).
In still further examples, presence of a nucleic acid molecule comprising the R-gene sequence of any of SEQ ID NOs: 2-4, 11-12 or a sequence having at least 90% sequence identity, at least 95% sequence identity, or at least 99% sequence identity to any of SEQ ID NO: 2-4, 11-12; or a nucleotide sequence encoding the protein of SEQ ID NO: 5 or encoding a protein having at least 90% sequence identity, at least 95% sequence identity, or at least 99% sequence identity to SEQ ID NO: 5, can be detected by generating an amplicon using a primer pair comprising a forward primer and a reverse primer selected from the primers of Tables 5-6, and detected using a probe selected from the probes of Tables 5-6.
DNA constructs, particularly vectors, were generated comprising the R-gene operably coupled to a heterologous promoter. The nucleotide sequence of the R-gene used in the vectors comprised genomic DNA sequences or coding sequences for the R-gene, herein also referred to as “GtoRG30” and previously described in Example 4 (SEQ ID NOS: 2-4, 11-12). The DNA constructs comprise the R-gene coding sequence operably linked to a heterologous promoter capable of enabling expression of the R-gene in a plant cell. In a first set of constructs, transcription of R-gene GtoRG30 was driven by Medicago truncatula promoter prMt12344. In a second set of constructs, transcription of R-gene GtoRG30 was driven by Medicago truncatula promoter prMt51186. In a third set of constructs, transcription of R-gene GtoRG30 was driven by a native promoter (RG30_promoter). Binary vectors created comprising R-gene GtoRG30 are listed in Table 7.
a) 25845 Vector Construction for R-Gene Comprising SEQ ID NO: 3
Functional description: A binary vector for soybean transformation with the ALS selection harboring a soy rust resistance candidate gene, gGtoRG30-02 from chr3b of PI_505267 (G. tomentella), encoding a protein containing toll/interleukin receptor-1 (TIR), nucleotide-binding site (NBS), and leucine rich repeat (LRR) domains. This resistance gene includes its native 5′UTR and 3′UTR and the coding sequence cGtoRG30-01. The first native intron was replaced with Arabidopsis intron. This resistance gene is driven by the Medicago truncatula promoter, prMt12344-02 and corresponding terminator, tMt12344-01. Vector also contains the ALS selection cassette prGmEF-05/cNtALS-01/tGmEPSPS-04.
Features:
oVS1. Start: 19359 End: 19763. Origin of replication in Agrobacterium tumefaciens host.
cRepA. Start: 18243 End: 19316. cRepA-01 with A to G at nt735.
cVirG. Start: 17488 End: 18213. virG (putative) from pAD1289 with TTG start codon. virGN54D came from pAD1289 described in Hansen et al. 1994, PNAS 91:7603-7607.
prVirG. Start: 17283 End: 17413. virG promoter (Winans J. Bact. 172:2433-38 (1990)) composed of two promoter elements, one responsive to acetosyringone and phosphate-starvation (bp 45 to 83) and another to medium acidification (86 to 128).
cSpec. Start: 16400 End: 17188. Also called aadA; gene encoding the enzyme aminoglycoside 3′adenyltransferase that confers resistance to spectinomycin and streptomycin for maintenance of the vector in E. coli and Agrobacterium.
bNLB. Start: 16026 End: 16050. 25 bp Left border repeat region of T-DNA of Agrobacterium tumefaciens nopaline ti-plasmid.
bNLB. Start: 15991 End: 16120. Left border region of T-DNA of Agrobacterium tumefaciens nopaline ti-plasmid.
xTAG. Start: 15943 End: 15982. 40 bp site for plant insert intactness testing and to stop readthrough ORFs.
xSTOPS Start: 15931 End: 15942. 6-frame stop to minimize unintended ORF read-through.
xSTOPS Start: 15855 End: 15866. 6-frame stop to minimize unintended ORF read-through.
tGmEPSPS Start: 15057 End: 15854. An EPSPS terminator from Glycine max.
cNtALS Start: 13056 End: 15050. The NtALS DNA fragment encodes an Acetolactate synthase (ALS) double mutant (P191A, W568L) from Nicotiana tabacum. It was codon-optimized for soybean expression.
u5GmEF Start: 13037 End: 13047. Second 5′ UTR of the soybean elongation factor (EF) gene.
iGmEFStart: 12128 End: 13036. The first intron of the soybean elongation factor (EF) gene. iGmEF-01 with an internal BamHI site and a 3′ end unintended ORF removed.
u5GmEF Start: 12065 End: 12127. First 5′ UTR of the soybean elongation factor (EF) gene.
start Start: 12065 End: 12065. Transcription start site.
prGmEF Start: 10982 End: 13047. Translation elongation factor EF-1 alpha/Tu promoter, including the first intron and neighboring UTR, from soybean (williams 82).
xSTOPS Start: 10970 End: 10981. 6-frame stop to minimize unintended ORF read-through.
tMt12344 Start: 9956 End: 10962. The terminator based on the Medicago truncatula gene. It consists of the 3′-UTR and 3′-non-transcribed sequence.
gGtoRG30-02 Start: 2231 End: 9955. A genomic fragment containing a soy R-gene (SEQ ID NO: 3) along with its native 5′UTR and 3′UTR that encodes a protein containing toll/interleukin receptor-1 (TIR), nucleotide-binding site (NBS), and leucine rich repeat (LRR) domains. The first native intron was replaced with the Arabidopsis intron, iAtBAF60-01. The genomic fragment comprises the following components: RG30_5′UTR Start: 2231; End: 2730; RG30_3′UTR Start: 9318 End: 9955; intron4 Start: 7094 End: 8693; intron3 Start: 5401 End: 6262; intron2 Start: 4877 End: 5124; iAtBAF60-01 Start: 3357 End: 3765; Intron of Arabidopsis thaliana BAF60 homolog (CHC1 by the Chromatin database) inserted in GUS coding sequence to prevent bacterial expression; cGtoRG30-01 Start: 2731 End: 9317; the CDS from the R-gene. The coding sequence comprises (with reference to SEQ ID NO: 3):
start Start: 2043 End: 2043 The transcription start site based on cDNA/gDNA alignment.
prMt12344 Start: 217 End: 2218 The promoter from the Medicago truncatula gene. It consists of 5′-non-transcribed sequence and the 5′ UTR.
xSTOPS Start: 184 End: 195 6-frame stop to minimize unintended ORF read-through
xTAG Start: 144 End: 183 40 bp site for plant insert intactness testing and to stop readthrough ORFs.
bNRB-01 Start: 101 End: 125 Right Border Repeat
bNRB-04 Start: 4 End: 143 Right border region of T-DNA of Agrobacterium tumefaciens nopaline ti-plasmid.
b) 25899 Vector Construction for R-Gene Comprising SEQ ID NO: 3
oCOLE Start: 20,955 End: 21,761 The ColE1 origin of replication functional in E. coli derived from pUC19.
oVS1 Start: 19,873 End: origin of replication in Agrobacterium tumefaciens host.
cRepA Start: 18,757 End: 19,830 cRepA with A to G at nt735.
cVirG Start: 18,002 End: 18,727 virG (putative) from pAD1289 with TTG start codon. virGN54D came from pAD1289 described in Hansen et al. 1994, PNAS 91:7603-7607.
prVirGStart: 17,797 End: 17,927 virG promoter (Winans J. Bact. 172:2433-38 (1990)) composed of two promoter elements, one responsive to acetosyringone and phosphate-starvation (bp 45 to 83) and another to medium acidification (86 to 128).
cSpec Start: 16,914 End: 17,702 Also called aadA; gene encoding the enzyme aminoglycoside 3′adenyltransferase that confers resistance to spectinomycin and streptomycin for maintenance of the vector in E. coli and Agrobacterium.
bNLB Start: 16,540 End: 16,564 25 bp Left border repeat region of T-DNA of Agrobacterium tumefaciens nopaline ti-plasmid.
bNLB Start: 16,505 End: 16,634 Left border region of T-DNA of Agrobacterium tumefaciens nopaline ti-plasmid.
xTAG Start: 16,457 End: 16,496 40 bp site for plant insert intactness testing and to stop readthrough ORFs.
xSTOPS Start: 16,445 End: 16,456 6-frame stop to minimize unintended ORF read-through.
xSTOPS Start: 16,369 End: 16,380 6-frame stop to minimize unintended ORF read-through.
tGmEPSPS Start: 15,571 End: 16,368 An EPSPS terminator from Glycine max. Removal of 6 frame stops at 3′ end of terminator.
cNtALS Start: 13,570 End: 15,564 The NtALS DNA fragment encodes an Acetolactate synthase (ALS) double mutant (P191A, W568L) from Nicotiana tabacum. It was codon-optimized for soybean expression and synthesized by GeneArt.
u5GmEF Start: 13,551 End: 13,561 Second 5′ UTR of the soybean elongation factor (EF) gene.
iGmEF Start: 12,642 End: 13,550 The first intron of the soybean elongation factor (EF) gene.
u5GmEF Start: 12,579 End: 12,641 First 5′ UTR of the soybean elongation factor (EF) gene.
start Start: 12,579 End: 12,579 transcription start site.
prGmEF Start: 11,496 End: 13,561 Translation elongation factor EF-1 alpha/Tu promoter, including the first intron and neighboring UTR, from soybean (williams 82).
xSTOPS Start: 11,484 End: 11,495 6-frame stop to minimize unintended ORF read-through.
xSTOPS Start: 11,465 End: 11,476 6-frame stop to minimize unintended ORF read-through.
tMt51186 Start: 10,465 End: 11,464 Modified terminator of Medicago truncatula gene.
gGtoRG30-02 (SEQ ID NO: 3) Start: 2,734 End: 10,458; A genomic fragment containing a soy R-gene along with its native 5′UTR and 3′UTR that encodes a protein containing toll/interleukin receptor-1 (TIR), nucleotide-binding site (NBS), and leucine rich repeat (LRR) domains. The first native intron was replaced with the Arabidopsis intron, iAtBAF60-01. The genomic fragment comprises the following components: RG30_5′UTR Start: 2,734 End: 3,233; RG30_3′UTR Start: 9,821 End: 10,458; intron4 Start: 7,597 End: 9,196; intron3 Start: 5,904 End: 6,765; intron2 Start: 5,380 End: 5,627; iAtBAF60-01 Start: 3,860 End: 4,268; intron of Arabidopsis thaliana BAF60 homolog (CHC1 by the Chromatin database) inserted in GUS coding sequence to prevent bacterial expression; cGtoRG30-01 Start: 3,234 End: 9,820; the CDS of the R-gene.
iMt51186 Start: 2,450 End: 2,700 The first intron of the Medicago truncatula gene.
start Start: 2,313 End: 2,313 Transcription start based on cDNA/gDNA alignment.
prMt51186 Start: 217 End: 2,721 The promoter from the Medicago truncatula gene.
xSTOPS Start: 184 End: 195 6-frame stop to minimize unintended ORF read-through.
xTAG Start: 144 End: 183 40 bp site for plant insert intactness testing and to stop readthrough ORFs.
bNRB Start: 101 End: 125 Right Border Repeat
bNRB-04 Start: 4 End: 143 Right border region of T-DNA of Agrobacterium tumefaciens nopaline ti-plasmid. Differs from bNRB-03 by 20 bp at 5′ end.
c) 25950 Vector Construction for R-Gene Comprising SEQ ID NO: 11
Vector type: Binary Vector
Construct Size (bp): 20,738
Functional description: A binary vector for soybean transformation with the ALS selection harboring a soy rust resistance candidate gene, gGtoRG30-01 encoding a protein containing toll/interleukin receptor-1 (TIR), nucleotide-binding site (NBS), and leucine rich repeat (LRR) domains. The first native intron was replaced with Arabidopsis intron. Vector also contains the ALS selection cassette prGmEF-05/cNtALS-01/tGmEPSPS-04.
Features:
xTAG Start: 144 End: 183 40 bp site for plant insert intactness testing and to stop readthrough ORFs.
xTAG Start: 15,422 End: 15,461 40 bp site for plant insert intactness testing and to stop readthrough ORFs.
xSTOPS Start: 15,410 End: 15,421 6-frame stop to minimize unintended ORF read-through
xSTOPS Start: 15,334 End: 15,345 6-frame stop to minimize unintended ORF read-through
xSTOPS Start: 10,449 End: 10,460 6-frame stop to minimize unintended ORF read-through
xSTOPS Start: 184 End: 195 6-frame stop to minimize unintended ORF read-through
prGmEF Start: 12,516 End: 12,526 Translation elongation factor EF-1 alpha/Tu promoter, including the first intron and neighboring UTR, from soybean (williams 82).
prGmEF Start: 11,544 End: 11,606 Translation elongation factor EF-1 alpha/Tu promoter, including the first intron and neighboring UTR, from soybean (williams 82).
tGmEPSPS Start: 15,536 End: 15,333 An EPSPS terminator from Glycine max.
gGtoRG30-01 Start: 8,804 End: 10,441 A genomic fragment containing a soy rust resistance candidate gene that encodes a protein containing toll/interleukin receptor-1 (TIR), nucleotide-binding site (NBS), and leucine rich repeat (LRR) domains. The first native intron was replaced with Arabidopsis intron, iAtBAF60-01. The coding sequence comprises (with reference to SEQ ID NO: 11):
prVirG Start: 16,762 End: 16,892 virG promoter (Winans J. Bact. 172:2433-38 (1990)) composed of two promoter elements, one responsive to acetosyringone and phosphate-starvation (bp 45 to 83) and another to medium acidification (86 to 128)
prGmEF Start: 10,461 End: 12,526 Translation elongation factor EF-1 alpha/Tu promoter, including the first intron and neighboring UTR, from soybean (williams 82).
oVS1 Start: 18,838 End: 19,242 origin of replication in Agrobacterium tumefaciens host
oCOLE Start: 20,726 End: 19,920 The ColE1 origin of replication functional in E. coli
prGmEF-05 Start: 11,607 End: 12,515 Translation elongation factor EF-1 alpha/Tu promoter, including the first intron and neighboring UTR, from soybean (williams 82).
iAtBAF60 Start: 2,843 End: 3,251 intron of Arabidopsis thaliana BAF60 homolog (CHC1 by the Chromatin database) inserted to prevent bacterial expression</nobr></html>
cVirG Start: 16,967 End: 17,692 virG from pAD1289 with TTG start codon described in Hansen et al. 1994, PNAS 91:7603-7607
cSpec Start: 15,879 End: 16,667 gene encoding the enzyme aminoglycoside 3′adenyltransferase that confers resistance to spectinomycin and streptomycin for maintenance of the vector in E. coli and Agrobacterium.
cRepA Start: 17,722 End: 18,795 cRepA-01 with A to G at nt735
cNtALS Start: 12,535 End: 14,529 encodes an Acetolactate synthase double mutant from Nicotiana tabacum that was codon-optimized for soybean expression
bNRB Start: 125 End: 101 Right border region of T-DNA of Agrobacterium tumefaciens nopaline ti-plasmid
bNLB Start: 15,529 End: 15,505 Left border repeat region of T-DNA of Agrobacterium tumefaciens nopaline ti-plasmid
Each of the constructs was transformed into soybean cells using known methods of plant transformation (e.g., via Agrobacterium mediated transformation) to create primary soybean events.
In the phenotyping work, both symptom evaluation and molecular assays for rust resistance or susceptibility ratings have been used. The symptom evaluation is a modified version of a rust rating scale from Burdon and Speer (Euphytica, 33: 891-896, 1984; also T A G 1984). The molecular assay is based on a fungal housekeeping gene, β-tubulin, wherein the probe for β-tubulin targets a specific region in soy rust but not in other pathogens or plant species. Further, the molecular assay was validated by coupling with phenotypic symptomatic observations, as shown in
Leaves from primary events were placed in a petri dish on moist paper towel and then inoculated with a soybean rust spore suspension from three different rust populations (RTP1; BRO1—Brazil; BRO3—Brazil). Leaves from plants that had the same genetic background but did not have the transgene served as negative controls. After 14 days, leaves from both events and the control were evaluated for resistance to soybean rust. As shown in
The results in
Quantitative measurements taken using fungal β-tubulin transcripts are consistent with these phenotypic observations, as shown in the graph of
In this validation, we demonstrated that construct 25845 and construct 25950 show strong resistance (>90%) and broad spectrum against all rusts tested.
Additional vector constructs are also provided by way of example.
d) 25992 Vector Construction for R-Gene Comprising SEQ ID NO: 4
oCOLE Start: 19,303 End: 20,109. The ColE1 origin of replication functional in E. coli.
oVS1 Start: 18,221 End: 18,625. Origin of replication in Agrobacterium tumefaciens host.
cRepA Start: 17,105 End: 18,178. cRepA-01 with A to G at nt735.
cVirG Start: 16,350 End: 17,075. virG (putative) from pAD1289 with TTG start codon. virGN54D came from pAD1289 described in Hansen et al. 1994, PNAS 91:7603-7607.
prVirG Start: 16,145 End: 16,275. virG promoter (Winans J. Bact. 172:2433-38 (1990)) composed of two promoter elements, one responsive to acetosyringone and phosphate-starvation (bp 45 to 83) and another to medium acidification (86 to 128).
cSpec Start: 15,262 End: 16,050. Also called aadA; gene encoding the enzyme aminoglycoside 3′adenyltransferase that confers resistance to spectinomycin and streptomycin for maintenance of the vector in E. coli and Agrobacterium.
bNLB Start: 14,888 End: 14,912 25 bp. Left border repeat region of T-DNA of Agrobacterium tumefaciens nopaline ti-plasmid.
bNLB Start: 14,853 End: 14,982. Left border region of T-DNA of Agrobacterium tumefaciens nopaline ti-plasmid.
xTAG Start: 14,805 End: 14,844. 40 bp site for plant insert intactness testing and to stop readthrough ORFs.
xSTOPS Start: 14,793 End: 14,804. 6-frame stop to minimize unintended ORF read-through.
xSTOPS Start: 14,717 End: 14,728. 6-frame stop to minimize unintended ORF read-through.
tGmEPSPS Start: 13,919 End: 14,716. Modified version of tGMEPS-02; an EPSPS terminator from Glycine max.
cNtALS Start: 11,918 End: 13,912. The NtALS DNA fragment encodes an Acetolactate synthase double mutant (P191A, W568L) from Nicotiana tabacum. It was codon-optimized for soybean expression.
u5GmEF Start: 11,899 End: 11,909. Second 5′ UTR of the soybean elongation factor (EF) gene.
iGmEFStart: 10,990 End: 11,898. The first intron of the soybean elongation factor (EF) gene.
u5GmEF Start: 10,927 End: 10,989. First 5′ UTR of the soybean elongation factor (EF) gene.
start Start: 10,927 End: 10,927. Transcription start site
prGmEF Start: 9,844 End: 11,909. Translation elongation factor EF-1 alpha/Tu promoter, including the first intron and neighboring UTR, from soybean (williams 82).
xSTOPS Start: 9,832 End: 9,843. 6-frame stop to minimize unintended ORF read-through.
tMt12344 Start: 8,818 End: 9,824. The terminator based on the Medicago truncatula gene.
It consists of the 3′-UTR and 3′-non-transcribed sequence.
iAtBAF60 Start: 2,857 End: 3,265. Intron of Arabidopsis thaliana BAF60 homolog inserted in GUS coding sequence to prevent bacterial expression.
cGtoRG30 (SEQ ID NO: 4) Start: 2,231 End: 8,817. A CDS of soy R-gene that encodes a protein containing toll/interleukin receptor-1 (TIR), nucleotide-binding site (NBS), and leucine rich repeat (LRR) domains. This CDS is from an R-gene on Chromosome 3 of G. tomentella P1 505267.
start Start: 2,043 End: 2,043. The transcription start site based on cDNA/gDNA alignment.
prMt12344 Start: 217 End: 2,218. The promoter from the Medicago truncatula gene. It consists of 5′-non-transcribed sequence and the 5′ UTR.
xSTOPS Start: 184 End: 195. 6-frame stop to minimize unintended ORF read-through.
xTAG Start: 144 End: 183. 40 bp site for plant insert intactness testing and to stop readthrough ORFs. Typically, by agro RB. 1 bp different than −01.
bNRB Start: 101 End: 125. Right Border Repeat.
bNRB Start: 4 End: 143. Right border region of T-DNA of Agrobacterium tumefaciens nopaline ti-plasmid.
e) 26015 Vector Construction for R-Gene Comprising SEQ ID NO: 4
oCOLE Start: 19,817 End: 20,623. The ColE1 origin of replication functional in E. coli.
oVS1 Start: 18,735 End: 19,139. Origin of replication in Agrobacterium tumefaciens host.
cRepA Start: 17,619 End: 18,692. cRepA-01 with A to G at nt735.
cVirG Start: 16,864 End: 17,589. virG (putative) from pAD1289 with TTG start codon. virGN54D came from pAD1289 described in Hansen et al. 1994, PNAS 91:7603-7607.
prVirGStart: 16,659 End: 16,789. virG promoter (Winans J. Bact. 172:2433-38 (1990)) composed of two promoter elements, one responsive to acetosyringone and phosphate-starvation (bp 45 to 83) and another to medium acidification (86 to 128).
cSpec Start: 15,776 End: 16,564. Also called aadA; gene encoding the enzyme aminoglycoside 3′adenyltransferase that confers resistance to spectinomycin and streptomycin for maintenance of the vector in E. coli and Agrobacterium.
bNLB Start: 15,402 End: 15,426 25 bp Left border repeat region of T-DNA of Agrobacterium tumefaciens nopaline ti-plasmid
bNLB Start: 15,367 End: 15,496 Left border region of T-DNA of Agrobacterium tumefaciens nopaline ti-plasmid.
xTAG Start: 15,319 End: 15,358. 40 bp site for plant insert intactness testing and to stop readthrough ORFs.
xSTOPS-01 Start: 15,307 End: 15,318. 6-frame stop to minimize unintended ORF read-through.
xSTOPS Start: 15,231 End: 15,242. 6-frame stop to minimize unintended ORF read-through
tGmEPSPS Start: 14,433 End: 15,242. EPSPS terminator from Glycine max.
cNtALS Start: 12,432 End: 14,426. The NtALS DNA fragment encodes an Acetolactate synthase double mutant (P191A, W568L) from Nicotiana tabacum. It was codon-optimized for soybean expression.
u5GmEF Start: 12,413 End: 12,423. Second 5′ UTR of the soybean elongation factor (EF) gene.
iGmEFStart: 11,504 End: 12,412. The first intron of the soybean elongation factor (EF) gene.
u5GmEF Start: 11,441 End: 11,503. First 5′ UTR of the soybean elongation factor (EF) gene.
start Start: 11,441 End: 11,441. Transcription start site
prGmEF Start: 10,358 End: 12,423. Translation elongation factor EF-1 alpha/Tu promoter, including the first intron and neighboring UTR, from soybean (williams 82).
xSTOPS Start: 10,346 End: 10,357. 6-frame stop to minimize unintended ORF read-through.
xSTOPS Start: 10,327 End: 10,338. 6-frame stop to minimize unintended ORF read-through.
tMt51186 Start: 9,327 End: 10,326. Modified terminator of Medicago truncatula gene.
iAtBAF60 Start: 3,360 End: 3,768. Intron of Arabidopsis thaliana BAF60 homolog (CHC1 by the Chromatin database) inserted in GUS coding sequence to prevent bacterial expression.
cGtoRG30-01 (SEQ ID NO: 4) Start: 2,734 End: 9,320. A CDS of a soy R-gene that encodes a protein containing toll/interleukin receptor-1 (TIR), nucleotide-binding site (NBS), and leucine rich repeat (LRR) domains. This CDS is from an R-gene in Chromosome 3 of G. tomentella PI_505267.
iMt51186 Start: 2,450 End: 2,700 The first intron of the Medicago truncatula gene.
iMt51186 Start: 2,450 End: 2,574 Truncated version of the first intron of the Medicago truncatula gene.
start Start: 2,313 End: 2,313 Transcription start based on cDNA/gDNA alignment.
prMt51186 Start: 217 End: 2,721 The promoter from the Medicago truncatula gene.
xSTOPS Start: 184 End: 195 6-frame stop to minimize unintended ORF read-through
xTAG Start: 144 End: 183 40 bp site for plant insert intactness testing and to stop readthrough ORFs.
bNRB Start: 101 End: 125 Right Border Repeat
bNRB Start: 4 End: 143 Right border region of T-DNA of Agrobacterium tumefaciens nopaline ti-plasmid.
The above examples clearly illustrate the advantages of various embodiments of the invention. Although the present invention has been described with reference to specific details of certain embodiments thereof, it is not intended that such details should be regarded as limitations upon the scope of the invention except as and to the extent that they are included in the accompanying claims.
Throughout this application, various patents, patent publications and non-patent publications are referenced. The disclosures of these patents, patent publications and non-patent publications in their entireties are incorporated by reference herein into this application in order to more fully describe the state of the art to which this invention pertains.
This application claims priority to US Provisional Patent Application Nos. 63/147,849 filed 10 Feb. 2021 and 63/209,005 filed 10 Jun. 2021, the contents of which are incorporated by reference herein in their entirety.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US22/15172 | 2/4/2022 | WO |
Number | Date | Country | |
---|---|---|---|
63147849 | Feb 2021 | US | |
63209005 | Jun 2021 | US |