The content of the electronically submitted Sequence Listing in ASCII text file (Name: 47004_207976_SEQ LIST_ST25.txt; Size: 13,391 bytes; and Dated of Creation: Apr. 7, 2021) filed with the application is incorporated herein by reference in its entirety.
Sorghum bicolor (L.) Moench, a C4 crop native to Africa and known for its drought and heat tolerance, is a promising bioenergy crop due to its ability to grow in marginal environments, and high potential for biomass yield (Brenton, Z. W., et al. (2016) Genetics, 204(1): 21-33). Early season planting, which extends the growing season of bioenergy sorghum and potentially increases exposure to rainfall is a strategy to maximize biomass yield (Yu, J., Tuinstra, M. R. (2001) Crop Science, 41(5): 1438-1443; Upadhyaya, H. D., et al. (2015) Genome, 59(2): 137-145). Moreover, cultivation in northern regions and at higher elevations could increase the available land for sorghum bioenergy production, without using land areas needed for food or feed crop production.
However, early planting and cultivation in temperate areas will not lead to higher biomass yields if plant development stalls. Due to its cold-sensitivity, planting sorghum in temperatures below 12-15° C. will diminish the yield (Burow. G., et al. (2011) Molecular Breeding, 28(3): 391-402; Salas Fernandez, M. G., Schoenbaum. G. R., Goggi, A. S. (2014) Crop Science, 54(6): 2631-2638; Chopra. R., et al. (2015) BMC Genomics, 16(1): 1040). Therefore, identification and development of sorghum accessions with enhanced tolerance to cold is vital to lengthen the growing season, expand growing regions, and to achieve higher total biomass yields.
Increased cold tolerance is only one of several traits necessary for improved biomass yield. Identification of ‘ideotype-positive’ and ‘ideotype-negative’ accessions can be used to select accessions with multiple positive characteristics that contribute to increased yield as well as other valuable traits for crop production. For example, accessions with high biomass, reduced height, and high water use efficiency (WUE) may be more desirable for breeding bioenergy sorghum. Alternatively, accessions that are tall but have low biomass and low WUE may be the least desirable for multiple reasons including a higher propensity for lodging (Esechie, H. A., Maranville, J. W., Ross, W. M. (1977) Crop Science, 17(4): 609-612). Crosses of these different ideotypes can, for example, also be used to create a nested association mapping (NAM) population to identify variation in traits and genes associated with multiple beneficial characteristics that contribute to the desired ideotype.
To identify accessions that can tolerate the lower temperatures of early spring and colder climates, the genetic basis of sorghum's response to cold stress and potential for cold adaptability must be better understood. Cold tolerance is a complex quantitative trait, and there is phenotypic variability for the degree of cold tolerance among sorghum accessions (Chopra, R., et al. (2017) BMC Plant Biology, 17(1): 12; Yu, J., et al. (2004) Field Crops Research, 85(1): 21-30). Natural genetic variation in sorghum's response to cold stress has also been identified (Ortiz, D., Hu, J., Salas Fernandez, M. G. (2017) Journal of Experimental Botany, 68(16): 4545-4557). Genome-wide association mapping of cold sensitivity traits in diverse germplasm is a promising approach to identify allelic variation that may be harnessed to improve the cold tolerance of sorghum.
Provided for herein is a method for obtaining a sorghum plant comprising in its genome at least one cold-stress tolerance genetic locus, the method comprising the steps of: a. genotyping a plurality of sorghum plants with respect to at least one genetic locus comprising a plastocyanin gene Sobic.007G033300 (SEQ ID NO: 1); and b. selecting a sorghum plant comprising in its genome at least one genetic locus comprising a genotype associated with cold-stress tolerance, based on said genotyping.
Also provided for herein is a method for producing a sorghum plant comprising in its genome at least one introgressed cold-stress tolerance genetic locus, the method comprising the steps of: a. crossing a first sorghum plant with a genotype associated with cold-stress tolerance in a genetic locus comprising a plastocyanin gene Sobic.007G033300 with a second sorghum plant comprising a genotype not associated with cold-stress tolerance in a genetic locus comprising a plastocyanin gene Sobic.007G033300 and at least one second polymorphic locus that is linked to the genetic locus comprising Sobic.007G033300 and that is not present in said first sorghum plant to obtain a population segregating for the cold-stress tolerance loci and said linked polymorphic locus; b. genotyping for the presence of at least two polymorphic nucleic acids in at least one sorghum plant from said population, wherein a first polymorphic nucleic acid is located in said genetic locus comprising Sobic.007G033300 and wherein a second polymorphic amino acid is a linked polymorphic locus not present in said first sorghum plant; and c. selecting a sorghum plant comprising a genotype associated with cold-stress tolerance and at least one linked marker found in said second sorghum plant comprising a non-cold-stress tolerance locus but not found in said first sorghum plant, thereby obtaining a sorghum plant comprising in its genome an introgressed cold-stress tolerance locus.
Also provided for herein is a method of identifying a sorghum plant that comprises a genotype associated with cold-stress tolerance, comprising: genotyping a sorghum plant for the presence of an allele in at least one genetic locus associated with cold-stress tolerance, wherein the genetic locus comprises a plastocyanin gene Sobic.007G033300, and denoting based on the genotyping that said sorghum plant comprises a genotype associated with cold-stress tolerance.
Also provided for herein is a sorghum plant made by the methods disclosed herein, wherein said sorghum plant comprises an introgressed cold-stress tolerance locus. Also provided for herein is a sorghum plant comprising an introgressed cold-stress tolerance genetic locus comprising a genotype associated with cold-stress tolerance in a genomic region comprising a plastocyanin gene Sobic.007G033300, wherein at least one marker linked to the introgressed cold-stress tolerance genetic locus found in said sorghum plant is characteristic of germplasm comprising a non-cold-stress tolerance genetic locus but is not associated with germplasm comprising the cold-stress tolerance genetic locus.
Also provided for herein is a method of producing a commercial crop seed lot of sorghum seeds comprising in their genomes at least one introgressed cold-stress tolerance genetic locus, the method comprising the steps of: a. producing a population of sorghum plants from the sorghum plant selected in step (c) of any one of claims 21-33 or the sorghum plant of any one of claims 46 to 56 comprising a genotype associated with cold-stress tolerance and at least one linked marker found in said second sorghum plant comprising a non-cold-stress tolerance locus but not found in said first sorghum plant; and b. harvesting a commercial seed lot, wherein the harvested crop seed lot comprises a plurality of seeds that comprise in their genomes at least one introgressed cold-stress tolerance locus.
Also provided for herein is a commercial crop seed lot of sorghum seeds made the methods disclosed herein. Also provided for herein is a commercial crop seed lot of sorghum seeds comprising a plurality of seeds that comprise in their genomes at least one introgressed cold-stress tolerance genetic locus comprising a genotype associated with cold-stress tolerance.
Also provided for herein is a method of growing a sorghum plant comprising in its genome at least one cold-stress tolerance genetic locus of any one of the preceding claims, the method comprising growing the plant under cold conditions sufficient to cause a deleterious effect in a non-cold-stress tolerant variety.
Provided for herein is an edited plastocyanin gene. In certain embodiments, the edited gene comprises (i) a variant polynucleotide encoding a plastocyanin protein variant or fragment thereof, wherein the variant polynucleotide comprises at least one nucleotide insertion, deletion, and/or substitution in comparison to the corresponding unedited wild-type polynucleotide sequence, and wherein the variant polynucleotide does not encode a wild-type plastocyanin protein. In certain embodiments, the variant polynucleotide is operably linked to a polynucleotide comprising a promoter. In certain embodiments, the edited gene comprises (ii) a variant polypeptide comprising a plastocyanin gene 3′ UTR, wherein the variant polynucleotide comprises at least one nucleotide insertion, deletion, and/or substitution in the 3′ UTR in comparison to the corresponding unedited wild-type polynucleotide sequence. In certain embodiments, the variant polynucleotide is operably linked to a polynucleotide comprising a promoter and/or a plastocyanin protein coding region. In certain embodiments, the edited gene comprises (iii) a variant polypeptide comprising a plastocyanin gene 5′ UTR, wherein the variant polynucleotide comprises at least one nucleotide insertion, deletion, and/or substitution in the 5′ UTR in comparison to the corresponding unedited wild-type polynucleotide sequence. In certain embodiments, the variant polynucleotide is operably linked to a polynucleotide comprising a promoter and/or a plastocyanin protein coding region. In certain embodiments, the edited gene comprises (iv) a variant polypeptide comprising a plastocyanin gene promoter, wherein the variant polynucleotide comprises at least one nucleotide insertion, deletion, and/or substitution in the promoter in comparison to the corresponding unedited wild-type polynucleotide sequence. In certain embodiments, the variant polynucleotide is operably linked to a polynucleotide comprising a plastocyanin protein coding region. In certain embodiments, the edited gene comprises (v) a variant polypeptide comprising a plastocyanin gene intron, wherein the variant polynucleotide comprises at least one nucleotide insertion, deletion, and/or substitution in the intron in comparison to the corresponding unedited wild-type polynucleotide sequence. In certain embodiments, the variant polynucleotide is operably linked to a polynucleotide comprising at least one plastocyanin gene exon. In certain embodiments, the edited gene comprises (vi) a variant polypeptide encoding a plastocyanin (a) transit peptide, vacuolar targeting peptide, and/or endoplasmic reticulum targeting peptide; (b) plastid targeting peptide; and/or (c) polyadenylation or transcriptional termination signal, wherein the variant polynucleotide comprises at least one nucleotide insertion, deletion, and/or substitution in (a), (b), and/or (c) in comparison to the corresponding unedited wild-type polynucleotide sequence. In certain embodiments, the polynucleotides of (a), (b), and/or (c) are operably linked to a polypeptide encoding a plastocyanin protein.
Certain embodiments provide for a plant nuclear or plastid genome comprising the edited plastocyanin gene of the disclosure, for example, as described above. Certain embodiments provide for a cell comprising the edited gene or nuclear or plastid genome. And, certain embodiments provide for a plant comprising the edited gene or nuclear or plastid genome and a plant part of the plant, wherein the plant part comprises the edited gene or nuclear or plastid genome.
Provided for herein is a method for obtaining a plant comprising the edited gene or plant nuclear or plastid genome of this disclosure, wherein the plant is cold-stress tolerant. In certain embodiments, the method comprises the steps of: (i) introducing the edited gene, the variant polynucleotide encoding the plastocyanin protein, the polynucleotide comprising the promoter, a fragment of said polynucleotides, or a combination of said polynucleotides into a plant cell, tissue, plant part, or whole plant; (ii) obtaining a plant cell, tissue, part, or whole plant wherein the edited gene, the variant polynucleotide encoding the plastocyanin protein, the polynucleotide comprising the promoter, a fragment of said polynucleotides, or a combination of said polynucleotides is integrated into the plant nuclear or plastid genome; and (iii) selecting a plant obtained from the plant cell, tissue, part or whole plant of step (ii) for expression of a variant plastocyanin protein, thereby obtaining a plant that is cold-stress tolerant.
Provide for herein a method for obtaining a plant comprising the edited gene or plant nuclear or plastid genome of this disclosure, wherein the plant is cold-stress tolerant. In certain embodiments, the method comprises introducing into a plant cell one or more gene editing molecules that target an endogenous plastocyanin gene to introduce at least one nucleotide insertion, deletion, and/or substitution into the endogenous plastocyanin gene. In certain embodiments, the method further comprising selecting a plant comprising a gene edited plastocyanin gene that expresses a variant plastocyanin protein; optionally, further comprising selecting a plant that is cold-stress tolerant.
Provided for herein is a method for obtaining a plant comprising the edited gene or nuclear or plastid genome of this disclosure, wherein the plant is cold-stress tolerant. In certain embodiments, the method comprises the steps of: (i) providing to a plant cell, tissue, part, or whole plant an endonuclease or an endonuclease and at least one guide RNA, wherein the endonuclease or guide RNA and endonuclease can form a complex that can introduce a double strand break at a target site in a nuclear or plastid genome of the plant cell, tissue, part, or whole plant; (ii) obtaining a plant cell, tissue, part, or whole plant wherein at least one nucleotide insertion, deletion, and/or substitution has been introduced into the corresponding wild-type polynucleotide sequence; and (iii) selecting a plant obtained from the plant cell, tissue, part or whole plant of step (ii) comprising the edited gene for expression of a plant plastocyanin protein variant or fragment, thereby obtaining a plant that is cold-stress tolerant.
Certain embodiments provide for a method of producing/breeding a cold-stress tolerant plant, the method comprising crossing a cold-stress tolerant plant of claim this disclosure with one or more other plants to produce a population of progeny plants and in certain embodiments further screening the population of progeny plants to identify cold-stress tolerant plants.
Also provided for herein a plant or a seed of said plant for use in a method of plant breeding, crop production, or for making a processed plant product.
Sorghum bicolor is a promising cellulosic feedstock crop for bioenergy because of its potential for high biomass yields. However, in its early growth phases, sorghum is sensitive to cold stress, preventing early planting in temperate environments. Cold temperature adaptability is vital for the successful cultivation of both bioenergy and grain sorghum at higher latitudes and elevations, and for early season planting or to extend the growing season. Identification of genes and alleles that enhance biomass accumulation of sorghum grown under early cold stress would enable development of improved bioenergy sorghum through breeding or genetic engineering.
A genome-wide association study of bioenergy sorghum accessions phenotyped under early season cold stress revealed transient QTLs for highly heritable biomass and growth-related traits that appeared as the temperature increased and plants developed. Sorghum accessions clustered into multiple groups for each heritable trait with distinct growth profiles. GWAS identified candidate genes associated with growth traits and cold stress responses. The top-performing accessions with the highest growth-related trait values over time and temperature shifts may be useful for further genetic analysis and breeding or engineering efforts directed at biomass yield enhancements.
A high throughput imaging-based system was used to collect daily phenotypic measurements from a set of 369 diverse accessions from the sorghum Bioenergy Association Panel (BAP) genotyped with 232,303 high-quality single nucleotide polymorphism (SNP) markers (S1). Daily phenotypic measurements are more beneficial than single endpoint measurements because the response to cold stress can change over time and as plants develop. Therefore, daily imaging and phenotyping allowed for the comparison of the cold stress response in the BAP at different developmental stages.
High throughput image-based phenotyping was performed to track growth over time and under early cold stress in this sorghum panel. GWAS on the daily phenotypic data revealed genetic variation underlying the response to early-season cold stress and identified transient QTLs related to biomass, height, hull area, water use efficiency (WUE) and relative growth rate (RGR). Candidate genes associated with these phenotypes were identified for crop improvement through breeding or targeted genetic modifications.
The analysis identified a priori and novel candidate genes associated with growth-related traits and the temporal response to cold stress.
The term “and/or” where used herein is to be taken as specific disclosure of each of the two specified features or components with or without the other. Thus, the term and/or” as used in a phrase such as “A and/or B” herein is intended to include “A and B,” “A or B,” “A” (alone), and “B” (alone). Likewise, the term “and/or” as used in a phrase such as “A, B, and/or C” is intended to encompass each of the following embodiments: A, B, and C; A, B, or C; A or C; A or B; B or C; A and C; A and B; B and C; A (alone); B (alone); and C (alone).
The term “comprising” as used herein is to be construed as at least having the features to which it refers while not excluding any additional unspecified features. However, in embodiments provided herein where the term “comprising” is used, other embodiments where the phrases “consisting of” and/or “consisting essentially of” are substituted for the term “comprising” are also provided.
As used herein, the terms “include,” “includes,” and “including” are to be construed as at least having the features to which they refer while not excluding any additional unspecified features.
Where a term is provided in the singular, other embodiments described by the plural of that term are also provided. For example, the term “a” or “an” entity refers to one or more of that entity; “an allele,” is understood to represent “one or more alleles.” As such, the terms “a” (or “an”), “one or more,” and “at least one” can be used interchangeably herein.
Unless defined otherwise, technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure is related.
Numeric ranges are inclusive of the numbers defining the range. Even when not explicitly identified by “and any range in between,” or the like, where a list of values is recited, e.g., 1, 2, 3, or 4, the disclosure specifically includes any range in between the values, e.g., 1 to 3, 1 to 4, 2 to 4, etc.
The headings provided herein are solely for ease of reference and are not limitations of the various aspects or aspects of this disclosure, which can be had by reference to the specification as a whole.
As used herein, an “allele” refers to one of two or more alternative forms of a genomic sequence at a given locus on a chromosome. When all the alleles present at a given locus on a chromosome are the same, that plant is homozygous at that locus. If the alleles present at a given locus on a chromosome differ, that plant is heterozygous at that locus.
As used herein, the term “denoting” when used in reference to a plant genotype refers to any method whereby a plant is indicated to have a certain genotype. Such indications of a certain genotype include, but are not limited to, any method where a plant is physically marked or tagged. Physical markings or tags that can be used include, but not limited to, a barcode, a radio-frequency identification (RFID) tag, a label, or the like. Indications of a certain genotype also include, but are not limited to, any entry into any type of written or electronic database whereby the plant's genotype is provided.
A “genetic locus” or “locus” is a position on a genomic sequence that is usually found by a point of reference; e.g., a short DNA sequence that is a gene, or part of a gene or intergenic region. A locus may refer to a nucleotide position at a reference point on a chromosome, such as a position from the end of the chromosome. While the genetic locus may be identified by a particular reference sequence, e.g., the Sorghum plastocyanin gene Sobic.007G033300 (SEQ ID NO: 1), it is understood that the locus can comprise various allelic forms or variants and still be considered the same locus or marker.
As used herein, “linkage” refers to relative frequency at which types of gametes are produced in a cross. For example, if locus A has genes “A” or “a” and locus B has genes “B” or “b” and a cross between parent I with AABB and parent B with aabb will produce four possible gametes where the genes are segregated into AB, Ab, aB and ab. The null expectation is that there will be independent equal segregation into each of the four possible genotypes, i.e. with no linkage ¼ of the gametes will of each genotype. Segregation of gametes into a genotypes differing from ¼ are attributed to linkage.
As used herein, the termed “linked”, when used in the context of markers and/or genomic regions, means that the markers and/or genomic regions are located on the same linkage group or chromosome.
As used herein, “polymorphism” means the presence of one or more variations of a nucleic acid sequence at one or more loci in a population of at least two members. The variation can comprise but is not limited to one or more nucleotide base substitutions, the insertion of one or more nucleotides, a nucleotide sequence inversion, and/or the deletion of one or more nucleotides.
As used herein, the term “single nucleotide polymorphism,” also referred to by the abbreviation “SNP,” means a polymorphism at a single site wherein the polymorphism constitutes any or all of a single base pair change, an insertion of one or more base pairs, and/or a deletion of one or more base pairs.
As used herein, “marker” means a detectable characteristic that can be used to discriminate between organisms. Examples of such characteristics include, but are not limited to, genetic markers, biochemical markers, morphological characteristics, and agronomic characteristics.
As used herein, “marker assay” means a method for detecting a polymorphism at a particular locus using a particular method. Marker assays thus include, but are not limited to, measurement of at least one phenotype (such as seed color, flower color, or other visually detectable trait as well as any biochemical trait), or genotyping such as by restriction fragment length polymorphism (RFLP), single base extension, electrophoresis, sequence alignment, allelic specific oligonucleotide hybridization (ASO), random amplified polymorphic DNA (RAPD), microarray-based polymorphism detection technologies, and the like.
As used herein, “genotype” means the genetic component of the phenotype and it can be indirectly characterized using markers or directly characterized by nucleic acid sequencing.
As used herein, “phenotype” means the detectable characteristics of a cell or organism which can be influenced by gene expression.
As used herein, the term “introgressed”, when used in reference to a genetic locus, refers to a genetic locus that has been introduced into a new genetic background. Introgression of a genetic locus can thus be achieved through both plant breeding methods or by molecular genetic methods. Such molecular genetic methods include, but are not limited to, various plant transformation techniques and/or methods that provide for homologous recombination, non-homologous recombination, site-specific recombination, and/or genomic modifications that provide for locus substitution or locus conversion. In certain embodiments, introgression could thus be achieved by substitution of a cold-stress sensitive locus with a corresponding cold-stress tolerance locus or by conversion of a locus from a cold-stress sensitive to a cold-stress tolerance genotype.
As used herein, “quantitative trait locus (QTL)” means a locus that controls to some degree numerically representable traits that are usually continuously distributed.
As used herein, the term “event”, when used in the context of describing a transgenic plant, refers to a particular transformed plant line. In a typical transgenic breeding program, a transformation construct responsible for a trait is introduced into the genome via a transformation method. Numerous independent transformants (events) are usually generated for each construct. These events are evaluated to select those with superior performance.
As used herein, the term “sorghum” means Sorghum bicolor and includes all plant varieties that can be bred with sorghum, including wild sorghum species. In certain embodiments, sorghum plants from the species Sorghum bicolor and subspecies can be genotyped using the compositions and methods of the present disclosure.
As used herein, the term “bulk” refers to a method of managing a segregating population during inbreeding that involves growing the population in a bulk plot, harvesting the self-pollinated seed of plants in bulk, and using a sample of the bulk to plant the next generation.
As used herein, a polynucleotide is said to be “endogenous” to a given cell when it is found in a naturally occurring form and genomic location in the cell.
As used herein, the phrase “consensus sequence” refers to an amino acid, DNA or RNA sequence created by aligning two or more homologous sequences and deriving a new sequence having either the conserved or set of alternative amino acid, deoxyribonucleic acid, or ribonucleic acid residues of the homologous sequences at each position in the created sequence.
As used herein, the terms “edit,” “editing,” “edited” and the like refer to processes or products where insertions, deletions, and/or nucleotide substitutions are introduced into a genome. Such processes include methods of inducing homology directed repair and/or non-homologous end joining of one or more sites in the genome.
The phrases “genetically edited plant,” “edited plant,” and the like are used herein to refer to a plant comprising one or more nucleotide insertions, deletions, substitutions, or any combination thereof in the genomic DNA of the plant. Such genetically edited plants can be constructed by techniques including CRISPR/Cas endonuclease-mediated editing, meganuclease-mediated editing, engineered zinc finger endonuclease-mediated editing, and the like.
The term “heterologous,” as used herein in the context of a second polynucleotide that is operably linked to a first polynucleotide, refers to: (i) a second polynucleotide that is derived from a source distinct from the source of the first polynucleotide; (ii) a second polynucleotide derived the same source as the first polynucleotide, where the first, second, or both polynucleotide sequence(s) is/are modified from its/their original form; (iii) a second polynucleotide arranged in an order and/or orientation or in a genomic position or environment with respect to the first polynucleotide that is different than the order and/or orientation in or genomic position or environment of the first and second polynucleotides in a naturally occurring cell; or (iv) the second polynucleotide does not occur in a naturally occurring cell that contains the first polynucleotide. Heterologous polynucleotides include polynucleotides that promote transcription (e.g., promoters and enhancer elements), transcript abundance (e.g., introns, 5′ UTR, and 3′ UTR), translation, or a combination thereof as well as polynucleotides encoding peptides or proteins, spacer peptides, or localization peptides. In certain embodiments, a nuclear or plastid genome can comprise the first polynucleotide, where the second polynucleotide is heterologous to the nuclear or plastid genome. A “heterologous” polynucleotide that promotes transcription, transcript abundance, translation, or a combination thereof as well as polynucleotides encoding peptides, spacer peptides, or localization peptides can be autologous to the cell but, however, arranged in an order and/or orientation or in a genomic position or environment that is different than the order and/or orientation in or genomic position or environment in a naturally occurring cell. A polynucleotide that promotes transcription, transcript abundance, translation, or a combination thereof as well as polynucleotides encoding peptides, spacer peptides, or localization can be heterologous to another polynucleotide when the polynucleotides are not operably linked to one another in a naturally occurring cell. Heterologous peptides or proteins include peptides or proteins that are not found in a cell or organism as the cell or organism occurs in nature. As such, heterologous peptides or proteins include peptides or proteins that are localized in a subcellular location, extracellular location, or expressed in a tissue that is distinct from the subcellular location, extracellular location, or tissue where the peptide or protein is found in a cell or organism as it occurs in nature. Heterologous polynucleotides include polynucleotides that are not found in a cell or organism as the cell or organism occurs in nature.
The term “homolog” as used herein refers to a gene related to a second gene by identity of either the DNA sequences or the encoded protein sequences. Genes that are homologs can be genes separated by the event of speciation (see “ortholog”). Genes that are homologs can also be genes separated by the event of genetic duplication (see “paralog”). Homologs can be from the same or a different organism and can in certain embodiments perform the same biological function in either the same or a different organism.
The phrase “operably linked” as used herein refers to the joining of nucleic acid or amino acid sequences such that one sequence can provide a function to a linked sequence. In the context of a promoter, “operably linked” means that the promoter is connected to a sequence of interest such that the transcription of that sequence of interest is controlled and regulated by that promoter. When the sequence of interest encodes a protein that is to be expressed, “operably linked” means that the promoter is linked to the sequence in such a way that the resulting transcript will be efficiently translated. If the linkage of the promoter to the coding sequence is a transcriptional fusion that is to be expressed, the linkage is made so that the first translational initiation codon in the resulting transcript is the initiation codon of the coding sequence. Alternatively, if the linkage of the promoter to the coding sequence is a translational fusion and the encoded protein is to be expressed, the linkage is made so that the first translational initiation codon contained in the 5′ untranslated sequence associated with the promoter and the coding sequence is linked such that the resulting translation product is in frame with the translational open reading frame that encodes the protein. Nucleic acid sequences that can be operably linked include sequences that provide gene expression functions (e.g., gene expression elements such as promoters, 5′ untranslated regions, introns, protein coding regions, 3′ untranslated regions, polyadenylation sites, and/or transcriptional terminators), sequences that provide DNA transfer and/or integration functions (e.g., T-DNA border sequences, site specific recombinase recognition sites, integrase recognition sites), sequences that provide for selective functions (e.g., antibiotic resistance markers, biosynthetic genes), sequences that provide scoreable marker functions (e.g., reporter genes), sequences that facilitate in vitro or in vivo manipulations of the sequences (e.g., polylinker sequences, site specific recombination sequences) and sequences that provide replication functions (e.g., bacterial origins of replication, autonomous replication sequences, centromeric sequences). In the context of an amino acid sequence encoding a localization, spacer, linker, or other peptide, “operably linked” means that the peptide is connected to the polyprotein sequence(s) of interest such that it provides a function. Functions of a localization peptide include localization of a protein or peptide of interest to, e.g., an extracellular space or subcellular compartment. Functions of a spacer peptide include linkage of two peptides of interest such that the peptides will be expressed as a single protein.
As used herein, the term “polypeptide” is intended to encompass a singular “polypeptide” as well as plural “polypeptides,” and comprises any chain or chains of two or more amino acids. Thus, as used herein, a “peptide,” an “oligopeptide,” a “dipeptide,” a “tripeptide,” a “protein,” an “amino acid chain,” an “amino acid sequence,” “a peptide subunit,” or any other term used to refer to a chain or chains of two or more amino acids, are included in the definition of a “polypeptide,” (even though each of these terms can have a more specific meaning) and the term “polypeptide” can be used instead of, or interchangeably with any of these terms. The term further includes polypeptides which have undergone post-translational modifications, for example, glycosylation, acetylation, phosphorylation, amidation, derivatization by known protecting/blocking groups, proteolytic cleavage, or modification by non-naturally occurring amino acids.
The phrases “percent identity” or “sequence identity” as used herein refer to the number of elements (i.e., amino acids or nucleotides) in a sequence that are identical within a defined length of two DNA, RNA or protein segments (e.g., across the entire length of a reference sequence) in an alignment resulting in the maximal number of identical elements, and is calculated by dividing the number of identical elements by the total number of elements in the defined length of the aligned segments and multiplying by 100.
The phrase “transgenic” refers to an organism or progeny thereof wherein the organism's or progeny organism's DNA of the nuclear or organellar genome contains an inserted exogenous DNA molecule of 10 or more nucleotides in length. The phrase “transgenic plant” refers to a plant or progeny thereof wherein the plant's or progeny plant's DNA of the nuclear or plastid genome contains an introduced exogenous DNA molecule of 10 or more nucleotides in length. Such introduced exogenous DNA molecules can be naturally occurring, non-naturally occurring (e.g., synthetic and/or chimeric), from a heterologous source, or from an autologous source.
As used herein, the phrase “cold-stress sensitive,” “cold-stress sensitivity,” “exhibits cold-stress sensitivity,” and the like, refers to undesirable phenotypic traits observed in certain plants, such as certain sorghum germplasms, after exposure to cold-stress Such undesirable phenotypic traits include, but are not limited to, leaf chlorosis, leaf necrosis, and plant death.
As used herein, the phrase “cold-stress tolerant,” “cold-stress tolerance,” “exhibits cold-stress tolerance,” and the like refers to either the absence or reduction of undesirable phenotypic traits observed after exposure to cold-stress in “cold-stress sensitive” sorghum germplasms.
As used herein, the term “alters” means to change a characteristic. For example, an altered protein activity could mean a decrease in activity, but it could also mean an increase or a change in the specificity of the activity.
To the extent to which any of the preceding definitions is inconsistent with definitions provided in any patent or non-patent reference incorporated herein by reference, any patent or non-patent reference cited herein, or in any patent or non-patent reference found elsewhere, it is understood that the preceding definition will be used herein.
Sorghum is attractive as a bioenergy feedstock crop because it is heat and drought tolerant and can thrive in marginal environments. It is an ideal target for accelerated improvement through breeding and engineering because it has extensive genetic and phenotypic diversity but has not yet benefited from genomics or genetic modification like some other crops such as maize. Early season planting of sorghum provides the opportunity for an extended growing season with greater potential accumulation of biomass. However, as a tropical crop, sorghum is sensitive to cold stress. Identification of accessions that exhibit the highest WUE, RGR, height, hull area, and biomass under early cold stress conditions could facilitate genetic improvement of bioenergy sorghum for early-season planting and cultivation in colder temperatures.
Top performing BAP accessions for bioenergy-related traits under early cold stress were identified. The accessions with the highest and lowest rankings for each trait and multiple bioenergy-related traits were identified. Accession PI329299 was among the top accessions for hull area and height, PI452619 was a top accession for hull area and WUE, PI329403 was a top accession for hull area and WUE, and PI585461 was among the top accessions for both biomass and WUE. These top performing accessions, particularly the accessions that possess multiple advantageous traits, are promising candidates for sorghum bioenergy breeding programs and the development of additional genetic resources such as mapping populations (e.g., NAMs).
Ideotype-positive and ideotype-negative accessions as accessions that ranked high or low, respectively, for multiple traits or exhibited beneficial trait combinations were identified. The ideotype-positive phenotype is the combination of high WUE with a low height-to-biomass ratio, i.e., a plant that has high biomass but does not grow tall. This ideotype may be desirable for bioenergy sorghum because the required biomass accumulation may be attained with reduced water use and without the increased risk of lodging associated with taller plants. Five accessions were ideotype-positive in the early phase of the experiment when the temperature was 15° C. These accessions may be beneficial in breeding bioenergy sorghum that can be planted early in the season or grown in colder environments. Accession PI329517, which attained the ideotype-positive phenotype early under cold stress, and maintained it after the temperature increased, may be beneficial for early season planting, for long growth periods, or under conditions of reduced water availability. On the other hand, ideotype-negative accessions that have relatively low biomass accumulation and ranked low for WUE, but are tall may be undesirable as bioenergy sorghum because more water would be required for the plant to grow tall and thin, with a higher risk of lodging and decreased biomass accumulation. Even though these accessions may not be desirable for breeding purposes, identification of these extreme ideotypes may be useful for the development of structured populations for further genetic analysis.
The profiles of the traits over time and in response to temperature changes could be discerned, ranked, and clustered using the phenotypic data collected with single-day resolution. The BAP clustered into 6-8 distinct temporal profiles for each trait, identifying accessions that perform best under early cold stress, as well as at different temperatures and development stages. For example, accessions in WUE clusters two, six, and seven have the highest WUE at 15° C. Similarly, cluster six not only attains high WUE under 15° C. but is the cluster with the highest WUE attained under increased temperatures and therefore may perform well under drought conditions. Accession PI455221 was among the top-performing accessions for WUE at both 15° C. and 24° C., and PI329403 was among the top-performing accessions for WUE at 15° C. and 32° C., demonstrating that high WUE can be maintained as the plants develop and as the temperature increases.
GWAS revealed 2,305 highly significant SNPs associated with biomass, hull area, WUE, RGR, and height phenotypes. Significant SNPs that colocalized for multiple traits were identified, suggesting that these polymorphisms may be near tightly linked genes or genes with pleiotropic effects. By using the daily values for each phenotype in the GWAS, the temporal correlations of SNPs to traits were determined. With this approach, transient QTL that “turn on/off” at a specific developmental stage or with a change in temperature were able to be identified.
The highly significant SNPs identified by GWAS revealed 72 candidate genes with putative functions related to the bioenergy-relevant and cold-stress-responsive traits of interest. A transient QTL for RGR at 33 DAP mapped to the gene Sobic.006G057866, which encodes PRR37/Ma1 an a priori candidate gene that represses flowering in long days (Murphy. R. L., et al. (2011) Proceedings of the National Academy of Sciences, 108(39); 16469-16474). The observation that this transient QTL for RGR “turns on” at 33 DAP, a critical time for the transition to flowering in sorghum, and immediately after the temperature increased from 15° C., is consistent with a role for PRR37/Ma1 in growth and biomass accumulation through regulation of the transition from vegetative to reproductive development. Another transient QTL associated with hull area was significant from 27-30 DAP and again from 34-35 DAP. This transient QTL mapped to the gene Sobic.007G033300, which encodes a putative plastocyanin that may function in photosynthetic electron transfer. Additional sequence variation in this gene correlated with a significantly larger hull area from 27-30 DAP. These observations suggest that Sobic.007G033300 is important for biomass accumulation, consistent with known deleterious effects of cold on photosynthesis, in particular, the capacity for electron transport and photosystem protection (Liu, X., et al. (2018) Frontiers in Plant Science. 9: 1715). Studies have also implicated that plastocyanins affect yield in crops such as rice (Zhang, J. P., et al. (2017) Plant Physiology, 175: 1175-1185).
In accordance with the present disclosure, Applicants have discovered genomic regions, associated markers, and associated methods for identifying and associating genotypes that effect the cold-stress tolerance of plants. In certain embodiments, the plant is sorghum. For example, in one embodiment, a method of the disclosure comprises screening a plurality of germplasm entries displaying a heritable variation for at least one cold-stress tolerance trait wherein the heritable variation is linked to at least one genotype; and associating at least one genotype from the germplasm entries to at least one cold-stress tolerance trait. The methods of the present disclosure can be used with traditional breeding techniques as described below to more efficiently screen and identify genotypes affecting a cold-stress tolerance trait.
The use of markers to infer a phenotype of interest results in the economization of a breeding program by substituting costly, time-intensive phenotyping assays with genotyping assays. Further, breeding programs can be designed to explicitly drive the frequency of specific, favorable phenotypes by targeting particular genotypes (U.S. Pat. No. 6,399,855). Fidelity of these associations may be monitored continuously to ensure maintained predictive ability and, thus, informed breeding decisions (US Patent Application 2005/0015827). In this case, costly, time-intensive phenotyping assays required for determining if a plant or plants contains a genomic region associated with a “cold-stress tolerance” or “cold-stress sensitivity” phenotype can be supplanted by genotypic assays that provide for identification of a plant or plants that contain the desired genomic region that confers cold-stress tolerance.
Genomic Region, Locus, and Genotype Associated with a Cold-Stress Tolerance Phenotype
Provided in this disclosure is a genomic region that is shown herein to be associated with a cold-stress tolerance phenotype when present in certain allelic forms (cold-stress tolerant genetic locus of this disclosure). Also provided herein is a cold-stress tolerance genetic locus as well as genotypes, alleles, and allelic states associated with cold-stress tolerance. It is understood that for purposes of this disclosure, a variant plastocyanin gene of this disclosure is a genotype associated with cold-stress tolerance and a genetic locus comprising a variant plastocyanin gene of this disclosure is a cold-stress tolerance genetic locus.
A sorghum genomic region that is associated with a cold-stress tolerance phenotype when present in certain allelic forms comprises a genetic locus comprising a plastocyanin gene Sobic.007G033300 (SEQ ID NO: 1). In certain embodiments, the genetic locus is a genomic region between any of 1 Mb, 500 kb, 100 kb, 50 kb, or 10 kb telomere proximal and any of 1 Mb, 500 kb, 100 kb, 50 kb, or 10 kb centromere proximal of a plastocyanin gene Sobic.007G033300. In certain embodiments, the genetic locus consists of a plastocyanin gene Sobic.007G033300. A series of markers useful in practicing the methods of this disclosure are provided herewith in Table 1. In certain embodiments, the genotype associated with cold-stress tolerance comprises at least one polymorphic allele associated with cold-stress tolerance of at least one marker in the gene Sobic.007G033300 selected from the group consisting of Chr07:2934702, Chr07:2934028, Chr07:2934031, Chr07:2934040, Chr07:2934099, Chr07:2934128, Chr07:2934129, Chr07:2934187, and Chr07:2934291. In certain embodiments, the genotype associated with cold-stress tolerance comprises at least two, at least three, at least four, at least five, at least six, at least seven, at least eight, or at least nine of these polymorphic alleles. In certain embodiments, the genotype associated with cold-stress tolerance comprises at least one polymorphic allele associated with cold-stress tolerance of at least the marker Chr07.2934702 in the gene Sobic.007G033300. In certain embodiments, the genotype associated with cold-stress tolerance comprises at least one polymorphic allele associated with cold-stress tolerance of at least the marker Chr07.2934702 and also comprises at least two, at least three, at least four, at least five, at least six, at least seven, at least eight, or at least nine of the aforementioned polymorphic alleles. In certain embodiments, the genotype associated with cold-stress tolerance comprises an allelic state with at least one single nucleotide polymorphism (SNP) associated with cold-stress tolerance of at least one marker in Sobic.007G033300, wherein the SNP is selected from the group consisting of a G allele of SNP 7:2934702, a C allele of SNP 7:2934028, a G allele of SNP 7:2934031, a T allele of SNP 7:2934040, a C allele of SNP 7:2934099, an A allele of 7:2934128, a G allele of 7:2934129, a T allele of SNP 07:2934187, and a C allele replacing the nucleotides CCCG beginning at position 07:2934291. In certain embodiments, the genotype associated with cold-stress tolerance comprises at least two, at least three, at least four, at least five, at least six, at least seven, at least eight, or at least nine of said SNPs. In certain embodiments, the genotype associated with cold-stress tolerance comprises the allelic state of a G allele of SNP 7:2934702. In certain embodiments, the genotype associated with cold-stress tolerance comprises the allelic state of a G allele of SNP 7:2934702 and also comprises at least two, at least three, at least four, at least five, at least six, at least seven, at least eight, or at least nine of the aforementioned SNPs.
Without being bound by theory, certain alleles disclosed herein result in non-synonymous amino acid substitutions within the coding sequence of Sobic.007G033300. Certain alleles are associated with significantly improved (p≤0.019) growth and biomass accumulation following cold stress in comparison to the reference alleles.
Additional genetic markers can be used either in conjunction with the markers provided in Table 1 or independently of the markers provided in Table 1 to practice the methods of this disclosure. Publicly available sorghum marker databases from which useful markers can be obtained are known to those of ordinary skill in the art. Given the provision herein of a genomic region associate with sorghum cold-stress tolerance, as well as an assortment of sorghum germplasms exhibiting either a “cold-stress sensitive” or “cold-stress tolerant” phenotype, additional markers located either within or near this genomic region that are associated with these phenotypes can be obtained by merely typing the new markers in the various known sorghum germplasms.
Provided for herein is a method for obtaining a sorghum plant comprising in its genome at least one cold-stress tolerance genetic locus. In certain embodiments, the cold-stress tolerance genetic locus is a genetic locus comprising a plastocyanin gene Sobic.007G033300 and any of its embodiments as described in detail elsewhere herein. In certain embodiments, the method comprises the steps of (a) genotyping a plurality of sorghum plants with respect to at least one genetic locus comprising a plastocyanin gene Sobic.007G033300; and (b) selecting a sorghum plant comprising in its genome at least one genetic locus comprising a genotype associated with cold-stress tolerance, based on said genotyping. In certain embodiments, the genotype, polymorphic allele, allelic state, single nucleotide polymorphism, genetically edited variant, etc., associated with cold-stress tolerance is selected from any the embodiments as described in detail elsewhere herein. In certain embodiments, the selected sorghum plant exhibits cold-stress tolerance in comparison to a sorghum plant that is not considered cold-stress tolerant. In certain embodiments, a progeny of the selected sorghum plant exhibits cold-stress tolerance in comparison to a sorghum plant that is not considered cold-stress tolerant.
Once selected, a sorghum plant comprising in its genome a cold-stress tolerance genetic locus can used for breeding and/or in a breeding program to produce progeny comprising in their genomes the cold-stress tolerance locus. In certain embodiments, it is useful to use a sorghum plant comprising in its genome a cold-stress tolerance genetic locus to introduce the locus into germplasm that does not comprise the cold-stress tolerance genetic locus. Thus in certain embodiments, the selected sorghum plant having in its genome a cold-stress tolerance locus is crossed with a cold-stress sensitive sorghum plant to produce a progeny sorghum plant comprising in its genome at least one cold-stress tolerance genetic locus. In certain embodiments, the progeny plant exhibits cold-stress tolerance. Because sorghum is planted as a commercial crop, it is desirable to produce a population of sorghum plants comprising a plurality of sorghum plants comprising in their genomes the cold-stress tolerance genetic locus. Thus in certain embodiments, a sorghum plant having in its genome a cold-stress tolerance genetic locus is crossed with a cold-stress sensitive sorghum plant to produce a population of sorghum plants in comprising in their genomes at least one cold-stress tolerance genetic locus. The production of such a population of plants include the use of sorghum plant having in their genome an introgressed cold-stress tolerance genetic locus as described elsewhere herein and thus any of these embodiments apply to introgressed plants. In certain embodiments, the population of sorghum plants produced comprises a plurality of sorghum plants that exhibit cold-stress tolerance. It is understood that not all of the sorghum plants in a population, depending on how the population is produced and/or determined, will have inherited the cold-stress tolerance locus and/or will be cold-stress tolerant. In certain embodiments, at least 50%, at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% of the sorghum plants in the population of sorghum plants produced exhibit cold-stress tolerance.
The starting sorghum plants to be used in the method for obtaining a sorghum plant comprising in its genome at least one cold-stress tolerance genetic locus can come from numerous sources disclosed elsewhere herein and/or known to those of ordinary skill in the art. In certain embodiments, the plurality of sorghum plants initially genotyped in the method comprises a population that is obtained by crossing a parent plant comprising at least one cold-stress tolerance locus with a parent plant comprising at least one non-cold-stress tolerance locus. In certain embodiments, the plurality of sorghum plants initially genotyped in the method comprises a population that is obtained by obtaining seed or progeny from a parental plant segregating for at least one cold-stress tolerance locus.
While the detection of a single allele having an allelic state associated with a cold-stress tolerance locus can be predictive of a cold-stress tolerance phenotype, in some cases it is useful to determine the allelic state of additional genetic markers, such as to strengthen the prediction. In any embodiments herein involving the genotyping of an allele, the embodiments also provide for genotyping for/determining, and optionally selecting/crossing/breeding a plant based thereon, the presence of a haplotype associated with cold-stress tolerance. Further, in some cases, it may be useful to determine whether a cold-stress tolerance allele and/or locus is present in a germplasm that naturally contains the allele and/or locus or whether the allele and/or locus has been artificially introduced (as to produce a new, non-naturally occurring plant), such as by introgression, into the germplasm. Thus in certain embodiments, the method further comprises genotyping for the presence of at least one additional marker. In some embodiments, the additional marker is associated with cold-tolerance. In some embodiments, the additional marker is linked to the genomic locus associated with cold-tolerance disclosed herein. In certain embodiments, the additional marker is not linked to the genomic locus associated with cold-tolerance.
As noted, not all sorghum plants comprising in their genomes at least one cold-stress tolerance genetic locus will exhibit cold-stress tolerance. Therefore in certain embodiments, the method further comprises exposing the selected sorghum plant or progeny thereof comprising the cold-stress tolerance locus to cold conditions sufficient to cause a deleterious effect in a non-cold-stress tolerant variety. In certain embodiments, this can be done in an experimental setting such as in a green house or other climate controlled environment. In certain embodiments, the plants can be planted in an open field wherein they are exposed to cold weather conditions, such as in the geographic regions described elsewhere herein.
Introgression of a Genomic Region Associated with a Cold-Stress Tolerance
Also provided herewith is unique germplasm, such as sorghum germplasms, comprising an introgressed genomic region that is associated with a cold-stress tolerance and methods of obtaining the same. Marker-assisted introgression involves the transfer of a chromosomal region, defined by one or more markers, from one germplasm to a second germplasm. Offspring of a cross that contain the introgressed genomic region can be identified by the combination of markers characteristic of the desired introgressed genomic region from a first germplasm (i.e., such as a cold-stress tolerance germplasm) and both linked and unlinked markers characteristic of the desired genetic background of a second germplasm (i.e., a cold-stress sensitive germplasm).
Certain embodiments of this disclosure provide for a method for producing a sorghum plant comprising in its genome at least one introgressed cold-stress tolerance genetic locus. In certain embodiments, the method comprises the steps of:
In certain embodiments, the second linked polymorphic locus is detected with a marker that is located within about 1000, 500, 100, 40, 20, 10, or 5 kilobases (Kb) of the cold-stress tolerance locus.
In certain embodiments, the method further comprises exposing the selected sorghum plant or progeny thereof comprising the introgressed cold-stress tolerance genetic locus to cold conditions sufficient to cause a deleterious effect in a non-cold-stress tolerant variety. In certain embodiments, the sorghum plant comprising in its genome the introgressed cold-stress tolerance locus exhibits cold-stress tolerance.
Identification of Plants Comprising a “Cold-Stress sensitivity” or “Cold-Stress Tolerance” Associated Genotype
Provided herein is a method of identifying a sorghum plant that comprises or does not comprise a genotype associated with cold-stress tolerance of this disclosure. In certain embodiments, the method comprises genotyping a sorghum plant for the presence of or absence an allele in at least one genetic locus associated with cold-stress tolerance as disclosed in detail elsewhere herein. In certain embodiments, the method comprises denoting based on the genotyping that said sorghum plant comprises or does not comprise a genotype associated with cold-stress tolerance. In certain embodiments, the method further comprises the step of selecting a denoted plant either comprising or not comprising a genotype associated with cold-stress tolerance from a population of plants.
In certain embodiments, the method comprises genotyping a sorghum plant for the presence of an allele in at least one genetic locus associated with cold-stress tolerance as disclosed elsewhere herein and denoting based on the genotyping that said sorghum plant comprises a genotype associated with cold-stress tolerance. In certain embodiments, the identified and/or selected sorghum plant comprises in its genome at least one introgressed cold-stress tolerance genetic locus exhibits cold-stress tolerance. Methods of identifying an introgressed locus are provided elsewhere herein.
Further, to observe the presence or absence of the cold-stress sensitive or cold-stress tolerant phenotypes, sorghum plants can be exposed to cold-stress such as conditions sufficient to cause a deleterious effect in a non-cold stress tolerant variety. In certain embodiments, an identified sorghum plant comprising a genotype associated with cold tolerance or a progeny thereof is cold-stress tolerant.
Sorghum Plants Comprising a Genomic Region Associated with Cold-Stress Tolerance
Provided herein are sorghum plants comprising a genomic region of this disclosure associated with cold-stress tolerance. In certain embodiments, the plant is a naturally occurring sorghum variety comprising genomic region associated with cold-stress tolerance, such as can be identified by the methods disclosed herein. In certain embodiments, the sorghum plant is made, such as by introgressing a cold-stress tolerance locus into a germplasm that comprises a cold-stress sensitive locus, or by genetic editing as disclosed elsewhere herein.
In certain embodiments, a sorghum plant comprises an introgressed cold-stress tolerance locus comprising a genotype associated with cold-stress tolerance as disclosed elsewhere herein, wherein at least one marker linked to the introgressed cold-stress tolerance locus found in said sorghum plant is characteristic of germplasm comprising a non-cold-stress tolerance locus but is not associated with germplasm comprising the cold-stress tolerance locus. In certain embodiments, the sorghum plant or a progeny thereof exhibits cold-stress tolerance.
It is contemplated that in any of the embodiments disclosed herein, a sorghum plant whether or not comprising in its genome a cold-stress tolerance genetic locus, can be non-transgenic and resistant to a herbicide. It is contemplated that in any of the embodiments disclosed herein, a sorghum plant whether or not comprising in its genome a cold-stress tolerance genetic locus, can contain a transgene that confers resistance to a herbicide. It is contemplated that in any of the embodiments disclosed herein, a sorghum plant whether or not comprising in its genome a cold-stress tolerance genetic locus, can contain a transgene that confers resistance to a herbicide and be resistant to an additional herbicide to which the plant does not contain a transgene conferring resistance thereto. In certain embodiments, the herbicide can be for example glyphosate, dicamba, imidazoline, metribuzin, 2,4-D, glufosinate, and/or bromoxynil. Thus, in certain embodiments, a method of this disclosure includes exposing a sorghum plant comprising in its genome a cold-stress tolerance genetic locus to an amount of herbicide sufficient to cause a deleterious effect in a sorghum plant that is not resistance to the herbicide.
For purposes of this disclosure, the presence and/or use of herbicide tolerance and herbicide application can apply to single plants or a population of plants such as used in a breeding program or grown from a commercial seed lot as disclosed elsewhere herein. For purposes of this disclosure herbicide tolerance, whether from a transgene or not, includes seeds that can be grown into herbicide tolerant plants.
Genetic markers that can be used in the practice of the instant disclosure include, but are not limited to, are Restriction Fragment Length Polymorphisms (RFLP), Amplified Fragment Length Polymorphisms (AFLP), Simple Sequence Repeats (SSR), Single Nucleotide Polymorphisms (SNP), Insertion/Deletion Polymorphisms (Indels), Variable Number Tandem Repeats (VNTR), and Random Amplified Polymorphic DNA (RAPD), and others known to those skilled in the art. Marker discovery and development in crops provides the initial framework for applications to marker-assisted breeding activities (US Patent Applications 2005/0204780, 2005/0216545, 2005/0218305, and 2006/00504538). The resulting “genetic map” is the representation of the relative position of characterized loci (DNA markers or any other locus for which alleles can be identified) along the chromosomes. The measure of distance on this map is relative to the frequency of crossover events between sister chromatids at meiosis.
As a set, polymorphic markers serve as a useful tool for fingerprinting plants to inform the degree of identity of lines or varieties (U.S. Pat. No. 6,207,367). These markers form the basis for determining associations with phenotype and can be used to drive genetic gain. The implementation of marker-assisted selection is dependent on the ability to detect underlying genetic differences between individuals.
Certain genetic markers for use in the present disclosure include “dominant” or “codominant” markers. “Codominant markers” reveal the presence of two or more alleles (two per diploid individual). “Dominant markers” reveal the presence of only a single allele. The presence of the dominant marker phenotype (e.g., a band of DNA) is an indication that one allele is present in either the homozygous or heterozygous condition. The absence of the dominant marker phenotype (e.g., absence of a DNA band) is merely evidence that “some other” undefined allele is present. In the case of populations where individuals are predominantly homozygous and loci are predominantly dimorphic, dominant and codominant markers can be equally valuable. As populations become more heterozygous and multiallelic, codominant markers often become more informative of the genotype than dominant markers.
In another embodiment, markers that include. but are not limited, to single sequence repeat markers (SSR), AFLP markers, RFLP markers, RAPD markers, phenotypic markers, isozyme markers, single nucleotide polymorphisms (SNPs), insertions or deletions (Indels), single feature polymorphisms (SFPs, for example, as described in Borevitz et al. 2003 Gen. Res. 13:513-523), microarray transcription profiles, DNA-derived sequences, and RNA-derived sequences that are genetically linked to or correlated with cold-stress tolerance loci, regions flanking cold-stress tolerance loci, regions linked to cold-stress tolerance loci, and/or regions that are unlinked to cold-stress tolerance loci can be used in certain embodiments of the instant disclosure.
In one embodiment, nucleic acid-based analyses for determining the presence or absence of the genetic polymorphism (i.e. for genotyping) can be used for the selection of seeds in a breeding population. A wide variety of genetic markers for the analysis of genetic polymorphisms are available and known to those of skill in the art. The analysis may be used to select for genes, portions of genes, QTL, alleles, or genomic regions (genotypes) that comprise or are linked to a genetic marker that is linked to or correlated with cold-stress tolerance loci, regions flanking cold-stress tolerance loci, regions linked to cold-stress tolerance loci, and/or regions that are unlinked to cold-stress tolerance loci can be used in certain embodiments of the instant disclosure.
Nucleic acid analysis methods (e.g., genotyping) provided herein include, but are not limited to, PCR-based detection methods (for example, TaqMan assays), microarray methods, mass spectrometry-based methods and/or nucleic acid sequencing methods. In one embodiment, the detection of polymorphic sites in a sample of DNA, RNA, or cDNA may be facilitated through the use of nucleic acid amplification methods. Such methods specifically increase the concentration of polynucleotides that span the polymorphic site, or include that site and sequences located either distal or proximal to it. Such amplified molecules can be readily detected by gel electrophoresis, fluorescence detection methods, or other means.
A method of achieving such amplification employs the polymerase chain reaction (PCR) (Mullis et al. 1986 Cold Spring Harbor Symp. Quant. Biol. 51:263-273; European Patent 50,424; European Patent 84,796; European Patent 258,017; European Patent 237,362; European Patent 201,184; U.S. Pat. Nos. 4,683,202; 4,582,788; and 4,683,194), using primer pairs that are capable of hybridizing to the proximal sequences that define a polymorphism in its double-stranded form.
Methods for typing DNA based on mass spectrometry can also be used. Such methods are disclosed in U.S. Pat. Nos. 6,613,509 and 6,503,710, and references found therein.
Polymorphisms in DNA sequences can be detected or typed by a variety of effective methods well known in the art including, but not limited to, those disclosed in U.S. Pat. Nos. 5,468,613, 5,217,863; 5,210,015; 5,876,930; 6,030,787; 6,004,744; 6,013,431; 5,595,890; 5,762,876; 5,945,283; 5,468,613; 6,090,558; 5,800,944; 5,616,464; 7,312,039; 7,238,476; 7,297,485; 7,282,355; 7,270,981 and 7,250,252 all of which are incorporated herein by reference in their entireties. However, the compositions and methods of the present disclosure can be used in conjunction with any polymorphism typing method to type polymorphisms in genomic DNA samples. These genomic DNA samples used include but are not limited to genomic DNA isolated directly from a plant, cloned genomic DNA, or amplified genomic DNA.
For instance, polymorphisms in DNA sequences can be detected by hybridization to allele-specific oligonucleotide (ASO) probes as disclosed in U.S. Pat. Nos. 5,468,613 and 5,217,863. U.S. Pat. No. 5,468,613 discloses allele specific oligonucleotide hybridizations where single or multiple nucleotide variations in nucleic acid sequence can be detected in nucleic acids by a process in which the sequence containing the nucleotide variation is amplified, spotted on a membrane and treated with a labeled sequence-specific oligonucleotide probe.
Target nucleic acid sequence can also be detected by probe ligation methods as disclosed in U.S. Pat. No. 5,800,944 where sequence of interest is amplified and hybridized to probes followed by ligation to detect a labeled part of the probe.
Microarrays can also be used for polymorphism detection, wherein oligonucleotide probe sets are assembled in an overlapping fashion to represent a single sequence such that a difference in the target sequence at one point would result in partial probe hybridization (Borevitz et al., Genome Res. 13:513-523 (2003); Cui et al., Bioinformatics 21:3852-3858 (2005). On any one microarray, it is expected there will be a plurality of target sequences, which may represent genes and/or noncoding regions wherein each target sequence is represented by a series of overlapping oligonucleotides, rather than by a single probe. This platform provides for high throughput screening a plurality of polymorphisms. A single-feature polymorphism (SFP) is a polymorphism detected by a single probe in an oligonucleotide array, wherein a feature is a probe in the array. Typing of target sequences by microarray-based methods is disclosed in U.S. Pat. Nos. 6,799,122; 6,913,879; and 6,996,476.
Target nucleic acid sequence can also be detected by probe linking methods as disclosed in U.S. Pat. No. 5,616,464, employing at least one pair of probes having sequences homologous to adjacent portions of the target nucleic acid sequence and having side chains which non-covalently bind to form a stem upon base pairing of the probes to the target nucleic acid sequence. At least one of the side chains has a photoactivatable group which can form a covalent cross-link with the other side chain member of the stem.
Other methods for detecting SNPs and Indels include single base extension (SBE) methods. Examples of SBE methods include, but are not limited, to those disclosed in U.S. Pat. Nos. 6,004,744; 6,013,431; 5,595,890; 5,762,876; and 5,945,283. SBE methods are based on extension of a nucleotide primer that is adjacent to a polymorphism to incorporate a detectable nucleotide residue upon extension of the primer. In certain embodiments, the SBE method uses three synthetic oligonucleotides. Two of the oligonucleotides serve as PCR primers and are complementary to sequence of the locus of genomic DNA which flanks a region containing the polymorphism to be assayed. Following amplification of the region of the genome containing the polymorphism, the PCR product is mixed with the third oligonucleotide (called an extension primer) which is designed to hybridize to the amplified DNA adjacent to the polymorphism in the presence of DNA polymerase and two differentially labeled dideoxynucleosidetriphosphates. If the polymorphism is present on the template, one of the labeled dideoxynucleosidetriphosphates can be added to the primer in a single base chain extension. The allele present is then inferred by determining which of the two differential labels was added to the extension primer. Homozygous samples will result in only one of the two labeled bases being incorporated and thus only one of the two labels will be detected. Heterozygous samples have both alleles present, and will thus direct incorporation of both labels (into different molecules of the extension primer) and thus both labels will be detected.
In another method for detecting polymorphisms, SNPs and Indels can be detected by methods disclosed in U.S. Pat. Nos. 5,210,015; 5,876,930; and 6,030,787 in which an oligonucleotide probe having a 5′ fluorescent reporter dye and a 3′ quencher dye covalently linked to the 5′ and 3′ ends of the probe. When the probe is intact, the proximity of the reporter dye to the quencher dye results in the suppression of the reporter dye fluorescence, e.g. by Forster-type energy transfer. During PCR forward and reverse primers hybridize to a specific sequence of the target DNA flanking a polymorphism while the hybridization probe hybridizes to polymorphism-containing sequence within the amplified PCR product. In the subsequent PCR cycle DNA polymerase with 5′→3′ exonuclease activity cleaves the probe and separates the reporter dye from the quencher dye resulting in increased fluorescence of the reporter.
In another embodiment, the locus or loci of interest can be directly sequenced using nucleic acid sequencing technologies. Methods for nucleic acid sequencing are known in the art and include technologies provided by 454 Life Sciences (Branford, CT), Agencourt Bioscience (Beverly, MA), Applied Biosystems (Foster City, CA), LI-COR Biosciences (Lincoln, NE), NimbleGen Systems (Madison, WI), Illumina (San Diego, CA), and VisiGen Biotechnologies (Houston, TX). Such nucleic acid sequencing technologies comprise formats such as parallel bead arrays, sequencing by ligation, capillary electrophoresis, electronic microchips, “biochips,” microarrays, parallel microchips, and single-molecule arrays, as reviewed by R. F. Service Science 2006 311:1544-1546.
The markers to be used in the methods of the present disclosure should preferably be diagnostic of origin in order for inferences to be made about subsequent populations. Experience to date suggests that SNP markers may be ideal for mapping because the likelihood that a particular SNP allele is derived from independent origins in the extant populations of a particular species is very low. As such, SNP markers appear to be useful for tracking and assisting introgression of QTLs, particularly in the case of genotypes.
Provided for herein are methods for producing a plant having a variant plastocyanin gene that confers cold-stress tolerance to plants. Thus, it is understood that for purposes of this disclosure, a variant plastocyanin gene of this disclosure is a genotype associated with cold-stress tolerance. In certain embodiments, the plant is a non-transgenic plant. These methods include, but are not limited to, gene editing tools such as CRISPR/Cas endonuclease-mediated editing, meganuclease-mediated editing, engineered zinc finger endonuclease-mediated editing, and traditional mutagenesis. For examples, certain embodiments comprise a precise gene editing in plant cells, callus, and/or germplasm explants (e.g. Sorghum) using CRISPR/Cas system mediated by homology direct repair (HDR). In certain embodiments, the modifications can confer cold-stress tolerance to plants which are regenerated and selected using an in vitro culture approach.
In certain embodiments, a genetically-edited plant comprising a plastocyanin variant gene and/or protein can be obtained by using techniques that provide for genome editing in the plant. In certain embodiments, a plant comprising an endogenous plastocyanin gene can be subjected to a genome editing technique wherein at least one nucleotide insertion, deletion, and/or substitution, in comparison to the corresponding unedited wild-type polynucleotide sequence is introduced, resulting in cold-stress tolerance. Examples of endogenous plant plastocyanin genes that can be edited to a cold-stress tolerant variant include, but are not limited to, the Sorghum gene Sobic.007G033300 (SEQ ID NO: 1) and the Seratia viridis gene Sevir.6G069300 (SEQ ID NO: 2). Other representative genes identified herein that can be edited to a cold-stress tolerant variant include those listed in
The at least one nucleotide insertion, deletion, and/or substitution can be made anywhere in the gene including, for example, in the promoter region, the exons, the introns, and the untranslated regions (5′ and 3′ UTRs). The insertion, deletion, and/or substitution can be minimal, for example, a one nucleotide insertion, deletion, or substitution, or it can be more extensive, e.g., a deletion or insertion of up to 2, 3, 4, 5, 10, 15, 20, 25, 50, or more nucleotides such as between any of about 1, 2, 3, 4, 5, 10, 15, 20, 25, or 50 and any of about 2, 3, 4, 5, 10, 15, 20, 25, 50 or 100 nucleotides, or multiple nucleotide substitutions (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 25, 50, or 100, or any integer inbetween). Examples of methods for plant genome editing with clustered regularly interspaced short palindromic repeats (CRISPR)-associated (Cas)-polynucleotide modification template technology and a Cas endonuclease are at least disclosed by Bortesi and Fisher et al., 2015; Svitashev et al., 2015; Kumar and Jain, 2015; and in US Patent Appl. Pub. Nos. 20150082478, 20150059010, 20190352655, and 2020157554, which are specifically incorporated herein by reference in their entireties. Examples of methods involving cytosine base editors and adenine base editors are at least disclosed by Kim, Nature Plants, 2018 March, 4(3):148-151; Komor et al. (2016) Nature, 533:420-424, Komor et al., Sci Adv. 2017 August; 3(8):eaao4774; and Gaudelli et al., (2017) Nature 551(7681):464-471 and in US Patent Appl. Pub. Nos. 20180362590 and 20180312828, which are specifically incorporated herein by reference in their entireties.
Gene editing molecules for inducing a genetic modification in the plant cell or plant protoplast of the systems, methods, and compositions provided herein include, but are not limited to: (i) a polynucleotide selected from the group consisting of an RNA guide for an RNA-guided nuclease, a DNA encoding an RNA guide for an RNA-guided nuclease; (ii) a nuclease selected from the group consisting of an RNA-guided nuclease, an RNA-guided DNA endonuclease, a type II Cas nuclease, a Cas9, a type V Cas nuclease, a Cpfl, a CasY, a CasX, a C2cl, a C2c3, an engineered nuclease, a codon-optimized nuclease, a zinc-finger nuclease (ZFN), a transcription activator-like effector nuclease (TAL-effector nuclease), Argonaute, a meganuclease or engineered meganuclease; (iii) a polynucleotide encoding one or more nucleases capable of effecting site-specific modification of a target nucleotide sequence; and/or (iv) a donor template polynucleotide. In certain embodiments, at least one delivery agent is selected from the group consisting of solvents, fluorocarbons, glycols or polyols, surfactants; primary, secondary, or tertiary amines and quaternary ammonium salts; organosilicone surfactants; lipids, lipoproteins, lipopolysaccharides; acids, bases, caustic agents; peptides, proteins, or enzymes; cell-penetrating peptides; RNase inhibitors; cationic branched or linear polymers; dendrimers; counter-ions, amines or polyamines, osmolytes, buffers, and salts; polynucleotides; transfection agents; antibiotics; chelating agents such as ammonium oxalate, EDTA, EGTA, or cyclohexane diamine tetraacetate, non-specific DNA double-strand-break-inducing agents; and antioxidants; particles or nanoparticles, magnetic particles or nanoparticles, abrasive or scarifying agents, needles or microneedles, matrices, and grids.
Thus, provided for herein is an edited plastocyanin gene. The edited plastocyanin gene can comprise an edit (e.g., at least one nucleotide insertion, deletion, and/or substitution) in any sequence/region of the gene, for example, in the promoter sequence, 5′ untranslated region (5′ UTR), exons, introns, 3′ UTR, etc., as long as the edit alters the expression and/or a characteristic of a gene product and/or activity of the plastocyanin protein. The at least one nucleotide insertion, deletion, and/or substitution can be short, e.g., one nucleotide, or it can be longer, e.g., comprising 10, 20, 30, 40, 50, 75, or more nucleotides that have been inserted, deleted, and/or substituted. In certain embodiments, the plastocyanin gene is disrupted or knocked-out. For example, a polynucleotide variant of a promoter or untranslated region could cause decrease in expression of the plastocyanin protein and thus an overall decrease in plastocyanin activity in a cell, even if the plastocyanin protein itself or its characteristics are unaltered. An edit in a protein coding region (e.g., exons) can result in a variant plastocyanin protein amino acid sequence. The polynucleotide variant can include a frameshift, missense, and/or nonsense mutation. Such polynucleotide variant can result in a protein variant with at least one amino acid insertion, deletion, substitution, and/or truncation of the protein. The protein variant can have a conservative or a non-conservative substitution. In certain embodiments, the edit alters the activity of the variant plastocyanin protein. In certain embodiments, the activity of plastocyanin protein is reduced or abolished in comparison to the the wild-type protein. In certain embodiments, when present in a plant and/or plant chromosome, the variant polynucleotide results in cold-stress tolerance.
In certain embodiments, the edited plastocyanin gene comprises (i) a variant polynucleotide encoding a plastocyanin protein variant or fragment thereof. The variant polynucleotide comprises at least one nucleotide insertion, deletion, and/or substitution in comparison to the corresponding unedited wild-type polynucleotide sequence and does not encode a wild-type plastocyanin protein. In certain embodiments, the variant polynucleotide is operably linked to a polynucleotide comprising a promoter for expression of the plastocyanin protein. In certain embodiments, a nucleotide insertion, deletion, and/or substitution is in Exon 1 or Exon 2 of the edited plastocyanin gene. For example, SEQ ID NO: 3 is the wild-type S. viridis plastocyanin Exon 1 sequence (similarly, SEQ ID NO: 5 is the wild-type Sorghum 5′ UTR/Exon 1 sequence). The polynucleotides of SEQ ID NO: 8 and SEQ ID NO: 10 represent edited variants of S. viridis Exon 1 and code for the plastocyanin protein/fragments of SEQ ID NO: 7 and SEQ ID NO: 9, respectively. In certain embodiments, the variant polynucleotide encodes a plastocyanin protein variant or fragment thereof having altered/reduced activity in comparison to the corresponding wild-type plastocyanin protein and/or the variant polynucleotide alters/decreases expression, transcription, intron splicing, and/or translation; alters the post-translational processing; and/or alters the sub-cellular localization of the plastocyanin gene, mRNA, and/or protein. In certain embodiments, when present in a plant and/or plant chromosome, the variant polynucleotide results in cold-stress tolerance.
In certain embodiments, the edited plastocyanin gene comprises (ii) a variant polypeptide comprising a plastocyanin gene 3′ UTR. The variant polynucleotide comprises at least one nucleotide insertion, deletion, and/or substitution in the 3′ UTR in comparison to the corresponding unedited wild-type polynucleotide sequence. In certain embodiments, the variant polynucleotide is operably linked to a polynucleotide comprising a promoter and/or a plastocyanin protein coding region. For example, SEQ ID NO: 4 is the wild-type S. viridis plastocyanin 3′ UTR sequence (simlarly, SEQ ID NO: 6 is the wild-type Sorghum 3′ UTR sequence). The polynucleotide of SEQ ID NO: 11 represents an edited variant of the S. viridis 3′ UTR. In certain embodiments, the variant polynucleotide alters/decreases expression, transcription, intron splicing, and/or translation; alters the post-translational processing; and/or the alters the sub-cellular localization of the plastocyanin gene, mRNA, and/or protein in comparison to an otherwise identical plastocyanin gene having a wild-type 3′ UTR. In certain embodiments, the insertion, deletion, and/or substitution in the 3′ UTR results in altered/reduced plastocyanin activity in comparison to an otherwise identical plastocyanin gene having a wild-type 3′ UTR. In certain embodiments, when present in a plant and/or plant chromosome, the variant polynucleotide results in cold-stress tolerance.
In certain embodiments, the edited plastocyanin gene comprises (iii) a variant polypeptide comprising a plastocyanin gene 5′ UTR. The variant polynucleotide comprises at least one nucleotide insertion, deletion, and/or substitution in the 5′ UTR in comparison to the corresponding unedited wild-type polynucleotide sequence. In certain embodiments, the variant polynucleotide is operably linked to a polynucleotide comprising a promoter and/or a plastocyanin protein coding region. For example, SEQ ID NO: 5 is the wild-type Sorghum 5′ UTR/Exon 1 sequence. In certain embodiments, the variant polynucleotide alters/decreases expression, transcription, intron splicing, and/or translation; alters the post-translational processing; and/or the alters the sub-cellular localization of the plastocyanin gene, mRNA, and/or protein in comparison to an otherwise identical plastocyanin gene having a wild-type 5′ UTR. In certain embodiments, the insertion, deletion, and/or substitution in the 5′ UTR results in altered/reduced plastocyanin activity in comparison to an otherwise identical plastocyanin gene having a wild-type 5′ UTR. In certain embodiments, when present in a plant and/or plant chromosome, the variant polynucleotide results in cold-stress tolerance.
In certain embodiments, the edited plastocyanin gene comprises (iv) a variant polypeptide comprising a plastocyanin gene promoter. The variant polynucleotide comprises at least one nucleotide insertion, deletion, and/or substitution in the promoter in comparison to the corresponding unedited wild-type polynucleotide sequence. In certain embodiments, the variant polynucleotide is operably linked to a polynucleotide comprising a plastocyanin protein coding region. In certain embodiments, the variant polynucleotide alters/decreases expression, transcription, intron splicing, and/or translation; alters the post-translational processing; and/or the alters the sub-cellular localization of the plastocyanin gene, mRNA, and/or protein in comparison to an otherwise identical plastocyanin gene having a wild-type promoter. In certain embodiments, the insertion, deletion, and/or substitution in the promoter results in altered/reduced plastocyanin activity in comparison to an otherwise identical plastocyanin gene having a wild-type promoter. In certain embodiments, when present in a plant and/or plant chromosome, the variant polynucleotide results in cold-stress tolerance.
In certain embodiments, the edited plastocyanin gene comprises (v) a variant polypeptide comprising a plastocyanin gene intron. The variant polynucleotide comprises at least one nucleotide insertion, deletion, and/or substitution in the intron in comparison to the corresponding unedited wild-type polynucleotide sequence. In certain embodiments, the variant polynucleotide is operably linked to a polynucleotide comprising at least one plastocyanin gene exon. In certain embodiments, the variant polynucleotide alters/decreases expression, transcription, intron splicing, and/or translation; alters the post-translational processing; and/or the alters the sub-cellular localization of the plastocyanin gene, mRNA, and/or protein in comparison to an otherwise identical plastocyanin gene having a wild-type intron. In certain embodiments, the insertion, deletion, and/or substitution in the intron results in altered/reduced plastocyanin activity in comparison to an otherwise identical plastocyanin gene having a wild-type intron. In certain embodiments, when present in a plant and/or plant chromosome, the variant polynucleotide results in cold-stress tolerance.
Similarly, in certain embodiments, and edited plastocyanin gene comprises (iv) a variant polypeptide encoding (a) a transit peptide, a vacuolar targeting peptide, and/or an endoplasmic reticulum targeting peptide; (b) a plastid targeting peptide; and/or (c) a polyadenylation or transcriptional termination signal. The variant polynucleotide comprises at least one nucleotide insertion, deletion, and/or substitution in (a), (b), and/or (c) in comparison to the corresponding unedited wild-type polynucleotide sequence. In certain embodiments, the polynucleotides of (a), (b), and/or (c) are operably linked to a polypeptide encoding a plastocyanin protein. In certain embodiments, the variant polynucleotide alters/decreases expression, transcription, intron splicing, and/or translation; alters the post-translational processing; and/or the alters the sub-cellular localization of the plastocyanin gene, mRNA, and/or protein in comparison to an otherwise identical plastocyanin gene having a wild-type intron. In certain embodiments, the insertion, deletion, and/or substitution in the intron results in altered/reduced plastocyanin activity in comparison to an otherwise identical plastocyanin gene having a wild-type intron. In certain embodiments, when present in a plant and/or plant chromosome, the variant polynucleotide results in cold-stress tolerance.
In certain of any embodiments of an edited plastocyanin gene of this disclosure, the variant polynucleotide is integrated into the nuclear or plastid genome of a cell. In certain embodiments, the nuclear or plastid genome is of a plant cell. Thus, the disclosure provides for a plant nuclear or plastid genome comprising an edited plastocyanin gene of this disclosure. In certain embodiments, the variant polynucleotide is heterologous to the nuclear or plastid genome. In certain embodiments, the variant polynucleotide is operably linked to an endogenous promoter of the nuclear or plastid genome, for example, a wild-type plastocyanin promoter. In certain embodiments, the edited gene or the nuclear or plastid genome further comprises a wild-type or variant polynucleotide encoding (a) a transit peptide, a vacuolar targeting peptide, and/or an endoplasmic reticulum targeting peptide; (b) a plastid targeting peptide; and/or (c) a polyadenylation or transcriptional termination signal, wherein the polynucleotides of (a), (b), and/or (c) are operably linked to the polypeptide encoding the plastocyanin protein.
In certain embodiments, the nuclear or plastid genome is a monocot crop plant or a dicot crop plant nuclear or plastid genome and/or the edited plastocyanin gene is from a monocot crop plant or a dicot crop plant. For example, in certain embodiments, a monocot crop plant is selected from the group consisting of a corn, barley, oat, pearl millet, rice, sorghum, sugarcane, turf grass, and wheat. In certain embodiments, the nuclear or plastid genome is a C3 plant or a C4 plant nuclear or plastid genome and/or the edited plastocyanin gene is from a C3 plant or a C4 plant. Representative examples of C3 plants include barley, oats, rice, and wheat, alfalfa (lucerne), cotton, Eucalyptus, sunflower, soybeans, sugar beets, potatoes, and tobacco. Representative examples of C4 plants include maize, sugarcane, and sorghum.
Also provided for herein is a cell comprising the edited gene or nuclear or plastid genome of this disclosure. In certain embodiments, the cell is a plant, yeast, mammalian, or bacterial cell. In certain embodiment, the plant is a plant cell, such as from a plant disclosed elsewhere herein. In certain embodiments, the cell is a plant cell that is non-regenerable.
Also provided for herein is a plant comprising the edited gene or nuclear or plastid genome of this disclosure. In certain embodiments, the plant is a monocot crop plant or a dicot crop plant. For example, in certain embodiments, a monocot crop plant is selected from the group consisting of a corn, barley, oat, pearl millet, rice, sorghum, sugarcane, turf grass, and wheat. In certain embodiments, the plant is a C3 plant or a C4 plant. Representative examples of C3 plants include barley, oats, rice, and wheat, alfalfa (lucerne), cotton, Eucalyptus, sunflower, soybeans, sugar beets, potatoes, and tobacco. Representative examples of C4 plants include maize, sugarcane, and sorghum. In certain embodiments, the edited gene or nuclear or plastid genome confers to the plant cold-stress tolerance in comparison to a control plant that lacks the edited gene or nuclear or plastid genome. Thus, in certain embodiments, the plant is cold-stress tolerant. Further, certain embodiments provide for a plant part of a plant above, wherein the plant part comprises the edited gene or nuclear or plastid genome. In certain embodiments, the plant part is a seed, stem, leaf, root, tuber, flower, or fruit.
Also provided for herein is a seed produced by a plant of this disclosure wherein said seed comprises a detectable amount of the variant polynucleotide encoding a plastocyanin protein variant or fragment. In certain embodiments, the seed comprises a transgene comprising a heterologous variant polynucleotide. In certain embodiments, the seed comprises an endogenous edited gene comprising the variant polynucleotide. In certain embodiments, the seed is coated with a composition comprising an insecticide and/or a fungicide. Also provided for herein is a plant propagation material comprising the coated seed.
Certain embodiments provide for a plant or seed of this disclosure for use in a method of plant breeding, crop production, or for making a processed plant product.
This disclosure provides for a method for obtaining a cold-stress tolerant plant comprising the edited gene or plant nuclear or plastid genome disclosed herein. The plant can be any plant as described herein. In certain embodiments, the plant is a monocot crop plant or a dicot crop plant. For example, in certain embodiments, a monocot crop plant is selected from the group consisting of a corn, barley, oat, pearl millet, rice, sorghum, sugarcane, turf grass, and wheat. In certain embodiments, the plant is a C3 plant or a C4 plant. Representative examples of C3 plants include barley, oats, rice, and wheat, alfalfa (lucerne), cotton, Eucalyptus, sunflower, soybeans, sugar beets, potatoes, and tobacco. Representative examples of C4 plants include maize, sugarcane, and sorghum. Such method comprises the steps of: (i) introducing the edited gene, the variant polynucleotide encoding the plastocyanin protein, the polynucleotide comprising the promoter, a fragment of said polynucleotides, or a combination of said polynucleotides into a plant cell, tissue, plant part, or whole plant; (ii) obtaining a plant cell, tissue, part, or whole plant wherein the edited gene, the variant polynucleotide encoding the plastocyanin protein, the polynucleotide comprising the promoter, a fragment of said polynucleotides, or a combination of said polynucleotides is integrated into the plant nuclear or plastid genome; and (iii) selecting a plant obtained from the plant cell, tissue, part or whole plant of step (ii) for expression of a variant plastocyanin protein, thereby obtaining a plant that is cold-stress tolerant.
This disclosure also provides for a method for obtaining a cold-stress tolerant plant comprising the edited gene or plant nuclear or plastid genome disclosed herein. The plant can be any plant as described herein. In certain embodiments, the plant is a monocot crop plant or a dicot crop plant. For example, in certain embodiments, a monocot crop plant is selected from the group consisting of a corn, barley, oat, pearl millet, rice, sorghum, sugarcane, turf grass, and wheat. In certain embodiments, the plant is a C3 plant or a C4 plant. Representative examples of C3 plants include barley, oats, rice, and wheat, alfalfa (lucerne), cotton, Eucalyptus, sunflower, soybeans, sugar beets, potatoes, and tobacco. Representative examples of C4 plants include maize, sugarcane, and sorghum. Such method comprises introducing into a plant cell one or more gene editing molecules, as described in greater detail elsewhere herein, that target an endogenous plastocyanin gene to introduce at least one nucleotide insertion, deletion, and/or substitution into the endogenous plastocyanin gene. In certain embodiments, the method further comprises selecting a plant comprising a gene edited plastocyanin gene that expresses a variant plastocyanin protein. Certain embodiments further comprise selecting a plant that is cold-stress tolerant. Thus, in certain embodiments, the gene editing transforms a cold-stress susceptible plant or plant line to a cold-stress tolerant plant or plant line.
This disclosure also provides for a method for obtaining a cold-stress tolerant plant comprising the edited gene or nuclear or plastid genome of this disclosure. The plant can be any plant as described herein. In certain embodiments, the plant is a monocot crop plant or a dicot crop plant. For example, in certain embodiments, a monocot crop plant is selected from the group consisting of a corn, barley, oat, pearl millet, rice, sorghum, sugarcane, turf grass, and wheat. In certain embodiments, the plant is a C3 plant or a C4 plant. Representative examples of C3 plants include barley, oats, rice, and wheat, alfalfa (lucerne), cotton, Eucalyptus, sunflower, soybeans, sugar beets, potatoes, and tobacco. Representative examples of C4 plants include maize, sugarcane, and sorghum. Such method comprises the steps of first (i) providing to a plant cell, tissue, part, or whole plant an endonuclease or an endonuclease and at least one guide RNA, wherein the endonuclease or guide RNA and endonuclease can form a complex that can introduce a double strand break at a target site in a nuclear or plastid genome of the plant cell, tissue, part, or whole plant. Certain embodiments further include providing a template polynucleotide comprising the polynucleotide encoding a plastocyanin protein or a fragment thereof. Then (ii) obtaining a plant cell, tissue, part, or whole plant wherein at least one nucleotide insertion, deletion, and/or substitution has been introduced into the corresponding wild-type polynucleotide sequence. And then (iii) selecting a plant obtained from the plant cell, tissue, part or whole plant of step (ii) comprising the edited gene for expression of a plant plastocyanin protein variant or fragment, thereby obtaining a plant that is cold-stress tolerant. In certain embodiments the endonuclease is a Cas endonuclease. In certain embodiments, the guide RNA is a clustered regularly interspaced short palindromic repeats (CRISPR)-associated (Cas)-guide RNA.
This disclosure also provides for a method of producing/breeding a cold-stress tolerant plant. The plant can be any plant as described herein. In certain embodiments, the plant is a monocot crop plant or a dicot crop plant. For example, in certain embodiments, a monocot crop plant is selected from the group consisting of a corn, barley, oat, pearl millet, rice, sorghum, sugarcane, turf grass, and wheat. In certain embodiments, the plant is a C3 plant or a C4 plant. Representative examples of C3 plants include barley, oats, rice, and wheat, alfalfa (lucerne), cotton, Eucalyptus, sunflower, soybeans, sugar beets, potatoes, and tobacco. Representative examples of C4 plants include maize, sugarcane, and sorghum. Such method comprises crossing a cold-stress tolerant plant of this disclosure with one or more other plants to produce a population of progeny plants. In certain embodiments, the one or more of the other plants comprises an endogenous edited gene. In certain embodiments, the method further comprises screening the population of progeny plants to identify cold-stress tolerant plants. In certain embodiments, the population of plants is screened by genotyping to detect a variant polynucleotide encoding for plastocyanin protein variant or fragment. In certain embodiments, the method further comprises selecting a progeny plant based on its genotype and/or phenotype.
Certain embodiments also provide for the transgenic expression of cold-stress tolerance variants in transgenic plants. Methods of producing transgenic constructs and transgenic plants are well-known in the art. Expression cassettes that provide for expression of polypeptides in monocotyledonous plants, dicotyledonous plants, or both can be constructed. Such expression cassette construction can be effected either in a plant expression vector or in the genome of a plant. Expression cassettes are DNA constructs wherein various promoter, coding, and polyadenylation sequences are operably linked. In general, expression cassettes typically comprise a promoter that is operably linked to a sequence of interest, which is operably linked to a polyadenylation or terminator region. In certain instances including, but not limited to, the expression of transgenes in monocot plants, it can also be useful to include an intron sequence. When an intron sequence is included it is typically placed in the 5′ untranslated leader region of the transgene. In certain instances, it can also be useful to incorporate specific 5′ untranslated sequences in a transgene to enhance transcript stability or to promote efficient translation of the transcript.
The DNA constructs that comprise the plant expression cassettes can either be constructed in the plant genome by using site specific insertion of heterologous DNA into the plant genome, by mutagenizing the plant genome, and/or by introducing the expression cassette into the plant genome with a vector or other DNA transfer method. Vectors contain sequences that provide for the replication of the vector and covalently linked sequences in a host cell. For example, bacterial vectors will contain origins of replication that permit replication of the vector in one or more bacterial hosts. Agrobacterium-mediated plant transformation vectors typically comprise sequences that permit replication in both E. coli and Agrobacterium as well as one or more “border” sequences positioned so as to permit integration of the expression cassette into the plant chromosome. Such Agrobacterium vectors can be adapted for use in either Agrobacterium tumefaciens or Agrobacterium rhizogenes. Selectable markers encoding genes that confer resistance to antibiotics are also typically included in the vectors to provide for their maintenance in bacterial hosts.
A transgenic plant containing or comprising an expression vector can be obtained by regenerating that transgenic plant from the plant, plant cell, protoplast, or plant tissue that received the expression vector.
To generate enough seed for commercial distribution, the seed of commercial crops can be gathered from a plurality of plants and pooled together to create a seed lot. A commercial seed lot of a crop preferably contains a plurality of seeds that share similar or identical characteristics such as species, variety, genetic makeup, and/or similar germination rates. Provided for herein is a method of producing a commercial crop seed lot of sorghum seeds comprising in their genomes at least one cold-stress tolerance genetic locus and/or genotype associated with cold-stress tolerance of this disclosure. In certain embodiments, the genetic locus is an introgressed locus as described herein. In certain embodiments, the method comprises the steps of: (a) producing a population of sorghum plants, such as by but not limited to methods described elsewhere herein, comprising a genotype associated with cold-stress tolerance; and (b) harvesting a commercial seed lot, wherein the harvested crop seed lot comprises a plurality of seed that comprise in their genomes at least one cold-stress tolerance genetic locus. In certain embodiments, the method comprises the steps of: (a) producing a population of sorghum plants, such by but not limited to methods described elsewhere herein, comprising a genotype associated with cold-stress tolerance and at least one linked marker found in said second sorghum plant comprising a non-cold-stress tolerance genetic locus but not found in said first sorghum plant; and (b) harvesting a commercial seed lot, wherein the harvested crop seed lot comprises a plurality of seed that comprise in their genomes at least one cold-stress tolerance genetic locus. In certain embodiments, the method comprises the steps of: (a) producing a population of sorghum plants, such by but not limited to methods described elsewhere herein, comprising an introgressed genomic cold-stress tolerance genetic locus; and (b) harvesting a commercial seed lot, wherein the harvested crop seed lot comprises a plurality of seed that comprise in their genomes at least one introgressed cold-stress tolerance genetic locus. Certain embodiments may include a combination of any of the above.
In certain embodiments, the seed lot comprises at least 100 seeds, at least 500 seeds, at least 1,000 seeds, at least 5,000 seeds, at least 10,000 seeds, at least 25,000 seeds, at least 50,000 seeds, or at least 100,000 seeds. It is contemplated that a harvested commercial crop seed lot of the disclosure will not necessarily comprise seed that contains 100% of seeds that comprise in their genomes at least one cold-stress tolerance genetic locus. For example, some amount of contamination of other seeds into the harvested seed lot may occur. However, a high percentage of seeds comprising an cold-stress tolerance genetic locus is attainable. In certain embodiments, the plurality of seeds that comprise in their genomes at least one cold-stress tolerance genetic locus constitute at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the seed of the harvested seed lot. In certain embodiments, the plurality of seeds that comprise in their genomes at least one introgressed cold-stress tolerance genetic locus constitute at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the seed of the harvested seed lot.
In certain embodiments, such as for a method for producing a commercial crop seed lot, the seed of the harvested commercial crop seed lot is packaged into one or more bags. Such packaging results in one or more packaged seed bags. In one embodiment, the packaged seeds, such as in packaged seed bags, are further distributed to growers for use in crop production.
One of skill in the art will recognize that packaged seed bags of a commercial crop seed lot destined for distribution to growers for use in crop production will preferably comprise a large number of seeds. For example, at least one hundred seeds, at least one thousand seeds, at least ten thousand seeds, at least one hundred thousand seeds, or at least one million seeds.
Provided for herein is a commercial crop seed lot wherein a plurality of seeds comprise at least one cold-stress tolerance genetic locus and/or genotype associated with cold-stress tolerance of this disclosure. In certain embodiments a commercial crop seed lot comprises a plurality of seeds comprising at least one such cold-stress tolerance genetic locus that has been introgressed. Commercial crop seed lots of sorghum seeds can be made by the methods described in detail elsewhere herein.
In certain embodiments, the seed lot comprise at least 100 seeds, at least 500 seeds, at least 1,000 seeds, at least 5,000 seeds, at least 10,000 seeds, at least 25,000 seeds, at least 50,000 seeds, or at least 100,000 seeds. Further, in certain embodiments, wherein the plurality of seeds that comprise in their genomes at least one cold-stress tolerance genetic locus constitute at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the seed of the seed lot. In certain embodiments, the cold-stress tolerance genetic locus is an introgressed locus.
For certain crop species, cross-pollination of certain distinct plant lines can result in hybrid offspring exhibiting a highly desirable heterosis or hybrid vigor which advantageously provides increased yields of the desired crop. In one embodiment, the crossing of parental plants produces a commercial crop seed lot that provides plants that yield at least 5% more than plants produced by selfing either the male parent plants or the female parent plants used to obtain the commercial crop seed lot when the crossed plants and the selfed plants are grown under the same field conditions. In a further embodiment, the crossing of parental plants produces a commercial crop seed lot that provides plants that yield at least 10% more than plants produced by selfing either the male parent plants or the female parent plants used to obtain the commercial crop seed lot when the crossed plants and the selfed plants are grown under the same field conditions. In yet a further embodiment, the crossing of parental plants produces a commercial crop seed lot that provides plants that yield at least 15% more than plants produced by selfing either the male parent plants or the female parent plants used to obtain the commercial crop seed lot when the crossed plants and the selfed plants are grown under the same field conditions.
One of skill in the art will recognize that certain standards may be set—as may be set for certification—for a commercial crop seed lot. These standards may vary according to the crop selected and different classes of standards may exist such as “breeder,” “foundation,” “registered,” and “certified.”
One of skill in the art will recognize that the preceding standards are illustrative and that standards may vary depending on geographical location and as set by different regulatory entities. The standards disclosed herein are consistent with standards that are practiced in the field of commercial seed certification. Such standards are useful guidelines, however, the present disclosure is not to be interpreted as limited only to such standards or crops, but also to encompass other standards and crops as are known to those skilled in the art.
Provide for herein is a method of growing a sorghum plant comprising in its genome at least one cold-stress tolerance genetic locus and/or genotype associated with cold-stress tolerance of this disclosure, whether an introgressed locus or not, under cold-stress conditions. Also included are progeny plants and populations of plants. In certain embodiments, the method comprises growing a sorghum plant under cold conditions sufficient to cause a deleterious effect in a non-cold-stress tolerant variety. In certain embodiments, the sorghum plant is grown at an ambient temperature of less than 15° C., a temperature that will negatively impact both seedling emergence and vigor.
In certain embodiments, sorghum plants are grown in a temperature controlled environment, such as a greenhouse or other enclosure, in which the plants are exposed to cold-stress conditions. In certain embodiments, plants are grown in an open field such as for large scale breeding, seed production, and crop production. When grown in an open field, the sorghum plants will be subject to the prevailing weather conditions.
Due to the seasons, certain times of the year are cooler and less amenable if not prohibitive to planting sorghum. One advantage of the cold-stress tolerant sorghum plants of the present disclosure is to allow planting, for example earlier in the year, to improve yields. In certain embodiments, cold-stress tolerant plants of this disclosure, progeny, or population produced therefrom are planted and/or grown in the month of February, March, or April. In certain embodiments, cold-stress tolerant plants of this disclosure, progeny, or population produced therefrom are planted and/or grown in the month April.
The climate and thus seasonal temperature varies with geographical location. In general in the norther hemisphere, the climate is cooler the farther north one goes. Thus, another advantage of the cold-stress tolerant sorghum plants of the present disclosure is to allow planting in geographical locations where sorghum hasn't been traditionally grown or is unable to grow because the climate is too cool. In certain embodiments, cold-stress tolerant plants of this disclosure, progeny, or population produced therefrom are planted and/or grown in the northern portion of the continental United States (
Altitude can also affect temperature. In general, temperatures become cooler at higher altitudes. Thus, even in locations south of, for example the norther portion of the continental United States, sorghum may not be easily grown or grown at all at higher altitudes. Another advantage of the cold-stress tolerant sorghum plants of the present disclosure is to allow planting in geographical locations at altitudes where sorghum hasn't been traditionally grown or is unable to grow because the climate is too cool. In certain embodiments, cold-stress tolerant plants of this disclosure, progeny, or population produced therefrom are planted and/or grown at an altitude in geographical location wherein the ambient temperature drop below 15° C. during the growing season.
Further, it is within this disclosure to grow cold-stress tolerant plants of this disclosure, progeny, or population produced therefrom under a combination of any of the above temperature, time of year, geographical location, and/or altitude.
Plant Materials
This study used the sorghum Bioenergy Association Panel (BAP) (Brenton, Z. W., et al. (2016) Genetics. 204(1): 21-33). Details about the panel design, GBS genotyping, marker distribution, population structure, and linkage disequilibrium (LD) decay have been previously described (Brenton, Z. W., et al. (2016) Genetics. 204(1): 21-33). 369 BAP accessions were genotyped at 232,303 SNPs. The BAP accessions represent a racially, geographic, and phenotypically diverse selection of sorghum accessions, but is limited to accessions exhibiting key bioenergy traits, such as height, sensitivity to photoperiod, and delayed flowering.
Experimental Conditions
Three replicates of 369 BAP accessions (1131 plants) were planted in 600 g of Turface® in 8-inch tall tree pots, with +14−14−14 Osmocote (1.5 lb/cubic yard) fertilizer. The potted seeds were held overnight in a growth chamber at 32° C. (day)/22° C. (night). The next day, the pots were loaded into a carrier on an automated phenotyping system within a controlled-environment plant growth chamber. The phenotyping system moved the plants on a closed-loop conveyor path to stations for daily watering, weighing, and imaging. The plants were positioned in the growth chamber in a randomized block design and were rotated one-half lane each day to reduce edge effects. To study the effect of cold stress on these accessions, plants were grown at 15° C. (day)/15° C. (night) for 31 days, 24° C. (day)/19° C. (night) for 7 days, and 32° C. (day)/22° C. (night) for 18 days. The soil was maintained at 100% field water capacity by watering the plants twice daily to a target weight of 1192 grams. The target weight was calculated by adding the weight of the plant carrier (342 g), the water weight at saturation (250 g), and the weight of the Turface®-filled pot (600 g). Water was added after each carrier including the potted plant was weighed, and the volume of water added to reach the target weight of 1192 grams was recorded.
Image Collection and Phenotyping
Each plant passed through a visible light imaging chamber daily while on the automated phenotyping system. The imaging cameras recorded two side views and one top view image. As the plants grew, the fields of view on the cameras were adjusted so that the entire plant could be captured in each image. The optical zoom level was reduced for the top view and side view images at 19 DAP and 40 DAP. Scaling factors were calculated for both area and height using a reference object of known size, so that pixel areas across zoom levels were comparable. After eight weeks (56 DAP) the plants were removed from the phenotyping system. The shoot of each plant was cut at the base of the stem. Fresh weight measurements of the shoots were collected immediately.
The images were analyzed using Plant Computer Vision (PlantCV), an image-processing tool coded in Python (Fahlgren. N., et al. (2015) Molecular Plant, 8(10): 1520-1535). The pixel areas from the daily top view and two side view images of each plant were analyzed with PlantCV to generate measurements of area, hull area, height, RGR, and WUE.
The area was calculated by adding the pixel count from the top view image to the two side view images. Endpoint fresh weight measurements were correlated with area calculations at 51 DAP in order to estimate biomass (Fahlgren, N., et al. (2015) Molecular Plant, 8(10): 1520-1535). The values at 51 DAP were used for the correlation because, after that date, plants began to overlap and grow outside of the camera's field of view, and the resulting pixel counts were less indicative of the actual plant size.
The hull area is the convex hull calculated from the pixel count in the smallest area that includes a set of given points in a plane. WUE was calculated by dividing the derived area by the cumulative water added to each plant. Height was determined from the side view images.
The relative growth rate (RGR) was calculated as described in Hoffnann and Poorter, using estimator 2 to determine the RGR for each distinct time point (Hoffnann, W. A., & Poorter, H. (2002) Annals of Botany, 90(1): 37-42). Briefly, we calculated the natural log of all replicate area values in the experiment, then calculated the mean of the log-transformed values for two time points, t1 and t2. The mean value of the log-transformed areas for t2, W2, is subtracted from the mean value of the log-transformed areas for t1, W1, and then divided by the difference between t2 and t1 as shown:
Germination dates were determined for each plant based on the top view area measurements. The earliest DAP with a top view area measure was considered the germination date. The resulting germination date for the three replicates of each accession was averaged to determine the germination date for each accession.
Data Processing and Analysis
Image-derived phenotypic data were generated for 309 of the BAP accessions (S1). Phenotypic analysis was performed on the image-derived data using the R statistical software (R). Plants that never germinated or that died by 46 DAP were excluded from the analysis. A conservative initial outlier removal step was performed on each phenotype to remove likely artifacts from the image analysis. A data point was considered an outlier and excluded from further analysis if its value was greater than 40 median absolute deviations (MAD) from the median (Davies, L., Gather, U. (1993) Journal of the American Statistical Association, 88(423): 782-792). Raw measurements from PlantCV were smoothed using predicted values from a loess smoothing fit. These fitted values were used as the phenotypes for further analyses (Feldman, M. J., et al. (2017)PLoS Genetics, 13(6): e1006841).
The ‘lmer’ function in the lme4 R package was used to estimate variance components for broad-sense heritability (Bates, D., et al. (2015) Journal of Statistical Software., 67(1): 1-48). Heritability was calculated based on the subset of accessions that had three replicates that germinated. Broad-sense heritability was calculated as:
After the heritability calculation, the median value of the replicates for each accession was used for further analysis. Traits were tested for normality and transformed as necessary using the Box-Cox procedure as implemented in R with the ‘boxcox’ function in the MASS package (Venables, W. N., Ripley, B. D. (2002) Modern applied statistics with S Statistics and Computing, pp. 271-300).
Hierarchical Clustering
Hierarchical clustering of each trait from 12-56 DAP was performed using the ‘hclust’ R function using a Euclidean distance matrix and the Ward agglomeration method (Murtagh, F., Legendre, P. (2014) Journal of Classification, 31(3): 274-295). Clustering results were visualized both as a dendrogram using the R package ‘dendextend’ and as a clustergram using the function ‘clustergranm.R’ (Galili T (2010) Clustergram: visualization and diagnostics for cluster analysis on the world wide web at “r-statistics.com/2010/06/clustergram-visualization-and-diagnostics-for-cluster-analysis-r-code”; Schonlau, M. (2002) The Stata Journal. 3: 316-327).
Ideotype Selection
Height-to-biomass ratios were calculated for each day and converted to a percentile with values between 0 and 100. The average percentile of the accessions for height-to-biomass ratio and WUE over the growing period, DAP 15 to 31 or DAP 39 to 56, for the early and late periods, respectively, was used to select ideotype accessions. Ideotype-positive accessions were defined as those with a height-to-biomass ratio in the bottom 5 percent of the population and a WUE in the top 10 percent of the population. Ideotype-negative accessions have a height-to-biomass ratio in the top 5 percent of the population and a WUE in the bottom 10 percent. Radar plots of the ideotype accessions were generated with the R package ‘fmsb’ (Nakazawa, M., & Nakazawa, M. M. (2018) fmsb: Functions for Medical Statistics Book with some Demographic Data. R package version 0.6.3). Heatmaps were generated using the R package ‘pheatmap’ (Kolde, R., & Kolde, M. R. (2018) pheatmap: Pretty Heatmaps. R package version 1.0.10).
Genome-Wide Association Study
Genotyping-by-sequencing (GBS) SNP markers for the BAP have been previously described (Brenton, Z. W., et al. (2016) Genetics. 204(1): 21-33). GWAS was performed using a multi-locus mixed linear model (MLMM) in R to identify loci associated with each trait of interest. The first three principal components of the genotype matrix were included as covariates in the mixed model in order to control for population structure (Price, A. L., et al. (2006) Nature Genetics, 38(8): 904). A kinship matrix, calculated from the genotype matrix using the Astle-Balding method in the ‘synbreed’ package, was also included as a random effect to control for familial and cryptic relatedness between accessions (Wimmer, V., Albrecht, T., Auinger, H. J., Schon, C. C. (2012) Bioinformatics, 28(15): 2086-2087; Astle, W., Balding, D. J. (2009) Statistical Science, 24(4): 451-471) (S2).
MLMM tests for association with the phenotype using a stepwise mixed model regression. In each step, the SNP with the most significant association to the phenotype is added to the model as a covariate. Stepwise addition of SNPs continues until the heritable variance estimate (pseudo-heritability) reaches 0 (Segura, V., et al (2012) Nature Genetics, 44(7): 825). A final set of high-confidence SNPs were selected as those that were included as covariates in either of the two optimal models selected by the MLMM software: the multiple-Bonferonni model or the extended BIC model (Segura, V., et al. (2012) Nature Genetics, 44(7): 825). In the multiple-Bonferroni model, all cofactors with a p-value below a Bonferroni corrected threshold are selected. Multiple-Bonferroni was the more stringent of the two models (Segura, V., et al. (2012) Nature Genetics, 44(7): 825). The extended BIC model selects a model based upon BIC penalized by the model complexity (Brenton, Z. W., et al. (2016) Genetics. 204(1): 21-33; Chen, J., Chen, Z. (2008) Biometrika, 95(3): 759-771).
The temporal GWAS results were analyzed using ZBrowse, an interactive browser that runs using R (Ziegler, G. R., Hartsock, R. H., & Baxter, I. (2015) PeerJ Computer Science, 1: e3). Using ZBrowse, we were able to view the SNPs for multiple traits simultaneously and plot those traits over time so that we could determine the DAP on which each SNP was significant and thus identify transient QTL peaks that turn “on/off” at specific DAP.
Candidate Gene Identification
The sorghum reference genome assembly version 3.1 was used to identify genes colocalizing with or adjacent to the associated SNPs. A genome scan of 15 kb upstream and 15 kb downstream of each significant SNP was performed to identify candidate genes. Phytozome (https://phytozome.jgi.doe.gov/pzlportal.html) was used to analyze the functional annotation of the candidate genes and to identify putative homologs in other species. Analysis of polymorphisms within candidate genes was done using three diversity panels of resequencing data available for sorghum via Phytozome (Paterson, A. H., et al. (2009) Nature, 457(7229): 551: Mace. E. S., et al. (2013) Nature Communications. 4: 2320: McCormick, R. F., et al. (2018) The Plant Journal, 93(2): 338-354). VCFtools was used to generate final variant calls after merging the three diversity VCFs, based on a minor allele frequency cutoff of >0.05, as well as generate distinct VCFs for accessions specifically in this study (Danecek, P., et al. (2011) Bioinformatics, 27(15): 2156-2158). Variant effects were estimated using the SNPeff pipeline and the v 3.1 sorghum reference to assess the potential impact of sequence variants on annotated gene models (Cingolani, P., et al. (2012) Fly, 6(2): 80-92). Correlations of the potential effects of polymorphism were based on overlap between publicly available sequence data and BAP accessions with phenotype data from this study.
Population Structure and Kinship
In this study, 369 accessions of the BAP grown under early cold stress conditions were evaluated. The BAP (Brenton, Z. W., et al. (2016) Genetics, 204(1): 21-33) comprises six subpopulations that represent a racial, geographic and phenotypically diverse selection of sorghum accessions. The BAP was designed to limit variation in key bioenergy traits, such as height and sensitivity to photoperiod, within a range of desirable values (Brenton, Z. W., et al. (2016) Genetics, 204(1): 21-33) (S1). A kinship matrix showing the genetic correlations in the BAP was generated (S2). The kinship matrix revealed distinct groups of highly correlated accessions. In many of the highly correlated clusters, the individual accessions shared characteristics such as country of origin and similar photoperiod sensitivity. For example, in one cluster, 47 of 48 accessions originated from Ethiopia. Another cluster consisted of 29 accessions, and 25 of those originated from Ethiopia, while 26 were photoperiod-sensitive. One cluster was comprised entirely of photoperiod-insensitive accessions, while 29 of 30 accessions in another cluster were of the cellulosic type. These clusters reflect an uneven population distribution and illustrate the necessity for controlling for population structure in this study.
Germination and Seedling Vigor
Seeds were sown in pots at 15° C. and grown at that temperature for 31 days. The temperature was then increased to 24° C. for seven days and then increased to 32° C. for the final 18 days of the experiment. Plant growth and development were observed using daily, non-destructive imaging throughout eight weeks. Germination dates were determined for each plant by image analysis and manual validation. At least two plants per genotype germinated for a total of 369 unique accessions. Of the 369 accessions, 41 germinated within ten days after planting (DAP) during the 15° C. cold treatment, and the remaining accessions germinated between 11 DAP and 46 DAP (S3). Accessions that failed to germinate or died by 46 DAP were excluded, leaving 309 accessions for subsequent phenotypic and genetic analyses (S3).
Trait Heritability and Phenotypic Variation
To analyze the phenotypic response to early cold stress in the BAP, we evaluated five bioenergy-related traits of interest: biomass, height, hull area, water use efficiency (WUE), and relative growth rate (RGR). Images captured daily from three angles and the amount of water each plant received daily enabled calculation of phenotypic measurements for each plant for each day of the experiment.
We used the three replicates for each accession to calculate broad-sense heritability within the experiment for each of the traits at each DAP to determine the proportion of phenotypic variation explained by genetic variation in the BAP. The heritability for each trait changed over time and as the temperature increased (
Heritability of WUE increased between 10-20 DAP, remained relatively flat until 42 DAP, then decreased. The decreased heritability of WUE was expected and coincides with an increased uncertainty of biomass calculations and water use estimates during the final two weeks of the experiment. There was a significant increase in overlapping of leaves after 42 DAP, resulting in the underestimation of plant biomass from the imaging data. The heritability for RGR had no discernable trend. The range of variation observed for the phenotypes of interest is depicted in the density histogram (
Trait Correlations
The correlations among biomass, height, hull area, RGR, and WUE were evaluated at 51 DAP (
Phenotypic Ranking of Accessions
The daily plant images were analyzed to quantify the biomass, hull area, height, and WUE phenotypes for each DAP. Based on the resulting values, for each trait, we ranked the accessions based on their performance. Some accessions were ranked high or low for multiple growth-related traits. The Venn diagram (
Ideotype-Positive and Ideotype-Negative Accessions
Comparisons of accession rankings among multiple traits facilitated the identification of ‘ideotype-positive’ and ‘ideotype-negative’ accessions. Accessions ranked within the top 10% for WUE but with a height-to-biomass ratio in the bottom 5% of accessions were designated as ideotype-positive (
Accessions within the bottom 10% for WUE accessions but with a height-to-biomass ratio in the top 5% of accessions were designated as ideotype-negative (
Temporal WUE Profiles and Clustering of Accessions
The daily quantification of traits provided the opportunity to assess temporal changes in trait values as the temperature increased and plants developed. For brevity, we only refer to the five top and bottom-ranked accessions for WUE, a key trait.
For each accession, we analyzed the profile of each phenotype over time and at different temperatures. Again, for brevity, here we only refer to WUE. For example, the temporal WUE profiles for the BAP accessions separate into seven clusters, each represented by a growth curve represented by the average of all accessions in that cluster (
Genome-Wide Association Study
A genome-wide association study (GWAS) was performed to characterize the genetic architecture underlying response to early cold stress conditions in the BAP. GWAS was performed using the multi-locus mixed model (MLMM) algorithm (Segura. V., et al. (2012) Nature Genetics, 44(7): 825). MLMM uses multiple loci in the model to yield a higher detection power and lower the potential of false discoveries.
In order to identify SNPs significantly correlated with phenotypes at specific DAP and correlated with temperature changes, we also used the values for each phenotype at each DAP (
This analysis revealed several QTL that displayed a transient response, appearing at specific times and developmental stages, and as temperatures increased. (
For the transient QTL that were detected, we also examined the phenotypic differences for the alternative alleles at a particular SNP. For the hull area SNP 7:2,934,702 shown in
The GWAS also identified pleiotropic QTL. A transient QTL on chromosome two (SNP 2:1,199,928) was significant for WUE and biomass from 14-19 DAP. SNP 4:64256235 on chromosome four was a significant transient QTL for WUE and biomass from 14-19 DAP, as well as height at 34 DAP. Transient QTLs on chromosomes six and nine (SNP 6:42,116,590 and SNP 9:55067080, respectively) were significant for WUE and biomass at 35-36 DAP. S9 shows the traits and DAP associated with each SNP.
Candidate Gene Identification
We conducted a genome scan of the 15 kb upstream and 15 kb downstream of each significant SNP in order to identify candidate genes. Functional annotations, putative homologs, and polymorphisms within the candidate genes based on public genomic resequencing data were analyzed using Phytozyme (https://phytozome.jgi.doe.gov/pz/portal.html). GWAS identified highly significant QTL near 72 candidate genes (S10) with putative functions potentially related to biomass, height, hull area, WUE, and RGR, and in response to cold. Many of these candidate genes are close to pleiotropic QTL implicated in cold stress response in addition to the biomass-associated phenotypes.
For brevity, here we will only refer to one a priori candidate gene. GWAS identified SNP 6:40,312,463 which was significantly associated with RGR at 33 DAP and is located within the gene Sobic.006G057866, which encodes PSEUDO-RESPONSE REGULATOR 7 (PRR37)/MATURITY 1 (Ma1) and is involved in flowering time regulation in sorghum (Murphy, R. L., et al. (2011) Proceedings of the National Academy of Sciences. 108(39): 16469-16474). SNP 6:40,312,463 causes a non-synonymous amino acid substitution (Asn184Lys) within Ma1, and although accessions carrying this polymorphism show a slightly reduced mean RGR at 33 DAP, they were not significantly different (S11a). This observation suggests that allelic variation at SNP 6:40,312,463 is not likely to be a robust contributor to phenotypic variation in RGR. However, interestingly, the majority of accessions including both BAP accessions as well as from other diversity panels (˜90%) carry a polymorphism in Ma1 resulting in loss of the annotated reference stop codon, which could drastically increase the protein size and affect its function within non-reference accessions (S11b).
Plastocyanin Gene Editing
As disclosed herein, the gene Sobic.007G033300 was identified in GWAS study of cold tolerance in sorghum, where a SNP in the gene was associated with hull area, among other phenotypes. Gene editing constructs were initiated to generate mutants in the gene in sorghum and the homologous gene (Sevir.6G069300) in the model plant Setaria viridis.
Four total constructs were built; two targeting the gene in sorghum and two targeting the gene in S. viridis, and targeting either the Exon 1 or the 3′ UTR for each species. Each construct contained two guide RNAs, spaced approximately 100 base pairs apart in the gene.
S. viridis
S. viridis
S. viridis
S. viridis
S. bicolor
S. bicolor
S. bicolor
S. bicolor
Constructs were generated to express the gRNAs under control of the wheat U3 or wheat U6 promoters, along with the wheat Cas9 gene, using a modular system developed in the Voytas lab. Plasmid maps for these CRISPR-Cas9 gene editing constructs are shown in
The two Sobic.007G033300 CRISPR-Cas9 constructs targeting the sorghum plastocyanin gene were sent to Albert Kausch's group at Rhode Island for transformation.
For S. viridis transformation, plasmids were transformed into Agrobacterium strain AGL1, which was used by the DDPSC Plant Transformation Facility to transform the genotype ME34. For the exon-targeting construct, two events were returned. For the 3′ UTR construct, eight events were returned. TO plantlets were returned and grown to maturity in a growth chamber. T1 seeds were collected. DNA was isolated from the T1 and T2 generations for characterization of variant alleles. Two unique alleles were identified in Exon 1 and one identified in the 3′ UTR.
The location of two deletions (Mu1 and Mu2) in plastocyanin (Sevir.6G069300) Exon 1 are shown in
Sevir.6G069300: misc_feature 1 . . . 269/label=5′UTR; exon 270 . . . 453/label=Exon1; misc_feature complement (375 . . . 397)/label=Exon1_gRNA; misc_feature 381 . . . 436/label=Mu1—56 bp deletion; misc_feature 394 . . . 449/label=Mu3—putative different 56 bp deletion; misc_feature 420 . . . 442/label=Exon1_gRNA; misc_feature 436/label=Mu2—““A”” Deletion; misc_feature 454 . . . 552/label=Intron; exon 553 . . . 899/label=Exon2; misc_feature 900 . . . 1072/label=3′UTR; misc_feature complement (905 . . . 927)/label=3UTR_gRNA; misc_feature 910{circumflex over ( )}911/label=3′UTR Mu1—““T”” insertion; misc_feature 1010 . . . 1032/label=3UTR_gRNA.
Sobic.007G033300: misc_feature 1 . . . 157/label=5′UTR; misc_feature 113 . . . 135/label=Exon1 gRNA; exon 158 . . . 344/label=Exon 1; misc_feature complement (251 . . . 273)/label=Exon1 gRNA; misc_feature 345 . . . 895/label=Intron; exon 896 . . . 1224/label=Exon 2; misc_feature 1225 . . . 1704/label=3′UTR; misc_feature 1389 . . . 1411/label=3′UTR gRNA; misc_feature complement (1581 . . . 1603)/label=3′UTR gRNA.
S. viridis
Plants were grown in growth chambers for chilling stress (15° C.) or control (30° C. day/20° C. night) for 12 hour days at 50% RH. In the first iteration of the experiment, the control genotype ME34 did not germinate in either temperature condition. Germination rates were poor for the mutant alleles as well, at 2/12 in the warm condition and 4/12 (Mu1) or 3/12 (Mu2) in the chilling stress condition. Growth rate and plant height are being measured semi-weekly on the plants in both temperatures. Total biomass of the plant will be measured on a fresh weight and dry weight basis at maturity. The experiment will be repeated to attempt to get better germination rates. It will also be repeated with the 3′ UTR allele to compare to wild-type growth. Plants will be characterized for gene expression differences to confirm phenotypic differences are due to on-target changes in the plastocyanin gene targeted.
Sorghum
Gene-edited sorghum plants will be characterized for changes in biomass and growth rate under chilling stress similar to the protocol for S. viridis. Plants will be characterized for gene expression differences to confirm phenotypic differences are due to on-target changes in the plastocyanin gene targeted.
Sorghum plastocyanin gene Sobic.007G033300
S. viridis gene Sevir.6G069300
S. viridis Exon 1 Mu1 variant encoded polypeptide
S. viridis Exon 1 Mu1 variant polynucleotide
S. viridis Exon 1 Mu2 variant encoded polypeptide
S. viridis Exon 1 Mu2 variant polynucleotide
S. viridis 3′ UTR Mu1 variant polynucleotide
arvense (47).
The breadth and scope of the present disclosure should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents.
The present invention can be defined in any of the following numbered paragraphs:
This application claims the benefit of U.S. Provisional Application 63/006,935, filed Apr. 8, 2020, which is incorporated by reference herein in its entirety.
This invention was made with government support under Cooperative Agreement Number DE-AR0000594 awarded by Advanced Research Projects Agency-Energy (ARPA-E), U.S. Department of Energy. The government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2021/026351 | 4/8/2021 | WO |
Number | Date | Country | |
---|---|---|---|
63006935 | Apr 2020 | US |