Modulating drought tolerance in Brassicaceae using the Kanghan gene family

REFERENCE TO A SEQUENCE LISTING

This application contains a sequence listing in ASCII format, which is being submitted as a text file via EFS-Web with the file name “2014-108-07_SL_ST25.txt” (created Aug. 13, 2021; size 313,032 bytes) and which is hereby incorporated by reference in its entirety.

FIELD OF THE INVENTION

The present invention relates to abiotic stress-resistant plants and processes for obtaining them, including flowering plants and seeds thereof.

BACKGROUND OF THE INVENTION

Abiotic stress is a major challenge facing the agricultural industry (see Yang et al., 2010). Abiotic stresses such as drought and heat not only cause a reduction in crop yield, but also cause high variation in crop yield. Improving crop tolerance to abiotic stresses such as heat and drought is essential for maintaining a stable yield under the continued threat of climate change. It is also a key factor for sustaining and expanding arable land areas for crop production.

Plants have evolved various mechanisms to cope with abiotic stress at both the physiological and biochemical levels. Many stress-induced genes have been identified, including those encoding key enzymes for abscisic acid (ABA) biosynthesis and signaling transduction components such as protein kinases, protein phosphatases and transcription factors. In recent years, several stress-regulated miRNAs have also been identified in model plants under biotic and abiotic stress conditions. Plants respond differently to drought and heat stress (Rizhsky et al., 2004).

SUMMARY

Methods are provided for modulating an abiotic stress response to drought or heat in a plant, for example by introducing a heritable change to the plant, which alters the expression in the plant of an endogenous or exogenous Kanghan protein. Similarly, plants and plant cells having such heritable changes are provided.

Plants having enhanced drought tolerance are accordingly provided, for example by altering selected quantitative trait loci (QTL) associated with the family of Kanghan genes. Suppression of Kanghan genes, for example in null mutations, confers drought tolerance.

Methods are accordingly provided for modulating an abiotic stress response to drought in a plant, comprising introducing a heritable change to the plant which alters the expression in the plant of an endogenous or exogenous Kanghan protein. The Kanghan protein may for example be at least 35% identical to, or at least 49% positively aligned with, a protein encoded by the nucleotide sequence set forth in SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, and/or SEQ ID NO: 13; and this alignment may for example be over an alignment length of at least 90 amino acids, with BLOSUM or PAM substitution matrix, with gaps permitted. Alternative degrees of sequence similarity are contemplated in alternative embodiments, for example 50%, 75%, 90% or 95% identical to, or at least 75%, 90% or 100% positively aligned with, the protein encoded by the nucleotide sequence set forth in SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, or SEQ ID NO: 13; over an alignment length of at least 90, 100 or 110 amino acids, with BLOSUM or PAM substitution matrix, and with gaps permitted.

The Kanghan protein includes a variety conserved domains, such as domains: identical to hTVKDChphAhp (SEQ ID NO: 6); and/or, at least 80% identical to LTVKDCLEhAhK-G (SEQ ID NO: 7); and/or, at least 70% identical to LTVKDCLEhAFKKG (SEQ ID NO: 8); and/or at least 80% identical to VshKGpVlEstshpEs.chhhpQs-huA+LHlFpPph (SEQ ID NO: 9); and/or, at least 70% identical to VsMKGEVIEspsh-EAhcLllcQP-lGA+LHlFoPcl (SEQ ID NO: 10); and/or, at least 80% identical to cppDYDtStpAAhVAlpLISSARlhLKlDuhhTEYSsQaLhDpsutpp (SEQ ID NO: 11); and/or, at least 70% identical to spphhpShupscGhCHPDC-KAssEpEDYDASQpAAhVAVsLISSARlhLKLDusaTEYSAQYLVDNAGpccs (SEQ ID NO: 12).

In alternative embodiments, the plant may lack an endogenous Kanghan protein, such as a protein that has the sequence characteristics of Kanghan proteins described above, such as being at least 35% identical to, or at least 49% positively aligned with, a protein encoded by the nucleotide sequence set forth in SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, or SEQ ID NO: 13; over an alignment length of at least 90 amino acids, with BLOSUM or PAM substitution matrix.

The plant may be an angiosperm, and may for example belong to the family of Brassicaceae, Fabaceae, Poaceae, or Asteraceae plants. The plant may for example be a Caspsella rubella, Brassica rapa, Brassica napus, Brassica carinata, Eutrema salsugineum, Thellugiella parvula, Camelina sativa, Glycine max, Triticum, Zea maize, Oryza sativa or Helianthus annuus plant.

The heritable change may be one that sufficiently decreases the expression of the Kanghan protein so as to enhance drought tolerance relative to an unmodified plant, for example improving drought tolerance by an objective measure by 10% to 100% or more.

The heritable change may for example involve expressing in the plant an inhibitory polynucleotide that down-regulates the expression of the Kanghan protein, such as an inhibitory RNA, for example an anti-sense oligonucleotide, an RNAi oligonucleotide (including a small interfering RNA), a microRNA, or a CRISPR guide RNA. Alternatively, the heritable change may be an alteration of a Kanghan gene sequence encoding the Kanghan protein, for example by transformation with an exogenous Kanghan gene encoding the exogenous Kanghan protein, or by editing or mutation of an endogenous Kanghan gene encoding the endogenous Kanghan protein. The editing or mutation may for example introduce a change to a coding sequence of the Kanghan gene which changes (renders) the amino acid sequence of the Kanghan protein (non-functional?).

In accordance with the foregoing methods, there are also provided parental plants or plant cells that are produced by these processes. Similarly, plant lines, varieties or cultivars are provided that include the parental plant or plant cell, and the plant line, variety or cultivar may for example be characterized by an improved drought tolerance characteristic. Seeds and plant parts are provided, for example from foregoing plant lines, varieties or cultivars.

Seeds in turn may be used to provide progeny plants, such as progeny plants that are genetically derived from the plant line, variety or cultivar so as to retain the improved drought tolerance characteristic.

Methods of marker assisted selection may for example be used to introduce the heritable change, with subsequent screening of the plant or plant cell or progeny for the desired modulation of the abiotic stress response to drought.

A further embodiment is a method for producing a plant having increased tolerance to heat stress, comprising introducing into a plant cell an expression construct comprising a nucleic acid molecule encoding a polypeptide with at least 80% identity to SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, and/or SEQ ID NO: 19 over an alignment length of at least 90 amino acids, operatively linked to at least one regulatory element, said at least one regulatory element being effective to direct expression of said nucleic acid molecule in the plant; and growing the plant cell into the plant. In another embodiment, the nucleic acid molecule encodes a polypeptide with at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, and/or SEQ ID NO: 19 over an alignment length of at least 90 amino acids, at least 100 amino acids, at least 110 amino acids, or over the full length of the amino acid sequence set forth in SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, or SEQ ID NO: 19. The polypeptide encoded by the nucleic acid molecule will preferably have the same biological activity as the polypeptide set forth in SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, or SEQ ID NO: 19. In an embodiment, at least one regulatory element comprises a promoter, for example a constitutive promoter. In a further embodiment, the regulatory element is a regulatory element that is not naturally in operative linkage with the nucleic acid molecule. For example, the regulatory element may be a synthetic regulatory element, a regulatory element derived from a different species than the nucleic acid molecule, or a regulatory element derived from a different gene within the same species as the nucleic acid molecule. In an embodiment, the nucleic acid molecule is derived from a different species than the plant cell into which the expression construct is introduced. In a further embodiment, the nucleic acid molecule is derived from Arabidopsis and the plant cell is a Triticum cell.

The method may further comprise a step of assessing the heat tolerance of the plant relative to a control plant of the same variety or genetic background that does not comprise the expression construct and identifying the plant as having increased tolerance to heat stress if it exhibits increased heat tolerance relative to the control plant. Tests for heat tolerance are known and will be understood by one skilled in the art (for example, see Kumar et al., 2013; Hatfield et al., 2015). In wheat, heat tolerance may be assessed by, for example, subjecting newly germinated seedlings, seedlings, or plants to heat stress at a temperature of about 27 or higher of about 30° C. or higher (e.g. conditions such as 36° C., 42/38° C. (day/night), or 40/38° C. (day/night)) for a period of time (typically days or weeks, for example two or three weeks) then allowing them to recover at a standard growth temperature between about 13-25° C. (e.g. growth conditions such as 25° C., 25/20° C. (day/night), 24/16° C. (day/night), or 18/13° C. (day/night)) for a period of time (e.g. 3-10 weeks) and then measuring viability or another indicator of heat stress, such as yield, biomass, or canopy temperature.

Further provided is a plant cell, plant, seed, or plant tissue comprising an expression construct as described above. In an embodiment, the plant cell, plant, seed, or plant tissue is a Poaceae cell, plant, seed, or tissue. In a further embodiment, the plant cell, plant, seed, or plant tissue is a cereal plant cell, plant, seed, or tissue. Cereal plants include commercially important grain crops such as rice (Oryza sativa), wheat/spelt (Triticum), corn/maize (Zea mays), barley (Hordeum vulgare), Sorghum, oat (Avena sativa), rye (Secale cereale), and Triticale. In a further embodiment, the plant cell, plant, seed, or plant tissue is Triticum. In accordance with the foregoing methods, there are also provided parental plants or plant cells that are produced by these processes. Seeds in turn may be used to provide progeny plants, such as progeny plants that are genetically derived from the plant line, variety or cultivar so as to retain the improved drought tolerance characteristic. Seeds and plant parts that are derived from the foregoing plant lines may be characterized by improved drought tolerance characteristics, for example they may be subjected to RNAseq analyses to identify transcripts that exhibit contrasting differential expression patterns when compared with their respective wild type controls. The combinatory profile of these genes can be an evaluation benchmark for drought tolerance.

Another aspect of the disclosure is a transgenic Brassicaceae plant or plant cell comprising a recombinant nucleic acid construct encoding at least one inhibitory polynucleotide that targets an endogenous Kanghan gene in the transgenic Brassicaceae plant or plant cell to reduce or eliminate expression of a Kanghan protein encoded by the Kanghan gene, wherein:

- the recombinant nucleic acid construct comprises a nucleic acid molecule encoding the at least one inhibitory polynucleotide operably linked to a heterologous promoter;
- the Kanghan protein has at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity to the amino acid sequence set forth in SEQ ID NO: 127 or SEQ ID NO: 128; and
- expression of the at least one inhibitory polynucleotide in the transgenic Brassicaceae plant or plant cell increases drought tolerance of the transgenic Brassicaceae plant or plant cell relative to a control Brassicaceae plant or plant cell of the same species lacking the at least one inhibitory polynucleotide and grown under the same conditions.

In an embodiment of the transgenic Brassicaceae plant or plant cell, the at least one inhibitory polynucleotide comprises an anti-sense oligonucleotide, an RNAi oligonucleotide, or a CRISPR guide RNA.

In an embodiment of the transgenic Brassicaceae plant or plant cell, the Kanghan protein has at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity to the amino acid sequence set forth in SEQ ID NO: 127 and further has at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity to the amino acid sequence set forth in SEQ ID NO: 126.

In an embodiment of the transgenic Brassicaceae plant or plant cell, the Kanghan protein has at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity to the amino acid sequence set forth in SEQ ID NO: 79, SEQ ID NO: 86, SEQ ID NO: 72, SEQ ID NO: 84, SEQ ID NO: 78, SEQ ID NO: 85, or SEQ ID NO: 95. In an embodiment, the Kanghan protein comprises SEQ ID NO: 79, SEQ ID NO: 86, SEQ ID NO: 72, SEQ ID NO: 84, SEQ ID NO: 78, SEQ ID NO: 85, or SEQ ID NO: 95.

In an embodiment of the transgenic Brassicaceae plant or plant cell, the Kanghan protein has at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity to the amino acid sequence set forth in SEQ ID NO: 63, SEQ ID NO: 77, or SEQ ID NO: 121. In an embodiment, the Kanghan protein comprises SEQ ID NO: 63, SEQ ID NO: 77, or SEQ ID NO: 121.

In an embodiment of the transgenic Brassicaceae plant or plant cell, the at least one inhibitory polynucleotide targets two or more endogenous Kanghan genes in the transgenic Brassicaceae plant or plant cell to reduce or eliminate the expression of the Kanghan proteins encoded by the two or more Kanghan genes. In a further embodiment, each of the Kanghan proteins encoded by the two or more Kanghan genes has at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity to the amino acid sequence set forth in SEQ ID NO: 79, SEQ ID NO: 86, SEQ ID NO: 72, SEQ ID NO: 84, SEQ ID NO: 78, SEQ ID NO: 85, or SEQ ID NO: 95. In an embodiment, each of the Kanghan proteins encoded by the two or more Kanghan genes comprises the amino acid sequence set forth in SEQ ID NO: 79, SEQ ID NO: 86, SEQ ID NO: 72, SEQ ID NO: 84, SEQ ID NO: 78, SEQ ID NO: 85, or SEQ ID NO: 95.

Another aspect is a transgenic seed obtained from a transgenic Brassicaceae plant as defined herein, wherein the transgenic seed comprises the recombinant nucleic acid construct.

In an embodiment, the transgenic Brassicaceae plant, plant cell, or seed is a Brassica napus plant or plant cell.

A further aspect of the disclosure is a method of obtaining a Brassicaceae plant having increased drought tolerance, the method comprising:

- (i) transforming at least one Brassicaceae plant cell with a recombinant nucleic acid construct as defined in claim 1 to produce at least one transformed Brassicaceae plant cell;
- (ii) obtaining at least one Brassicaceae plant from the at least one transformed Brassicaceae plant cell produced in step (i); and
- (ii) selecting a Brassicaceae plant from the at least one Brassicaceae plant obtained in step (ii) that exhibits increased drought tolerance relative to a control Brassicaceae plant or plant cell of the same species and grown under the same conditions, wherein the control plant or plant cell is from a Brassicaceae plant that has not been transformed with the recombinant nucleic acid construct.

In embodiments of the method, the Kanghan protein has at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity to the amino acid sequence set forth in SEQ ID NO: 127 or SEQ ID NO: 128.

In embodiments of the method, the Kanghan protein has at least has at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity to the amino acid sequence set forth in SEQ ID NO: 63, SEQ ID NO: 77, or SEQ ID NO: 121.

Another aspect of the disclosure is a method of obtaining a Brassicaceae plant having increased drought tolerance, the method comprising:

- (i) transforming at least one Brassicaceae plant cell with an inhibitory polynucleotide the targets an endogenous Kanghan gene in the transgenic Brassicaceae plant or plant cell to reduce or eliminate expression of a Kanghan protein encoded by the Kanghan gene;
- (ii) obtaining at least one Brassicaceae plant from the at least one transformed Brassicaceae plant cell produced in step (i); and
- (iii) selecting a Brassicaceae plant from the at least one Brassicaceae plant obtained in step (ii) that exhibits increased drought tolerance relative to a control Brassicaceae plant or plant cell of the same species and grown under the same conditions, into which the recombinant nucleic acid construct has not been introduced,
- wherein the Kanghan protein has at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity to the amino acid sequence set forth in SEQ ID NO: 127 or SEQ ID NO: 128.

In an embodiment of the method, the Kanghan protein has at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity to the amino acid sequence set forth in SEQ ID NO: 127 and further has at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity to the amino acid sequence set forth in SEQ ID NO: 126; or

the Kanghan protein has at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity to the amino acid sequence set forth in SEQ ID NO: 128 and further has at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity to the amino acid sequence set forth in SEQ ID NO: 129.

In an embodiment of the method, the Kanghan protein has at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity to the amino acid sequence set forth in SEQ ID NO: 79, SEQ ID NO: 86, SEQ ID NO: 72, SEQ ID NO: 84, SEQ ID NO: 78, SEQ ID NO: 85, or SEQ ID NO: 95.

In an embodiment of the method, the inhibitory polynucleotide comprises an anti-sense oligonucleotide, an RNAi oligonucleotide, or a CRISPR guide RNA.

Another aspect of the disclosure is a Brassicaceae plant, plant cell, or seed produced by the method described in the preceding paragraphs, wherein the Brassicaceae plant, plant cell, or seed comprises at least one non-naturally occurring heritable genetic change in the endogenous Kanghan gene that was induced by the inhibitory polynucleotide. In an embodiment, the plant, plant cell, or seed is a Brassica napus plant, plant cell, or seed.

BRIEF DESCRIPTION OF THE DRAWINGS AND LIST OF SEQUENCES

In order that the invention may be more clearly understood, embodiments thereof will now be described in detail by way of example, with reference to the accompanying drawings, in which:

FIG. 1 is a graph showing segregation of a drought tolerance trait from 500 F2 individual lines, calculated by the survival days after drought treatment (cessation of watering). The survival in days of Col and #95 plants are marked by arrows and legends.

FIG. 2 is a diagram depicting the gene structure of four members of the Kanghan gene family in Arabidopsis ecotype Col and #95. The locations of premature stop codons are indicated as TAA;

FIG. 3 roughly depicts the relative location of 4 conserved protein domains within 5 members of the Arabidopsis Kanghan gene family: at5g18065, at5g18040, at4g29770, at4g29760 and at1g48180.

FIG. 4 is an alternative illustration of the conserved protein domains within the 5 members of the Arabidopsis Kanghan gene family: at5g18065, at5g18040, at4g29770, at4g29760 and at1g48180, with an additional sequence identified as “lcl|Query_10001” which is the sequence of at5g18065 plus the translation of the at5g18065 cDNA following what appears to be a premature stop codon in at5g18065 which truncates the protein. Alternative protein consensus sequences are also set out in FIG. 4, with varying degrees of sequence consensus as illustrated (with lower case descriptors for residues having conserved properties based on Taylor (1986), as follows: alcohol=>o {S, T}, aliphatic=>1 {I, L, V}, aromatic=>a {F, H, W, Y}, charged=>c {D, E, H, K, R}, hydrophobic=>h {A, C, F, G, H, I, K, L, M, R, T, V, W, Y}, negative=>−{D, E}, polar=>p {C, D, E, H, K, N, Q, R, S, T}, positive=>+{H, K, R}, small=>s {A, C, D, G, N, P, S, T, V}, tiny=>u {A, G, S}, turnlike=>t {A, C, D, E, G, H, K, N, Q, R, S, T}.

FIG. 5 depicts conserved domain A in Kanghan proteins using a sequence logo, using the sequence of the 5 Kanghan proteins identified by QTL analysis as having the greatest contribution to drought tolerance.

FIG. 6 is a continuation of FIG. 5, depicting conserved domain B in Kanghan proteins using a sequence logo, using the sequence of the 5 Kanghan proteins identified by QTL analysis as having the greatest contribution to drought tolerance.

FIG. 7 is a continuation of FIG. 6, depicting conserved domain C in Kanghan proteins using a sequence logo, using the sequence of the 5 Kanghan proteins identified by QTL analysis as having the greatest contribution to drought tolerance.

FIG. 8 is a diagram depicting the construct for overexpression of Arabidopsis Kanghan1 in wheat wild type (Fielder) based on the monocot special overexpression vector PANIC5E.

FIG. 9A is a photograph of 3 week old wild-type (left) and transgenic (right) wheat seedlings grown under standard conditions (25° C.). The transgenic wheat seedlings heterologously express At5g18040.

FIG. 9B is a photograph of the plants from FIG. 9A, after being incubated for three weeks at 40/38° C. (day/night), followed by three weeks at 25° C.

FIG. 9C is a photograph of the plants from FIG. 9B after being grown for an additional seven weeks at 25° C.

FIG. 10 is a near infrared leaf surface temperature image of wild-type (left) and transgenic (right) wheat plants grown under standard conditions. The transgenic wheat plant expresses At5g18040 from a heterologous At5g18040 expression construct.

FIG. 11 shows a DNA neighbor phylogenetic tree of the Brassica napus Kanghan gene candidates and their Arabidopsis thaliana counterparts.

FIG. 12 shows a protein neighbor phylogenetic tree of the Brassica napus Kanghan gene candidates and their Arabidopsis thaliana counterparts.

FIG. 13 shows a DNA neighbor phylogenetic tree of the Brassica napus Kanghan gene candidates.

FIG. 14 shows a map of the pGEM®-T vector (Promega, USA).

FIG. 15 shows a map of the pCAMBIA 1301-35S-Int-T7 vector.

FIG. 16 shows a partial map of an RNAi construct designed to target Brassica napus Kanghan genes.

FIG. 17 shows infrared thermal images of a wild-type Brassica napus line and a Kanghan RNAi Brassica napus line.

FIG. 18 shows wild-type Brassica napus plants and Brassica napus plants from two Kanghan RNAi lines that have been subjected to drought treatment.

FIG. 19 shows the survival ratio, after 35 days recovery, of wild-type Brassica napus plants and Brassica napus plants from two RNAi lines that have been subjected to drought treatment.

FIG. 20 shows an alignment of potential Brassica napus homologs of Arabidopsis genes at4g29770, at4g29760, at5g18040, and at5g18065. The Brassica napus sequences included in the alignment are listed in Table 6. Alternative protein consensus sequences are also set out in FIG. 20, with varying degrees of sequence consensus as illustrated (with lower case descriptors for residues having conserved properties based on Taylor (1986), as described above in the brief description of FIG. 4).

FIG. 21 shows an alignment of potential Brassica napus homologs of Arabidopsis genes at1g51670 and at1g48180. The Brassica napus sequences included in the alignment are listed in Table 6. Alternative protein consensus sequences are also set out in FIG. 21, with varying degrees of sequence consensus as illustrated (with lower case descriptors for residues having conserved properties based on Taylor (1986), as described above in the brief description of FIG. 4).

FIG. 22 shows a phylogenetic analysis of Kanghan proteins in diverse Brassicaceae species.

FIG. 23 shows a schematic of a DNA cassette for simultaneous expression of four gRNAs.

FIG. 24 illustrates how the DNA cassette shown in FIG. 23 may be constructed using Gibson assembly.

FIG. 25 illustrates assembly of a plant transformation vector comprising Cas9 and a gRNA cassette using a Golden Gate cloning protocol.

FIG. 26 shows a map of a plasmid for CRISPR/Cas9 gene editing of Kanghan family genes.

The following is a list of sequences appearing in this document:

SEQ ID NO: 1 is a CDS of the At4g29760 gene from Arabidopsis;

ATGGCTGAGCGATTATTACAATCTATGTCAAGGGTGGCTGGCCGATGTCATCCAGATTG

CGTAAAAGCAAGTGATGAGCAAGAAGATTACCATGCATCTCAAAATGCAGCTTTGGTAG

CTCTCAATCTGATTAGCTCTGCAACGTTAATACTGAAACTCCACGCTGAGTTTACTGAG

TACTCAGCTCAGTTTTTGATGGACAATGCTGGAAAGGAAGACGACCCGGGAGAAGTGGA

TCAACAACGCAATCAGGTCACGACCGAAAACTGCCTTCGCTACTTGGCCGAAAACGTTT

GGACCAAGAAGGAAAATGGGCAGGGAGGAATGGATCAACAACGCCCTGTGCTCACTGTC

AAAGACTGCTTGGAACTTGCTTTTAAAAAAGGGCTGCCGAGAAGAGAACACTGGGCACA

TTTGGGATGTACCTTCAAGGCTCCCCCATTTGCTTGTCAGATACCTCGCGTTCCTGTGA

AAGGAGAAGTGGTTGAGGTTAAGACTTTTGATGAAGCATTCAAGCTGTTGGTGCATCAA

CCCATTGGAGCAAAACTGCATTTGTTCAGTCCGCAGATTGATAATGTTGGAGAGGGAGT

TTACAAAGGCCTCACGACAGGTAATGAAACACACTATGTTGGACTTAGAGATGTGCTAA

TAGCTTCAGTGGAGGAGTTCGAGGGAGATTCTGTTGCTATTGTGAAGATCTGCTACAAG

AAGAAGCTTTCATTTATCAAAGTGTCTTTGAGCGTTAGGTTTCTCTCAGTAGCACATGA

TGGTGATAAGTCTAAGTTCATAGCGCCAACAGGTCTGCTTGTTGACTTCTGTGTCCCGC

GCTTATCTATCAACTAA

SEQ ID NO: 2 is a CDS of the At4g29770 gene from Arabidopsis;

ATGATGGCAATCTCAGAAAAAGGAGTCATGGCAATCTCAGAAAAAGGAGTCATGGCAAC

GAAAATTGACAAAAACGGCGTCCTTCGAGAGTTAAGGCGACATTTCACTGAGTTTTCTC

TACGCGACGTAGATCTGTGTCTCCGGAGTTCATCGCAGATGGAGTCATTGTTAGAATGT

TTTGCAATCACGGATGGCAAATGTCATCCCGATTGCTTAAAAGCAAACAATGAGCAAGA

AGATTACGATGCATGTCAATCTGCAGCTTTGGTAGCTGTGAGTTTGATTAGCTCTGCAC

GTGTTATCTTCAAGATCGACTCTAAGTATACTGAGTACTCACCTCAGTATTTGGTGGAT

AACGTTGGGAAGGAAGAAGTTGAGGGAGAAATGGATCAACCAAGCTGTCAGTACACTGT

CGGAAACCTCCTTAGTTACTTGGTGGAAAACGTTTGGACCAAGAAGGAAGTTAGGCAGA

GAGAAATGGATCAACAACGCCGTGAGTTCACTGTCAAAGACTGCTTTGAATTTGCTTTT

AAAAAAGGGCTTCCAAGAAATGGACATTGGGCGCATGTGGGATGTATATTCCCGGTTCC

TCCATTTGCTTGTCAAATACCTCGCGTTCCCATGAAAGGAGAAGTGATTGAGGCTGCAA

ATGTGAGTGAAGCGTTGAAGCTGGGTATGCAACAACCAGCGGCAGCAAGGCTGCATTTG

TTCAGTCCAGAGTTTGATCTTGTTGGAGAGGGTATTTACGATGGCCCGTCAGGTAATGA

AACACGATATGTTGGACTTAGAGATGTGCTCATGGTTGAGGCGGAGAAGATCAAGGGAG

AAACTGTTTTTACTGTGCAGATATGCTACAAGAAGAAGACTTCATTTGTCAAAGTGTCT

ACGAGAAGTATGATTCTCCCGCTTAATGGTGACGACGAGTCTCAGGTCACAGAGCCAGC

ATGTCTACTTGTTGACTTCTGTATCCCACGTTTTTCTATCAACTAA

SEQ ID NO: 3 is a CDS of the At5g18065 gene from Arabidopsis;

ATGGATATGAATCAGCTATTCATGCAATCTATTGCAAACAGTCGTGGACTCTGTCATCC

AGATTGCGAAAAAGCAAATAATGAGCGTGAAGATTATGATGCGTCTCAACATGCCGCTA

TGGTAGCGGTGAATCTGATTAGCTCTGCACGGGTTATCCTCAAGCTTGATGCTGTGTAT

ACTGAGTACTCAGCTCAGTATTTGGTGGATAATGCTGGGAAGGAAGACAACCAGGGAGA

AATGGATCAACAAAGCTCTCAGCTCACTCTCCAAAACTTGCTTCAGTATATGGATGAAA

ATGTCTGGAATAAGAAGGAAGATGTGCAGGGAGAAAGGGAGCAACCACTCACTGTCAAA

GACTGCCTTGAATGTGCTTTCAAGTAA

SEQ ID NO: 4 is a CDS of the At5g18040 gene from Arabidopsis;

ATGAATATGATTCAGCGATTCATGCAATCTATGGCAAAGACGCGTGGCCTCTGTCATCC

AGATTGCGTAAAAGCAAGTAGTGAGCAAGAAGATTACGATGCGTCTCAGCTCAGTATTT

GGTGGATAATGCTGGGAAGGAAGACGACCAGGGAGAAATGGATGAACCAAGCTCTCAGT

TCACTATCGAAAACTTGCATCAGTATATGGTGGAAAATGTCTGGAATAAGAGGTAAGAT

GTGCAGGGAGAGGGAGCAACCACTCACTGTCAAAGACTGCCTTGAATGTGCTTTCAAGA

AAGGGCTACCGAGAAGAGAACATTGGGCACATGTGGGATGTACATTCAAGGCTCCCCCA

TTTGCTTGTCACATACCCCGCGTGCCCATGAAAGGAGAAGTGATTGAGACTAAGAGTTT

GGATGAAGCGTTTAAGCTGTTGATTAAACAACCGGTGGGTGCAAGACTCCATGTGTTCA

GTCCAGACCTTGATAATOTTGGAGAGGGAGTTTACGAGGGCCTGTCTAGCCTGTCTCGT

AAGGAATCACGCTATGTTGGACTTAGGGATGTCATCATAGTTGCAGTGAATAAGTCCGA

GGGAAAAACTGTTGCTACTGTGAAGATATGTTACAAGAAGAAGACTTCATTTGTCAAAG

TGTGTTTGAGCCGTATGTTTGTCCAGCTTGGTGGTGGCGAGGAGTCTCAGGTGAAAGAG

CCAACAGGTCTGCTTGTTGACTTCTGTATCCCACGCTTATCTATCAACTAA

SEQ ID NO: 5 is a CDS of the Atl g51670 gene from Arabidopsis;

ATGGCACTCCCTCCCTATGATCCGAATTTCACATTGGCTTTTTCATACGGTAGACGCGA

TAATGTCTTTGAGAATGACCCAGAGCACGATGAATCTGCTTCTGCTGCTATCGTAGCGG

TTGAGCTGATAAGCTCTGCACGGCTTGCACTTAAGCTGGATAGTGTCCGCACTGAGTAC

TCAGCTCAGTATTTGGTGGACAAAGCTGGCTCACGCAACCTCAGGCGCAGGCGCAAGCT

CACTGTCAAGGACTGCCTTAACTTTGCGTTAAAGAAAGGCGGCATACCGAGAGCAGAAG

ATTGGCCACCTTTGGGATCTGAGTCAAAGACCCCATCATCGTACGAACCTGCTCTCGTT

TCCATGAAAGGAGAAGTGATTGAGCCTAAGGATATGGACGAAGTACCTGAGTTOTTGGT

GCATCAATCAGCCGTGGGAGCAAAACTGCATGTGTTCACTCCACACATTGAACTTCAAC

AAGACGCAATTTACTTGCCTCGTCAGGTGAGTATGCGCGCTACGTTGGACTTAGAGATG

GGATAG

SEQ ID NO: 6 is a consensus sequence of Kanghan conserved domain B (100% consensus)

hTVKDChphAhp

SEQ ID NO: 7 is a consensus sequence of Kanghan conserved domain B (80% consensus)

LTVKDCLEhAhKXG (where X is Lys or absent)

SEQ ID NO: 8 is a consensus sequence of Kanghan conserved domain B (70% consensus)

LTVKDCLEhAFKKG

SEQ ID NO: 9 is a consensus sequence of Kanghan conserved domain C (80% consensus)

VshKGpVlEstshpEsXchhhpQs-huA + LHlFpPph (where X is any amino acid)

SEQ ID NO: 10 is a consensus sequence of Kanghan conserved domain C (70% consensus)

VsMKGEVIEspsh_EAhcLllcQPlGA + LHlFoPcl

SEQ ID NO: 11 is a consensus sequence of Kanghan conserved domain A (80% consensus)

cppDYDtStpAAhVAlpLISSARlhLKlDuhhTEYSsQaLhDpsutpp

SEQ ID NO: 12 is a consensus sequence of Kanghan conserved domain A (70% consensus)

spphhpShupscChCHPDCXKAssEpEDYDASQpAAhVAVsLISSARlhLKLDusaTEY

SAQYLVDNAGpccs (where X is any amino acid or absent)

SEQ ID NO: 13 is a CDS of the At1g48180 gene from Arabidopsis:

ATGGCACTCCCACCCTATGATCCCAATTTCAAATTTGCATTCTCTCTTGGCACGATTGC

GAAACACCAAGATTACGATGAATCTGCTTCTGCTGCTGTTGTAGCGCTTGATCTGATAA

GCTCTGCACGGTTTGCACTTAAGCTGGATAGTGTCTATACTGAGTACTCTGCTAAGTAT

GTGGTGGACAATGCTGCTGGCTCACACAGTGGGCGCAAGCTCACTGTCAAAGACTGTCT

TGAGTTTGCCTTAAACAAAGGCGGCATACCGAAAGCAGAAGATTGGCCACGCTTGGGAT

CTGTGATAACGCCCCCATCATCGTATAAACCTGATCTCGTTTCGATGAAAGGACAAGTG

ATTGAGCCTCAGACTATTGAGGAAGCATGTGACATGGTGGTGGATCAACCAGTAGGAGC

AAAATTGCATGTGTTCAAGCCACACATTGAACTTCAACAAGACGCAAGTGCTATAACTG

GCATTTACTCTGCCACCTCAGGTGAGCCACCCACCTATCTCCGACTTACAGATCCCATC

ATCGTTGGAGTCGAGAAGATCCAAGGGAAGTCTATTGGAACTGTGAAGGTATGGTACAA

GAAGTTCATATTTCTGAAAGTGGCTATGAGCAGGTGGTTTCAGTTATACTCTCCGGATG

GCACACACACGGGCATAAAGCGAACAGATTACCTTGTTGATTTTTGTGTCCCACGCCTA

TCCATCGATTAA

SEQ ID NO: 14 is the polypeptide encoded by SEQ ID NO: 1

MAERLLQSMSRVAGRCHPDCVKASDEQEDYHASQNAALVAVNLISSARLILKLDAEFTE

YSAQFLMDNAGKEDDPGEVDQQRNQVTTENCLRYLAENVWTKKENGQGGMDQQRPVLTV

KDCLELAFKKGLPRREHWAHLGCTFKAPPFACQIPRVPVKGEVVEVKTFDEAFKLLVHQ

P1GAKLHLFSPQIDNVGEGVYKGLTTGNETHYVGLRDVLIASVEEFEGDSVAIVKICYK

KKLSFIKVSLSVRFLSVAHDGDKSKFIAPTGLLVDFCVPRLSIN

SEQ ID NO: 15 is the polypeptide encoded by SEQ ID NO: 2

MMAISEKGVMAISEKGVMATKIDKNGVLRELRRHETEFSLRDVDLCLRSSSQMESLLEC

FATTDGKCHPDCLKANNEQEDYDACQSAALVAVSLISSARVIFKIDSKYTEYSPQYLVD

NVGKEEVEGEMDQPSCQYTVGNLLSYLVENVWTKKEVRQREMDQQRREFTVKDCFEFAF

KKGLPRNGHWAHVGCIFPVPPFACQIPRVPMKGEVIEAANVSEALKLGMQQPAAARLHL

FSPEFDLVGEGIYDGPSGNETRYVGLRDVLMVEAEKIKGETVETVQICYKKKTSFVKVS

TRSMILPLNGDDESQVTEPACLLVDFCIPRESIN

SEQ ID NO: 16 is the polypeptide encoded by SEQ ID NO: 3

MDMNQLFMQSIANSRGLCHPDCEKANNEREDYDASQHAAMVAVNLISSARVILKLDAVY

TEYSAQYLVDNAGKEDNQGEMDQQSSQLTLQNLLQYMDENVWNKKEDVQGEREQPLTVK

DCLECAFK

SEQ ID NO: 17 is the polypeptide encoded by SEQ ID NO: 4

MNMIQRFMQSMAKTRGLCHPDCVKASSEQEDYDASQLSIWWIMLGRKTTREKWMNQALS

SLSKTCISIWWKMSGIRGKMCREREQPLTVKDCLECAFKKGLPRREHWAHVGCTFKAPP

EACHIPRVPMKGEVIETKSLDEAFKLLIKQPVGARLHVFSPDLDNVGEGVYEGLSSLSR

KESRYVGLRDVIIVAVNKSEGKTVATVKICYKKKTSFVKVCLSRMFVQLGGGEESQVKE

PTGLLVDFCIPRLSIN

SEQ ID NO: 18 is the polypeptide encoded by SEQ ID NO: 5

MALPPYDPNFTLAFSYGRRDNVFENDPEHDESASAAIVAVELISSARLALKLDSVRTEY

SAQYLVDKAGSRNLRRRRKLTVKDCLNFALKKGGIPRAEDWPPLGSESKTPSSYEPALV

SMKGEVIEPKDMDEVPELLVHQSAVGAKLHVFTPHIELQQDAIYLPRQVSMRATLDLEM

G

SEQ ID NO: 19 is the polypeptide encoded by SEQ ID NO: 13

MALPPYDPNFKFAFSLGTIAKHQDYDESASAAVVALDLISSARFALKLDSVYTEYSAKY

VVDNAAGSHSGRKLTVKDCLEFALNKGGIPKAEDWPRLGSVITPPSSYKPDLVSMKGQV

IEPQTIEEACDMVVDQPVGAKLHVFKPHIELQQDASAITGIYCGTSGEPASYVGLRDAI

IVGVEKIQGKSIGTVKVWYKKFIFLKVAMSRWFQLYSPDGTHTGIKRTDYLVDFCVPRL

SMD

SEQ ID NOs: 20 and 21 are a primer pair designed to target BnaCO3g77540D

(LOC106364365)

TAGATTCTGCTGAGAGAGCCGCTAC (SEQ ID NO: 20)

GGATCCGTCGACGCACCTATGGGTCCATGCTTTAAC (SEQ ID NO: 21)

SEQ ID NOs: 22 and 23 are a primer pair designed to target BnaA08g12920D

(LOC106424160)

TCATCCAGATTGCCAACGAG (SEQ ID NO: 22)

GGATCCGTCGACACGCATCCTCCAGTGTCTTAG (SEQ ID NO: 23)

SEQ ID NOs: 24 and 25 are a primer pair designed to target hygromycin

TACACAGCCATCGGTCCAGA (SEQ ID NO: 24)

GTAGGAGGGCGTGGATATGTC (SEQ ID NO: 25)

SEQ ID NOs: 26 and 27 are a primer pair designed to target BnaA07g02270D

CGCTACGAGGCACGTACTCAAT (SEQ ID NO: 26)

CTCGGTCTTCCCCGGTTTC (SEQ ID NO: 27)

SEQ ID NOs: 28 and 29 are a primer pair designed to target BnaA08g12920D

GCTTAGAGACGTGATCCTGGTAGC (SEQ ID NO: 28)

CCAGTGTGGTGAACATACGGC (SEQ ID NO: 29)

SEQ ID NOs: 30 and 31 are a primer pair designed to target BnaA01g07670D

GTTTTGTTGGTCTCTTCTCTTTGC (SEQ ID NO: 30)

TTCTTAAGAGGCGTTTCAGATGG (SEQ ID NO: 31)

SEQ ID NOs: 32 and 33 are a primer pair designed to target BnaC03g77540D

TGATTTGGGTTTTGCCTGATAC (SEQ ID NO: 32)

GAAACAAACCATAAATGAGTTGCC (SEQ ID NO: 33)

SEQ ID NOs: 34 and 35 are a primer pair designed to target BnaC03g77550D

CATTTGGGATGTGTCGATTGAG (SEQ ID NO: 34)

CCCACGTAGCTTGTTCCGTT (SEQ ID NO: 35)

SEQ ID NOs: 36 and 37 are a primer pair designed to target BnaA01g06470D

AACACTGTCACGCAGATTGCC (SEQ ID NO: 36)

CTGTCCAGGTTAGCTACCATACGA (SEQ ID NO: 37)

SEQ ID NOs: 38 and 39 are a primer pair designed to target BnaC01g08490D

CGGTATCCAACTCATTCGAAGG (SEQ ID NO: 38)

TCAAGTATATACTGGGTTGGCTGC (SEQ ID NO: 39)

SEQ ID NOs: 40 to 171 are detailed in the sequence listing.

DETAILED DESCRIPTION

In the following detailed description, various non-limiting examples are set out of particular embodiments, together with experimental procedures that may be used to implement a wide variety of modifications and variations in the practice of the present invention. For clarity, a variety of technical terms are used herein in accordance with what is understood to be the commonly understood meaning, as reflected in definitions set out below.

The term “line” refers to a group of plants that displays very little overall variation among individuals sharing that designation. A “line” generally refers to a group of plants that display little or no genetic variation between individuals for at least one trait. Plants within a group of plants that display little or no genetic variation between individuals may also be referred to as having the same genetic background.

A “variety” or “cultivar” includes a line that is used for commercial production. In some aspects, Brassica varieties may for example be derived from “doubled haploid” (DH) lines, which refers to a line created by the process of microspore embryogenesis, in which a plant is created from an individual microspore. By this process, lines are created that are homogeneous, i.e. all plants within the line have the same genetic makeup. The original DH plant is referred to as DH1, while subsequent generations are referred to as DH2, DH3 etc. Doubled haploid procedures are well known and have been established for several crops. A procedure for B. juncea has been described by Thiagrarajah and Stringham (1993).

New lines, varieties or plants may be produced by introducing a heritable change in a parent plant. In this context, a “heritable change” is any molecular alteration, typically a genetic change, that is capable of being passed from one generation of plant to the next. This term is intended to include molecular alterations such as, but not limited to, insertions, deletions, point mutations, frame-shift mutations, inversions, rearrangements, and the introduction of transgenes. There is a wide variety of techniques available for introducing heritable changes to plants and plant cells.

Plant “mutagenesis” in the present context is a process in which an agent known to cause alterations in genetic material is applied to plant material, for example the mutagenic agent ethyl methylsulfonate (EMS). A range of molecular techniques such as recombination with foreign or heterologous nucleic acid fragments or gene editing may also be used for mutagenesis. All such methods of introducing nucleic acid sequence changes are included within the term “mutagenesis” as used herein.

Plant “regeneration” involves the selection of cells capable of regeneration (e.g. seeds, microspores, ovules, pollen, vegetative parts) from a selected plant or variety. These cells may optionally be subjected to mutagenesis, following which a plant is developed from the cells using regeneration, fertilization, and/or growing techniques based on the types of cells mutagenized. Applicable regeneration techniques are known to those skilled in the art; see, for example, Armstrong et al. (1985); and Close et al. (1987).

“Improved characteristics” of a plant means that the characteristics in question are altered in a way that is desirable or beneficial or both in comparison with a reference value or attribute, which in the absence of an express comparator relates to the equivalent characteristic of a wild type strain.

Plant “progeny” means the direct and indirect descendants, offspring and derivatives of a plant or plants and includes the first, second, third and subsequent generations and may be produced by self-crossing, crossing with plants with the same or different genotypes, and may be modified by range of suitable genetic engineering techniques.

Plant “breeding” includes all methods of developing or propagating plants and includes both intra and inter species and intra and inter line crosses as well as all suitable artificial breeding techniques. Desired traits may be transferred to other lines through conventional breeding methods and can also be transferred to other species through inter-specific crossing. Both conventional breeding methods and inter-specific crossing methods as well as all other methods of transferring genetic material between plants are included within the concept of “breeding”.

“Molecular biological techniques” means all forms of anthropomorphic manipulation of a biological molecules, such as nucleic acid sequences, for example to alter the sequence and expression thereof and includes the insertion, deletion, modification or editing of sequences or sequence fragments and the direct or indirect introduction of new sequences into the genome of an organism, for example by directed or random recombination using suitable vectors and/or techniques.

“Marker-assisted selection” (MAS) refers to the use of molecular markers to assist in phenotypic selection in the context of plant breeding. A wide variety of molecular markers, such as single nucleotide polymorphisms (SNPs), may for example be used in MAS plant breeding, including the application of next-generation sequencing (NGS) technologies.

The term “genetically derived” as used for example in the phrase “an improved characteristic genetically derived from the parent plant or cell” means that the characteristic in question is dictated wholly or in part by an aspect of the genetic makeup of the parent plant or cell, applying for example to progeny of the parent plant or cell that retain the improved characteristic of the parent plant or cell.

Various genes and nucleic acid sequences of the invention may be recombinant sequences. The term “recombinant” means that something has been recombined, so that when made in reference to a nucleic acid construct the term refers to a molecule that is comprised of nucleic acid sequences that are joined together or produced by means of molecular biological techniques. Nucleic acid “constructs” are accordingly recombinant nucleic acids, which have been generally been made by aggregating interoperable component sequencers. The term “recombinant” when made in reference to a protein or a polypeptide refers to a protein or polypeptide molecule which is expressed using a recombinant nucleic acid construct created by means of molecular biological techniques. The term “recombinant” when made in reference to the genetic composition or an organism or cell refers to a gamete or progeny with new combinations of alleles that did not occur in the parental genomes. Recombinant nucleic acid constructs may include a nucleotide sequence which is ligated to, or is manipulated to become ligated to, a nucleic acid sequence to which it is not ligated in nature, or to which it is ligated at a different location in nature. Referring to a nucleic acid construct as ‘recombinant’ therefore indicates that the nucleic acid molecule has been manipulated using genetic engineering, i.e. by human intervention. Recombinant nucleic acid constructs may for example be introduced into a host cell by transformation. Such recombinant nucleic acid constructs may include sequences derived from the same host cell species or from different host cell species, which have been isolated and reintroduced into cells of the host species. Recombinant nucleic acid construct sequences may become integrated into a host cell genome, either as a result of the original transformation of the host cells, or as the result of subsequent recombination and/or repair events.

Recombinant constructs of the invention may include a variety of functional molecular or genomic components, as required for example to mediate gene expression or suppression in a transformed plant. In this context, “DNA regulatory sequences,” “control elements,” and “regulatory elements,” refer to transcriptional and translational control sequences, such as promoters, enhancers, polyadenylation signals, terminators, and protein degradation signals that regulate gene expression. In the context of the present disclosure, “promoter” means a sequence sufficient to direct transcription of a gene when the promoter is operably linked to the gene. The promoter is accordingly the portion of a gene containing DNA sequences that provide for the binding of RNA polymerase and initiation of transcription. Promoter sequences are commonly, but not universally, located in the 5′ non-coding regions of a gene. A promoter and a gene are “operably linked” when such sequences are functionally connected so as to permit gene expression mediated by the promoter. The term “operably linked” accordingly indicates that DNA segments are arranged so that they function in concert for their intended purposes, such as initiating transcription in the promoter to proceed through the coding segment of a gene to a terminator portion of the gene. Gene expression may occur in some instances when appropriate molecules (such as transcriptional activator proteins) are bound to the promoter. Expression is the process of conversion of the information of a coding sequence of a gene into mRNA by transcription and subsequently into polypeptide (protein) by translation, as a result of which the protein is said to be expressed. As the term is used herein, a gene or nucleic acid is “expressible” if it is capable of expression under appropriate conditions in a particular host cell.

Promoters may for example be used that provide for preferential gene expression within a specific organ or tissue, or during a specific period of development. For example, promoters may be used that are specific for leaf (Dunsmuir et al., 1983), root tips (Pokalsky et al., 1989), fruit (Peat et al., 1989; U.S. Pat. No. 4,943,674 issued 24 Jul. 1990; International Patent Publication WO-A 8 809 334; U.S. Pat. No. 5,175,095 issued 29 Dec. 1992; European Patent Application EP-A 0 409 629; and European Patent Application EP-A 0 409 625) embryogenesis (U.S. Pat. No. 5,723,765 issued 3 Mar. 1998 to Oliver et al.), or young flowers (Nilsson et al. 1998). Promoters demonstrating preferential transcriptional activity in plant tissues are, for example, described in European Patent Application EP-A 0 255 378 and International Patent Publication WO-A 9 113 980. Promoters may be identified from genes which have a differential pattern of expression in a specific tissue by screening a tissue of interest, for example, using methods described in U.S. Pat. No. 4,943,674 and European Patent Application EP-A 0255378. The disclosure herein includes examples of this embodiment, showing that plant tissues and organs can be modified by transgenic expression of a Kanghan gene.

An “isolated” nucleic acid or polynucleotide as used herein refers to a component that is removed from its original environment (for example, its natural environment if it is naturally occurring). An isolated nucleic acid or polypeptide may contain less than about 50%, less than about 75%, less than about 90%, less than about 99.9% or less than any integer value between 50 and 99.9% of the cellular or biological components with which it was originally associated. A polynucleotide amplified using PCR so that it is sufficiently distinguishable (on a gel for example) from the rest of the cellular components is, for example, thereby “isolated”. The polynucleotides of the invention may be “substantially pure,” i.e., having the high degree of isolation as achieved using a purification technique.

In the context of biological molecules “endogenous” refers to a molecule such as a nucleic acid that is naturally found in and/or produced by a given organism or cell. An “endogenous” molecule may also be referred to as a “native” molecule. Conversely, in the context of biological molecules “exogenous” refers to a molecule, such as a nucleic acid, that is not normally or naturally found in and/or produced by a given organism or cell in nature.

As used herein to describe nucleic acid or amino acid sequences the term “heterologous” refers to molecules or portions of molecules, such as DNA sequences, that are artificially introduced into a particular host cell, for example by transformation. Heterologous DNA sequences may for example be introduced into a host cell by transformation. Such heterologous molecules may include sequences derived from the host cell. Heterologous DNA sequences may become integrated into the host cell genome, either as a result of the original transformation of the host cells, or as the result of subsequent recombination events.

Transformation techniques that may be employed include plant cell membrane disruption by electroporation, microinjection and polyethylene glycol based transformation (such as are disclosed in Paszkowski et al. (1984); Fromm et al. (1985); Rogers et al. (1986); and in U.S. Pat. Nos. 4,684,611; 4,801,540; 4,743,548 and 5,231,019), biolistic transformation such as DNA particle bombardment (for example as disclosed in Klein et al. (1987); Gordon-Kamm, et al. (1990); and in U.S. Pat. Nos. 4,945,050; 5,015,580; 5,149,655 and 5,466,587); Agrobacterium-mediated transformation methods (such as those disclosed in Horsch et al. (1984); Fraley et al. (1983); and U.S. Pat. Nos. 4,940,838 and 5,464,763). Transformation systems adapted for use in Camelina sativa are for example described in US Patent Publication 20140223607. Varieties of Camelina sativa are for example described in US Patent Publication 20120124693, and the subject of seed samples deposited under ATCC Accession No. PTA-11480. Aspects of the present invention involve altering known plant varieties, such as Camelina sativa, to alter endogenous Kanghan genes.

Transformed plant cells may be cultured to regenerate whole plants having the transformed genotype and displaying a desired phenotype, as for example modified by the expression of a heterologous Kanghan gene during growth or development. A variety of plant culture techniques may be used to regenerate whole plants, such as are described in Gamborg et al. (1995); Evans et al. (1983); Binding (1985); Klee et al. (1987).

Various aspects of the present disclosure encompass nucleic acid or amino acid sequences that are homologous to other sequences. As the term is used herein, an amino acid or nucleic acid sequence is “homologous” to another sequence if the two sequences are substantially identical, as defined herein, and the functional activity of the sequences is conserved (as used herein, sequence conservation or identity does not infer evolutionary relatedness). Nucleic acid sequences may also be homologous if they encode substantially identical amino acid sequences, even if the nucleic acid sequences are not themselves substantially identical, for example as a result of the degeneracy of the genetic code.

With reference to biological sequences “substantial homology” or “substantial identity” is meant, in the alternative, a sequence identity of greater than 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% up to 100% sequence identity. Homology may refer to nucleic acid or amino acid sequences as the context dictates. In alternative embodiments, sequence identity may for example be at least 75%, at least 90% or at least 95%. Optimal alignment of sequences for comparisons of identity may be conducted using a variety of algorithms, such as the local homology algorithm of Smith and Waterman (1981), the homology alignment algorithm of Needleman and Wunsch (1970), the search for similarity method of Pearson and Lipman (1988), and the computerized implementations of these algorithms (such as GAP, BESTFIT, FASTA and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, Madison, Wis., U.S.A.). Sequence identity may also be determined using the BLAST algorithm, described in Altschul et al. (1990) (using the published default settings). Software for performing BLAST analysis may be available through the National Center for Biotechnology Information (NCBI) at their Internet site. The BLAST algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence that either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold. Initial neighborhood word hits act as seeds for initiating searches to find longer HSPs. The word hits are extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Extension of the word hits in each direction is halted when the following parameters are met: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T and X determine the sensitivity and speed of the alignment. The BLAST program may use as defaults a word length (W) of 11, the BLOSUM62 scoring matrix (Henikoff et al., 1992) alignments (B) of 50, expectation (E) of 10, M=5, N=4, and a comparison of both strands. One measure of the statistical similarity between two sequences using the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. In alternative embodiments, nucleotide or amino acid sequences are considered substantially identical if the smallest sum probability in a comparison of the test sequences is less than about 1, less than about 0.1, less than about 0.01, or less than about 0.001.

An alternative indication that two amino acid sequences are substantially identical is that one peptide is specifically immunologically reactive with antibodies that are also specifically immunoreactive against the other peptide. Antibodies are specifically immunoreactive to a peptide if the antibodies bind preferentially to the peptide and do not bind in a significant amount to other proteins present in the sample, so that the preferential binding of the antibody to the peptide is detectable in an immunoassay and distinguishable from non-specific binding to other peptides. Specific immunoreactivity of antibodies to peptides may be assessed using a variety of immunoassay formats, such as solid-phase ELISA immunoassays for selecting monoclonal antibodies specifically immunoreactive with a protein (see Harlow et al., 1988).

An alternative indication that two nucleic acid sequences are substantially identical is that the two sequences hybridize to each other under moderately stringent, or stringent, conditions. Hybridization to filter-bound sequences under moderately stringent conditions may, for example, be performed in 0.5 M NaHPO₄, 7% sodium dodecyl sulfate (SDS), 1 mM EDTA at 65° C., and washing in 0.2×SSC/0.1% SDS at 42° C. (see Ausubel, et al. (eds), 1989). Alternatively, hybridization to filter-bound sequences under stringent conditions may, for example, be performed in 0.5 M NaHPO₄, 7% SDS, 1 mM EDTA at 65° C., and washing in 0.1×SSC/0.1% SDS at 68° C. (see Ausubel, et al. (eds), 1989). Hybridization conditions may be modified in accordance with known methods depending on the sequence of interest (see Tijssen, 1993). Generally, stringent conditions are selected to be about 5° C. lower than the thermal melting point for the specific sequence at a defined ionic strength and pH. The term “a polynucleotide that hybridizes under stringent (low, intermediate) conditions” is intended to encompass both single and double-stranded polynucleotides although only one strand will hybridize to the complementary strand of another polynucleotide. Washing in the specified solutions may be conducted for a range of times from several minutes to several days and those skilled in the art will readily select appropriate wash times to discriminate between different levels of homology in bound sequences.

In alternative embodiments, the invention provides nucleic acids, such as isolated or recombinant nucleic acid molecules, comprising the sequence of a Kanghan allele of the invention. Isolated nucleic acids of the invention may include coding sequences of the invention recombined with other sequences, such as cloning vector sequences. Homology to sequences of the invention may be detectable by hybridization with appropriate nucleic acid probes, by PCR techniques with suitable primers or by other techniques. In particular embodiments there are provided nucleic acid probes which may comprise sequences homologous to portions of the alleles of the invention. Further embodiments may involve the use of suitable primer pairs to amplify or detect the presence of a sequence of the invention, for example a sequence that is associated with an abiotic stress response, such as drought or heat resistance.

In alternative embodiments, the invention provides methods for identifying plants, such as Camelina, Brassica or Triticum plants, with a desirable abiotic stress response, such as drought tolerance and/or heat resistance, or a desired genomic characteristic. Methods of the invention may for example involve determining the presence in a genome of particular Kanghan alleles. In particular embodiments the methods may comprise identifying the presence of: a nucleic acid polymorphism associated with one of the identified alleles; or an antigenic determinant associated with one of the alleles. Such a determination may for example be achieved with a range of techniques, such as PCR amplification of the relevant DNA fragment, DNA fingerprinting, RNA fingerprinting, gel blotting and RFLP analysis, nuclease protection assays, sequencing of the relevant nucleic acid fragment, the generation of antibodies (monoclonal or polyclonal), or alternative methods adapted to distinguish the protein produced by the relevant alleles from other variants or wild type forms of that protein.

In selected embodiments, a specific base pair change in a Kanghan allele may for example be used to design protocols for MAS, such as the use of allele-specific probes, markers or PCR primers. For an exemplary summary of allele-specific PCR protocols, see Myakishev et al. (2001) or Tanhuanpaa et al. (1999). In alternative embodiments, for example, various methods for detecting single nucleotide polymorphisms (SNPs) may be used for identifying Kanghan alleles of the invention. Such methods may for example include TaqMan assays or Molecular Beacon assays (Tapp et al., 2000), Invader Assays (Mein et al., 2000) or assays based on single strand conformational polymorphisms (SSCP) (Orita et al., 1989).

In alternative embodiments, the invention provides progeny of parent plant lines having altered endogenous or heterologous Kanghan genes, for example progeny of Camelina sativa parent line which is the subject of ATCC Accession number PTA-11480. Such progeny may for example be selected to have a desired alteration in an abiotic stress response compared to the parent strain, such as improved drought resistance or heat tolerance.

In alternative embodiments, a plant seed is provided, such as an Arabidopsis, Camelina, Triticum or Brassica seed. In alternative embodiments, genetically stable plants are provided, such as plants of the genus Arabidopsis, Camelina, Triticum or Brassica. In further alternative embodiments the invention provides processes of producing genetically stable plants, such as Arabidopsis, Camelina, Triticum or Brassica plants, for example plants having a desired alteration in an abiotic stress response compared to a reference strain that does not have a particular alteration in a Kanghan gene, such as improved drought resistance or heat tolerance.

In various aspects, the invention involves the modulation of the number of copies of an expressible Kanghan coding sequence in a plant genome. By “expressible” it is meant that the primary structure, i.e. sequence, of the coding sequence indicates that the sequence encodes an active protein. Expressible coding sequences may nevertheless not be expressed as an active protein in a particular cell, for example due to gene silencing. This ‘gene silencing’ may for example take place by various mechanisms of homologous transgene inactivation or epigenetic silencing in vivo. Homologous transgene inactivation and epigenetic silencing in transgenic plants has been described in plants where a transgene has been inserted in the sense orientation, with the result that both the gene and the transgene are down-regulated (Napoli et al., 1990; Rajeevkum et al., 2015). In the present invention, the expressible coding sequences in a genome may accordingly not all be expressed in a particular cell, and may in some embodiments result in suppression of Kanghan gene expression.

In other aspects, reduction of Kanghan gene expression may include the reduction, including the suppression or elimination (aka knockout), of expression of a nucleic acid sequence that encodes a Kanghan protein, such as a nucleic acid sequence of the invention. By elimination of expression, it is meant herein that a functional amino acid sequence encoded by the nucleic acid sequence is not produced at a detectable level. By suppression of expression, it is meant herein that a functional polypeptide encoded by the nucleic acid sequence is produced at a reduced level relative to the wild type level of expression of the polypeptide. Reduction of Kanghan expression may include the elimination of transcription of a nucleic acid sequence that encodes a Kanghan protein, such as a sequence of the invention encoding a Kanghan protein. By elimination of transcription it is meant herein that the mRNA sequence encoded by the nucleic acid sequence is not transcribed at detectable levels. Reduction of Kanghan activity may also include the production of a truncated amino acid sequence from a nucleic acid sequence that encodes a Kanghan protein, meaning that the amino acid sequence encoded by the nucleic acid sequence is missing one or more amino acids of the functional amino acid sequence encoded by a wild type nucleic acid sequence. In addition, reduction of Kanghan activity may include the production of a variant Kanghan amino acid sequence, meaning that the amino acid sequence has one or more amino acids that are different from the amino acid sequence encoded by a wild type nucleic acid sequence. A variety of mutations may be introduced into a nucleic acid sequence for the purpose of reducing Kanghan activity, such as frame-shift mutations, introduction of premature stop codon(s), substitutions and deletions. For example, mutations in coding sequences may be made so as to introduce substitutions within functional motifs or conserved domains in a Kanghan protein, such as conserved Kanghan protein domains A, B or C.

In an alternative aspect, the down-regulation of Kanghan genes may be used to alter a plant response to abiotic stress, for example to enhance drought tolerance. Such down-regulation may be tissue-specific. For example, anti-sense oligonucleotides may be expressed to down-regulate expression of Kanghan genes. The expression of such anti-sense constructs may be made to be tissue-specific by operably linking anti-sense encoding sequences to tissue-specific promoters. Anti-sense oligonucleotides, including anti-sense RNA molecules and anti-sense DNA molecules, act to block the translation of mRNA by binding to targeted mRNA and inhibiting protein translation from the bound mRNA. For example, anti-sense oligonucleotides complementary to regions of a DNA sequence encoding a Kanghan protein may be expressed in transformed plant cells during development to down-regulate the expression of the Kanghan gene. Alternative methods of down-regulating Kanghan gene expression may include the use of ribozymes or other enzymatic RNA molecules (such as hammerhead RNA structures) that are capable of catalyzing the cleavage of RNA (as disclosed in U.S. Pat. Nos. 4,987,071 and 5,591,610).

Aspects of the invention involve the use of gene editing to alter Kanghan gene sequences. For example, CRISPR-Cas system(s) (e.g., single or multiplexed) can be used to perform plant gene or genome interrogation or editing or manipulation. Kanghan genes may for example be edited for functional investigation and/or selection and/or interrogation and/or comparison and/or manipulation and/or transformation of plant Kanghan genes. This editing may be carried out so as to create, identify, develop, optimize, or confer trait(s) or characteristic(s) to plant(s) or to transform a plant genome, for example to alter an abiotic stress response in a plant, such as a drought or heat tolerance. Gene editing can in this way be used to provide improved production of plants, new plants with new combinations of traits or characteristics or new plants with enhanced traits. Such CRISPR-Cas system(s) can for example be used in Site-Directed Integration (SDI) or Gene Editing (GE) or any Near Reverse Breeding (NRB) or Reverse Breeding (RB) techniques (see the University of Arizona website “CRISPR-PLANT” http://www.genome.arizona.edu/crispr/). Embodiments of the invention can be used in genome editing in plants alone or in combination with other molecular biological techniques, such as RNAi or similar genome editing techniques (see, e.g., Nekrasov, 2013; Brooks, 2014; Shan, 2013; Feng, 2013; Xie, 2013; Xu, 2014; Caliando et al, 2015; U.S. Pat. Nos. 6,603,061; 7,868,149; US 2009/0100536; Morrell et al., 2011). Protocols for targeted plant genome editing via CRISPR/Cas9 are also available in Li et al, 2015.

In some embodiments, the invention provides new Kanghan polypeptide sequences, which may be produced from wild type Kanghan proteins by a variety of molecular biological techniques. It is well known in the art that some modifications and changes can be made in the structure of a polypeptide without substantially altering the biological function of that peptide, to obtain a biologically equivalent polypeptide. As used herein, the term “conserved amino acid substitutions” refers to the substitution of one amino acid for another at a given location in the peptide, where the substitution can be made without any appreciable loss or gain of function, to obtain a biologically equivalent polypeptide. In making such changes, substitutions of like amino acid residues can be made on the basis of relative similarity of side-chain substituents, for example, their size, charge, hydrophobicity, hydrophilicity, and the like, and such substitutions may be assayed for their effect on the function of the peptide by routine testing. Conversely, as used herein, the term “non-conserved amino acid substitutions” refers to the substitution of one amino acid for another at a given location in the peptide, where the substitution causes an appreciable loss or gain of function of the peptide, to obtain a polypeptide that is not biologically equivalent.

In some embodiments, conserved amino acid substitutions may be made where an amino acid residue is substituted for another having a similar hydrophilicity value (e.g., within a value of plus or minus 2.0), where the following hydrophilicity values are assigned to amino acid residues (as detailed in U.S. Pat. No. 4,554,101): Arg (+3.0); Lys (+3.0); Asp (+3.0); Glu (+3.0); Ser (+0.3); Asn (+0.2); Gln (+0.2); Gly (0); Pro (−0.5); Thr (−0.4); Ala (−0.5); His (−0.5); Cys (−1.0); Met (−1.3); Val (−1.5); Leu (−1.8); Ile (−1.8); Tyr (−2.3); Phe (−2.5); and Trp (−3.4). Non-conserved amino acid substitutions may be made were the hydrophilicity value of the residues is significantly different, e.g. differing by more than 2.0.

In alternative embodiments, conserved amino acid substitutions may be made where an amino acid residue is substituted for another having a similar hydropathic index (e.g., within a value of plus or minus 2.0). In such embodiments, each amino acid residue may be assigned a hydropathic index on the basis of its hydrophobicity and charge characteristics, as follows: Ile (+4.5); Val (+4.2); Leu (+3.8); Phe (+2.8); Cys (+2.5); Met (+1.9); Ala (+1.8); Gly (−0.4); Thr (−0.7); Ser (−0.8); Trp (−0.9); Tyr (−1.3); Pro (−1.6); His (−3.2); Glu (−3.5); Gln (−3.5); Asp (−3.5); Asn (−3.5); Lys (−3.9); and Arg (−4.5). Non-conserved amino acid substitutions may be made were the hydropathic index of the residues is significantly different, e.g. differing by more than 2.0.

In alternative embodiments, conserved amino acid substitutions may be made where an amino acid residue is substituted for another in the same class, where the amino acids are divided into non-polar, acidic, basic and neutral classes, as follows: non-polar: Ala, Val, Leu, Ile, Phe, Trp, Pro, Met; acidic: Asp, Glu; basic: Lys, Arg, His; neutral: Gly, Ser, Thr, Cys, Asn, Gln, Tyr. Non-conserved amino acid substitutions may be made were the residues do not fall into the same class, for example substitution of a basic amino acid for a neutral or non-polar amino acid.

Example 1: Arabidopsis Kanghan Genes

This Example illustrates that drought tolerance in Arabidopsis is conferred by novel QTLs located on three different chromosomes. These genes were identified in an extremely drought tolerant Arabidopsis ecotype, designated herein as #95. The #95 ecotype was isolated during a series of drought treatment experiments, and assessed as follows.

In one assay, 36 plants of ecotype Col and 36 plants of ecotype #95 were used for drought sensitivity testing. At the outset, soil for each pot was dried and weighed to ensure that each pot had the same amount of soil, after which water was added to maintain moisture.

Seeds from Col and #95 were first germinated, then sown one seedling per pot separately. The plants were grown in a controlled environment under long-day conditions (16-h-light/8-h-dark cycle) at 23° C., light intensity of 50 gmol m⁻²s⁻¹and 70% relative humidity (rH). Watering was stopped for both Col and #95 plants three weeks after germination, and all pots were then weighed again, and additional water was supplied to keep every pot at the same weight. Thereafter, drought treatment was initiated and survival days were recorded for both ecotypes. After a period of 15 days without watering, all 36 plants of ecotype Col had died. In contrast, the plants of ecotype #95 retained considerable vigor, and fully recovered to maturity when water supply was resumed.

The extreme drought tolerance Arabidopsis ecotype #95 was particularly evident after withdrawing water for 38 days. Plants of the ecotype Col were all severely wilted due to drought. Ecotype #95, in contrast, still exhibited clear vigor. The F1 progeny between Col and #95 were also sensitive to drought, indicating the recessive nature of the #95 drought resistant trait. In one assay, 27 days after water was withdrawn, the plants were segregated into two groups, those that had died, and those that maintained vigor and were recoverable to full maturity when watering was resumed. In alternative drought tolerance tests of F2 progeny derived from a cross between Col and #95, segregation of F2 population plants after drought treatment (50 days after water withdrawal) was much lower than 3:1. This segregation is consistent with the involvement of major QTL in controlling the drought tolerance trait. FIG. 1 is graph illustrating the drought tolerance diversity of the F2 generation of these Arabidopsis plants (col×#95). Segregation of the drought tolerance trait from 500 F2 individual lines was calculated by the survival days after drought treatment (cessation of watering). In FIG. 1, the survival in days of Col and #95 plants are marked by arrows with legends. The normal distribution for the phenotype of F2 drought tolerance indicates that several QTLs govern the drought tolerance trait.

Map based cloning through crossing with ecotype Col, revealed that the drought-related trait was governed by three major QTLs distributed on three different chromosomes. To delineate the underlying genetic components, an F1 generation was developed from the seeds of a cross between Col and #95. The F1 seeds were then used to develop a large F2 population of 5000 lines. The F2 populations showed significant segregation of the drought tolerance trait, with some plants showing significant drought tolerance, and others showing no drought tolerance, which indicated that the drought tolerance trait of #95 was controlled by several QTLs.

A fine mapping of the genes was further pursued using 500 lines of this population from which 20 extremely drought tolerant individuals and 20 extremely drought sensitive individuals were selected to conduct a Bulk Segregate Analysis (BSA) with 106 molecular markers which cover all 5 chromosomes of Arabidopsis. Based on this analysis, three major QTLs distributed on three different chromosomes were identified. Specifically, QTL's were identified on chromosomes 1, 4 and 5 of the Arabidopsis genome. The contribution rates of these 3 loci to the observed drought tolerance trait were 13.8%, 29.3%, 37.7%, respectively, explaining in the aggregate more than 80% of the drought tolerance variation between ecotype #95 and Col.

Fine mapping was first focused on loci on Chr.4 and Chr.5, which was carried out using 700 extremely drought tolerant individuals from a total of 5000 F2 plants. The candidate genes were narrowed down to two regions of 540 kb on Chr.4 and 189 kb on Chr.5. Single nucleotide polymorphism (SNP) and insertion/deletion (In/del) analysis, as well as expression level analysis based on the TAIR database, was carried out for all of the genes identified in these two regions on Chr.4 and 5.

The full genome sequence of ecotype #95 was compared with the full genome sequence of Arabidopsis ecotype Columbia (ecotype Col. The three major QTL's associated with drought tolerance on Chr. 1, Chr. 4 and Chr. 5 of ecotype #95 were revealed to harbor members of a protein coding gene family: At1g51670 (also referred to as Kanghan3 or KH3), At4g29760 (also referred to as Kanghan4 or KH4), At4g29770 (also referred to as Kanghan2 or KH2), At5g18065 (also referred to as Kanghan5 or KH5) and At5g18040 (also referred to as Kanghan1 or KH1). An additional member of the gene family was recognized by sequence similarity: At1g48180 (also referred to as Kanghan6 or KH6). This gene family is designated herein as the Kanghan gene family, the first 5 of which have very strong roles in drought tolerance (a GenBank database accession number for a protein encoded by each of the native Arabidopsis genes is given after the gene name in brackets): Kanghan1 (At5g18040; NP_197305.1), Kanghan2 (At4g29770; NP 001154277.1), Kanghan3 (At1g51670; NP_175578.2), Kanghan4 (At4g29760; NP_194705.1), Kanghan5 (At5g18065; NP_680172.2), Kanghan6 (At1g48180; NP_175252.1).

Analysis of the genomic sequence of Ecotype #95 reveals that mutations within Kanghan family genes are associated with drought tolerance. Specifically, in ecotype #95, all 5 members of the Kanghan family strongly associated with drought tolerance have dramatic mutations. Specifically, four members of the Kanghan gene family (At4g29770, At5g18065, At5g18040 and At1g51670) contain a premature stop codon (see FIG. 2), which is indicative of loss-of-function mutations (null) in ecotype #95 compared to the Col variety. A fifth member of the Kanghan gene family, At4g29760, does not contain a premature stop codon, but 5 amino acid substitutions occur in the coding region of this gene. Among the five Kanghan genes strongly associated with drought tolerance, At5g18040, At4g29770 and At1g51670 are much more highly expressed (over 10 times) in both Col and #95 compared to At5g18065 and At4g29760, suggesting that At5g18040, At4g29770 and At1g51670 may in some circumstances contribute more than the other two genes to drought tolerance trait.

Example 2: Reversing Drought Resistance

To further illustrate the role of the Kanghan genes in drought tolerance, two full length Kanghan genes (AT5g18040 and At4g29770) from Arabidopsis ecotype Col were used to transform Arabidopsis ecotype #95, including at least 2 kb 5′UTR, 1 kb 3′UTR and CDS. The transformants lost their drought resistance, confirming that the modulation of Kanghan gene expression plays a dramatic role in drought resistance.

A further illustration of the dramatic effect of Kanghan genes on drought tolerance was provided by introducing five Kanghan gene alleles from ecotype #95 into ecotype Columbia (Col) by crossing and molecular marker based selection, generation by generation. The 7^thgeneration of backcrossed lines was used for self-crossing to provide homozygous plants which contained the five Kanghan gene alleles from #95 strongly associated with drought tolerance. These homozygous plants were subjected to drought treatment. The result was that introduction of the #95 Kanghan gene alleles rendered ecotype Columbia drastically enhanced in its drought tolerance traits.

A further illustration of the effect of Kanghan genes on abiotic stress response was provided by measuring the canopy temperatures of Col, #95 and the backcrossed lines bearing the Kanghan alleles. Increased canopy temperature was clearly evident in #95 plants and backcrossed lines, when compared with Col ecotype plants. Further, subjecting seedlings of #95 and Col to heat treatment at 45° C. confirmed heat sensitivity in ecotype #95.

As this Example illustrates, functional expression of Kanghan gene family proteins plays a positive role in heat tolerance, and a negative role in drought tolerance. The invention accordingly provides a variety of avenues for modulating abiotic stress response in plants. In some embodiments, this involves balancing Kanghan gene expression to achieve a desired phenotype of abiotic stress response, for example balancing drought and heat tolerance.

The negative role of the Kanghan family of genes in drought tolerance serves as a basis for improving plant drought tolerance by down-regulating or silencing members of the Kanghan gene family. This may for example be achieved through a wide variety of techniques, including mutagenesis (TILLing) or targeted gene editing, as discussed above.

Example 3: Kanghan Sequence Similarity and Protein Domains

TABLE 1

BLAST alignments of Kanghan proteins, with AT4G29770 as reference sequence.

Sequence

Percent
Percent
Length of
Mis-

SEQ ID

Accession
Gene
Identities
Positives
Alignment
matches
Gaps
NO:

NP_001154277.1
AT4G29770
100
100
329
0
0
15

NP_194705.1
AT4G29760
60.432
73.38
278
108
2
14

NP_197305.1
AT5G18040
48.227
60.99
282
112
4
17

NP_680172.2
AT5G18065
63.415
73.98
123
42
1
16

NP_175252.1
AT1G48180
35.907
50.19
259
120
5
19

NP_175578.2
AT1G51670
36.628
49.42
172
70
4
18

TABLE 2

Continuation of BLAST alignments of Kanghan proteins,

with AT4G29770 as reference sequence.

Sequence
Query
Query
Subject
Subject

Max
SEQ ID

Accession
Start
End
Start
End
E Value
Score
NO:

NP_001154277.1
1
329
1
329
0
685
15

NP_194705.1
54
329
3
280
2.53E−115
347
14

NP_197305.1
51
329
2
252
1.06E−80
258
17

NP_680172.2
56
178
7
126
1.16E−42
155
16

NP_175252.1
77
329
21
239
1.45E−39
150
19

NP_175578.2
80
249
28
162
8.24E−20
95.1
18

TABLE 3

BLAST alignments of Kanghan proteins, with AT1G51670 as reference sequence.

Percent
Percent
Length of
Mis-

SEQ ID

Identities
Positives
Alignment
matches
Gaps
NO:

NP_175578.2
AT1G51670
100
100
178
0
0
18

NP_175252.1
AT1G48180
65.625
76.25
160
48
3
19

NP_194705.1
AT4G29760
43.407
53.85
182
62
7
14

NP_001154277.1
AT4G29770
36.628
49.42
172
70
4
15

NP_197305.1
AT5G18040
45.833
60.42
96
47
4
17

NP_680172.2
AT5G18065
43.75
51.04
96
21
2
16

TABLE 4

Continuation of BLAST alignments of Kanghan proteins,

with AT1G51670 as reference sequence.

Query
Query
Subject
Subject

Max
SEQ ID

Start
End
Start
End
E Value
Score
NO:

NP_175578.2
1
178
1
178
6.95E−127
366
18

NP_175252.1
1
160
1
153
5.32E−61
201
19

NP_194705.1
20
162
19
198
6.40E−25
108
14

NP_001154277.1
28
162
80
249
4.46E−20
95.1
15

NP_197305.1
69
162
77
169
9.09E−13
73.6
17

NP_680172.2
28
90
31
126
1.36E−09
63.2
16

As set out in the tables above, which alternatively set out BLAST alignments with reference sequences that are the most divergent of the Kanghan genes (AT4G29770 and AT1G51670) the Kanghan gene family may be defined as including genes that encode proteins, that when optimally aligned, have at least 35% identity and/or at least 49% positive alignments, over a length of at least 90 amino acids, with BLOSUM or PAM substitution matrix, with gaps permitted.

This Example further illustrates the existence of conserved protein domains encoded by Kanghan family genes, as depicted in FIGS. 3 through 7.

Conserved domain A is close to the amino end of the proteins, and as shown in FIGS. 4 and 5, comprises a region that may be defined as having a reasonably high degree of consensus (80%) to the following sequence: cppDYDtStpAAhVAlpLISSARlhLKlDuhhTEYSsQaLhDpsutpp. Alternatively, at a slightly reduced level of consensus, conserved domain A may be defined as comprising a region that is defined as having a reasonably high degree of consensus (70%) to the following sequence: spphhpShupscGhCHPDC-KAssEpEDYDASQpAAhVAVsLISSARlhLKLDusaTEYSAQYLVDNAGpccs.

Conserved domain B, as shown in FIGS. 4 and 6, comprises a region that may be defined as having a high degree of consensus (100%) to the following sequence: hTVKDChphAhp. Alternatively, at a reduced level of consensus, conserved domain B may be defined as comprising a region that is defined as having at least 80% identity to the following sequence: LTVKDCLEhAhK-G. Alternatively, at a further reduced level of consensus, conserved domain B may be defined as comprising a region that is defined as having at least 70% identity to the following sequence: LTVKDCLEhAFKKG.

Conserved domain C, as shown in FIGS. 4 and 7, comprises a region that may be defined as having at least 80% identity to the following sequence: VshKGpVlEstshpEs.chhhpQs-huA+LHlFpPph. Alternatively, at a reduced level of consensus, conserved domain C may be defined as comprising a region that is defined as having at least 70% identity to the following sequence: VsMKGEVIEspsh-EAhcLllcQP-lGA+LHlFoPcl. FIGS. 5, 6 and 7 illustrate consensus sequences using a sequence logo, which is a graphical representation of an amino acid or nucleic acid multiple sequence alignment (CLUSTL W). Each logo consists of stacks of symbols, one stack for each position in the sequence. The overall height of the stack indicates the sequence conservation at that position, while the height of symbols within the stack indicates the relative frequency of each amino or nucleic acid at that position. The width of the stack is proportional to the fraction of valid symbols in that position—positions with many gaps have thin stacks (Crooks et al., 2004; Schneider et al., 1990). Shading of the weblogo images reflects amino acid chemistry (AA).

Conserved domain A is absent in Kanghan1 (At5g18040) in Columbia (Col) due to an 82 bp deletion compared to the orthologous gene in other species of Arabidopsis. As shown in FIG. 2, the At5g18040 gene in Ecotype #95 contains conserved domain A, before the premature stop codon, so that the existence of this domain on its own does not appear to confer drought tolerance.

Conserved domain B is relatively highly conserved in all members of Kanghan gene family in Columbia. In contrast, the premature stop codons of Kanghan1, Kanghan2, Kanghan3, 5 and Kanghan5 cause the loss of conserved domain B in #95. Accordingly, this absence of this domain is closely associated with the drought tolerance trait.

Conserved domain C and the tasi-RNA target site are not present in Kanghan5 (At5g18065) in both Columbia and #95.

BLAST searching reveals that Kanghan family genes are widely distributed in Brassicaceae, in addition to the six Kanghan genes in Arabidopsis thaliana, there are also 5 members in Arabidopsis lyrata, 6 members in Caspsella rubella, 5 members in Brassica rapa, 11 members in Brassica napus, 3 members in Eutrema salsugineum, 1 member in Thellugiella parvula, and at least 24 members in Camelina sativa. Most of these Kanghan genes include all three conserved domains, and all of them contain conserved domain B.

TABLE 5

BLASTP search results identifying plant Kanghan proteins based

on sequence similarity to the protein encoded by AT4G29770.

Length of
SEQ ID

Seq
Gene
Identities
Positives
Alignment
NO:

NP_001154277.1
AT4G29770
100
100
329
15

CAB43652.1
hypothetical protein
100
100
282
45

[Arabidopsis thaliana]

NP_567833.1
target of trans acting-
100
100
277
46

siR480/255 [Arabidopsis

thaliana]

XP_002869410.1
hypothetical protein
85.56
89.89
277
47

ARALYDRAFT_491783

[Arabidopsis lyrata

subsp. lyrata]

XP_006293511.1
hypothetical protein
73.188
83.33
276
48

CARUB_v10023817mg

[Capsella rubella]

XP_010447809.1
PREDICTED:
71.326
82.8
279
49

uncharacterized protein

LOC104730345

[Camelina sativa]

XP_010438266.1
PREDICTED:
70.504
82.37
278
50

uncharacterized protein

LOC104721886

[Camelina sativa]

XP_010433066.1
PREDICTED:
70.922
81.56
282
51

uncharacterized protein

LOC104717221

[Camelina sativa]

XP_010447810.1
PREDICTED:
68.1
79.57
279
52

uncharacterized protein

LOC104730347

[Camelina sativa]

XP_010436343.1
PREDICTED:
74.194
82.66
248
53

uncharacterized protein

LOC104720070

[Camelina sativa]

XP_002869411.1
predicted protein
65.233
76.34
279
54

[Arabidopsis lyrata

subsp. lyrata]

XP_010441644.1
PREDICTED:
66.791
78.36
268
55

uncharacterized protein

LOC104724792

[Camelina sativa]

XP_010494756.1
PREDICTED:
66.045
77.99
268
56

uncharacterized protein

LOC104771851

[Camelina sativa]

XP_010451117.1
PREDICTED:
65.108
76.26
278
57

uncharacterized protein

LOC104733215

[Camelina sativa]

XP_010441643.1
PREDICTED:
62.816
76.9
277
58

uncharacterized protein

LOC104724791

[Camelina sativa]

XP_002871796.1
predicted protein
64.234
74.09
274
59

[Arabidopsis lyrata

subsp. lyrata]

XP_006280936.1
hypothetical protein
66.415
77.36
265
60

CARUB_v10026934mg

[Capsella rubella]

XP_010438262.1
PREDICTED:
65.556
74.81
270
61

uncharacterized protein

LOC104721884

[Camelina sativa]

XP_002871797.1
predicted protein
63.296
72.28
267
62

[Arabidopsis lyrata

subsp. lyrata]

NP_194705.1
AT4G29760
60.432
73.38
278
14

XP_006413298.1
hypothetical protein
54.373
70.72
263
64

EUTSA_v10026005mg

[Eutrema salsugineum]

XP_013601305.1
PREDICTED:
53.409
70.45
264
65

uncharacterized protein

LOC106308720

[Brassica oleracea var.

oleracea]

XP_013720359.1
PREDICTED:
54.444
67.78
270
66

uncharacterized protein

(protein)

LOC106424160

170

[Brassica napus]

(cDNA)

XP_006412791.1
hypothetical protein
54.412
68.75
272
67

EUTSA_v10027444mg

[Eutrema salsugineum]

XP_013628081.1
PREDICTED:
54.851
68.28
268
68

uncharacterized protein

LOC106334325

[Brassica oleracea var.

oleracea]

XP_010436344.1
PREDICTED:
62.673
70.05
217
69

uncharacterized protein

LOC104720071

[Camelina sativa]

XP_009108974.1
PREDICTED:
53.333
66.67
270
70

uncharacterized protein

LOC103834660 isoform

X2 [Brassica rapa]

XP_010438269.1
PREDICTED:
50.158
57.41
317
71

uncharacterized protein

LOC104721889

[Camelina sativa]

CDY23253.1
BnaA08g12930D
52.239
68.28
268
72

[Brassica napus]

(protein)

171

(cDNA)

XP_006294905.1
hypothetical protein
49.811
63.77
265
73

CARUB_v10023956mg

[Capsella rubella]

XP_009108973.1
PREDICTED:
51.493
67.16
268
74

uncharacterized protein

LOC103834660 isoform

X1 [Brassica rapa]

XP_009127652.1
PREDICTED:
51.515
68.56
264
75

uncharacterized protein

LOC103852500

[Brassica rapa]

XP_013659423.1
PREDICTED:
52.453
65.28
265
76

uncharacterized protein

LOC106364376

[Brassica napus]

NP_197305.1
AT5G189040
48.227
60.99
282
17

CDX68686.1
BnaA01g07670D
53.815
70.28
249
78

[Brassica napus]

(protein)

165

(cDNA)

XP_013668007.1
PREDICTED:
50
67.42
264
79

uncharacterized protein

LOC106372351

[Brassica napus]

XP_013720313.1
PREDICTED:
50.562
66.29
267
80

uncharacterized protein

LOC106424116 isoform

X2 [Brassica napus]

AAM64385.1
unknown [Arabidopsis
49.451
60.81
273
81

thaliana]

XP_013720312.1
PREDICTED:
50.562
66.29
267
82

uncharacterized protein

LOC106424116 isoform

X1 [Brassica napus]

XP_013659411.1
PREDICTED:
52
69.6
250
83

uncharacterized protein

LOC106364365

[Brassica napus]

CDY23252.1
BnaA08g12940D
49.064
65.17
267
84

[Brassica napus]

CDY55618.1
BnaC03g77520D
47.94
65.17
267
85

[Brassica napus]

(protein)

166

(cDNA)

XP_009102300.1
PREDICTED:
46.792
63.77
265
86

uncharacterized protein

(protein)

LOC103828450

167

[Brassica rapa]

(cDNA)

XP_009108975.1
PREDICTED:
47.94
62.55
267
87

uncharacterized protein

LOC103834660 isoform

X3 [Brassica rapa]

CDY55620.1
BnaC03g77540D
48.387
63.71
248
88

[Brassica napus]

XP_010495074.1
PREDICTED:
49.434
58.11
265
89

uncharacterized protein

LOC104772124

[Camelina sativa]

XP_013674022.1
PREDICTED:
47.059
65.16
221
90

uncharacterized protein

LOC106378439

[Brassica napus]

XP_010433021.1
PREDICTED:
40.892
55.76
269
91

uncharacterized protein

LOC104717183

[Camelina sativa]

XP_010438210.1
PREDICTED:
42.804
54.98
271
92

uncharacterized protein

LOC104721842

[Camelina sativa]

XP_010447759.1
PREDICTED:
42.857
55.64
266
93

uncharacterized protein

LOC104730304

[Camelina sativa]

XP_006393225.1
hypothetical protein
42.912
52.87
261
94

EUTSA_v10011766mg

[Eutrema salsugineum]

CDY55622.1
BnaC03g77550D
43.939
56.82
264
95

[Brassica napus]

(protein)

169

(cDNA)

KFK22930.1
hypothetical protein
42.339
55.24
248
96

AALP_AAs51418U000100

[Arabis alpina]

XP_010447760.1
PREDICTED:
42.578
55.08
256
97

uncharacterized protein

LOC104730305

[Camelina sativa]

XP_002894098.1
F21D18.8 [Arabidopsis
39.147
52.33
258
98

lyrata subsp. lyrata]

XP_009108976.1
PREDICTED:
47.541
64.48
183
99

uncharacterized protein

LOC103834661

[Brassica rapa]

XP_010479661.1
PREDICTED:
39.683
53.97
252
100

uncharacterized protein

LOC104758482

[Camelina sativa]

XP_010462001.1
PREDICTED:
39.044
53.39
251
101

uncharacterized protein

LOC104742681

[Camelina sativa]

KFK30349.1
hypothetical protein
42.387
54.73
243
102

AALP_AA7G249900

[Arabis alpina]

XP_006304151.1
hypothetical protein
39.768
53.28
259
103

CARUB_v10010162mg

[Capsella rubella]

XP_010482049.1
PREDICTED:
38.672
51.95
256
104

uncharacterized protein

LOC104760782

[Camelina sativa]

NP_680172.2
AT5G18065
63.415
73.98
123
16

XP_010479658.1
PREDICTED:
36.822
52.33
258
106

uncharacterized protein

LOC104758479

[Camelina sativa]

XP_002891651.1
predicted protein
38.492
51.98
252
107

[Arabidopsis lyrata

subsp. lyrata]

NP_175252.1
AT1G48180
35.907
50.19
259
19

XP_006304149.1
hypothetical protein
36.863
51.37
255
109

CARUB_v10010150mg

[Capsella rubella]

XP_002891717.1
hypothetical protein
37.549
51.78
253
110

ARALYDRAFT_892299

[Arabidopsis lyrata

subsp. lyrata]

XP_010471249.1
PREDICTED:
36.957
50.87
230
111

uncharacterized protein

LOC104751067

[Camelina sativa]

AAF79518.1
F21D18.8 [Arabidopsis
35.125
48.39
279
112

thaliana]

XP_006303339.1
hypothetical protein
36.8
51.6
250
113

CARUB_v10010206mg

[Capsella rubella]

XP_010501962.1
PREDICTED:
34.348
49.57
230
114

uncharacterized protein

LOC104779303

[Camelina sativa]

AAG50884.1
unknown protein
36.111
49.6
252
115

[Arabidopsis thaliana]

XP_010442215.1
PREDICTED:
38.095
50.6
168
116

uncharacterized protein

LOC104725285

[Camelina sativa]

XP_010500744.1
PREDICTED:
30.038
44.49
263
117

uncharacterized protein

LOC104778076

[Camelina sativa]

KFK24575.1
hypothetical protein
47.581
61.29
124
118

AALP_AAs45078U000200

[Arabis alpina]

NP_175578.2
AT1G51670
36.628
49.42
172
18

XP_013684707.1
PREDICTED:
31.818
50
176
120

uncharacterized protein

LOC106389038 isoform

X1 [Brassica napus]

CDY43538.1
BnaA01g07060D
31.818
50
176
121

[Brassica napus]

XP_013684772.1
PREDICTED:
31.818
50
176
122

uncharacterized protein

LOC106389038 isoform

X2 [Brassica napus]

XP_013596364.1
PREDICTED:
35.537
53.72
121
123

uncharacterized protein

LOC106304487 isoform

X3 [Brassica oleracea

var. oleracea]

XP_013750812.1
PREDICTED:
33.871
50
124
124

uncharacterized protein

LOC106453111 isoform

X3 [Brassica napus]

XP_013750806.1
PREDICTED:
33.871
50
124
125

uncharacterized protein

LOC106453111 isoform

X1 [Brassica napus]

Example 4: Modulating Abiotic Stress Response in Wheat with Kanghan Genes

This example illustrates a genetic modification of a wild-type wheat by gene gun mediated transformation using a Kanghan gene construct, to modulate an abiotic stress response, in this case conferring heat tolerance. Transgenic constructs for overexpression of Arabidopsis Kanghan family genes in wheat were produced using monocot special overexpression vector PANIC5E. This vector was designed for stable transformation and overexpression of heterologous Kanghan genes in wheat. Over expression of Arabidopsis Kanghan1 (At5g18040) in one wheat wild type (Fielder) was achieved in this way by gene gun mediated transformation. The construct used to perform this transformation is shown in FIG. 8.

To illustrate the heat tolerance of the wheat transgenic lines, three-week seedlings of both wild types and T1 transgenic lines were heat treated at 42/38° C. (day/night). After two weeks of heat treatment, recovery at normal growth temperature was performed, and phenotypes observed. Heat tolerance was clearly observed in T1 transformants compared to non-transgenic plants under heat treatment. Non-transgenic plants displayed wilt symptoms or died. The transformants, on the other hand, recovered after transferring to normal growth temperature conditions, and were able to grow normally and transit to reproductive growth.

To further illustrate the heat tolerance of the wheat transgenic lines, three week old seedlings of both wild-type and T1 transgenic lines were subjected to 40/38° C. (day/night) for three weeks, followed by a three week recovery period at 25° C. After this recovery period, the transgenic plants fully recovered whereas the control plants failed to recover (FIGS. 9A and 9B). After a further seven weeks at 25° C., the transgenic plants reached maturity and produced seeds (FIG. 9C).

Under standard growth conditions of 23° C. day/18° C. night, 16 h photoperiod (16 h light/8 h dark), and 200 μmol m−2 s−1 light intensity wild-type and transgenic plants are visually indistinguishable, however as determined by infrared thermal imaging using FLIR T640 Infrared Camera, the canopy temperature of T1 transgenic wheat plants is significantly lower (FIG. 10).

These studies illustrate the utility of the Kanghan genes in modulating abiotic stress response in crop species such as wheat, in this case to improve heat tolerance.

Example 5: Identifying Kanghan Homologs in Brassica napus

A BLAST sequence search was carried out on available genome and transcript data from Brassica napus to identify potential homologues of at4g29770 (SEQ ID NOs: 2 and 15), at4g29760 (SEQ ID NOs: 1 and 14), at5g18040 (SEQ ID NOs: 4 and 17), at5g18065 (SEQ ID NOs: 3 and 16), at1g51670 (SEQ ID NOs: 5 and 18), and at1g48180 (SEQ ID NOs: 13 and 19). The potential candidates identified are provided in Table 6.

TABLE 6

Homologs of Arabidopsis thaliana Kanghan family genes in Brassica napus.

Homologs of

at4g29770, at4g29760, at5g18040, and
Homologs of at1g51670 and

at5g18065
at1g48180

BnaA01g06470D (SEQ ID NO: 79)
BnaC01g08520D (SEQ ID NO: 63)

BnaA07g02270D (SEQ ID NO: 86)
BnaC01g08490D (SEQ ID NO: 77)

BnaA08g12920D (SEQ ID NO: 66)
BnaA01g07060D (SEQ ID NO: 121)

BnaA08g12930D (SEQ ID NO: 72)

BnaA08g12940D (SEQ ID NO: 84)

BnaA01g07670D (SEQ ID NO: 78)

BnaC03g77520D (SEQ ID NO: 85)

BnaC03g77540D (SEQ ID NO: 88)

BnaC03g77550D (SEQ ID NO: 95)

A DNA neighbor phylogenetic tree of the Brassica napus Kanghan gene candidates and their Arabidopsis thaliana counterparts is provided in FIG. 11 and a protein neighbor phylogenetic tree is provided in FIG. 12. A DNA neighbor phylogenetic tree of the Brassica napus Kanghan gene candidates is shown in FIG. 13. The candidates indicated by arrows were selected for targeting by RNAi.

A sequence alignment of the Brassica napus homologues of at4g29770, at4g29760, at5g18040, and at5g18065 is provided in FIG. 20. These homologues may be characterized by their consensus sequences: a first 100% consensus sequence DsucpAshlAssLISstRhhhpLDp.hTpYSsQaLVDNAh . . . p (SEQ ID NO: 126) and a second 100% consensus sequence ptsplhl+tsLthAhKcGlP+ . . . WsHlGsl . . . Ps.h.h.shV.hKGphhEsKp.-tA.cLhppt.luAKLhVFsPph-h . . . tha.G.uG . . . topYVGLRDshlsu.tphps.shhpVplhYKKp.thhpVuhs.hh . . . ppus. pppltP.hLLVDFhlPph.h (SEQ ID NO: 127).

A sequence alignment of the Brassica napus homologues of at1g51670 and at1g48180 is provided in FIG. 21. These homologues may be characterized by their consensus sequences: a first 100% consensus sequence MAD.HLhPtLTRHRHTVPsISDDFYNYMKLIpKT-PEIMSKLLPILRTIPDSGIQLlp (SEQ ID NO: 128) and a second 100% consensus sequence R-chpL-cQYAVLQYD-HEhVWAVIAAp.l.h (SEQ ID NO: 129).

Example 6: Targeting Brassica napus Kanghan Genes by RNAi

Primer Design

Two conserved fragments from 12 putative Brassica napus Kanghan genes, identified based on ClustalW multiple alignment, were used to design two pairs of RNAi primers. The reverse primers were designed to include a BamH1 restriction site and a Sal1 restriction site to facilitate cloning.

The first primer pair was designed to target BnaC03g77540D (LOC106364365):

RNAiF1 GP438:

(SEQ ID NO: 20)

TAGATTCTGCTGAGAGAGCCGCTAC

RNAiR1 GP439:

(SEQ ID NO: 21)

GGATCCGTCGACGCACCTATGGGTCCATGCTTTAAC

The second primer pair was designed to target BnaA08g12920D (LOC106424160):

RNAiF2 GP440:

(SEQ ID NO: 22)

TCATCCAGATTGCCAACGAG

RNAiR2 GP441:

(SEQ ID NO: 23)

GGATCCGTCGACACGCATCCTCCAGTGTCTTAG

Production of BnKanghan RNAi Construct and Establishment of Brassica Napus RNAi Lines

To generate a cDNA library of Brassica napus, total RNA was isolated from 3-week-old leaves of canola wild type ‘Hero’ using the Plant RNeasy Mini Kit (Qiagen). Then, RNA samples were used for library construction using the QuantiTect Reverse Transcription Kit (Qiagen). The primer pairs RNAiF1 GP438 (SEQ ID NO: 20)+RNAiR1 GP439 (SEQ ID NO: 21) and RNAiF2 (SEQ ID NO: 22)+RNAiR2 GP441 (SEQ ID NO: 23) were used separately to amplify fragments from two target BnKanghan genes from the obtained cDNA library. Each of the resulting PCR products was isolated and cloned into the pGEM®-T vector (Promega, USA). A map of the pGEM-T vector is provided in FIG. 14. Then, two copies of the Kanghan gene fragments were subcloned into the pCAMBIA 1301-35S-Int-T7 vector in opposite orientations using a Pst1, Sal1 digest and a BamH1, Sac1 digest to generate two RNAi constructs, one for each gene fragment. A map of the pCAMBIA 1301-35S-Int-T7 vector is provided in FIG. 15 and a partial map of the resulting RNAi constructs is provided in FIG. 16.

Next, a genetic modification of canola wild type ‘Hero’ was conducted using both of these completed RNAi constructs through agrobacterium-mediated transformation aimed to obtain increased drought tolerance. Positive transformants were confirmed using a pair of hygromycin specific primers (HptF TACACAGCCATCGGTCCAGA (SEQ ID NO: 24) and HptR GTAGGAGGGCGTGGATATGTC (SEQ ID NO: 25)). A cross was carried out between T1 positive transformants from the two different constructs. In the C2 generation, lines harboring both constructs together were selected for further evaluation of silencing of BnKanghan family genes and drought tolerance traits.

To assess the expression level of BnKanghan family genes in transgenic and crossing lines, a number of primer pairs were designed for qRT-PCR assays to assess the expression levels of seven candidate Kanghan genes from Brassica napus. The targets of these primer pairs are identified in Table 7. In total, 12 lines harboring both RNAi constructs from the C2 generation were selected to detect expression level changes of BnKanghan family genes. Each line tested showed decreases in expression of at least three BnKanghan genes. The most commonly suppressed genes were: BnaA07g02270D, BnaA08g12920D, and BnaC03g77550D, followed by BnaC03g77540D and BnaC01g08490D.

TABLE 7

qRT-PCR primers targeting BnKanghan family genes.

SEQ

primer
ID

product

no.
NO:
primer sequence
length
Kanghan gene

GP635
26
CGCTACGAGGCACGTACTCAAT
103
BnaA07g02270D

GP636
27
CTCGGTCTTCCCCGCTTTC

GP637
28
GCTTAGAGACGTGATCCTGGTAGC
128
BnaA08g12920D

GP638
29
CCAGTGTGGTGAACATACGGC

GP639
30
GTTTTGTTGGTCTCTTCTCTTTGC
71
BnaC01g07670D

GP640
31
TTCTTAAGAGGCGTTTCAGATGG

GP641
32
TGATTTGGGTTTTGCCTGATAC
69
BnaC03g77540D

GP642
33
GAAACAAACCATAAATGAGTTGCC

GP645
34
CATTTGGGATGTGTCGATTGAG
165
BnaC03g77550D

GP646
35
CCCACGTAGCTTGTTCCGTT

GP649
36
AACACTGTCACGCAGATTGCC
124
BnaA01g06470D

GP650
37
CTGTCCAGGTTAGCTACCATACGA

GP655
38
CGGTATCCAACTCATTCGAAGG
121
BnaC01g08490D

GP656
39
TCAAGTATATACTGGGTTGGCTGC

Testing Canopy Temperature and Drought Tolerance of Brassica napus RNAi Lines

Individual lines C2-83-20 and C2-83-10 each showed decreased expression of six BnKanghan genes. These two lines were selected for further drought tolerance measurements. To predict the potential drought tolerance of the C2-83-20 and C2-83-10 lines, the canopy temperatures were measured using an infrared camera. In comparison to wild type plants, higher canopy temperatures were observed for the transgenic plants (FIG. 17) indicating a lower leaf water potential. These RNAi phenotypes are similar to loss-of-function alleles of At Kanghan genes in Arabidopsis, which suggests a similar role for the BnKanghan genes in canola.

To assess the drought tolerance of the transgenic plants, four weeks-old plants of both wild type and these two transgenic lines were subjected to drought treatment. The same amount of soil and water were applied to each individual plant before treatment, and then the water supply was stopped. After two weeks of drought treatment, recovery by re-watering of the plants was performed. The resulting phenotypes are shown in FIGS. 18 and 19. Increased drought tolerance was clearly observed in transgenic lines compared to wild type plants under drought conditions that lead to wilt symptoms or death of the wild type plants. The transformants, on the other hand, recovered after being transferred to normal watering conditions and were able to grow up normally and transit to reproductive growth. This demonstrates that silencing of BnKanghan family genes in crop species, such as canola, can improve drought tolerance.

Example 7: Targeting Brassica napus Kanghan Genes by CRISPR

The pan-genome architecture of Brassica napus was recently released (Song et al., 2020), providing a possibility to identify all the members of Kanghan gene family in B. napus. Furthermore, through CRISPR/Cas genome editing technologies, the knock-out of designated member(s) of Kanghan gene family can used to generate non-GMO B. napus lines with high abiotic stress resistance traits.

Identifying Kanghan Homologs in Brassica napus and Other Brassicaceae Species

Genome-wide identification of Kanghan gene family numbers was performed in multiple Brassicaceae species, in which the whole genome sequence information has been released. Kanghan homologs in A. thaliana, A. lyrata, A. helleri, B. napus, B. oleracea and B. rapa were identified and a phylogenetic tree was built based on their protein sequence (FIG. 22). The genes included in the phylogenetic tree are identified in Table 8. Pairwise analysis between each member will be conducted to check the closest homologs for each member in different species. After confirmation of their phylogenic relationship, CRISPR/Cas knock-out will be designed to target different combinations of homologs in B. napus.

TABLE 8

Sequences used to produce the phylogenetic

tree provided in FIG. 22

Gene name
Species
SEQ ID NO:

BnaA08g12920D

Brassica napus

66

Bra010276

Brassica rapa

144

B03g175440

Brassica oleracea

68

BnaC03g77550D

Brassica napus

95

BnaA08g12930D

Brassica napus

72

Bra039897

Brassica rapa

86

BnaA07g02270D

Brassica napus

86

BnaC03g77520D

Brassica napus

85

Bra010278

Brassica rapa

147

B01g011940

Brassica oleracea

65

BnaA01g07670D

Brassica napus

78

Bra011210

Brassica rapa

75

g04250

Arabidopsis halleri

150

At4g29760 (KH4)

Arabidopsis thaliana

14

g04249

Arabidopsis halleri

151

AI scaffold 0007 1135

Arabidopsis lyrata

54

fgenesh2 kg.7 1216

Arabidopsis lyrata

47

AT4G29770.1

g04248

Arabidopsis halleri

154

At4g29770 (KH2)

Arabidopsis thaliana

15

At5g18040 col (KH1)

Arabidopsis thaliana

17

At5g18040 95 (KH1)

Arabidopsis thaliana

155

AI scaffold 0006 1720

Arabidopsis lyrata

59

gl3293

Arabidopsis halleri

157

At5g18065 (KH3)

Arabidopsis thaliana

16

AI scaffold 0006 1721

Arabidopsis lyrata

62

g13290

Arabidopsis halleri

159

fgenesh1 pm.C scaffold

Arabidopsis lyrata

98

1003033

g17457

Arabidopsis halleri

161

At1g48180 (KH6)

Arabidopsis thaliana

19

scaffold 105093.1

Arabidopsis lyrata

110

At1g51670 (KH5)

Arabidopsis thaliana

18

AI scaffold 0001 4560

Arabidopsis lyrata

107

g21951

Arabidopsis halleri

164

Multiplexed Gene Editing Through an Optimized CRISPR/Cas9 Toolkit

A multiplexed toolkit (Cermak et al., 2017) has been selected and optimized for application in B. napus. This toolkit could carry up to 12 guide RNAs (gRNAs) to realize the knock-out of multiple target genes through one construct. Reducing the number of constructs will ideally reduce the cost of plant transformations and downstream molecular confirmation for gene editing. Targeted gRNA design will be performed through multiple bioinformatic tools to avoid potential off-targets and cover as many as Kanghan homologs as possible. gRNAs targeting conserved regions and specific regions of Kanghan family genes will be confirmed after a genome-wide SNP/indels screening for duplicates and homologs in different subgenomes (AA and CC). The final selected 6 gRNAs will be tandem connected with Csy-type ribonuclease 4 (Csy4) for simultaneous expression through Pol II promoter (FIG. 23) using Gibson assembly (Gibson et al., 2009) (FIG. 24) through a specific designed primer list (Table 9). The final plasmid for plant transformation will be constructed following Golden Gate® protocol to link Cas9, gRNA cassette and selection markers together into pTRANS_220d backbone (FIG. 25), a binary vector for T-DNA insertion with neomycin phosphotransferase II (npt II) selection (FIG. 26).

TABLE 9

Primers for gRNA production

Primer

Name
Sequence
SEQ ID NO:

DG564
TGCTCTTCGCGCTGGCAGACATACTGTCCCAC
130

DG565
TCGTCTCCAGCGCACTCGAGCTGCCTATACGGCAGTGAAC
131

DG566
TCGTCTCACGCTTTCAAGGAGTTTTAGAGCTAGAAATAGC
132

DG567
TCGTCTCCCTTTGAAAGAAGCTGCCTATACGGCAGTGAAC
133

DG568
TCGTCTCAAAAGCGTACTCGGTTTTAGAGCTAGAAATAGC
134

DG569
TCGTCTCCCTCTCAGCAGAACTGCCTATACGGCAGTGAAC
135

DG570
TCGTCTCAAGAGAGCTGCTAGTTTTAGAGCTAGAAATAGC
136

DG571
TCGTCTCCGCCGAGTACTCGCTGCCTATACGGCAGTGAAC
137

DG572
TCGTCTCACGGCTCAGTTCCGTTTTAGAGCTAGAAATAGC
138

DG573
TCGTCTCCGCATTGGGCACACTGCCTATACGGCAGTGAAC
139

DG574
TCGTCTCAATGCTCTCTCCTGTTTTAGAGCTAGAAATAGC
140

DG575
TCGTCTCCACCATACGAGCACTGCCTATACGGCAGTGAAC
141

DG576
TCGTCTCATGGTAGCTAACCGTTTTAGAGCTAGAAATAGC
142

DG577
TGCTCTTCTGACCTGCCTATACGGCAGTGAAC
143

Generating Transgenic Plants Through Agrobacterium-Mediated Transformation

Transformation will be conducted in canola cultivar DH12075 using the generated CRISPR/Cas9 construct through agrobacterium-mediated transformation. Positive transformants will be confirmed using a pair of npt II specific primers in T0 generation transgenic lines.

High Throughput Validation for the Gene Editing

Mutations in targeted genes from T0 generation will be identified. To detect the editing on all ten KH homologs in B. napus for hundreds of T0 and T1 generation positive lines, a cost-efficient high throughput detection method is desired. A workflow using droplet digital PCR (ddPCR) assay will be established to achieve a high throughput validation. Fluorescent probes targeting gDNA-associated regions will be designed, and the corresponding primer will be selected based on SNP/Indels information obtained above. Thus, gene editing information of every Kanghan homolog in each transgenic line will be identified. The combination of different mutations in different homologs will provide transgenic materials to investigate knock-out lines of At5g18040 (KH1) homologs, At4g29770 (KH2) homologs, At5g18065 (KH3) homologs, Atg29760 (KH4) homologs, At1g51670 (KH5) homologs and knock-out lines for all Kanghan gene family members in canola, respectively. Non-GMO lines with successful gene-editing in Kanghan gene(s) but without the transformed plasmid will be identified in T1 and T2 generations.

While the present application has been described with reference to specific examples, it is to be understood that the application is not limited to the disclosed examples. To the contrary, the present application is intended to cover various modifications and equivalent arrangements encompassed by the scope of the appended claims.

All publications, patents and patent applications are herein incorporated by reference in their entirety to the same extent as if each individual publication, patent or patent application was specifically and individually indicated to be incorporated by reference in its entirety. Where a term in the present application is found to be defined differently in a document incorporated herein by reference, the definition provided herein is to serve as the definition for the term.

REFERENCES

Altschul et al. (1990), J Mol. Biol. 215:403-10.

Armstrong, C. L., and Green, C. E., Planta 165:322-332 (1985).

Ausubel, et al. (eds), 1989, Current Protocols in Molecular Biology, Vol. 1, Green Publishing Associates, Inc., and John Wiley & Sons, Inc., New York, at p. 2.10.3.

Binding, “Regeneration of Plants, Plant Protoplasts”, CRC Press, Boca Raton, 1985.

Brooks, Plant Physiology September 2014 pp 114.247577.

Caliando et al, Nature Communications 6:6989 (2015).

Cermak, T. et al. (2017). A Multipurpose Toolkit to Enable Advanced Genome Engineering in Plants. Plant Cell 29: 1196-1217.

Close, K. R., and Ludeman, L. A., Planta Science 52:81-89 (1987).

Crooks et al., Genome Research, 14:1188-1190, (2004).

Dunsmuir, et al Nucleic Acids Res, (1983) 11:4177-4183.

EP0255378A1

EP0409625A1

EP0409629A1

Evans et al. “Protoplasts Isolation and Culture”, Handbook of Plant Cell Culture, Macmillian Publishing Company, New York, 1983.

Feng, Cell Research (2013) 23:1229-1232.

Fraley et al., Proc. Nat'l Acad. Sci. USA 80:4803 (1983).

Fromm et al., Proc. Natl. Acad. Sci. USA 82:5824 (1985).

Gamborg and Phillips, “Plant Cell, Tissue and Organ Culture, Fundamental Methods”, Springer Berlin, 1995.

Gibson, D. G. et al. (2009). Enzymatic assembly of DNA molecules up to several hundred kilobases. Nat. Methods 6: 343-345.

Gordon-Kamm, et al. “The Plant Cell” 2:603 (1990).

Harlow and Lane (1988) Antibodies, A Laboratory Manual, Cold Spring Harbor Publications, New York.

Hatfield and Prueger (2015), Weather and Climate Extremes 10:4-10.

Henikoff and Henikoff (1992) Proc. Natl. Acad. Sci. USA 89: 10915-10919.

Horsch et al. Science 233: 496 (1984).

International Patent Publication WO-A 8 809 334.

International Patent Publication WO-A 9 113 980.

Klee et al., Ann. Rev. of Plant Phys. 38:467 (1987).

Klein, et al., Nature 327: 70 (1987).

Kumar et al, Journal of Plant Biochemistry and Biotechnology, July 2013.

Li et al, Targeted Plant Genome Editing via the CRISPR/Cas9 Technology, Methods in Molecular Biology, volume 1284, pp 239-255, 10 Feb. 2015.

Mein et al., Genome Research 10: 330-343, 2000.

Morrell et al., Nat Rev Genet. 2011 Dec. 29; 13(2):85-96.

Myakishev et al., 2001, Genome Research 11: 163-169.

Napoli et al., 1990 Plant Cell 2: 279-289.

Needleman and Wunsch (1970) J. Mol. Biol. 48:443.

Nekrasov, Plant Methods 2013, 9:39.

Nilsson et al., Flowering-Time Genes Modulate the Response to LEAFY Activity, Genetics, 150(1): 403-410, 1998.

Orita et al., Proc. Natl. Acad. Sci. U.S.A. 86: 2766-2770, 1989.

Paszkowski et al. EMBO J. 3:2717 (1984).

Pearson and Lipman (1988) Proc. Natl. Acad. Sci. USA 85: 2444.

Peat et al., Plant Mol. Biol, (1989) 13:639-651.

Pokalsky, et al., Nucleic Acids Res, (1989) 17:4661-4673.

Rajeevkum et al., 2015 Front Plant Sci 6:693.

Rizhsky et al., Plant Physiology, April 2004, Vol. 134, pp. 1683-1696.

Rogers et al., Methods Enzymol. 118:627 (1986).

Schneider and Stephens, 1990, Nucleic Acids Res. 18:6097-610.

Shan, Nature Biotechnology 31, 686-688 (2013).

Smith and Waterman (1981) Adv. Appl. Math 2: 482.

Song, J. M. et al. (2020). Eight high-quality genomes reveal pan-genome architecture and ecotype differentiation of Brassica napus. Nat. Plants 6: 34-45.

Tanhuanpaa et al., 1999, Molecular Breeding 4: 543-550.

Tapp et al., BioTechniques 28: 732-738.

Taylor W. R. (1986) J. Theor. Biol. 119:205-218

Thiagrarajah and Stringham (1993), A comparison of genetic segregation in traditional and microspore-derived populations of Brassica juncea in: L. Czern and Coss. Plant Breeding 111:330-334.

Tijssen, 1993, Laboratory Techniques in Biochemistry and Molecular Biology Hybridization with Nucleic Acid Probes, Part I, Chapter 2 “Overview of principles of hybridization and the strategy of nucleic acid probe assays”, Elsevier, N.Y.

US 2009/0100536

US 2012/0124693

US 2014/0223607

U.S. Pat. No. 4,554,101

U.S. Pat. No. 4,684,611

U.S. Pat. No. 4,743,548

U.S. Pat. No. 4,801,540

U.S. Pat. No. 4,940,838

U.S. Pat. No. 4,943,674

U.S. Pat. No. 4,945,050

U.S. Pat. No. 4,987,071

U.S. Pat. No. 5,015,580

U.S. Pat. No. 5,149,655

U.S. Pat. No. 5,175,095

U.S. Pat. No. 5,231,019

U.S. Pat. No. 5,464,763

U.S. Pat. No. 5,466,587

U.S. Pat. No. 5,591,610

U.S. Pat. No. 5,723,765

U.S. Pat. No. 6,603,061

U.S. Pat. No. 7,868,149

Xie, Mol Plant. 2013 November; 6(6):1975-83.

Xu, Rice 2014, 7:5 (2014).

Yang et al., Molecular Plant, Volume 3, Issue 3, May 2010, Pages 469-490.

Number	Name	Date	Kind
4554101	Hopp	Nov 1985	A
4743548	Crossway et al.	May 1988	A
4801540	Hiatt et al.	Jan 1989	A
4940838	Schilperoort	Jul 1990	A
4943674	Houck et al.	Jul 1990	A
4945050	Sanford et al.	Jul 1990	A
4987071	Cech et al.	Jan 1991	A
5015580	Christou et al.	May 1991	A
5149655	McCabe et al.	Sep 1992	A
5175095	Martineau et al.	Dec 1992	A
5231019	Paszkowski et al.	Jul 1993	A
5283184	Jorgensen et al.	Feb 1994	A
5464763	Schilperoort et al.	Nov 1995	A
5466587	Fitzpatrick-McElligott et al.	Nov 1995	A
5591610	Cech et al.	Jan 1997	A
5723765	Oliver et al.	Mar 1998	A
6603061	Armstrong et al.	Aug 2003	B1
6603062	Schmidt et al.	Aug 2003	B1
7867149	Webber et al.	Jan 2011	B1
8030473	Carrington et al.	Oct 2011	B2
8476422	Carrington et al.	Jul 2013	B2
20060107345	Alexandrov et al.	May 2006	A1
20090100536	Adams et al.	Apr 2009	A1
20100192237	Ren et al.	Jul 2010	A1
20120124693	Guillen-Portal	May 2012	A1
20120198585	Xiao	Aug 2012	A1
20140223607	Kuvshinov et al.	Aug 2014	A1

Number	Date	Country
2300692	Aug 2000	CA
103172716	Jun 2013	CN
104561040	Apr 2015	CN
0255378	Feb 1988	EP
0409625	Jan 1991	EP
0409629	Jan 1991	EP
20120119211	Oct 2012	KR
8809334	May 1988	WO
9113980	Mar 1991	WO

	Number	Date	Country
Parent	16131395	Sep 2018	US
Child	17462586		US
Parent	PCT/IB2017/051474	Mar 2017	US
Child	16131395		US

Modulating drought tolerance in Brassicaceae using the Kanghan gene family

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

Agents

CPC

Field of Search

US

International Classifications

Term Extension

Abstract

Description

Claims

CROSS-REFERENCE TO RELATED APPLICATIONS

US Referenced Citations (27)

Foreign Referenced Citations (9)

Non-Patent Literature Citations (103)

Related Publications (1)

Provisional Applications (1)

Continuation in Parts (2)

Entry
Wells (Biochemistry 29:8509-8517, 1990).
Guo et al. (PNAS, 101: 9205-9210, 2004 ).
Ngo et al., (The Protein Folding Problem and Tertiary Structure Prediction, K. Merz., and S. Le Grand (eds.) pp. 492-495,1994).
Thornton et al. (Nature structural Biology, structural genomics supplement, Nov. 2000).
Keskin et al. (Protein Science, 13:1043-1055, 2004).
Smith et al. (Nature Biotechnology, 15:1222-1223, 1997).
Bork et al. (TIG, 12:425-427, 1996).
Doerks et al., (TIG, 14:248-250, 1998).
Nishimura et al. (Plant Cell Physiol., 41(5):583-590, 2000).
Yang et al. (PNAS, 98:11438-11443, 2001).
McConnell et al. (Nature, 411:709-713, 2001).
Gutterson (HortScience 30:964-966,1995).
Bruening (Proc. Natl. Acad. Sci., 95:13349-13351, 1998).
Elomaa et al. (Molecular Breeding, 2:41-50, 1996).
Colliver et al. (Plant molecular Biology, 35:509-522, 1997).
Emery et al. (Current Biology 13:1768-1774, 2003).
Arziman et al. (Nucleic Acids Research, 33:582-588, 2005).
Bonawitz et al.,(Annu. Rev. Genet. 44: 337-363, 2010).
Paul et al., (Plant Cell Reports; 35:1417-1427; 2016).
Joseph Ecker (Germplasm / Stock: SALK┐_008144.21.60.x submitted and available on public domain on Dec. 19, 2007).
Alonso et al. (Science, 301:653-657, 2003).
Zhou et al. (Plant Physiol., June 162(2):1030-1040; Published Jun. 2013; first published on line May 8, 2013).
Nunes et al. (Planta 224:125-132; 2006).
Kume et al. TAS1 trans-Acting siRNA Targets Are Differentially Regulated at Low Temperature, and TAS1 transActing siRNA Mediates Temperature-Controlled At1g51670 Expression. Bioscience, Biotechnology and Biochemistry. 2010;74(7):1435-1440.
L Czern and Coss. Plant Breeding 111;330-334.
Marin et al. miR390, Arabidopsis TAS3 tasiRNAs, and their Auxin Response Factor targets define an autoregulatory network quantitatively regulating lateral root growth. Plant Cell 2010; 22: 1104-1117.
Montgomery et al. AG01-miR173 complex initiates phased siRNA formation in plants. Proc Natl Acad Sci USA 2008; 105: 20055-20062.
Munns et al. Mechanisms of salinity tolerance. Annual Review of Plant Biology. 2008; 59:651-681.
Rajagopolan et al. A diverse and evolutionarily fluid set of microRNAs in Arabidopsis thaliana. Genes & Dev. 2006;20:3407-3425.
Sunkar et al. Novel and Stress-Regulated MicroRNAs and Other Small RNAs from Arabidopsis. The Plant Cell. (2004) vol. 16, 2001-2019.
Trindade et al. (2011) Facing the Environment: Small RNAs and the Regulation of Gene Expression Under Abiotic Stress in Plants. Chapter 5 in Abiotic Stress Response in Plants—Physiological, Biochemical and Genetic Perspectives. Shankerand Venkateswarlu eds. (InTech, Croatia, 2011).
Xin et al. Diverse set of microRNAs are responsive to powdery mildew infection and heat stress in wheat (Triticum aestivum L). BMC Plant Biology. 2010; 10, 123 (11 pages).
Zhu. Salt and drought stress signal transduction in plants. Annual Review of Plant Biology. 2002; 53:247-273.
C.L. Armstrong-etal, Establishment and maintenance of friable, embryogenic maize callus and the involvement of L-proline, Planta, 1985.
J.Wells, Additivity of Mutational Effects in Proteins, vol. 29, No. 37, American Chemical Society, 1990.
Guo et al, Protein tolerance to random amino acid change, PNAS, vol. 101, 9205-9210, 2004.
Ngo et al, Computational complexity, protein structure prediction, pp. 492-495, 1994.
Bork et al., 1996.
Doerks et al, Protein annotation: detective work for function prediction, vol. 14, issue 6, p. 248-250, 1998.
Yang et al, Expression of the REB transcriptional activator in rice grains improved the yield of recombinant proteins whose genes are controlled by a Reb-responsive promoter, PNAS, vol. 98, 2001.