SOYBEAN RESISTANT TO CYST NEMATODES

Abstract
A transgenic soybean plant or parts thereof, resistant to soybean cyst nematodes, transformed to express Glyma18g02570, Glyma18g02580, or Glyma18g02590, or a variant thereof. Also provided is a method of making such a plant. Also provided is an artificial DNA construct encoding Glyma18g02570, Glyma18g02580, or Glyma18g02590, or a variant thereof.
Description
MATERIAL INCORPORATED-BY-REFERENCE

The Sequence Listing, which is a part of the present disclosure, includes a computer readable form comprising nucleotide and/or amino acid sequences of the present invention. The subject matter of the Sequence Listing is incorporated herein by reference in its entirety.


BACKGROUND OF THE INVENTION

Soybean (Glycine max (L.) Merr.) is a major crop that provides a sustainable source of protein and oil worldwide. Soybean cyst nematode (SCN), Heterodera glycines Ichinohe, is a major constraint to soybean production. This nematode causes more than $1 billion in yield losses annually in the United States alone, making it the most economically important pathogen of soybeans. Although planting of resistant cultivars forms the core management strategy for this pathogen, nothing is known about the nature of resistance. Moreover, the increase in virulent populations of this parasite on most known resistance sources necessitates the development of novel approaches for control.


SUMMARY OF THE INVENTION

Disclosed herein are methods of transforming a soybean plant using artificial DNA constructs to increase resistance to soybean cyst nematode (SCN).


One aspect provides a transgenic soybean resistant to SCN, or a seed, plant part, or progeny thereof. In some embodiments, the soybean plant can be transformed with an artificial DNA construct. In some embodiments, the DNA construct includes, as operably associated components in the 5′ to 3′ direction of transcription, a promoter that functions in a soybean. In some embodiments, the DNA construct also includes a transcribable nucleic acid molecule.


In some embodiments, the transcribable nucleic acid molecule includes a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 (Glyma18g02570), SEQ ID NO: 2 (Glyma18g02580), and SEQ ID NO: 3 (Glyma18g02590). In some embodiments, the transcribable nucleic acid molecule includes a nucleotide sequence at least 95% identical to a nucleotide sequence selected from the group consisting of SEQ ID NO: 1, SEQ ID NO: 2, and SEQ ID NO: 3 encoding a polypeptide having Glyma18g02570, Glyma18g02580, or Glyma18g02590 activity, respectively.


In some embodiments, the transcribable nucleic acid molecule includes a nucleotide sequence encoding a polypeptide comprising SEQ ID NO: 4 (Glyma18g02570), SEQ ID NO: 5 (Glyma18g02580), SEQ ID NO: 6 (Glyma18g02590), and SEQ ID NO: 7 (Forrest SNAP A111D mutant). In some embodiments, the transcribable nucleic acid molecule includes a nucleotide sequence encoding a polypeptide having an amino acid sequence at least 95% identical a polypeptide comprising SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, and SEQ ID NO: 7 having Glyma18g02570, Glyma18g02580, Glyma18g02590, or SNAP activity, respectively.


In some embodiments, the transcribable nucleic acid molecule includes a nucleotide sequence that hybridizes under stringent conditions to a nucleic acid sequence of SEQ ID NO: 1, SEQ ID NO: 2, or SEQ ID NO: 3. In some embodiments, the polynucleotide encodes a polypeptide having Glyma18g02570, Glyma18g02580, or Glyma18g02590 activity. In some embodiments, stringent conditions include incubation at 65° C. in a solution including 6×SSC (0.9 M sodium chloride and 0.09 M sodium citrate). In some embodiments, the transcribable nucleic acid molecule includes a nucleotide sequence which is the reverse complement of nucleotide sequences disclosed herein.


In some embodiments, the DNA construct also includes a transcriptional termination sequence. In some embodiments, the transgenic soybean exhibits increased SCN resistance compared to a control not expressing the transcribable nucleic acid molecule.


In some embodiments, the nucleotide sequence can be at least 95% identical to SEQ ID NO: 3 having one of more mutations selected from the group consisting of C163225G, G174968T, A164972AGGT, C164974A, C163208A, G164965C, G164968C, A164972AGGC, and C164974A. In some embodiments, the encoded polypeptide includes an amino acid sequence at least 95% identical to SEQ ID NO: 6 having one of more mutations selected from the group consisting of D208E, D286Y, D287E, -288V, L289I, Q203K, E285Q, D286H, D287E, -288A, L289I, and A111D.


In some embodiments, the encoded polypeptide includes an amino acid sequence at least 95% identical to SEQ ID NO: 6, a mutation of A111D, and Glyma18g02590 polypeptide activity. In some embodiments, the transcribable nucleic acid molecule is expressed in epidermis, vascular tissue, meristem, cambium, cortex, pith, leaf, sheath, root, flower, developing ovule or seed.


In some embodiments, the promoter includes an inducible promoter or a tissue-specific promoter. In some embodiments, the promoter includes a nematode-inducible promoter. In some embodiments, the promoter is selected from the group consisting of factor EF1α gene promoter; rice tungro bacilliform virus (RTBV) gene promoter; cestrum yellow leaf curling virus (CmYLCV) promoter; tCUP cryptic promoter system; T6P-3 promoter; S-adenosyl-L-methionine synthetase promoter; Raspberry E4 gene promoter; cauliflower mosaic virus 35S promoter; figwort mosaic virus promoter; conditional heat-shock promoter; promoter sub-fragments of sugar beet V-type H+-ATPase subunit c isoform; and beta-tubulin promoter.


In some embodiments, increased SCN resistance comprises at least about 20%, at least about 30%, at least about 40%, at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 100%, at least about 200%, at least about 300%, at least about 400%, at least about 500%, at least about 600%, at least about 700%, at least about 800%, at least about 900%, or at least about 1000% decrease in susceptibility to SCN as compared to a non-transformed control.


In some embodiments, the transcribable nucleic acid molecule includes a nucleotide sequence selected from the group consisting of SEQ ID NO: 1, SEQ ID NO: 2, and SEQ ID NO: 3. In some embodiments, the transcribable nucleic acid molecule includes a nucleotide sequence at least 95% identical to SEQ ID NO: 1, SEQ ID NO: 2, or SEQ ID NO: 3, and encodes a polypeptide having Glyma18g02570, Glyma18g02580, or Glyma18g02590 activity, respectively. In some embodiments, the transcribable nucleic acid molecule encodes a polypeptide including SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, or SEQ ID NO: 7. In some embodiments, the transcribable nucleic acid molecule encodes a polypeptide including an amino acid sequence at least 95% identical to SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, or SEQ ID NO: 7 and having Glyma18g02570, Glyma18g02580, Glyma18g02590, or SNAP activity, respectively.


In some embodiments, the transgenic progeny, seed, or part comprises the transcribable nucleic acid molecule.


One aspect provides a soybean plant including in its genome at least one introgressed allele locus associated with an SCN resistant phenotype. In some embodiments, the locus can be in a genomic region flanked by at least two loci selected from TABLE 6. In some embodiments, the soybean plant also includes in its genome one or more polymorphic loci including alleles or combinations of alleles that are not found in an SCN resistant variety and that are linked to said locus associated with an SCN resistant phenotype, or a progeny plant therefrom. In some embodiments, the at least one allele locus is selected from the group consisting of Glyma18g02570, Glyma18g02580, and Glyma18g02590.


One aspect provides a method of producing a soybean plant as disclosed herein including crossing a first soybean plant lacking a locus associated with an SCN resistant phenotype with a second soybean plant. In some embodiments, the second soybean plant includes an allele of at least one polymorphic nucleic acid associated with an SCN resistant phenotype located in a genomic region flanked by at least two loci selected from TABLE 6. In some embodiments, the second soybean plant also includes at least one additional polymorphic locus located outside of said region that is not present in the first soybean plant, to obtain a population of soybean plants segregating for the polymorphic locus associated with an SCN resistant phenotype and said additional polymorphic locus.


In some embodiments, the method also includes detecting said polymorphic locus in at least one soybean plant from said population of soybean plants. In some embodiments, the method also includes selecting a soybean plant including the locus associated with an SCN resistant phenotype that lacks the additional polymorphic locus, thereby obtaining a soybean plant including in its genome at least one introgressed allele of a polymorphic nucleic acid associated with an SCN resistant phenotype. In some embodiments, the first soybean plant includes germplasm capable of conferring agronomically elite characteristics to a progeny plant of the first soybean plant and the second soybean plant.


One aspect provides an artificial DNA construct including, as operably associated components in the 5′ to 3′ direction of transcription, a promoter that functions in a soybean.


In some embodiments, the DNA construct also includes a transcribable nucleic acid molecule. In some embodiments, the transcribable nucleic acid molecule includes a nucleotide sequence selected from the group consisting of SEQ ID NO: 1, SEQ ID NO: 2, and SEQ ID NO: 3, or a nucleotide sequence at least 95% identical thereto encoding a polypeptide having Glyma18g02570, Glyma18g02580, or Glyma18g02590 activity, respectively. In some embodiments, the transcribable nucleic acid molecule includes a nucleotide sequence encoding a polypeptide including SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, or SEQ ID NO: 7, or an amino acid sequence at least 95% identical thereto having Glyma18g02570, Glyma18g02580, Glyma18g02590, or SNAP activity, respectively. In some embodiments, the transcribable nucleic acid molecule includes a nucleotide sequence that hybridizes under stringent conditions to a nucleic acid sequence of SEQ ID NO: 1, SEQ ID NO: 2, or SEQ ID NO: 3. In some embodiments, the polynucleotide encodes a polypeptide having Glyma18g02570, Glyma18g02580, or Glyma18g02590 activity. In some embodiments, said stringent conditions include incubation at 65° C. in a solution including 6×SSC (0.9 M sodium chloride and 0.09 M sodium citrate). In some embodiments, the transcribable nucleic acid molecule includes a nucleotide sequence which is the reverse complement of nucleotide sequences disclosed herein.


In some embodiments, DNA construct also includes a transcriptional termination sequence.


One aspect provides a method of increasing SCN resistance of a soybean including transforming a soybean plant with an artificial DNA construct disclosed herein.





DESCRIPTION OF THE DRAWINGS

Those of skill in the art will understand that the drawings, described below, are for illustrative purposes only. The drawings are not intended to limit the scope of the present teachings in any way.



FIG. 1 is a series of drawings and a sequence listing illustrating the positional cloning of the Rhg1 gene. FIG. 1A shows high-density genetic maps of the Rhg1 locus developed using two recombinant inbred line populations, developed from crosses between a resistant line “Forrest” (F) and a susceptible line “Essex” (E) or “Williams 82” (W), showing recombinant lines WxF6034 (I, SCN-susceptible), ExF3126 (II, SCN-resistant) and ExF4361 (III, SCN-resistant). Black horizontal lines represent approximately 370 kb of the Rhg1 chromosomal interval. Arrows designate DNA marker positions and names. Numbers above the black horizontal line denote marker position relative to marker RLK (an LRR-RLK gene at the Rhg1 locus; assigned position ‘0’). Arrows with one asterisk designate the physical position of each tested DNA marker within the Rhg1 locus using published DNA sequence of Williams 82 as a reference. Arrows with no asterisk represent the DNA markers with Forrest alleles found in recombinants WxF6034, ExF3126 and ExF4361. Arrows with two asterisks represent the DNA markers with heterozygote alleles (Forrest allele with Essex allele). The arrow with three asterisks represents DNA marker 600 having polymorphisms between Essex and Forrest, and between Essex and Williams 82, but not between Forrest and Williams 82. FIG. 1B shows the genomic DNA gene model for the SNAP gene. The gene is 4,223 bp from start codon to stop codon and contains nine exons (light-grey boxes) and eight introns (solid black lines). Numbers above the light-grey boxes and solid black line indicate the length (bp) of each exon or intron, while the numbers under the dotted lines indicate the nucleotide position relative to the first nucleotide of the start codon. FIG. 1C shows a comparison of the predicted SNAP protein sequences between Forrest and Essex with the amino acid differences (Y206D, E207D, V288- and I289L) highlighted. FIG. 1D shows the predicted armadillo/beta-catenin-like repeat sequence (marker 570). FIG. 1E shows the predicted amino acid transporter sequence (marker 580).



FIG. 2 is a schematic representation of the amino acid differences in the predicted SNAP protein sequences in 11 soybean lines with the number indicating amino acid position. Amino acid differences detected in the exons between Forrest and Essex are boxed along with the special amino acids of the PI88788-type PIs in SNAP. Amino acids marked in the boxes are the different amino acids between Peking type SNAP, PI88788 type SNAP, and susceptible type SNAP.



FIG. 3 shows images of a soybean plant. FIG. 3A and FIG. 3B show the virus-induced gene-silencing (VIGS) phenotype of Glyma18g02590-VIGS-AS bombarded plants at 16 days post-inoculation. FIG. 3C and FIG. 3D show the VIGS phenotype of Glyma18g02590-VIGS-AS rub-inoculated plants at 16 days post-inoculation.



FIG. 4 is Venn diagram illustrating the overlapping SNPs, insertions, and deletions conferring resistance of Rhg1 to SCN. The number following S, I, and D represents the number of SNPs (S), insertions (I), or deletions (D). EFP88, EFP, EF88, EP88, FP88, EP, E88, FP, F88, and P88 represent the overlapping SNPs, insertions, or deletions of Essex (E), Forrest (F), Peking (P), and PI88788 (88); Essex, Forrest, and PI88788; Essex, Forrest, and PI88788; Essex, Peking, and P188788; Forrest, Peking, and P188788; Essex and Peking; Essex and P188788; Forrest and Peking; Forrest and PI88788; and Peking and PI88788, respectively.



FIG. 5 is a histogram illustrating SCN susceptibility of Forrest SNAP Type III mutant A111D, as compared to SCN-resistant wild-type Forrest.





DETAILED DESCRIPTION OF THE INVENTION

The present disclosure is based, at least in part, on the discovery that three genes mapped to the Rhg1 (for resistance to Heterodera glycines 1) locus confer resistance to SCN.


Reported herein is the map-based cloning of three genes at the Rhg1 locus, a major quantitative trait locus conferring resistance to this pathogen. Results herein indicate that three genes that can confer SCN-resistance at the Rhg1 locus include Glyma18g02570 (an armadillo/beta-catenin-like repeat), Glyma18g02580 (an amino acid transporter), or Glyma18g02590 (a SNAP-like protein).


According to the approach described herein, a soybean cell or plant can be transformed so as to provide for SCN resistance. In some embodiments, a soybean host cell or plant can be transformed with a nucleic acid molecule encoding a polypeptide having activity of Glyma18g02570, Glyma18g02580, or Glyma18g02590. A nucleic acid encoding a polypeptide having Glyma18g02570, Glyma18g02580, or Glyma18g02590 activity can confer SCN resistance.


Since the discovery of the genes involved in resistance to SCN, others have published data providing confirmation that the three genes are involved in the resistance to SCN. Proof of principle data includes the following additional evidence that the Rhg1 locus confers SCN-resistance in soybean. It has since been shown that upregulation of genes at nematode feeding sites in near-isogenic lines of resistant and susceptible soybean cultivars differ at the Rhg1 locus (Kandoth et al., Plant Physiology, 155:1960-1975, 2011). These results show that expression of Glyma18g02580 and Glyma18g02590 increased in resistant cells as described herein (see e.g., TABLE 1). The effect of copy number variation of multiple genes at the Rhg1 locus was shown for nematode resistance in soybean (Cook et al., Science, 338(6111):1206-1209, 2012).









TABLE 1







Glyma18g02580 and Glyma18g02590 gene expression


in SCN-resistant cells.









Gene
Description
Fold increase (R:S)





Glyma18g02580.1
Amino acid transporter
4.08


Glyma18g02590.1
NSF soluble attachment
1.50



protein









TRANSFORMED ORGANISM

Provided herein is a soybean plant genetically engineered to be SCN-resistant. The host genetically engineered to resist SCN can be any soybean plant or cell.


Assays to assess SCN resistance are well known in the art (see e.g., Examples). Therefore, except as otherwise noted herein, plant SCN resistance can be carried out in accordance with such assays.


One aspect of the current invention is therefore directed to the aforementioned plants, and parts thereof, and methods for using these plants and plant parts. Plant parts include, but are not limited to, pollen, an ovule, and a cell. The invention further provides tissue cultures of regenerable cells of these plants, which cultures regenerate soybean plants capable of expressing all the physiological and morphological characteristics of the starting variety. Such regenerable cells may include embryos, meristematic cells, pollen, leaves, roots, root tips or flowers, or protoplasts or callus derived therefrom. Also provided by the invention are soybean plants regenerated from such a tissue culture, wherein the plants are capable of expressing all the physiological and morphological characteristics of the starting plant variety from which the regenerable cells were obtained.


Such SCN-resistant plants can have a commercially significant yield, for example, a yield of at least 90% to at least 110% (e.g., at least 95%, 100%, 105%) of a soybean check line. Plants are provided comprising the Glyma18g02570, Glyma18g02580, or Glyma18g02590 alleles and SCN resistance and a grain yield of at least about 90%, 94%, 98%, 100%, 105% or about 110% of these lines.


In various embodiments, a nucleic acid sequence encoding a polypeptide having Glyma18g02570, Glyma18g02580, or Glyma18g02590 activity is engineered in a host plant (e.g., a soybean plant) so as to result in an SCN-resistant phenotype. A nucleic acid sequence encoding a polypeptide having Glyma18g02570, Glyma18g02580, or Glyma18g02590 activity can be endogenous or exogenous to the host plant. Transformation of a plant to express a polypeptide having Glyma18g02570, Glyma18g02580, or Glyma18g02590 activity can convey SCN resistance to a host lacking such phenotype. Transformation of a plant to express a polypeptide having Glyma18g02570, Glyma18g02580, or Glyma18g02590 activity can increase SCN resistance to a host already possessing such phenotype.


In some embodiments, a host plant transformed to express a polypeptide having Glyma18g02570, Glyma18g02580, or Glyma18g02590 activity can exhibit at least about 10% decrease in susceptibility to SCN. For example, a host plant transformed to express a polypeptide having Glyma18g02570, Glyma18g02580, or Glyma18g02590 activity can exhibit at least about 20%, at least about 30%, at least about 40%, at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, or at least about 100% decrease in susceptibility to SCN as compared to a non-transformed control. As another example, a host plant transformed to express a polypeptide having Glyma18g02570, Glyma18g02580, or Glyma18g02590 activity can exhibit at least about 100%, at least about 200%, at least about 300%, at least about 400%, at least about 500%, at least about 600%, at least about 700%, at least about 800%, at least about 900%, or at least about 1000% decrease in susceptibility to SCN as compared to a non-transformed control.


Genes of particular interest for engineering a soybean plant to exhibit SCN resistance include Glyma18g02570 (SEQ ID NO: 1), Glyma18g02580 (SEQ ID NO: 2), or Glyma18g02590 (SEQ ID NO: 3). As described herein, Glyma18g02570, Glyma18g02580, or Glyma18g02590 have been mapped to the Rhg1 locus and can confer SCN-resistance.


A transformed host soybean plant can comprise a nucleotide sequence of SEQ ID NO: 1 (Glyma18g02570), SEQ ID NO: 2 (Glyma18g02580), or SEQ ID NO: 3 (Glyma18g02590). A transformed host soybean plant can comprise a nucleotide sequence having at least about 80% sequence identity to SEQ ID NO: 1 (Glyma18g02570), SEQ ID NO: 2 (Glyma18g02580), or SEQ ID NO: 3 (Glyma18g02590), wherein the nucleotide sequence encodes a polypeptide having Glyma18g02570, Glyma18g02580, or Glyma18g02590 activity, respectively, or the transformed soybean exhibits SCN resistance. For example, a transformed host soybean plant can comprise a nucleotide sequence having at least about 85%, at least about 90%, at least about 95%, or at least about 99% sequence identity to SEQ ID NO: 1 (Glyma18g02570), SEQ ID NO: 2 (Glyma18g02580), or SEQ ID NO: 3 (Glyma18g02590), wherein the nucleotide sequence encodes a polypeptide having Glyma18g02570, Glyma18g02580, or Glyma18g02590 activity, respectively, or the transformed soybean exhibits SCN resistance.


A nucleotide sequence described herein can include one or mutations affecting the activity of a Glyma18g02570, Glyma18g02580, or Glyma18g02590 polypeptide or host SCN resistance. For example, a nucleotide sequence of SEQ ID NO: 1 (Glyma18g02570), SEQ ID NO: 2 (Glyma18g02580), or SEQ ID NO: 3 (Glyma18g02590) can have one or more mutations affecting the activity of a Glyma18g02570, Glyma18g02580, or Glyma18g02590 polypeptide or host SCN resistance. For example, a nucleotide sequence variant (e.g., at least 80%, 85%, 90%, 95, or 99% identity) of SEQ ID NO: 3 (Glyma18g02590) can have one or more of the following mutations: C163225G, G174968T, A164972AGGT, C164974A, C163208A, G164965C, G164968C, A164972AGGC, or C164974A. As another example, the SNAP gene (Glyma18g02590; e.g., SEQ ID NO: 3) in Forrest or Peking backgrounds can include one or more of the following mutations: C163225G, G174968T, A164972AGGT, or C164974A. As another example, the SNAP gene (Glyma18g02590; e.g., SEQ ID NO: 3) in a PI88788 background can include one or more of the following mutations: C163208A, G164965C, G164968C, A164972AGGC, or C164974A.


A transformed host soybean plant can comprise a nucleotide sequence encoding a polypeptide of SEQ ID NO: 4 (Glyma18g02570), SEQ ID NO: 5 (Glyma18g02580), SEQ ID NO: 6 (Glyma18g02590), or SEQ ID NO: 7 (Forrest SNAP A111D mutant). A transformed host soybean plant can comprise a nucleotide sequence encoding a polypeptide having at least about 80% sequence identity to SEQ ID NO: 4 (Glyma18g02570), SEQ ID NO: 5 (Glyma18g02580), SEQ ID NO: 6 (Glyma18g02590), or SEQ ID NO: 7 (Forrest SNAP A111D mutant), wherein the polypeptide has Glyma18g02570, Glyma18g02580, Glyma18g02590, or SNAP activity, respectively, or the transformed soybean exhibits SCN resistance. For example, a transformed host soybean plant can comprise a nucleotide sequence encoding a polypeptide having at least about 85%, at least about 90%, at least about 95%, or at least about 99% sequence identity to SEQ ID NO: 4 (Glyma18g02570), SEQ ID NO: 5 (Glyma18g02580), SEQ ID NO: 6 (Glyma18g02590), or SEQ ID NO: 7 (Forrest SNAP A111D mutant), wherein the nucleotide sequence encodes a polypeptide having Glyma18g02570, Glyma18g02580, Glyma18g02590, or SNAP activity, respectively, or the transformed soybean exhibits SCN resistance.


A polypeptide sequence described herein can include one or mutations affecting the activity of a Glyma18g02570, Glyma18g02580, or Glyma18g02590 polypeptide or host SCN resistance. For example, an encoded or expressed polypeptide of SEQ ID NO: 4 (Glyma18g02570), SEQ ID NO: 5 (Glyma18g02580), SEQ ID NO: 6 (Glyma18g02590), or SEQ ID NO: 7 (Forrest SNAP A111D mutant) can have one or more mutations affecting the activity of the polypeptide or host SCN resistance. For example, an encoded or expressed polypeptide variant (e.g., at least 80%, 85%, 90%, 95, or 99% identity) of SEQ ID NO: 3 (Glyma18g02590) can have one or more of the following mutations: D208E, D286Y, D287E, -288V, L289I, Q203K, E285Q, D286H, D287E, -288A, L289I, or A111D. As another example, the SNAP-like protein (SEQ ID NO: 6, encoded by, e.g., Glyma18g02590, SEQ ID NO: 3) in Forrest or Peking backgrounds can include one or more of the following mutations: D208E, D286Y, D287E, -288V, L289I, or A111D. As another example, the SNAP-like protein (SEQ ID NO: 6, encoded by, e.g., Glyma18g02590, SEQ ID NO: 3) in a PI88788 background can include one or more of the following mutations: Q203K, E285Q, D286H, D287E, -288A, L289I, or A111D.


As another example, a transformed soybean can comprise a nucleotide sequence that hybridizes under stringent conditions to a Glyma18g02570, Glyma18g02580, or Glyma18g02590 polynucleotide (e.g., SEQ ID NO: 1, SEQ ID NO: 2, or SEQ ID NO: 3, respectively) over the entire length thereof, and which encodes a polypeptide having Glyma18g02570, Glyma18g02580, Glyma18g02590, or SNAP A111D mutant (e.g., SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, or SEQ ID NO: 7, respectively) activity.


As a further example, a transformed soybean can comprise the complement to any of the above sequences.


Variant Sequences


As describe above, a plant can be transformed with a variant of a Glyma18g02570, Glyma18g02580, or Glyma18g02590 polynucleotide (e.g., SEQ ID NO: 1, SEQ ID NO: 2, or SEQ ID NO: 3) or with a polynucleotide encoding a variant of a Glyma18g02570, Glyma18g02580, Glyma18g02590, SNAP A111D mutant (e.g., SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, or SEQ ID NO: 7, respectively) polypeptide. These species SEQ ID NO: 1, SEQ ID NO: 2, or SEQ ID NO: 3, and their corresponding encoded polypeptides, are representative of the genus of variant nucleic acid and polypeptides, respectively, because all variants must possess the specified catalytic activity (e.g., Glyma18g02570, Glyma18g02580, or Glyma18g02590 activity) and must have the percent identity required above to the reference sequence.


Promoters


One or more of the nucleotide sequences discussed above (e.g., Glyma18g02570, Glyma18g02580, or Glyma18g02590 or a variant thereof) can be operably linked to a promoter that can function in a plant, such as soybean. Promoter selection can allow expression of a desired gene product under a variety of conditions.


Promoters can be selected for optimal function in a soybean host cell into which the vector construct will be inserted. Promoters can also be selected on the basis of their regulatory features. Examples of such features include enhancement of transcriptional activity and inducibility.


Numerous promoters functional in a soybean plant will be known to one of skill in the art (see, e.g., Weise et al., Applied Microbiology and Biotechnology, 70(3):337-345, 2006; Saidi et al., Plant Molecular Biology, 59(5):697-711, 2005; Horstmann et al., BMC Biotechnology, 4, 2004; Holtorf et al., Plant Cell Reports, 21(4):341-346, 2002; Zeidler et al., Plant Molecular Biology, 30(1):199-205, 1996). Except as otherwise noted herein, therefore, the processes and compositions of the present disclosure can be carried out in accordance with such known promoters. Examples of promoters than can be used in accord with methods and compositions described herein include, but are not limited to, factor EF1α gene promoter (US App Pub No. 2008/0313776); rice tungro bacilliform virus (RTBV) gene promoter (US App Pub No. 2008/0282431); cestrum yellow leaf curling virus (CmYLCV) promoter (Stavolone et al., Plant Molecular Biology, 53(5):663-673, 2003); tCUP cryptic promoter system (Malik et al., Theoretical and Applied Genetics, 105(4):505-514, 2002); T6P-3 promoter (JP2002238564); S-adenosyl-L-methionine synthetase promoter (WO/2000/037662); Raspberry E4 gene promoter (U.S. Pat. No. 6,054,635); cauliflower mosaic virus 35S promoter (Benfey et al., Science, 250(4983):959-966, 1990); figwort mosaic virus promoter (U.S. Pat. No. 5,378,619); conditional heat-shock promoter (Saidi et al., Plant Molecular Biology, 59(5):697-711, 2005); promoter sub-fragments of the sugar beet V-type H+-ATPase subunit c isoform (Holtorf et al., Plant Cell Reports, 21(4):341-346, 2002); beta-tubulin promoter (Jost et al., Current Genetics, 47(2):111-120, 2005); and bacterial quorum-sensing components (You et al., Plant Physiology, 140(4):1205-1212, 2006).


The promoter can be an inducible promoter. For example, the promoter can be induced according to temperature, pH, a hormone, a metabolite (e.g., lactose, mannitol, an amino acid), light (e.g., wavelength specific), osmotic potential (e.g., salt induced), a heavy metal, or an antibiotic. As another example, the promoter can be a nematode-inducible promoter, such as pZF (Kandoth et al. Plant Physiol. 155:1960-1975 (2011)). Numerous standard inducible promoters will be known to one of skill in the art.


The term “chimeric” is understood to refer to the product of the fusion of portions of two or more different polynucleotide molecules. “Chimeric promoter” is understood to refer to a promoter produced through the manipulation of known promoters or other polynucleotide molecules. Such chimeric promoters can combine enhancer domains that can confer or modulate gene expression from one or more promoters or regulatory elements, for example, by fusing a heterologous enhancer domain from a first promoter to a second promoter with its own partial or complete regulatory elements. Thus, the design, construction, and use of chimeric promoters according to the methods disclosed herein for modulating the expression of operably linked polynucleotide sequences are encompassed by the present invention.


Novel chimeric promoters can be designed or engineered by a number of methods. For example, a chimeric promoter may be produced by fusing an enhancer domain from a first promoter to a second promoter. The resultant chimeric promoter may have novel expression properties relative to the first or second promoters. Novel chimeric promoters can be constructed such that the enhancer domain from a first promoter is fused at the 5′ end, at the 3′ end, or at any position internal to the second promoter.


Constructs


Any of the transcribable polynucleotide molecule sequences described above can be provided in a construct. Constructs of the present invention generally include a promoter functional in the host plant, such as soybean, operably linked to a transcribable polynucleotide molecule encoding a polypeptide with Glyma18g02570, Glyma18g02580, or Glyma18g02590 activity, such as provided in SEQ ID NO: 1, SEQ ID NO: 2, or SEQ ID NO: 3, or variants thereof as discussed above.


Exemplary promoters are discussed above. One or more additional promoters may also be provided in the recombinant construct. These promoters can be operably linked to any of the transcribable polynucleotide molecule sequences described above.


The term “construct” is understood to refer to any recombinant polynucleotide molecule such as a plasmid, cosmid, virus, autonomously replicating polynucleotide molecule, phage, or linear or circular single-stranded or double-stranded DNA or RNA polynucleotide molecule, derived from any source, capable of genomic integration or autonomous replication, comprising a polynucleotide molecule where one or more polynucleotide molecule has been linked in a functionally operative manner, i.e. operably linked. The term “vector” or “vector construct” is understood to refer to any recombinant polynucleotide construct that may be used for the purpose of transformation, i.e., the introduction of heterologous DNA into a host plant, such as a soybean.


In addition, constructs may include, but are not limited to, additional polynucleotide molecules from an untranslated region of the gene of interest. These additional polynucleotide molecules can be derived from a source that is native or heterologous with respect to the other elements present in the construct.


Host cells developed according to the approaches described herein can be evaluated by a number of means known in the art (see, e.g., Studier, Protein Expr Purif, 41(1):207-234, 2005; Gellissen, ed., (2005) Production of Recombinant Proteins: Novel Microbial and Eukaryotic Expression Systems, Wiley-VCH, ISBN-10:3527310363; Baneyx, (2004) Protein Expression Technologies, Taylor & Francis, ISBN-10:0954523253).


Molecular Engineering


The following definitions and methods are provided to better define the present invention and to guide those of ordinary skill in the art in the practice of the present invention. Unless otherwise noted, terms are to be understood according to conventional usage by those of ordinary skill in the relevant art.


Compositions and methods described herein utilizing molecular biology protocols can be according to a variety of standard techniques known to the art (see, e.g., Sambrook and Russel (2006) Condensed Protocols from Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, ISBN-10: 0879697717; Ausubel et al. (2002) Short Protocols in Molecular Biology, 5th ed., Current Protocols, ISBN-10: 0471250929; Sambrook and Russel (2001) Molecular Cloning: A Laboratory Manual, 3d ed., Cold Spring Harbor Laboratory Press, ISBN-10: 0879695773; Green and Sambrook 2012 Molecular Cloning: A Laboratory Manual, 4th ed., Cold Spring Harbor Laboratory Press, ISBN-10: 1605500569; Elhai, J. and Wolk, C. P. 1988. Methods in Enzymology 167, 747-754; Studier (2005) Protein Expr Purif. 41(1), 207-234; Gellissen, ed. (2005) Production of Recombinant Proteins: Novel Microbial and Eukaryotic Expression Systems, Wiley-VCH, ISBN-10: 3527310363; Baneyx (2004) Protein Expression Technologies, Taylor & Francis, ISBN-10: 0954523253).


The terms “heterologous DNA sequence”, “exogenous DNA segment” or “heterologous nucleic acid,” as used herein, each refer to a sequence that originates from a source foreign to the particular host cell or, if from the same source, is modified from its original form. Thus, a heterologous gene in a host cell includes a gene that is endogenous to the particular host cell but has been modified through, for example, the use of DNA shuffling. The terms also include non-naturally occurring multiple copies of a naturally occurring DNA sequence. Thus, the terms refer to a DNA segment that is foreign or heterologous to the cell, or homologous to the cell but in a position within the host cell nucleic acid in which the element is not ordinarily found. Exogenous DNA segments are expressed to yield exogenous polypeptides. A “homologous” DNA sequence is a DNA sequence that is naturally associated with a host cell into which it is introduced.


Expression vector, expression construct, plasmid, or recombinant DNA construct is generally understood to refer to a nucleic acid that has been generated via human intervention, including by recombinant means or direct chemical synthesis, with a series of specified nucleic acid elements that permit transcription or translation of a particular nucleic acid in, for example, a host cell. The expression vector can be part of a plasmid, virus, or nucleic acid fragment. Typically, the expression vector can include a nucleic acid to be transcribed operably linked to a promoter.


A “promoter” is generally understood as a nucleic acid control sequence that directs transcription of a nucleic acid. An inducible promoter is generally understood as a promoter that mediates transcription of an operably linked gene in response to a particular stimulus. A promoter can include necessary nucleic acid sequences near the start site of transcription, such as, in the case of a polymerase II type promoter, a TATA element. A promoter can optionally include distal enhancer or repressor elements, which can be located as much as several thousand base pairs from the start site of transcription.


A “transcribable nucleic acid molecule” as used herein refers to any nucleic acid molecule capable of being transcribed into a RNA molecule. Methods are known for introducing constructs into a cell in such a manner that the transcribable nucleic acid molecule is transcribed into a functional mRNA molecule that is translated and therefore expressed as a protein product. Constructs may also be constructed to be capable of expressing antisense RNA molecules, in order to inhibit translation of a specific RNA molecule of interest. For the practice of the present disclosure, conventional compositions and methods for preparing and using constructs and host cells are well known to one skilled in the art (see, e.g., Sambrook and Russell, (2006) Condensed Protocols from Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, ISBN-10:0879697717; Ausubel et al., (2002) Short Protocols in Molecular Biology, 5th ed., Current Protocols, ISBN-10:0471250929; Sambrook and Russell, (2001) Molecular Cloning: A Laboratory Manual, 3d ed., Cold Spring Harbor Laboratory Press, ISBN-10:0879695773; Elhai, J. and Wolk, C. P., Methods in Enzymology, 167:747-754, 1988).


The “transcription start site” or “initiation site” is the position surrounding the first nucleotide that is part of the transcribed sequence, which is also defined as position +1. With respect to this site all other sequences of the gene and its controlling regions can be numbered. Downstream sequences (i.e., further protein encoding sequences in the 3′ direction) can be denominated positive, while upstream sequences (mostly of the controlling regions in the 5′ direction) are denominated negative.


“Operably-linked” or “functionally linked” refers preferably to the association of nucleic acid sequences on a single nucleic acid fragment so that the function of one is affected by the other. For example, a regulatory DNA sequence is said to be “operably linked to” or “associated with” a DNA sequence that codes for an RNA or a polypeptide if the two sequences are situated such that the regulatory DNA sequence affects expression of the coding DNA sequence (i.e., that the coding sequence or functional RNA is under the transcriptional control of the promoter). Coding sequences can be operably-linked to regulatory sequences in sense or antisense orientation. The two nucleic acid molecules may be part of a single contiguous nucleic acid molecule and may be adjacent. For example, a promoter is operably linked to a gene of interest if the promoter regulates or mediates transcription of the gene of interest in a cell.


A “construct” is generally understood as any recombinant nucleic acid molecule such as a plasmid, cosmid, virus, autonomously replicating nucleic acid molecule, phage, or linear or circular single-stranded or double-stranded DNA or RNA nucleic acid molecule, derived from any source, capable of genomic integration or autonomous replication, comprising a nucleic acid molecule where one or more nucleic acid molecule has been operably linked.


A constructs of the present disclosure can contain a promoter operably linked to a transcribable nucleic acid molecule operably linked to a 3′ transcription termination nucleic acid molecule. In addition, constructs can include but are not limited to additional regulatory nucleic acid molecules from, e.g., the 3′-untranslated region (3′ UTR). Constructs can include but are not limited to the 5′ untranslated regions (5′ UTR) of an mRNA nucleic acid molecule which can play an important role in translation initiation and can also be a genetic component in an expression construct. These additional upstream and downstream regulatory nucleic acid molecules may be derived from a source that is native or heterologous with respect to the other elements present on the promoter construct.


The term “transformation” refers to the transfer of a nucleic acid fragment into the genome of a host cell, resulting in genetically stable inheritance. Host cells containing the transformed nucleic acid fragments are referred to as “transgenic” cells, and organisms comprising transgenic cells are referred to as “transgenic organisms”.


“Transformed,” “transgenic,” and “recombinant” refer to a host cell or organism such as a bacterium, cyanobacterium, animal or a plant into which a heterologous nucleic acid molecule has been introduced. The nucleic acid molecule can be stably integrated into the genome as generally known in the art. Known methods of PCR include, but are not limited to, methods using paired primers, nested primers, single specific primers, degenerate primers, gene-specific primers, vector-specific primers, partially mismatched primers, and the like. The term “untransformed” refers to normal cells that have not been through the transformation process.


“Wild-type” refers to a virus or organism found in nature without any known mutation.


Design, generation, and testing of the variant nucleotides, and their encoded polypeptides, having the above required percent identities and retaining a required activity of the expressed protein is within the skill of the art. For example, directed evolution and rapid isolation of mutants can be according to methods described in references including, but not limited to, Link et al., Nature Reviews, 5(9):680-688, 2007; Sanger et al., Gene, 97(1):119-123, 1991; and Ghadessy et al., Proc Natl Acad Sci USA, 98(8):4552-4557, 2001. Thus, one skilled in the art could generate a large number of nucleotide and/or polypeptide variants having, for example, at least 95-99% identity to the reference sequence described herein and screen such for desired phenotypes according to methods routine in the art.


Nucleotide and/or amino acid sequence identity percent (%) is understood as the percentage of nucleotide or amino acid residues that are identical with nucleotide or amino acid residues in a candidate sequence in comparison to a reference sequence when the two sequences are aligned. To determine percent identity, sequences are aligned and if necessary, gaps are introduced to achieve the maximum percent sequence identity. Sequence alignment procedures to determine percent identity are well known to those of skill in the art. Often publicly available computer software such as BLAST, BLAST2, ALIGN2 or Megalign (DNASTAR) software is used to align sequences. Those skilled in the art can determine appropriate parameters for measuring alignment, including any algorithms needed to achieve maximal alignment over the full-length of the sequences being compared. When sequences are aligned, the percent sequence identity of a given sequence A to, with, or against a given sequence B (which can alternatively be phrased as a given sequence A that has or comprises a certain percent sequence identity to, with, or against a given sequence B) can be calculated as: percent sequence identity=X/Y100, where X is the number of residues scored as identical matches by the sequence alignment program's or algorithm's alignment of A and B and Y is the total number of residues in B. If the length of sequence A is not equal to the length of sequence B, the percent sequence identity of A to B will not equal the percent sequence identity of B to A.


Generally, conservative substitutions can be made at any position so long as the required activity is retained. So-called conservative exchanges can be carried out in which the amino acid which is replaced has a similar property as the original amino acid, for example the exchange of Glu by Asp, Gln by Asn, Val by Ile, Leu by Ile, and Ser by Thr. Deletion is the replacement of an amino acid by a direct bond. Positions for deletions include the termini of a polypeptide and linkages between individual protein domains. Insertions are introductions of amino acids into the polypeptide chain, a direct bond formally being replaced by one or more amino acids. Amino acid sequence can be modulated with the help of art-known computer simulation programs that can produce a polypeptide with, for example, improved activity or altered regulation. On the basis of this artificially generated polypeptide sequences, a corresponding nucleic acid molecule coding for such a modulated polypeptide can be synthesized in-vitro using the specific codon-usage of the desired host cell.


“Highly stringent hybridization conditions” are defined as hybridization at 65° C. in a 6×SSC buffer (i.e., 0.9 M sodium chloride and 0.09 M sodium citrate). Given these conditions, a determination can be made as to whether a given set of sequences will hybridize by calculating the melting temperature (Tm) of a DNA duplex between the two sequences. If a particular duplex has a melting temperature lower than 65° C. in the salt conditions of a 6×SSC, then the two sequences will not hybridize. On the other hand, if the melting temperature is above 65° C. in the same salt conditions, then the sequences will hybridize. In general, the melting temperature for any hybridized DNA:DNA sequence can be determined using the following formula: Tm=81.5° C.+16.6(log10[Na+])+0.41(fraction G/C content)−0.63(% formamide)−(600/1). Furthermore, the Tm of a DNA:DNA hybrid is decreased by 1-1.5° C. for every 1% decrease in nucleotide identity (see, e.g., Sambrook and Russell, (2006)).


Host cells can be transformed using a variety of standard techniques known to the art (see, e.g., Sambrook and Russell (2006); Ausubel et al. (2002); Sambrook and Russell, (2001); Elhai, J. and Wolk, C. P., 1988). Such techniques include, but are not limited to, viral infection, calcium phosphate transfection, liposome-mediated transfection, microprojectile-mediated delivery, receptor-mediated uptake, cell fusion, electroporation, and the like. The transfected cells can be selected and propagated to provide recombinant host cells that comprise the expression vector stably integrated in the host cell genome.


Exemplary nucleic acids which may be introduced to a host cell include, for example, DNA sequences or genes from another species, or even genes or sequences which originate with or are present in the same species, but are incorporated into recipient cells by genetic engineering methods. The term “exogenous” is also intended to refer to genes that are not normally present in the cell being transformed, or perhaps simply not present in the form, structure, etc., as found in the transforming DNA segment or gene, or genes which are normally present and that one desires to express in a manner that differs from the natural expression pattern, e.g., to over-express. Thus, the term “exogenous” gene or DNA is intended to refer to any gene or DNA segment that is introduced into a recipient cell, regardless of whether a similar gene may already be present in such a cell. The type of DNA included in the exogenous DNA can include DNA which is already present in the cell, DNA from another individual of the same type of organism, DNA from a different organism, or a DNA generated externally, such as a DNA sequence containing an antisense message of a gene, or a DNA sequence encoding a synthetic or modified version of a gene.


Host strains developed according to the approaches described herein can be evaluated by a number of means known in the art (see, e.g., Studier, 2005; Gellissen, ed. (2005) Production of Recombinant Proteins: Novel Microbial and Eukaryotic Expression Systems, Wiley-VCH, ISBN-10: 3527310363; Baneyx (2004) Protein Expression Technologies, Taylor & Francis, ISBN-10: 0954523253).


Methods of down-regulation or silencing genes are known in the art. For example, expressed protein activity can be down-regulated or eliminated using antisense oligonucleotides, protein aptamers, nucleotide aptamers, and RNA interference (RNAi) (e.g., small interfering RNAs (sRNA), short hairpin RNA (shRNA), and micro RNAs (miRNA) (see, e.g., Fanning and Symonds, Handb Exp Pharmacol., 173:289-303G, 2006, describing hammerhead ribozymes and small hairpin RNA; Helene, C., et al., Ann. N.Y. Acad. Sci., 660:27-36, 1992; Maher, Bioassays 14(12):807-15, 1992, describing targeting deoxyribonucleotide sequences; Lee et al., Curr Opin Chem Biol., 10:1-8, 2006, describing aptamers; Reynolds et al., Nature Biotechnology, 22(3):326-330, 2004, describing RNAi; Pushparaj and Melendez, Clin. and Exp. Pharm. and Phys., 33(5-6):504-510, 2006, describing RNAi; Dillon et al., Annual Review of Physiology, 67:147-173, 2005, describing RNAi; Dykxhoorn and Lieberman, Annual Review of Medicine, 56:401-423, 2005, describing RNAi). RNAi molecules are commercially available from a variety of sources (e.g., Ambion, TX; Sigma Aldrich, MO; Invitrogen). Several siRNA molecule design programs using a variety of algorithms are known to the art (see, e.g., Cenix algorithm, Ambion; BLOCK-iT™ RNAi Designer, Invitrogen; siRNA Whitehead Institute Design Tools, Bioinformatics & Research Computing). Traits influential in defining optimal siRNA sequences include G/C content at the termini of the siRNAs, Tm of specific internal domains of the siRNA, siRNA length, position of the target sequence within the CDS (coding region), and nucleotide content of the 3′ overhangs.


Breeding


It is disclosed herein that a quantitative trait locus (QTL) with major effects for SCN resistance and single nucleotide polymorphism (SNP) markers in the proximity of this locus have been identified that can be used for the introgression of this genomic region to desirable germplasm, such as by marker-assisted selection and/or marker-assisted backcrossing.


The present disclosure provides genetic markers and methods for the introduction of Glyma18g02570, Glyma18g02580, or Glyma18g02590 alleles into agronomically elite soybean plants. The invention therefore allows the creation of plants that combine these Glyma18g02570, Glyma18g02580, or Glyma18g02590 alleles that confer SCN resistance with a commercially significant yield and an agronomically elite genetic background. Using the methods of the invention, loci conferring the SCN phenotype may be introduced into a desired soybean genetic background, for example, in the production of new varieties with commercially significant yield and SCN resistance.


As used herein, the term “population” means a genetically heterogenous collection of plants that share a common parental derivation.


As used herein, the terms “variety” and “cultivar” mean a group of similar plants that by their genetic pedigrees and performance can be identified from other varieties within the same species.


As used herein, an “allele” refers to one of two or more alternative forms of a genomic sequence at a given locus on a chromosome.


A “Quantitative Trait Locus (QTL)” is a chromosomal location that encodes for alleles that affect the expressivity of a phenotype.


As used herein, a “marker” means a detectable characteristic that can be used to discriminate between organisms. Examples of such characteristics include, but are not limited to, genetic markers, biochemical markers, metabolites, morphological characteristics, and agronomic characteristics.


As used herein, the term “phenotype” means the detectable characteristics of a cell or organism that can be influenced by gene expression.


As used herein, the term “genotype” means the specific allelic makeup of a plant.


“Agronomically elite” refers to a genotype that has a culmination of many distinguishable traits such as emergence, vigor, vegetative vigor, disease resistance, seed set, standability, and threshability, which allows a producer to harvest a product of commercial significance.


As used herein, the term “introgressed,” when used in reference to a genetic locus, refers to a genetic locus that has been introduced into a new genetic background. Introgression of a genetic locus can thus be achieved through plant breeding methods and/or by molecular genetic methods. Such molecular genetic methods include, but are not limited to, various plant transformation techniques and/or methods that provide for homologous recombination, non-homologous recombination, site-specific recombination, and/or genomic modifications that provide for locus substitution or locus conversion.


As used herein, the term “linked,” when used in the context of nucleic acid markers and/or genomic regions, means that the markers and/or genomic regions are located on the same linkage group or chromosome.


As used herein, the term “denoting” when used in reference to a plant genotype refers to any method whereby a plant is indicated to have a certain genotype. This includes any means of identification of a plant having a certain genotype. Indication of a certain genotype may include, but is not limited to, any entry into any type of written or electronic medium or database whereby the plant's genotype is provided. Indications of a certain genotype may also include, but are not limited to, any method where a plant is physically marked or tagged. Illustrative examples of physical marking or tags useful in the invention include, but are not limited to, a barcode, a radio-frequency identification (RFID), a label, or the like.


Marker assisted introgression involves the transfer of a chromosome region defined by one or more markers from one germplasm to a second germplasm. The initial step in that process is the localization of the trait by gene mapping, which is the process of determining the position of a gene relative to other genes and genetic markers through linkage analysis. The basic principle for linkage mapping is that the closer together two genes are on the chromosome, the more likely they are to be inherited together. Briefly, a cross is generally made between two genetically compatible but divergent parents relative to traits under study. Genetic markers can then be used to follow the segregation of traits under study in the progeny from the cross, often a backcross (BC1), F2, or recombinant inbred population.


The term quantitative trait loci, or QTL, is used to describe regions of a genome showing quantitative or additive effects upon a phenotype. The Rhg1 loci, containing Glyma18g02570, Glyma18g02580, or Glyma18g02590 alleles, represent exemplary QTL because Glyma18g02570, Glyma18g02580, or Glyma18g02590 alleles result in SCN resistance. Herein identified are genetic markers for non-transgenic, Glyma18g02570, Glyma18g02580, or Glyma18g02590 alleles that enable breeding of soybean plants comprising the Glyma18g02570, Glyma18g02580, or Glyma18g02590 alleles with agronomically superior plants, and selection of progeny that inherited the mutant Glyma18g02570, Glyma18g02580, or Glyma18g02590 alleles. Thus, the invention allows the use of molecular tools to combine these QTLs with desired agronomic characteristics.


Various embodiments of the present disclosure utilize a QTL or polymorphic nucleic acid marker or allele located in this genomic region. Subregions of this genomic region associated with SCN resistant phenotype can be described as being flanked by markers shown in TABLE 6. Such markers are believed to be associated with the SCN resistant phenotype because of their location and proximity to the major QTL. One or more polymorphic nucleic acids can be used from TABLE 6. For example, at least two, three, four, five, six, seven, eight, nine, ten, or more of such markers can used.


It can be useful to detect in, or determine whether, a soybean plant has an allelic state that is associated with or not associated with an SCN resistant phenotype.


A plant can be identified in which at least one allele at a polymorphic locus associated with an SCN resistant phenotype is detected. For example, a diploid plant in which the allelic state at a polymorphic locus comprises one allele associated with an SCN resistant phenotype and one allele that is not associated with an SCN resistant phenotype (i.e., heterozygous at that locus). In certain embodiments of the invention, it may be useful to cross a plant that is heterozygous at a locus associated with an SCN resistant phenotype with a plant that is similarly heterozygous or that does not contain any allele associated with an SCN resistant phenotype at the locus, to produce progeny a certain percentage of plants that are heterozygous at that locus. Plants homozygous at the locus may then be produced by various breeding methods, such as by self-crossing or dihaploidization.


One of skill in the art will also recognize that it can be useful to identify at a genetic locus a polymorphic nucleic acid marker that is not associated with an SCN resistant phenotype in a plant, such as when introgressing a QTL associated with an SCN resistant phenotype into a genetic background not associated with such a phenotype.


Markers and allelic states disclosed herein are exemplary. From Table 6, one of skill in the art would recognize how to identify soybean plants with other polymorphic nucleic acid markers and allelic states thereof related to SCN resistance consistent with the present disclosure. One of skill the art would also know how to identify the allelic state of other polymorphic nucleic acid markers located in the genomic region(s) or linked to the QTL or other markers identified herein, to determine their association with SCN resistance.


Provided herein are unique soybean germplasms or soybean plants comprising an introgressed genomic region that is associated with an SCR resistant phenotype and method of obtaining the same. Marker-assisted introgression involves the transfer of a chromosomal region, defined by one or more markers, from one germplasm to a second germplasm. Offspring of a cross that contain the introgressed genomic region can be identified by the combination of markers characteristic of the desired introgressed genomic region from a first germplasm (e.g., an SCN resistant phenotype germplasm) and both linked and unlinked markers characteristic of the desired genetic background of a second germplasm. Flanking markers that identify a genomic region associated with an SCN resistant phenotype include those in TABLE 6.


Flanking markers that fall on both the telomere proximal end and the centromere proximal end of any of these genomic intervals may be useful in a variety of breeding efforts that include, but are not limited to, introgression of genomic regions associated with an SCN resistant phenotype into a genetic background comprising markers associated with germplasm that ordinarily contains a genotype associated with a non-SCN resistant phenotype. Markers that are linked and either immediately adjacent or adjacent to the identified SCN resistant phenotype QTL that permit introgression of the QTL in the absence of extraneous linked DNA from the source germplasm containing the QTL are provided herewith. Those of skill in the art will appreciate that when seeking to introgress a smaller genomic region comprising a QTL associated with an SCN resistant phenotype described herein, that any of the telomere proximal or centromere proximal markers that are immediately adjacent to a larger genomic region comprising the QTL can be used to introgress that smaller genomic region.


Soybean plants or germplasm comprising an introgressed region that is associated with an SCN resistant phenotype wherein at least 10%, 25%, 50%, 75%, 90%, or 99% of the remaining genomic sequences carry markers characteristic of plant or germplasm that otherwise or ordinarily comprise a genomic region associated with an non-SCN resistant phenotype, are thus provided. Furthermore, soybean plants comprising an introgressed region where closely linked regions adjacent or immediately adjacent to the genomic regions, QTL, and markers provided herewith that comprise genomic sequences carrying markers characteristic of soybean plants or germplasm that otherwise or ordinarily comprise a genomic region associated with the phenotype are also provided.


Genetic markers that can be used in the practice of the present disclosure include, but are not limited to, Restriction Fragment Length Polymorphisms (RFLP), Amplified Fragment Length Polymorphisms (AFLP), Simple Sequence Repeats (SSR), Single Nucleotide Polymorphisms (SNP), Insertion/Deletion Polymorphisms (Indels), Variable Number Tandem Repeats (VNTR), and Random Amplified Polymorphic DNA (RAPD), and others known to those skilled in the art. Marker discovery and development in crops provides the initial framework for applications to marker-assisted breeding activities (U.S. Patent Pub. Nos.: 2005/0204780, 2005/0216545, 2005/0218305, and 2006/00504538). The resulting “genetic map” is the representation of the relative position of characterized loci (polymorphic nucleic acid markers or any other locus for which alleles can be identified) to each other.


As a set, polymorphic markers serve as a useful tool for fingerprinting plants to inform the degree of identity of lines or varieties (U.S. Pat. No. 6,207,367). These markers form the basis for determining associations with phenotypes and can be used to drive genetic gain. In certain embodiments of the present disclosure, polymorphic nucleic acids can be used to detect in a soybean plant a genotype associated with an SCN resistant phenotype, identify a soybean plant with a genotype associated with an SCN resistant phenotype, or to select a soybean plant with a genotype associated with an SCN resistant phenotype. In certain embodiments of methods of the present disclosure, polymorphic nucleic acids can be used to produce a soybean plant that comprises in its genome an introgressed locus associated with an SCN resistant phenotype. In certain embodiments of the invention, polymorphic nucleic acids can be used to breed progeny soybean plants comprising a locus associated with an SCN resistant phenotype.


Certain genetic markers useful in the present invention include “dominant” or “codominant” markers. “Codominant” markers reveal the presence of two or more alleles (two per diploid individual). “Dominant” markers reveal the presence of only a single allele. The presence of the dominant marker phenotype (e.g., a band of DNA) is an indication that one allele is present in either the homozygous or heterozygous condition. The absence of the dominant marker phenotype (e.g., absence of a DNA band) is merely evidence that “some other” undefined allele is present. In the case of populations where individuals are predominantly homozygous and loci are predominantly dimorphic, dominant and codominant markers can be equally valuable. As populations become more heterozygous and multiallelic, codominant markers often become more informative of the genotype than dominant markers.


Nucleic acid-based analyses for determining the presence or absence of the genetic polymorphism (i.e. for genotyping) can be used in breeding programs for identification, selection, introgression, or the like. A wide variety of genetic markers for the analysis of genetic polymorphisms are available and known to those of skill in the art. The analysis may be used to select for genes, portions of genes, QTL, alleles, or genomic regions that comprise or are linked to a genetic marker that is linked to or associated with an SCN resistant phenotype.


As used herein, nucleic acid analysis methods include, but are not limited to, PCR-based detection methods (for example, TaqMan assays), microarray methods, mass spectrometry-based methods and/or nucleic acid sequencing methods, including whole genome sequencing. In certain embodiments, the detection of polymorphic sites in a sample of DNA, RNA, or cDNA may be facilitated through the use of nucleic acid amplification methods. Such methods specifically increase the concentration of polynucleotides that span the polymorphic site, or include that site and sequences located either distal or proximal to it. Such amplified molecules can be readily detected by gel electrophoresis, fluorescence detection methods, or other means.


One method of achieving such amplification employs the polymerase chain reaction (PCR) (Mullis et al. 1986 Cold Spring Harbor Symp. Quant. Biol. 51:263-273; European Patent 50,424; European Patent 84,796; European Patent 258,017; European Patent 237,362; European Patent 201,184; U.S. Pat. No. 4,683,202; U.S. Pat. No. 4,582,788; and U.S. Pat. No. 4,683,194), using primer pairs that are capable of hybridizing to the proximal sequences that define a polymorphism in its double-stranded form. Methods for typing DNA based on mass spectrometry can also be used. Such methods are disclosed in U.S. Pat. Nos. 6,613,509 and 6,503,710, and references found therein.


Polymorphisms in DNA sequences can be detected or typed by a variety of effective methods well known in the art including, but not limited to, those disclosed in U.S. Pat. Nos. 5,468,613, 5,217,863; 5,210,015; 5,876,930; 6,030,787; 6,004,744; 6,013,431; 5,595,890; 5,762,876; 5,945,283; 5,468,613; 6,090,558; 5,800,944; 5,616,464; 7,312,039; 7,238,476; 7,297,485; 7,282,355; 7,270,981 and 7,250,252 all of which are incorporated herein by reference in their entireties. However, the compositions and methods of the present disclosure can be used in conjunction with any polymorphism typing method to type polymorphisms in genomic DNA samples. These genomic DNA samples used include but are not limited to genomic DNA isolated directly from a plant, cloned genomic DNA, or amplified genomic DNA.


For example, polymorphisms in DNA sequences can be detected by hybridization to allele-specific oligonucleotide (ASO) probes as disclosed in U.S. Pat. Nos. 5,468,613 and 5,217,863. U.S. Pat. No. 5,468,613 discloses allele specific oligonucleotide hybridizations where single or multiple nucleotide variations in nucleic acid sequence can be detected in nucleic acids by a process in which the sequence containing the nucleotide variation is amplified, spotted on a membrane and treated with a labeled sequence-specific oligonucleotide probe.


Target nucleic acid sequence can also be detected by probe ligation methods as disclosed in U.S. Pat. No. 5,800,944 where sequence of interest is amplified and hybridized to probes followed by ligation to detect a labeled part of the probe.


Microarrays can also be used for polymorphism detection, wherein oligonucleotide probe sets are assembled in an overlapping fashion to represent a single sequence such that a difference in the target sequence at one point would result in partial probe hybridization (Borevitz et al., Genome Res. 13:513-523 (2003); Cui et al., Bioinformatics 21:3852-3858 (2005). On any one microarray, it is expected there will be a plurality of target sequences, which may represent genes or non-coding regions wherein each target sequence is represented by a series of overlapping oligonucleotides, rather than by a single probe. This platform provides for high throughput screening of a plurality of polymorphisms. Typing of target sequences by microarray-based methods is disclosed in U.S. Pat. Nos. 6,799,122; 6,913,879; and 6,996,476.


Target nucleic acid sequence can also be detected by probe linking methods as disclosed in U.S. Pat. No. 5,616,464, employing at least one pair of probes having sequences homologous to adjacent portions of the target nucleic acid sequence and having side chains which non-covalently bind to form a stem upon base pairing of the probes to the target nucleic acid sequence. At least one of the side chains has a photoactivatable group which can form a covalent cross-link with the other side chain member of the stem.


Other methods for detecting SNPs and Indels include single base extension (SBE) methods. Examples of SBE methods include, but are not limited, to those disclosed in U.S. Pat. Nos. 6,004,744; 6,013,431; 5,595,890; 5,762,876; and 5,945,283. SBE methods are based on extension of a nucleotide primer that is adjacent to a polymorphism to incorporate a detectable nucleotide residue upon extension of the primer. In certain embodiments, the SBE method uses three synthetic oligonucleotides. Two of the oligonucleotides serve as PCR primers and are complementary to sequence of the locus of genomic DNA which flanks a region containing the polymorphism to be assayed. Following amplification of the region of the genome containing the polymorphism, the PCR product is mixed with the third oligonucleotide (called an extension primer) which is designed to hybridize to the amplified DNA adjacent to the polymorphism in the presence of DNA polymerase and two differentially labeled dideoxynucleosidetriphosphates. If the polymorphism is present on the template, one of the labeled dideoxynucleosidetriphosphates can be added to the primer in a single base chain extension. The allele present is then inferred by determining which of the two differential labels was added to the extension primer. Homozygous samples will result in only one of the two labeled bases being incorporated and thus only one of the two labels will be detected. Heterozygous samples have both alleles present, and will thus direct incorporation of both labels (into different molecules of the extension primer) and thus both labels will be detected.


In another method for detecting polymorphisms, SNPs and Indels can be detected by methods disclosed in U.S. Pat. Nos. 5,210,015; 5,876,930; and 6,030,787 in which an oligonucleotide probe having a 5′ fluorescent reporter dye and a 3′ quencher dye covalently linked to the 5′ and 3′ ends of the probe. When the probe is intact, the proximity of the reporter dye to the quencher dye results in the suppression of the reporter dye fluorescence, e.g. by Forster-type energy transfer. During PCR forward and reverse primers hybridize to a specific sequence of the target DNA flanking a polymorphism while the hybridization probe hybridizes to polymorphism-containing sequence within the amplified PCR product. In the subsequent PCR cycle DNA polymerase with 5′ to 3′ exonuclease activity cleaves the probe and separates the reporter dye from the quencher dye resulting in increased fluorescence of the reporter.


In another embodiment, the locus or loci of interest can be directly sequenced using nucleic acid sequencing technologies. Methods for nucleic acid sequencing are known in the art and include technologies provided by 454 Life Sciences (Branford, Conn.), Agencourt Bioscience (Beverly, Mass.), Applied Biosystems (Foster City, Calif.), LI-COR Biosciences (Lincoln, Nebr.), NimbleGen Systems (Madison, Wis.), Illumina (San Diego, Calif.), and VisiGen Biotechnologies (Houston, Tex.). Such nucleic acid sequencing technologies comprise formats such as parallel bead arrays, sequencing by ligation, capillary electrophoresis, electronic microchips, “biochips,” microarrays, parallel microchips, and single-molecule arrays, as reviewed by R.F. Service Science 2006 311:1544-1546.


The markers to be used in the methods of the present disclosure can be diagnostic of origin in order for inferences to be made about subsequent populations. Experience to date suggests that SNP markers may be ideal for mapping because the likelihood that a particular SNP allele is derived from independent origins in the extant populations of a particular species is very low. As such, SNP markers (see e.g., TABLE 6) appear to be useful for tracking and assisting introgression of QTLs.


Research Tools


The Glyma18g02570, Glyma18g02580, or Glyma18g02590 genes can be used to find or characterize related (interactive) genes or identify or further characterize the cascade for SCN resistance. The discovery of a Glyma18g02570, Glyma18g02580, or Glyma18g02590 as part of the resistance signaling pathway against SCN provides novel insight into this complex host-pathogen interaction. Insights reported herein can be used to discern the relationship between Glyma18g02570, Glyma18g02580, or Glyma18g02590 and metabolism.


In some embodiments, the Glyma18g02570, Glyma18g02580, or Glyma18g02590 genes can be used in a genomics, proteomics, bioinformatics, or statistical modeling approach to fish or isolate candidate genes or encoded proteins or other molecules with a direct or indirect function in mediating disease resistance to SCN in soybeans. In some embodiments, the Glyma18g02570, Glyma18g02580, or Glyma18g02590 genes can be used in a genomics, proteomics, bioinformatics, or statistical modeling approach to fish or isolate candidate genes or encoded proteins or other molecules with a direct or indirect function in mediating compatible or incompatible responses of soybeans to SCN (e.g., to a nematode or any intermediate). Thus is provided various methods to find or characterize related (interactive) genes involved with SCN resistance.


Targeting-Induced Local Lesions in Genomes (TILLING) is a method permitting identification of gene-specific mutations. In particular, this process uses traditional mutagenesis and SNP discovery methods for a reverse genetic strategy that takes advantage of a mismatch endonuclease to locate and detect induced mutations in a high-throughput and low cost manner. EcoTILLING, which is a variant of TILLING, examines natural genetic variation in populations to discover SNPs (reviewed in Barkley and Wang, Curr Genomics, 9(4):212-26, 2008).


Kits


Also provided are kits. Such kits can include an agent or composition described herein and, in certain embodiments, instructions for administration. Such kits can facilitate performance of the methods described herein. When supplied as a kit, the different components of the composition can be packaged in separate containers and admixed immediately before use. Components include, but are not limited to an antibody (e.g., a monoclonal antibody) specific for a transcribable nucleic acid molecule described herein (e.g., SEQ ID NOS: 1-3, or variants thereof) or encoded polypeptides disclosed herein (e.g., SEQ ID NOS: 4-6, or variants thereof). Methods for generating such a monoclonal antibody are known in the art and can be adapted to the methods or compositions described herein. Such packaging of the components separately can, if desired, be presented in a pack or dispenser device which may contain one or more unit dosage forms containing the composition. The pack may, for example, comprise metal or plastic foil such as a blister pack. Such packaging of the components separately can also, in certain instances, permit long-term storage without losing activity of the components.


Kits may also include reagents in separate containers such as, for example, sterile water or saline to be added to a lyophilized active component packaged separately. For example, sealed glass ampules may contain a lyophilized component and in a separate ampule, sterile water, sterile saline or sterile each of which has been packaged under a neutral non-reacting gas, such as nitrogen. Ampules may consist of any suitable material, such as glass, organic polymers, such as polycarbonate, polystyrene, ceramic, metal or any other material typically employed to hold reagents. Other examples of suitable containers include bottles that may be fabricated from similar substances as ampules, and envelopes that may consist of foil-lined interiors, such as aluminum or an alloy. Other containers include test tubes, vials, flasks, bottles, syringes, and the like. Containers may have a sterile access port, such as a bottle having a stopper that can be pierced by a hypodermic injection needle. Other containers may have two compartments that are separated by a readily removable membrane that upon removal permits the components to mix. Removable membranes may be glass, plastic, rubber, and the like.


In certain embodiments, kits can be supplied with instructional materials. Instructions may be printed on paper or other substrate, and/or may be supplied as an electronic-readable medium, such as a floppy disc, mini-CD-ROM, CD-ROM, DVD-ROM, Zip disc, videotape, audio tape, and the like. Detailed instructions may not be physically associated with the kit; instead, a user may be directed to an Internet web site specified by the manufacturer or distributor of the kit.


Definitions and methods described herein are provided to better define the present disclosure and to guide those of ordinary skill in the art in the practice of the present disclosure. Unless otherwise noted, terms are to be understood according to conventional usage by those of ordinary skill in the relevant art.


In some embodiments, numbers expressing quantities of ingredients, properties such as molecular weight, reaction conditions, and so forth, used to describe and claim certain embodiments of the present disclosure are to be understood as being modified in some instances by the term “about.” In some embodiments, the term “about” is used to indicate that a value includes the standard deviation of the mean for the device or method being employed to determine the value. In some embodiments, the numerical parameters set forth in the written description and attached claims are approximations that can vary depending upon the desired properties sought to be obtained by a particular embodiment. In some embodiments, the numerical parameters should be construed in light of the number of reported significant digits and by applying ordinary rounding techniques. Notwithstanding that the numerical ranges and parameters setting forth the broad scope of some embodiments of the present disclosure are approximations, the numerical values set forth in the specific examples are reported as precisely as practicable. The numerical values presented in some embodiments of the present disclosure may contain certain errors necessarily resulting from the standard deviation found in their respective testing measurements. The recitation of ranges of values herein is merely intended to serve as a shorthand method of referring individually to each separate value falling within the range. Unless otherwise indicated herein, each individual value is incorporated into the specification as if it were individually recited herein.


In some embodiments, the terms “a” and “an” and “the” and similar references used in the context of describing a particular embodiment (especially in the context of certain of the following claims) can be construed to cover both the singular and the plural, unless specifically noted otherwise. In some embodiments, the term “or” as used herein, including the claims, is used to mean “and/or” unless explicitly indicated to refer to alternatives only or the alternatives are mutually exclusive.


The terms “comprise,” “have” and “include” are open-ended linking verbs. Any forms or tenses of one or more of these verbs, such as “comprises,” “comprising,” “has,” “having,” “includes” and “including,” are also open-ended. For example, any method that “comprises,” “has” or “includes” one or more steps is not limited to possessing only those one or more steps and can also cover other unlisted steps. Similarly, any composition or device that “comprises,” “has” or “includes” one or more features is not limited to possessing only those one or more features and can cover other unlisted features.


All methods described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The use of any and all examples, or exemplary language (e.g., “such as”) provided with respect to certain embodiments herein is intended merely to better illuminate the present disclosure and does not pose a limitation on the scope of the present disclosure otherwise claimed. No language in the specification should be construed as indicating any non-claimed element essential to the practice of the present disclosure.


Groupings of alternative elements or embodiments of the present disclosure disclosed herein are not to be construed as limitations. Each group member can be referred to and claimed individually or in any combination with other members of the group or other elements found herein. One or more members of a group can be included in, or deleted from, a group for reasons of convenience or patentability. When any such inclusion or deletion occurs, the specification is herein deemed to contain the group as modified thus fulfilling the written description of all Markush groups used in the appended claims.


Citation of a reference herein shall not be construed as an admission that such is prior art to the present disclosure.


Having described the present disclosure in detail, it will be apparent that modifications, variations, and equivalent embodiments are possible without departing the scope of the present disclosure defined in the appended claims. Furthermore, it should be appreciated that all examples in the present disclosure are provided as non-limiting examples.


EXAMPLES

The following non-limiting examples are provided to further illustrate the present disclosure. It should be appreciated by those of skill in the art that the techniques disclosed in the examples that follow represent approaches the inventors have found function well in the practice of the present disclosure, and thus can be considered to constitute examples of modes for its practice. However, those of skill in the art should, in light of the present disclosure, appreciate that many changes can be made in the specific embodiments that are disclosed and still obtain a like or similar result without departing from the spirit and scope of the present disclosure.


Example 1
Positional Cloning of the Rhg1 Gene

The following example describes the positional cloning of the Rhg1 gene. Three genetic populations segregating for resistance to SCN PA3 (Hgtype 0) were used for mapping. These included an F2:6 recombinant inbred line (RIL) population from a cross between Forrest and Essex (98 individuals; Meksem et al., 2001), and two large F2 populations generated from crosses between Forrest and either Essex (1,755 lines) or Williams 82 (2,060 lines).


To enrich the chromosomal interval carrying the Rhg1 locus with recombinants, SCN phenotyping was conducted according to Brown et al. (2010). Because Forrest SCN-resistance requires both the Rhg1 and Rhg4 loci (Meksem et al., 2001), genotyping was conducted using DNA markers flanking both loci to detect informative recombinants at the Rhg1 locus. The SSR markers, Sat210 and Satt309 (see, e.g., SoyBase and the Soybean Breeder's Toolbox at soybase.org), and SIUC-SAT143 were used to identify chromosomal breakpoints at the Rhg1 locus and the Rhg1 genotype of each recombinant. PCR amplifications were performed using DNA from individuals from each of the three genetic populations. To enrich the chromosomal regions carrying the Rhg1 locus with DNA markers, the GenBank published Williams 82 sequences were used to design PCR primers every 5 to 10 kbp of the 370 kbp carrying the Rhg1 locus. DNA from Forrest and Essex were tested with each primer using a modified EcoTILLING protocol to find and map polymorphic sequences at the Rhg1 locus (Meksem et al., 2008; Liu et al., 2011). The identified SNP and InDel DNA markers were integrated into the informative recombinants to identify chromosomal breakpoints and the interval that carried the Rhg1 locus. A high density genetic map was developed for the Rhg1 locus (see e.g., FIG. 1). Comparison of the SNAP gene sequences between Forrest and Essex identified some significant changes including three SNPs (G2464C, T4206G and A4215C) and three InDels (G4211-, G4212- and G4213-) within the exons (see e.g., FIG. 1B).


Example 2
Relationship Between Genes and Resistance to SCN

The following example shows the relationship between the Glyma18g02570 (armadillo/beta-catenin-like repeat), Glyma18g02580 (amino acid transporter) and Glyma18g02590 (SNAP) genes and resistance to SCN. A haplotype map was developed using 4 DNA markers (560, 570, 590 and Satt309) at the Rhg1 locus and 1 DNA marker (Sat162) plus the Rhg4 GmSHMT gene at the Rhg4 locus, respectively The Forrest genotype was classified resistant (R) and the Essex genotype was classified susceptible (S). Lines were classified resistant (R) to SCN if female index (FI)≦10% and susceptible (S) if FI>10% (see e.g., TABLE 2).









TABLE 2







Haplotype map of SCN resistance in soybean.











SCN infection
Rhg1 locus
Rhg4 locus














Plant line
phenotype
560
570
590
Satt309
GmSHMT
Sat_162





Forrest
R
R
R
R
R
R
R


Peking
R
R
R
R
R
R
R


PI437654
R
R
R
R
R
R
R


PI89772
R
R
R
R
R
R
R


PI90763
R
R
R
R
R
R
R


PI88788
R
R
R
R
S
S
R


PI546316
R
R
R
R
S
S
R


PI209332
R
R
R
R
S
S
S


Essex
S
S
S
S
S
S
S


Williams 82
S
S
S
S
S
S
S


PI603428C
S
S
S
S
R
S
R









In addition, a detailed haplotype analysis was conducted for the SNAP gene. The SNAP coding region from 11 soybean lines was sequenced, representing the SCN-resistance variability in soybean germplasm. The amino acid differences in the predicted protein sequences of SNAP from the 11 soybean lines are shown with the number indicating the amino acid position in the predicted protein (see e.g., FIG. 2B). Haplotyping results from these 11 soybean lines indicate three types of SNAP haplotypes. The data further indicates there are at least two resistant types: Peking Type I including Peking, Forrest, PI437654, PI89772 and PI90763, and PI88788 Type II including PI88788, PI548316 and PI209332; and one susceptible Type III including Essex, Williams 82 and PI603428C.


Example 3
Virus-Induced Gene-Silencing (VIGS)

The following example describes VIGS in soybean. Bean pod mottle virus (BPMV) VIGS vectors, pBPMV IA-R1M, and pBPMV-IA-D35 were used in this example (Zhang et. al., 2010). pBPMV-IA-D35 is a derivative of pBPMV-IA-R2 containing BamHI and KpnI restriction sites between the cistrons encoding the movement protein and the large coat protein 15 subunit. Briefly, a 328 bp fragment of the SNAP cDNA sequence was amplified from soybean (cv. Forrest) root cDNA by RT-PCR. PCR products were digested with BamHI and KpnI and ligated into pBPMV-IA-D35 digested with the same enzymes to generate pBPMV-IA-SNAP. Gold particles coated with plasmid DNA corresponding to pBPMV-IA-R1M and pBPMV-IA-SNAP were co-bombarded into soybean leaf tissue (Zhang et al., 2010). At 3-4 weeks post-inoculation, BPMV-infected leaves were collected, lyophilized, and stored at −20° C. for future experiments. Infected soybean leaf tissues were ground with a mortar and pestle in 0.05 M potassium phosphate buffer (pH 7.0) and used as virus inoculum for VIGS assays.


The SCN-resistant RIL ExF67 was inoculated with pBPMV-IA-SNAP (Glyma18g02590). Control plants were infected with BPMV only. Each treatment consisted of at least 12 plants. Unifoliate leaves of 9-day-old plants were rub-inoculated with virus using carborundum (Zhang et al., 2010). Plants were grown in a growth chamber set to the following conditions: 20-21° C., 16 h light/8 h dark, and 100 mE m-2s-1 light intensity. A strong hypersensitive cell death-like response was observed in the leaves of infected pBPMV-IA-SNAP plants (see e.g., FIG. 3) and also resulted in poor root development that compromised the ability to conduct nematode infection assays on these plants.


Plants silenced for Glyma18g02590.1 in soybean leaves caused a strong hypersensitive cell death response (necrotic lesions) and compromised root growth. Consequently, the plants were not phenotyped against SCN (see e.g., FIG. 3).


SCN-resistance can be manifested at the site of nematode feeding as a strong hypersensitive response (HR) that leads to death of the feeding cell and nematode. Thus, these data indicate that interference in SNAP gene function in the resistant cultivar can mediate the SCN-resistance response.


Example 4
Near Isogenic Lines (NILs)

The following example describes additional evidence from Near Isogenic Lines (NILs). NILs that differ in SCN-resistance because of variations at the Rhg1 locus, but not the Rhg4 locus, were analyzed by genome resequencing and GoldenGate SNP analysis. SNPs in and around the three additional genes were found to be polymorphic (see e.g., TABLE 3) and therefore were identified as conferring SCN-resistance.









TABLE 3







NIL polymorphisms around the three additional genes


underlying SCN-resistance.














EXF34-
EXF34-32


Index
Name
Position
23Sus
Res





43751
Gm18_1552671_A_G
1552671
BB
AA


43760
Gm18_1562162_G_A
1562162
AA
BB


43771
Gm18_1567581_G_A
1567581
BB
NC


43784
Gm18_1582570_T_C
1582570
BB
BB


43815
Gm18_1612017_G_A
1612017
BB
AA


43823
Gm18_1620585_T_C
1620585
BB
BB


43829
Gm18_1625693_A_G
1625693
BB
BB


43836
Gm18_1630870_C_A
1630870
BB
BB


43840
Gm18_1634453_G_A
1634453
AA
AA


43845
Gm18_1640404_C_A
1640404
BB
AA


43849
Gm18_1652357_C_T
1652357
BB
AA


43859
Gm18_1663298_A_G
1663298
BB
BB


43863
Gm18_1671483_A_G
1671483
AA
BB


43865
Gm18_1674972_C_T
1674972
AA
AA


43869
Gm18_1677273_T_G
1677273
NC
BB









Example 5
Sequencing Recombinant Inbred Lines

The following example describes additional evidence from sequencing recombinant Inbred lines (RILs). RILs that differ in SCN-resistance because of variations at the Rhg1 locus, but not the Rhg4 locus, were analyzed by genome resequencing. SNPs in and around the three additional genes were found to be polymorphic (see e.g., TABLE 4) and therefore conferring SCN-resistance.









TABLE 4





SNPs among RILs from Illumina Sequencing.







>SNP_Gm18_15187890 Essex: A Forrest: G














1
SNP
A
17
17
0.000008
PARENT_A
Essex


2
COV
G
31
31
0.000000
PARENT_B
Forrest


3
SNP
A
19
19
0.000002
PARENT_A
RIL-1


4
COV
G
12
12
0.000244
PARENT_B
RIL-10


5
COV
G
19
19
0.000002
PARENT_B
RIL-11


6
SNP
A
33
33
0.000000
PARENT_A
RIL-12


7
SNP
A
15
15
0.000031
PARENT_A
RIL-13


9
COV
G
30
37
0.000075
PARENT_B
RIL-15


10
SNP
A
13
13
0.000122
PARENT_A
RIL-17


11
SNP
A
25
25
0.000000
PARENT_A
RIL-18


12
SNP
A
19
19
0.000002
PARENT_A
RIL-19


13
COV
G
8
9
0.017578
PARENT_B
RIL-2


14
COV
G
11
11
0.000488
PARENT_B
RIL-20


16
COV
G
24
24
0.000000
PARENT_B
RIL-22


17
COV
G
21
21
0.000000
PARENT_B
RIL-23


18
COV
G
18
18
0.000004
PARENT_B
RIL-24


19
COV
G
10
10
0.000977
PARENT_B
RIL-25


20
SNP
A
8
8
0.003906
PARENT_A
RIL-26


21
COV
G
33
33
0.000000
PARENT_B
RIL-27


22
COV
G
19
19
0.000002
PARENT_B
RIL-28


24
COV
G
15
15
0.000031
PARENT_B
RIL-37


25
SNP
A
6
6
0.015625
PARENT_A
RIL-38


26
SNP
A
8
8
0.003906
PARENT_A
RIL-4


27
SNP
A
28
28
0.000000
PARENT_A
RIL-40


28
COV
G
18
18
0.000004
PARENT_B
RIL-41


29
COV
G
23
23
0.000000
PARENT_B
RIL-42


30
COV
G
22
25
0.000069
PARENT_B
RIL-44


31
COV
G
19
26
0.009802
PARENT_B
RIL-46


32
SNP
A
28
28
0.000000
PARENT_A
RIL-47


34
COV
G
9
9
0.001953
PARENT_B
RIL-49


35
SNP
A
15
15
0.000031
PARENT_A
RIL-51


37
SNP
A
53
53
0.000000
PARENT_A
RIL-53


38
COV
G
39
39
0.000000
PARENT_B
RIL-54


39
COV
G
22
25
0.000069
PARENT_B
RIL-55


40
COV
G
26
26
0.000000
PARENT_B
RIL-56


41
COV
G
25
25
0.000000
PARENT_B
RIL-57


42
COV
G
34
34
0.000000
PARENT_B
RIL-58


43
SNP
A
6
6
0.015625
PARENT_A
RIL-6


44
SNP
A
27
27
0.000000
PARENT_A
RIL-67


45
SNP
A
73
73
0.000000
PARENT_A
RIL-7


46
SNP
A
17
17
0.000008
PARENT_A
RIL-72


47
COV
G
20
25
0.001583
PARENT_B
RIL-74


48
SNP
A
31
31
0.000000
PARENT_A
RIL-75


50
COV
G
23
23
0.000000
PARENT_B
RIL-79


51
COV
G
40
40
0.000000
PARENT_B
RIL-8


52
SNP
A
43
43
0.000000
PARENT_A
RIL-80


53
COV
G
53
53
0.000000
PARENT_B
RIL-84


54
SNP
A
26
26
0.000000
PARENT_A
RIL-85


55
COV
G
40
40
0.000000
PARENT_B
RIL-89


56
SNP
A
47
47
0.000000
PARENT_A
RIL-9


57
COV
G
26
26
0.000000
PARENT_B
RIL-91


58
COV
G
34
34
0.000000
PARENT_B
RIL-92


59
COV
G
28
28
0.000000
PARENT_B
RIL-94


60
COV
G
27
27
0.000000
PARENT_B
RIL-95


61
SNP
A
37
37
0.000000
PARENT_A
RIL-96







>SNP_Gm18_1608702 Essex: A Forrest: T














1
SNP
A
58
58
0.000000
PARENT_A
Essex


2
COV
T
114
114
0.000000
PARENT_B
Forrest


3
COV
T
58
58
0.000000
PARENT_B
RIL-1


4
COV
T
42
65
0.006155
PARENT_B
RIL-10


5
COV
T
38
44
0.000000
PARENT_B
RIL-11


6
COV
T
70
70
0.000000
PARENT_B
RIL-12


7
COV
T
44
46
0.000000
PARENT_B
RIL-13


8
SNP
A
47
47
0.000000
PARENT_A
RIL-14


9
COV
T
41
72
0.047140
PARENT_B
RIL-15


10
COV
T
51
58
0.000000
PARENT_B
RIL-17


11
SNP
A
46
46
0.000000
PARENT_A
RIL-18


12
SNP
A
52
52
0.000000
PARENT_A
RIL-19


13
COV
T
27
27
0.000000
PARENT_B
RIL-2


14
COV
T
30
30
0.000000
PARENT_B
RIL-20


15
COV
T
65
65
0.000000
PARENT_B
RIL-21


16
SNP
A
77
77
0.000000
PARENT_A
RIL-22


17
COV
T
30
30
0.000000
PARENT_B
RIL-23


18
COV
T
53
53
0.000000
PARENT_B
RIL-24


19
COV
T
30
30
0.000000
PARENT_B
RIL-25


20
COV
T
28
36
0.000440
PARENT_B
RIL-26


21
SNP
A
111
111
0.000000
PARENT_A
RIL-27


22
SNP
A
23
23
0.000000
PARENT_A
RIL-28


23
COV
T
29
29
0.000000
PARENT_B
RIL-36


24
COV
T
25
25
0.000000
PARENT_B
RIL-37


25
COV
T
37
37
0.000000
PARENT_B
RIL-38


26
COV
T
32
41
0.000159
PARENT_B
RIL-4


27
COV
T
72
72
0.000000
PARENT_B
RIL-40


28
SNP
A
49
49
0.000000
PARENT_A
RIL-41


29
SNP
A
51
51
0.000000
PARENT_A
RIL-42


30
COV
T
57
57
0.000000
PARENT_B
RIL-44


32
COV
T
47
47
0.000000
PARENT_B
RIL-47


33
SNP
A
18
18
0.000004
PARENT_A
RIL-48


34
SNP
A
40
40
0.000000
PARENT_A
RIL-49


35
SNP
A
24
24
0.000000
PARENT_A
RIL-51


36
SNP
A
87
87
0.000000
PARENT_A
RIL-52


37
SNP
A
99
99
0.000000
PARENT_A
RIL-53


38
SNP
A
88
91
0.000000
PARENT_A
RIL-54


39
COV
T
28
43
0.017227
PARENT_B
RIL-55


40
COV
T
59
59
0.000000
PARENT_B
RIL-56


41
COV
T
79
80
0.000000
PARENT_B
RIL-57


43
SNP
A
27
27
0.000000
PARENT_A
RIL-6


44
COV
T
91
91
0.000000
PARENT_B
RIL-67


45
COV
T
148
148
0.000000
PARENT_B
RIL-7


46
SNP
A
50
50
0.000000
PARENT_A
RIL-72


47
COV
T
65
65
0.000000
PARENT_B
RIL-74


48
COV
T
74
83
0.000000
PARENT_B
RIL-75


49
SNP
A
124
124
0.000000
PARENT_A
RIL-76


50
SNP
A
78
78
0.000000
PARENT_A
RIL-79


51
SNP
A
83
83
0.000000
PARENT_A
RIL-8


52
SNP
A
103
103
0.000000
PARENT_A
RIL-80


53
COV
T
102
102
0.000000
PARENT_B
RIL-84


54
SNP
A
70
70
0.000000
PARENT_A
RIL-85


55
COV
T
121
121
0.000000
PARENT_B
RIL-89


56
SNP
A
123
123
0.000000
PARENT_A
RIL-9


57
SNP
A
73
73
0.000000
PARENT_A
RIL-91


58
SNP
A
81
81
0.000000
PARENT_A
RIL-92


59
SNP
A
79
79
0.000000
PARENT_A
RIL-94


60
COV
T
48
49
0.000000
PARENT_B
RIL-95


61
SNP
A
124
124
0.000000
PARENT_A
RIL-96







>SNP_Gm18_1608832 Essex: G Forrest: A














1
SNP
G
5
5
0.031250
PARENT_A
Essex


2
COV
A
9
9
0.001953
PARENT_B
Forrest


3
COV
A
14
14
0.000061
PARENT_B
RIL-1


4
COV
A
17
19
0.000326
PARENT_B
RIL-10


5
COV
A
9
10
0.009766
PARENT_B
RIL-11


6
COV
A
31
31
0.000000
PARENT_B
RIL-12


7
COV
A
11
12
0.002930
PARENT_B
RIL-13


8
SNP
G
11
11
0.000488
PARENT_A
RIL-14


10
COV
A
21
22
0.000005
PARENT_B
RIL-17


11
SNP
G
18
18
0.000004
PARENT_A
RIL-18


12
SNP
G
5
5
0.031250
PARENT_A
RIL-19


14
COV
A
11
11
0.000488
PARENT_B
RIL-20


15
COV
A
19
19
0.000002
PARENT_B
RIL-21


16
SNP
G
20
20
0.000001
PARENT_A
RIL-22


17
COV
A
10
10
0.000977
PARENT_B
RIL-23


18
COV
A
20
20
0.000001
PARENT_B
RIL-24


19
COV
A
11
11
0.000488
PARENT_B
RIL-25


21
SNP
G
31
31
0.000000
PARENT_A
RIL-27


22
SNP
G
17
17
0.000008
PARENT_A
RIL-28


24
COV
A
9
9
0.001953
PARENT_B
RIL-37


28
SNP
G
20
20
0.000001
PARENT_A
RIL-41


29
SNP
G
16
16
0.000015
PARENT_A
RIL-42


30
COV
A
11
11
0.000488
PARENT_B
RIL-44


32
COV
A
15
15
0.000031
PARENT_B
RIL-47


33
SNP
G
5
5
0.031250
PARENT_A
RIL-48


36
SNP
G
13
13
0.000122
PARENT_A
RIL-52


37
SNP
G
38
39
0.000000
PARENT_A
RIL-53


38
SNP
G
38
40
0.000000
PARENT_A
RIL-54


40
COV
A
18
18
0.000004
PARENT_B
RIL-56


41
COV
A
26
26
0.000000
PARENT_B
RIL-57


43
SNP
G
14
14
0.000061
PARENT_A
RIL-6


44
COV
A
28
28
0.000000
PARENT_B
RIL-67


45
COV
A
44
44
0.000000
PARENT_B
RIL-7


46
SNP
G
24
24
0.000000
PARENT_A
RIL-72


47
COV
A
11
11
0.000488
PARENT_B
RIL-74


48
COV
A
18
18
0.000004
PARENT_B
RIL-75


49
SNP
G
39
39
0.000000
PARENT_A
RIL-76


50
SNP
G
19
19
0.000002
PARENT_A
RIL-79


51
SNP
G
28
28
0.000000
PARENT_A
RIL-8


52
SNP
G
34
34
0.000000
PARENT_A
RIL-80


53
COV
A
30
30
0.000000
PARENT_B
RIL-84


54
SNP
G
25
25
0.000000
PARENT_A
RIL-85


55
COV
A
42
42
0.000000
PARENT_B
RIL-89


56
SNP
G
29
29
0.000000
PARENT_A
RIL-9


57
SNP
G
26
26
0.000000
PARENT_A
RIL-91


58
SNP
G
28
28
0.000000
PARENT_A
RIL-92


59
SNP
G
30
30
0.000000
PARENT_A
RIL-94


61
SNP
G
28
28
0.000000
PARENT_A
RIL-96









Example 6
Targeted Genome Enrichment and Snap Identification

The following example describes additional evidence for the identification of Rhg1 gene, Glyma18g02590 (SNAP), conferring resistance to SCN, by next-generation sequencing of a targeted 300 kb region of Gm18 in soybean.









TABLE 5







SNPs, insertions, and deletions at the targeted 300 kb region of


Gm18 (Gm18: 1480001-1780000) in Essex, Forrest, Peking, and PI88788.













Essex
Forrest
Peking
PI88788
Total


















SNPs
632
618
649
736
1081



Insertions
109
120
123
120
183



Deletions
146
97
100
165
208



Total
887
835
872
1021
1472

















TABLE 6







Gm18 SNAP-cluster SNPS.











SEQ




ID


Position
Sequence
NO:





1552671
AGAG-AGGAA-GGG-AG-AGAAGA--AAAAGA
   8





1552732
TGGGGG-G--KGGGGGGGGGTGGTTGGTGTGG
   9





1552753
AGAGAAGGA-AGGGGAGGAGAAGAARAAA-GA
  10





1552799
CCCCCCAACCCCCC-CCCCCCCCCCCCCCCCC
  11





1553174
GAGAGGGGGGGAAAAGAAGGGGGGGGGGGGGG
  12





1553377
CTCTCCCTCCCTTTTCTTCTCCTCCYCCCCTC
  13





1553485
TCTCTTTTTTTCCCCTCCTTTTTTTTTTTTTT
  14





1553949
ACACAAAAAAACCC-ACCAAAAAAAAAAAAAA
  15





1554124
GGGGGG-GGGGGGGGGGGGTG-TG--GGGGT-
  16





1554570
AAAAAACCAAAAAAA-AAAAAAAA-AAAAAAA
  17





1554604
AAAAAAGGAAAAAAA-AARAAAAA-AAA-AAA
  18





1554733
T-TWTT-TTTTWWT--T--TTTTA-TTTTTTT
  19





1554848
TTTTTTCCTTTTTTT-TTTTTTTTTTTTTTTT
  20





1554938
TATAT-AATTTAAAATAA-AT-ATTATTTT-T
  21





1554942
AAAAAAAAAA-AAAAAAA-AAAAAAAWAAA-W
  22





1555085
ACAC-ACCAAACCCC--CACAACAAM---ACA
  23





1555204
AT-TAAT-AAA-TTTATTATAATAA-AAAATA
  24





1555236
AAAAAAAAAAA-AAAAA-ATAATAAAAAA-TA
  25





1555481
A-A-AA---AA-AA-AAAA-AA-AAWAA--TA
  26





1555562
AAA-AAAMAAA-AA----A-A-M-A--AA---
  27





1555572
A--TAA-AAAA----AT--AA-A-A--A----
  28





1555739
ACACAAA-A-A-CCC-CCAAAAAA-AAAA-A-
  29





1555772
G-G-GGG-G-G-TTTG--GGGGGGGGGGG-GG
  30





1556277
GAGAGGGGGGGAAAAGAAGAGG-GGRGGGGAG
  31





1556372
GAGAGGGGGGGAAA-GAAG-GGGGGGGGGGGG
  32





1556588
GCGCGGGGG-GCCCCGCCGGGGGGGGGGGGGG
  33





1556678
AA-AGG-GAGRAAAAGAAGAAAAAARGAGAAG
  34





1556781
TCTCTTCCTTTCCC-TCCTTTTTTTTTTTTTT
  35





1557165
CTTTTTTTCTYTTTTTT-TTCCTTCTTCTCTT
  36





1557549
AAAAAWA-A-AAAAAAAAAAAATAAWTA--A-
  37





1557751
AGA-AAAAAAA--G-A-GAAAAAAAAAAAAAA
  38





1557752
CGC-CCCCCCC--G-C-GCCCCCCCCCCCCCC
  39





1557771
TATATTTTTTT--A-TA-TTTTTTTTTTTTTT
  40





1557934
GGRGGAGGGGGGGGGGGGAGGGGGGRAGAGGA
  41





1557991
TTTTTTTTTTTTTTTTT-TTTTTCTTTTTTCT
  42





1558100
AAAAAAAAAAAAAAAAAAATAATTAWAAAA-A
  43





1558103
CCTCTTCCCCCCCCCTCCTTCCTTCTTCTC--
  44





1558128
GG-GGGG-GTGGGGT-GG--GGGGGG-GGG--
  45





1558129
TT-TTTT-TGTTTTG-TT--TTTTTT-TTT--
  46





1558137
T----TT-TTT-ATT-----TT-TT--TWT--
  47





1558318
AA-ATTA-AAWAAAAT-ATTAATTA--A-AT-
  48





1558319
AA-AATA-AAAAAAAA-A-TAATTA--A-AT-
  49





1558322
AAAAT-A-AAWAAAAT-A-AAATAA--A-AA-
  50





1558323
TTATA-T-TTWTTTTA-T-TTTA-T--T-TT-
  51





1558334
TTGTGGT-TTKTTTT--TGGTTGGT--T-TGG
  52





1558551
TCTCTTCCTCTCCCCTCCTTTTTTTYTTTTTT
  53





1558913
AAAAAGAAAAAAAAAAAAGAAAAAAR-AGAAG
  54





1558979
GGAGAGGGGGGGGGG--GGGGGGGGGGGGGGG
  55





1559151
ATTTTATTAT-TTTTTTTATAATT-WAAAATA
  56





1559399
ATATAA-AAAATTTTATTAAAAAAAAAAAAAA
  57





1559585
CACA-CCCCCCAAAACAACCCCCCCC-CCCCC
  58





1559603
CCCC-T-CCCCCCCCCCCTCCCCC-TTCTCCT
  59





1559659
GAGAGA-AGAGAAAAGAAAAGGAAGAAGAGAA
  60





1559787
AAAAACAAAAAAAAAAAACAAAA-AMC-CAAC
  61





1559970
TTKTTG-TTTT-TTTTTTGTTTTTTKGTGTTG
  62





1560043
TTATAT-TTATTTTTATTTTTTTT-TTTTTTT
  63





1560088
CCCCCCTCCTCCCCCCCCCCCCCCCCCCCCCC
  64





1560108
TGTGTTTTTTTGGGGTGGTTTTT-TTTTTTTT
  65





1560166
GGAGAGAGR-GGGGGAGGG-GGGGGGGGGGG-
  66





1560182
TTGTGTGTT-TTTTT-TTTTTTTTTTTTTTTT
  67





1560390
AAMACACAA-AAAAAC-AAAAAAAAAAAA-AA
  68





1560442
TTTTTTATTATTTTTTTT-TTTTTTTTTTTTT
  69





1560517
GARAGAAAGAGAAAAGAAAGGGAA-AAGAGAA
  70





1560584
CAMACA-ACCCAAAACAAACCCAA-AACACAA
  71





1560705
ACCMCCCCAC-CCCCCCCCAA-CCACCACACC
  72





1560784
AAAAAAG--G-AAAAAAAAAAAAAAAA-AAAA
  73





1560860
CTCTCTYTCCCTTTTCTTTCCCTTCTTCTCTT
  74





1561009
GGGGGGAGGA-GGGGG-GGGGGGGGGGGGGGG
  75





1561036
ACCCCCCCACACCCCCCCCAAACCACCACACC
  76





1561047
AAGAGAAAAAAAAAAGAAAAAAAA-AAAAAAA
  77





1561190
TTTTTTATTATTTT-TTTTTTTTTTTTTTTTT
  78





1561230
A-GAGAAAAAAAAAAGAA-AAAAA-AAA-AAA
  79





1561392
AA-AAAAAAA-AAAAAAAAAAAGG-GAAAAGA
  80





1561412
CC-CACCCCCCCCC-ACCCCCCCCC-C-CCCC
  81





1561429
CC-CTCTCCTCCCCCTCCCCCCCCC--CCCCC
  82





1561461
AARAGAAAAAAAAAAGAAAAAA-AAAAAA-AA
  83





1561493
AAAAAAAAAAAAAAA-AAAAAAGGAGAAA-GA
  84





1561533
CC-C-C--C--CCC---CCCCCAA--CCC-AC
  85





1561651
CT-T-CTC-TCTTTT-TTCCCCCCC-CCCCCC
  86





1561706
CC-CTCC-CCCCCCCTCC-CCCCCC-CCC-CC
  87





1561766
T----T-TT-T-AA-AA-TTTT--TTTTTT-T
  88





1561783
AAAA-A-AA-A-AA-A-AAAAAGGARAAAA-A
  89





1561792
TTCT-T-TTTT-TTTC-TT-TTTTTTTTTT-T
  90





1561828
ACGCAA-AACA-CCCGCCAAAACCAAAAAACA
  91





1561849
GGAGAGGGGGGGGGGAGGGGGGGGGGGGGG-G
  92





1561866
CC-CACCCCCCCCCCACCCCCCCCC-CCCC-C
  93





1561983
GGG-GG-GGGGARGR-R-GGGGGGGGGGGGGG
  94





1561989
ATTATA-AAAAWWWATW-AAAAAAAAAAAAAA
  95





1562128
GGGGSG-GG-GGGGG-GGGGGGSGGGGGGGGG
  96





1562155
AATATATAA-WAAAATAAAAAATTAWAAAATA
  97





1562239
CACACC-CC-CAAAACA-CCCCCCCCCCCCCC
  98





1562453
CCCCCTCTC-YCCCCCCCTCCCCCCTTCTCCT
  99





1562660
GGGGG--CGGCGGG-G--CGGGGGGSCGCGGC
 100





1562719
ATTTT-TTA-TTTTTTT-TAAAAAAWTATAAT
 101





1562751
GTGTGGGGG-GTTTTGT-GGGGGGGGGGGGGG
 102





1562768
GGSGCGGGG-GGGGGCG-GGGGGGGGGGGGGG
 103





1562844
AG-GGG-GAGRGGG-GGGGAAAAAARGAGAA-
 104





1562877
A-ATAWAWAAAW--AAA-AAAAAAAAWAAAAA
 105





1562884
A-A-AAGAAGAAAAAAA-AAAAAAAAAAAAAA
 106





1563239
TTTTTKT-TTTTTTTTTTTTTTTTTK-TKTT-
 107





1563245
GGGGGTGTGGGGGGGGGGTGGGGGGK-GTGG-
 108





1563541
AGGGGG-GA-AGGGGGGGGAAAAAARGAGAAG
 109





1563768
AAAAACACAAAAAA-AAACAAAAAAMCACAAC
 110





1563924
GTTTTTTTGTKTTTTTTTTGGGGGGKTGTGGT
 111





1564092
AAAAAATAATAAAAAAA-AAAAAAAAAAAAAA
 112





1564318
GTGTGGGGGGGTTTTGTTGGGGGGGGGGGGGG
 113





1564390
CTCTCCCCCCCTTTTCTTCCCCCCCCCCCCCC
 114





1564756
AACAAAAAAAAAAAACAAAAAAAAAAAAAAAA
 115





1564816
CCMCACCCCCMCCCCACMCCCCCCCCCCCCCC
 116





1565451
CCTCTTC-CCYCCCCTCCTCCCCCCYTCTCCT
 117





1565457
TTGTGGT-TTKTTTTGTTGTTTTTTKGTGTTG
 118





1565592
GGTGTTGTGGTGGGGTGGTGGGGGGKTGTGGT
 119





1565646
TTTTTTGTTGTTTTTTTTTTTTTTTTTTTKTT
 120





1565826
TTAT-A-AT-WTTTTAT-ATTTTTTWATATTA
 121





1565857
CT-T----C-CTT---T--CCCCCCC-C-CC-
 122





1565858
TC-C----C-C-C---C--CTCCCCC-C-CC-
 123





1565931
A------AA-AG---G---AAAAAA--A-AA-
 124





1565944
T--C----T-TCC--T---TTTTTT--T-TT-
 125





1566440
C--CG---C-CCCC-CCC-CCCCCC-SCSCCS
 126





1566449
G-GTG---G-GTT---TT-GGGGGGGGGKGGG
 127





1566453
CCTCTC--CCYCCC-TCCCCCCCCCCCCCCCC
 128





1566550
GGCGCCGCGGSGGGGCGGCGGGGGGSCGCGGC
 129





1566626
GGKGTGGTGGKGGGGTG-GGGGGGGGGGGGGG
 130





1566638
TTKTGTTTTTKTTTTGT-TTTTTTTTTTTTTT
 131





1566726
GGTGTTGTGGKGGGGTGGTGGGGGGKTGTGGT
 132





1566728
TTCTCCTCTTYTTTTCTTCTTTTTTYCTCT-C
 133





1566793
TTCT-CTCTTTTTTTCTTCTTTTTTYCTCTTC
 134





1566821
T-CT--TCTTTTTTTCT-CTTTT-T--TCTTC
 135





1566823
A-CC--CCACACCCCCC-CAAAA-A--ACAAC
 136





1566867
T-CT-CT-TTTTTTTCT--TTTTT---T-TT-
 137





1566882
C-CC-CT-CTCCCCCCCC-CCCCC---CCCC-
 138





1566890
G-AG-AG-GGGGGGGAGG-GGGGGGG-GAGG-
 139





1566891
C-CT-CC-CCCTTTTCTT-CCCCCCC-CCCC-
 140





1566906
TTTT-YT-TTTTTTTTTT-TTTTTTTC-YTTC
 141





1566911
GG-G-GA-GAGGGGG-GG--GGGGGGG-GGGG
 142





1566963
TC-C----T-TCC---C--TTTTTTT-T-TT-
 143





1566964
TC-C----T-TCC---C--TTTTTTT-T-TT-
 144





1566975
T--T----TCTCCCC-C--TTTTTTT-T-TT-
 145





1567049
GGAGGAGGG-GGGGGGGGAGGGGG-AAGAGGA
 146





1567111
TATATTTTT-TAAAATAA-TTTTTTTTTT-TT
 147





1567133
GGAG-AGAGGRGGGGAGGAGGGGGGRAGA-GA
 148





1567183
CCTCTTCTCCYCCCCTC-TCCCCCCTTCTCCT
 149





1567261
TTCTCTTCT--TTTTCTTTTTT-TTTTTTTTT
 150





1567327
GGAGAAGAGGRG-GGAGGAGGGGGGRAGAGGA
 151





1567332
ACCC-CCCACMC-CCCCCCAAAAAAMCACAAC
 152





1567385
ACACAAAAAAACCCCACCAAAAAAAAAAAAAA
 153





1567427
G-GGGG-GG-GGTKKGGGGGGGGGG-G-GGKG
 154





1567428
T-TTTT-KT-TTGKKTT-TTTTTTT-T-TTKT
 155





1567438
TTCTC--TT-TTTTTCT--TTTTTT--TCTT-
 156





1567439
TTCTC---T-TYTTYCT--TTTTYT--TCTY-
 157





1567449
TTGTGGTTTTTTTTTGT--TTTTTT-GTGTTG
 158





1567459
CCACAACCCCCCCCCAC--CCCCCC-ACACCA
 159





1567536
GAGAGG-GGGGAAAAGAAGGGGGGGGGGGGGG
 160





1567581
GGGGGA-GG-GGGGGGGGAGGGGG-GAGAGGA
 161





1567585
CCTCTT-CC-YCCCCTCC-CCCCC-CTCTCCT
 162





1567588
TTTTTCTTT-TTTTTTTT-TTTTT-TCTCTT-
 163





1567602
AGGGGGGGA-RGGGGGGGGAAAAAARGAGAAG
 164





1567636
AAGAGAAAAAAAAAAGAAAAAAAAAAAAAAAA
 165





1567642
TTTTTATTTTTTTTTTTTATTTTTTWATATTA
 166





1567648
AAGAGGAAAAAAAAAGAAGAAAAAARGAGAAG
 167





1567659
GGGGGAGGGGGGGGGGGGAGGGGGGRAGAGGA
 168





1567661
CCCCCCTCCTCCCCCCCCCCCCCCCCCCC-CC
 169





1567665
CC-CCCTCCTCCCCCCCCCCCCCCCCCCC-CC
 170





1567673
GG-GAGGGGGRGGGGAGGGGGGGGGGGGGGGG
 171





1567679
TTCTCCTTTTYTTTTCTTCTTTTTTTCTCTTC
 172





1567685
CCCCCCTCCTCCCCCCCCCCCCCCCCCCCCCC
 173





1567691
TTTTTGTTTTTTTTTTTTGTTTTTTTGTGTTG
 174





1567714
AAAAAGAAAAAAAAAAAA-ARAAAAAGAGAAG
 175





1567716
CCCCCTCCCCCCCCCCCC-CCCCCCCTCTCCT
 176





1567728
TTYT-CTTTTTTTTTCTT-TTTTTTTCTCTT-
 177





1567768
CCTCTC-CCC-CCCCTCCCCCCCCCCCCCCCC
 178





1567770
CCCCCT-CCC-CCCCCCCTCCCCCCYTCTCCT
 179





1567788
CCCCCT-CCCCCCCCCC-TCCCCCCYTCTCCT
 180





1567986
ATTTTT-TATWTTTTTTTTAAAAAAWTATAAT
 181





1568005
AGGGG-AAAARGGGGGGGGAAAAAAR-AGA-G
 182





1568012
GGGGG-GGGGGGGGGGGG-GGGGGGG-GAGGA
 183





1568019
AAAAA-AAAAWAAAAAAA-AAAAAAA-ATAA-
 184





1568021
CTCTC-CCCCCTTTTCTT-CCCCCCC-C-CC-
 185





1568085
ATTTT-AAAAWTTTTTTT-AAAAWAA-A-AA-
 186





1568120
AAGAG-AAAAAAAAAGAA-AAAAAAA-A-AA-
 187





1568124
CCTCT-CCCCSCCCCTCC-CCCCCCC-C-CC-
 188





1568168
AAAAA-AAAAAAAAAAAAAAAAAAAM-ACAAC
 189





1568196
GGRGAGGGGGGGGG-AGGGGGGGGGGGGGGGG
 190





1568214
AARAAGAAAARAAAAAAAGAAAAAARGAGAAG
 191





1568478
GGGGGCGGGGSGGGGGGGCGGGGGGSCGCGGC
 192





1568490
AAAAAACAACAAAAAAAAAAAAAAAAAAAAAA
 193





1568548
TTWTTATTTTTTTTTTT-ATTTTTTWATATTA
 194





1568634
TTTTTATTTTTTTTTTTTATTTTTTTA-ATTA
 195





1568727
TTTTTGTTTTTTTTTTTTGTTTTTTTGTGTTG
 196





1568784
CCCCCG-CCCCCCCCCC-GCCCCCCCG--CC-
 197





1568820
GGGGGA-GGGGGGGGGG-AGGGGGGGAGAGG-
 198





1568826
AAAAAG-AAAAAAAAAA-GAAAAAAAGAGAA-
 199





1568868
CCCCCACCCC-CCCCCC-ACCCCCCCA-A-C-
 200





1568870
TTTTTGTTTT-TTTTTT-GTTTTTTTG-G-T-
 201





1568916
G--G-G--G-GG---GG--GGGAA-G-GGGA-
 202





1568919
G--K-T--G-GG---GG--GGGGG-G-G-GG-
 203





1568929
A-GAG---AGAAA-AGA--AAAAA-A-A-AA-
 204





1568939
G-GGG---GGGGG-GGG--GGGGG-G-G-GGA
 205





1568952
C-CTCC--CCCTT-TCT-CCCCCC-CCC-CCC
 206





1568963
G-GGGA--GARGGGG-G-AGGGGG-GAG-GGA
 207





1569035
GG-G-TTGGTKGGGG-GG-GGGGGG--G-GG-
 208





1569058
CTCTCCCCC-CTTTTCT--CCYCCC-CCCCCC
 209





1569074
TTTTTCTTTTTTTTTTTTCTTTTT--CTCTTC
 210





1569093
AAAAACAAAAAAAAAAAACAAAAA-ACACAAC
 211





1569128
CCCCCTCCCCCCCCCCCC-CCCCC-CTCTCC-
 212





1569136
TTTTTCTTTTTTTTTTTT-TTTTTTTCTCTT-
 213





1569138
TTCTCTTTTTTTTTTCTT-TTTTTTTTTTTT-
 214





1569140
AAAWATAAAAAAAAAAAA-AAAAAAATATAA-
 215





1569146
CCCCCCTCCTCCCCCCCC-CCCCCCCCCCCC-
 216





1569167
AGGGG-AGAARGGGGG-G-AAAAAAA-AGAA-
 217





1569185
GGAGA-GGGGRGGGGA-G-GGGGGGG---GG-
 218





1569190
CTCT--TCC-CTTTTC-T-CCCCCCC---CC-
 219





1569227
CAA-----C-CA--A----CC-CCCC---CC-
 220





1569374
TCCCCCC-T--CCC-CCCCTT-TTT--T-TT-
 221





1569395
GGGGGAG-GGGGGG-GGGAGGGGGG--GAGG-
 222





1569396
GGAGAGG-GGGGGG-AGGGGGGGGG--GGGG-
 223





1569405
GGGGGTG-GGGGGG-GGGTGGGGGGT-GTGGT
 224





1569441
TTTTTTTTTTTTTTTTTTTTTTTTT-GTGTT-
 225





1569442
GGGGGKGGGGGGGGGGGGGGGGGGG-TGTGG-
 226





1569557
TTWTATTTT--TTTTTT-TTTTTTTWTT-TTA
 227





1569564
AAWAA-AAA-AAAAAWA-TAAAAAAAWA-AAA
 228





1569704
TTWTATTTTTTTTTTAT-TTTTTTTTTTTTTT
 229





1569788
C-CCCGCCCCCCCCCCCCGCCCCCC-GCGCCG
 230





1569791
G-GGGTGGGGGGKKGGGG-GGGGGG--GTGGT
 231





1569794
T-TTTWTTTTTTTTTTTTATTTTTT-ATTTTT
 232





1569797
A-AAATAAAAAWWWAAWAGAAAAAA-AATAAT
 233





1570104
AAAA--AA--AAAA-AA--AAAAAA--A-AAT
 234





1570126
GGGG-CGGG-GGGG-GG-CGGGGGGSCGCGGC
 235





1570416
GGTGTGGGGGGGGGGTGGGGGGGGGGGGGGGG
 236





1570660
TTTTTTCTTCTTTTTTTT-TTTTTTTTTTTTT
 237





1570881
TTTWTT-T-TTWWWATWW--TTTTTTTTTTTT
 238





1571274
TTYTTC-TTTTTTT-TTTCTTT-TTYCTCTTC
 239





1571487
TCCCCCCCTCYCCCCCC-CTTTTTTYCTCTTC
 240





1571774
AGGGGGGGA-GGGGGGGGGAAAAAARGAGAAG
 241





1572279
GAAAAAAAGARAAAAA-AAGGGGGGRAGAGGA
 242





1572432
TT-TTC-TTTTTTT-TT--TTTTTTYCTCT-C
 243





1572888
T--TGTTTT-TTTTTGT-TTTTT-TTTTTTT-
 244





1572987
ACCCCCCC-CMCCCCCCCCAAAAAAMCAC-AC
 245





1573060
CGGGGGGGCGGGGGGGGGGCCCCCCSGCGCCG
 246





1573221
ACCCCCCCACMCCCCCC-CAAAAAAACACAAC
 247





1573239
AAMACA-AAAAAAAACA-AAAAAAAAAAAAAA
 248





1573328
GGCGGC-GGGGGGGGGGGCGGGGGGCCGCGGC
 249





1573482
CCTCTC-CCCCCCCCTCCCCCCCCCCCCCCCC
 250





1574247
AATAATAAAAAAAAAAAATAAAAAAWTATAAT
 251





1574545
GGGGGTGGGGGGGGGGGGTGGGGGGKTGTGGT
 252





1575684
TCTCTT-TTTTCCCCTCCTTTTTTTTYTTTTT
 253





1575961
CGCGCCCCCCCGGGGCG--CCC-CCCCCCCCC
 254





1576052
G--G-A--GGGAAG-A---GGGGGGGGG-GG-
 255





1576055
TG-K-G--TTTGGT-G---TTTTTTT-T-TT-
 256





1576059
CCCS-C--C-CCCG-C---CCCCCCCCCCCC-
 257





1576291
CCCC-TC-CC-CCCCC-C--CCCCCCTCTCC-
 258





1576327
A----A--A----T-----WWAAAA----AA-
 259





1576345
A-------A-A-TAT----AAAAAA----AA-
 260





1576372
AACACA--A-A-AAAC-A-AAAAAAAAAAAAA
 261





1576552
A-T-TT-AAAW-TTTTTTTAAAAAATT---AT
 262





1576569
G-G-GG--GGG-GRGG-AGGGGGGGG----G-
 263





1576597
GGGGGG--GGGGGGGGGGAGGGGGG-A-A-R-
 264





1576682
GTTTTTTTGTGTTTTTTTTGG-GGGGT-TGGT
 265





1576719
GGRGGAG-GGGGGGGGGGRGGGGGGGAGAGGA
 266





1576755
G-A--G--G-G-AAAA---GGGGGGGGGGGGG
 267





1576816
ACACAAA-AAACCCCACCAAA-AAAAAAAAAA
 268





1576824
CTCTCCCCCCCTTTTCTTC-C-CCCCCCCCCC
 269





1576881
CC-CCTCCCCCCCCCCCCT-CCCCCTTYTCCT
 270





1577186
AAA-AATAATAAAAAAA-AAAAAAAAAAAAA-
 271





1577187
AAA-AATAAT-AAAAAA-AAAAAAAAAAAAA-
 272





1577188
AAA-AATAATTAAAAAA-AAAAAAAAAAAAA-
 273





1577205
ACC-CCCCACACCCCC--CAAAAAAMCACAAC
 274





1577558
T----A--T-TT-T-TTT-TTTTTTTAT-TTT
 275





1577559
T----A--T-TT-T-TTT-TTTTTTTAT-TTA
 276





1577560
A----A--A-WT-T-TTT-AAAAAAAAA-AAA
 277





1577562
A----T--A-AA-A--AA-AAAAAAA-A-AAT
 278





1577563
A----T--A-AA-A--AA-AAAAAAA-A-AAT
 279





1577633
TT-T-TG-TGTTTTTT-T-TTTTTTTTTTTTT
 280





1577638
AG-G-AA-AAAGGGG--G-AAAAAAAAAAAAA
 281





1577661
TT-T--C-T-TTTTT-TT-TTTTTTYC-CTTC
 282





1577669
TT-T--G-T-TTTTT-TT-TTTTTTTGT-TTG
 283





1577673
CT-T--C-CCCTTTT-TT-CCCCCCCCC-CCC
 284





1577684
TC----C-TCYCCCC-CC-TTTTTTT-T-TT-
 285





1577691
TC-C----TCTCCCC--C-TTTCTTT-T-TT-
 286





1577708
GG-GC---GCGGG-G--G-GGGGGGGCG-GG-
 287





1577712
TC-CCTT-TTTCC-C--C-TTTTTTTTT-TT-
 288





1577745
CCCCCAC-CCCCCCCC-CACCCCCCCA-ACCA
 289





1577746
GGRGGAG-GGGGGGGGGGAGGGGGGGA-AGGA
 290





1577755
GCGCGGG-GGGCCCCGCCGGGGGGGGG-GGGG
 291





1577762
GAGAGGG-GGGAAAAGAAGGGGGGGGG-GGGG
 292





1577765
AAWATAA-AAAAAAATAAAAAAAAAAA-AAAA
 293





1577792
GGTGTG--GGAGGGGTGGGGGGGGGGGGGGGG
 294





1577795
CGCGCC--CCCGGGGCGGCCCCCCCCCCCCCC
 295





1577857
TWTTTTTTTTTTATTTWTTTTTTTT-TTTTTT
 296





1577867
CCCTCCCCCCCC-TTCTTCCCCCCCCCCCCCC
 297





1577890
TTTTTTTGTTKTTTTTTTTTTTTTTTTTTTTT
 298





1578154
TCCCCC-CTCYCCCCCCCCTTTTTTYCTCTTC
 299





1578364
TATATT-TT-TAAAA-AA-TTTTTTT-T-TTT
 300





1578462
AAAAAG-AAAAAAAAAAAGAAAAAARGAGAAG
 301





1578538
TTYTTC-TTTTTTTTTTTCTTTTTTYCTCTTC
 302





1578727
GGGGGGTGGTGGGGGGGGGGGGGGGGGGGGKG
 303





1578925
AAAAAAACAAMAAAAAAAAAAAAAAAAAAAAA
 304





1579270
AAAAAAGAAGAAAAAAAAAAAAAA-AAAAAAA
 305





1579346
TGKG-TGGTGKGGG-GGGT-TTTTTTTT-TTT
 306





1579707
TTTTTTGTTGTTTTTTTTTTTTTTTTTTTTTT
 307





1579708
CCCCCCTCCTCCCCCCCCCCCCCCCCCCCCCC
 308





1580305
CTTTTTTTCTYTTT-TTTTCCCCCCYYCTCCT
 309





1581345
GGRGAGGGGGGGGGGAGG-GGGGGGGGGGGGG
 310





1581602
TCCCCCCCTCYCCCCCCCCTTTTTTCCTCTTC
 311





1581762
TTYTTCTCTTTTTTTTTTCTTTTTTTCTCTTC
 312





1581931
TTTTTGTTKTTTTTTTTTGTTTTTTKGTGTTG
 313





1582195
CTTTTTTTCTYTTTTTTTTCCCTTCYTCTCTT
 314





1582351
AGGGGG-GAAAGGGGGGGGAAAAAAR-AGAAG
 315





1582357
GAAAAA-AGGGAAAAAAAAGGGGGGRAGAGGA
 316





1582363
CTT-TTTTCCCTTTTTTTTCCCCCCYTCTCCT
 317





1582479
A-GA-A-AAAAAA-AG--AAAAAAAAAAAAAA
 318





1582483
A-AA-R--AAA---AA--AAAAAAAARAGAAA
 319





1582484
T-TT-W--TTT---TTA-TTTTTTTTWTATTT
 320





1582487
T-GT-K--TTT-T-G-T-GTTTTTTTTTTTTG
 321





1582566
AAAAAAAAATTAAAAAAAAAAAAAAAAAAAAA
 322





1582570
TCCCCCCCTTTCCCCCCCCTTTTTTYCTCTTC
 323





1582623
G--G-K--G-G-G---KT-GGGGGGK-G-G--
 324





1582625
G--G-K--G-G-G---KTGGGGGGGG-G-G--
 325





1582636
TT-T-A--T-T-TT-TTTATTT---W-T-T--
 326





1582637
TT-T-G--TTT-TT-TTTGTTT---K-T-T--
 327





1582694
CC-C-CCCCCCCCC-CCCCCCC-ACCCC-CAC
 328





1582722
GAGG-G--GG-------AAGGG-RGGAG-GG-
 329





1582723
ATAA-A--AA-----A--TAAA-WAATA-AA-
 330





1582737
G-GG-G-GGG-----G--GGGG-AGG-G-GA-
 331





1582739
A-GA-G-GAA-A---A--GAAA-AAA-A-AA-
 332





1582767
C-TT-T-TCCCT---T--TCCC-CCC-C-CCT
 333





1582785
A-G-----AAA--------AAA-AAA-A-AAG
 334





1582786
T-G-----TTT--------TTT-TTT-T-TTG
 335





1582824
GGGGGGG-GGGGGG----GGGG-AGGGG-GAG
 336





1582825
CCCCCCC-CCCCCC-C--CCCC-TCCCC-CTC
 337





1582843
AGGGGGGGAAAGGG-GGGGAAAAAARGAGAAG
 338





1582860
GAAAAAAAGGGAAA-AAAAGGGGGGGAGAGGA
 339





1582871
CCCCCCCCCCCCCCCCCCCCCCTTC-CCCCTC
 340





1582941
AAAAAAAAAAAAAAAAAAAAAACCAAAAAACA
 341





1582946
TAAAAATATAWAAAAAAAATTTAATWATATAA
 342





1583115
AAAAAAAAAAAAAAAAAAAAAARGAAAAAAGA
 343





1583431
GCCCCCCCGCSCCCCCCCCGGGCCGSCGCGCC
 344





1583461
GAGAGG-GGGGAARAGAAGGGGGGGGGGGGGG
 345





1583655
TTTTTTG-TTTTTTTTTT-TTT--TK-T-T--
 346





1583764
CCMCCAAACCCCCCCCM-ACCCAACMACACAA
 347





1583859
GGGGGTTTGGGGGGGGG-TGGGTTGKTGTG--
 348





1583939
ATWTAT-TAAA-TT-AT--AAATTA-TATATT
 349





1584144
TAWATAAATAWAAAATA-ATTTAATWATW-AA
 350





1584266
TKTTTT-KT-T-TT-T--TTTT-TTTTT-TTT
 351





1584267
GRGGGG-RG-G-GG-G--GGGG-RGGGR-G-G
 352





1584541
AAGAGG-GAGAAAAAGAAGAAAGGARGAGAGG
 353





1584669
AGGG-GGGAGGGGGGGR-GAAAGGARGAGAGG
 354





1585055
TTTTTA-ATTTTTTTTTTATTTTTTWWTATTA
 355





1585295
TTTTTT-AT-TTT---TT-TTTTTTT-T-TA-
 356





1585304
T-TW---TTTT-WT--T-TTTT--TT-T--T-
 357





1585332
T-GG---GTGT-GG-GG-GTTT--TK-TG-G-
 358





1585543
GAAAAAAAGAAAAAA-AAAGGGAAGRAGAGAA
 359





1585768
TA--AA-A-AT-AA-AT-AT--AA---TAT--
 360





1586016
TA--TT-TT-T-WT-TT-TTTTTTTT-TTTT-
 361





1586018
AT--AA-AA-A-WA-AA-AAAAAAAA-A-AA-
 362





1586074
AA-AT--WA-AAAA-AA-AAAAAAAA-A-A--
 363





1586080
AM-MAA-AA-MCAC-CM--AAA--AA-A-A--
 364





1586082
CM-MCC-CC-C-CA-AM--CCC--CCCC-C--
 365





1586217
ATTTTT-TATWTTTTTT--AAATTAWTATATT
 366





1586324
TAT-TAATTT---AAT-A-TTTTTTW-TATTA
 367





1586325
AAA-AAAAAA---AAA-A-AAAAAAW-AWAAT
 368





1586334
ACC-MCCCACC--CCC-C-AAACCAACACACC
 369





1586942
GTTT-TT-GTTTTTTTTT-GGGTTGGTGTGT-
 370





1586943
TAAA-AA-TAAAA-AAAA-TTTAATTATATA-
 371





1586945
TGGG-GG-TGGGGGGGGG-TTTGGTTGT-TG-
 372





1587141
GTTTTT-TGTKTTTTTTTTGGGTTGKTGTGTT
 373





1587173
CTTTTT-TCTCTTTTTTTTCCCTTCYTCTCTT
 374





1587518
CTTTTTTTCTYTTTTTTTTCYCTTCYTCTCTT
 375





1587643
GTKTTTTTGTGTTTTTTTTGGGTTGKTGTGTT
 376





1588896
AAWATAAA--AAAAATAAAAAAAAAAAAAAAA
 377





1589020
ATATATTAAAATTTTATTTAAAAAAWTATAAT
 378





1589177
T-T-T--WTT-TTW--W-TTTTA-TT-TWTTA
 379





1589187
G--GAG-GGR-R-G--G-GGGGGGGG-GGG-G
 380





1589259
TTATATTTTTTTTTTAT-TTTTTTTTTTT-TT
 381





1589715
GTTTTTTTGTKTTTTTTTTGGGTTGKTGTGTT
 382





1589780
AT-TTTTTATTTTT-TT-TAAATTAWTATATT
 383





1589870
TTTTTTTTTTTTTTTTTTTTTTCCTTTTTTCT
 384





1589938
CCGSGCCCCCCCCCCGC-CCCCCCCCCCCCCC
 385





1591968
GTGKGGGGGGGGGGGGGGGGGGGGGGGGGGGG
 386





1592485
GGAGAGG-GGGGGGGAGG-GGGGGGGGGGG-G
 387





1592711
TYYTTTTTTTTTTTTTTTYTYTTTTT--TTTT
 388





1592832
TYCTTTYTYTTTTT-YYTTTYTYCTTYTTYTY
 389





1592838
CYYCCCYCYCCCCC-YCCCCYCYTCCCCCCCY
 390





1593700
AAAAAAAAAAAAAA-AAAAAAACCAAAAAACA
 391





1593863
AAWATAAAA-AAAA-TAAAAAATTAAAAAATA
 392





1594079
TTWTATTTTTTTTTTATTTTTTT-TTTTTT-T
 393





1594162
GGKGTGGGGGGGGGGTGGGGGGGGGGGGGGGG
 394





1594233
TTYTCTTTTTTYTTTCTTTTTTTTTTTTTTTT
 395





1594426
A-GGGGG-AGRGGGGGGGGAAAGG-RGAGAGG
 396





1594480
T-TYTCTTTTTCCCCTC-CTTTTTTYCTCTTC
 397





1594800
A-TATAAAAAAAAAATAAA-AAAAAAAAAAAA
 398





1594961
GGGKGTGG-GGTTTTGTTTGGGGGGKTGTGGT
 399





1594983
AAWATAAAAAAAAAATAAAAAAAAAAAAAAAA
 400





1595159
TTTTTTTTKTKTTTTTTTTTTTTTKTTTTTTT
 401





1595360
AAARAGA-AAAGGGGAGGGAAAAAAGGAGAAG
 402





1595545
GGGGG-GGGGG-AR-G-A-GGGGGG--G-GG-
 403





1595560
TTCTCTTCTT-TTT-CTT-TTTTTT-TT-TT-
 404





1595571
CCTCTCCTCC-CYC-TCC-CCCCCC-CCCCC-
 405





1595887
CT-Y-C--CTCCCC-C-C-CCCCCC-CCCCCC
 406





1595916
TTCTCT-CTTYTTTT--T-TTTTTTTTTTTTT
 407





1595942
AAGAGAAGAAAAAAAGAAAAAAAAAA-AAAA-
 408





1596204
TT-TCTT-TYTTTTTCTTTTTT-TTTTTTT-T
 409





1596317
TTCTCTT-TTY-TTTCTTTTTTTTTTTT-TTT
 410





1596434
AATATA--AWTAAAATAAAAAAAAA-AAAA-A
 411





1596445
AATATA--AWWAAAATAAAAAAAAA-A-AA-A
 412





1597089
CAAAAAAAM-AAAA-AAAACCCCAC-ACACA-
 413





1597188
TAAAAAATTATAAAAAAAATTTTATTATATAA
 414





1597206
ATWT-TTAA--TTTTTTTTAAAATAWTATATT
 415





1597307
CTTTTT-CC-CTTT-TTTTCCCCTCTTCTCTT
 416





1597320
TCCCCC-TT-TCCC-CCCCTTTTCTCCTCTCC
 417





1597401
CCCCCTCCCCCTTTTCTTTCCCCCCY-CTCCT
 418





1597531
G-GC-C-GGCG---C-C--GG-G-GG-G-G-C
 419





1597534
G-GA-G-GG-G----GA--GG-G-GG-GGG-A
 420





1597566
A-TTTT-AATATTT-TTTTAA-ATAWTATATT
 421





1597599
TAWAAAATTATAAAAAAAATTTTATWATATAA
 422





1597812
CCYCTCCCCCCCCCCTCCCCCCCCCCCCCCC-
 423





1597849
TCCCCC-TTCTCCCCCC-CTTTTCTYCTCTC-
 424





1597865
ATTTTT-AATATTT--T-TAAAA-AATA-AT-
 425





1597868
A-TAAA-A-AA-AA--W-AAAAA-AA-ATAA-
 426





1597869
G-AGRG-G-GG-GG--R-GGGGGAGG-GAGR-
 427





1598084
AAAWA-AAAAA----A---AA-AAAA-ATAAT
 428





1598085
CCCMC-CCCCC----C---CC-CCCC-CACCA
 429





1598141
GCGGGGGGG-G-GG-GGGGGGGGCGGGGGG-G
 430





1598160
G-AA-A-GG-G--A-AAAAGGGG-GGAGAG-A
 431





1598175
GGTT---TG-K-GT--TT-GGG--G--GTG--
 432





1598279
ACMCCCAAACACCCCCCCCAAAACAMCACACC
 433





1598409
A-A--AAAA-A-AATAA--AAAAAAA---AT-
 434





1598416
TGTG-KTTKGT--TGTT--TTTTTTTG--TGG
 435





1598417
AAAA-AAAAAA--TAT---AAAA-AAA-AAAA
 436





1598418
TTTT-TTTTTT--ATW-T-TTTTTTTT-TTTT
 437





1598562
TGGGGGTTT-TGGGGGGG-TT-TGTTGTGTGG
 438





1599197
T-W-TT---WTT-T-----T--TT----TT-T
 439





1599227
CGCGCCC--CCCCCCCC-CCCCCC-C-CCC-C
 440





1599306
TCCCCC-T-CTCCCCCCCCTTTTCTYCTCTCC
 441





1599529
A---T-AAA-A--------AAAA-AA-A-A-T
 442





1599531
GA--A-GGG-G-A------GGGGAGG-G-G-A
 443





1599532
ACC-C-AAA-ACC------AAAACAA-A-A-C
 444





1599608
CTYTTTCCCTCTTTTTTTTCCCCTCYTCTCTT
 445





1599686
TGKGGG-TTGTGGGGG-GGTTTTGTK-TGTGG
 446





1599688
TCYCCC-TTCTCCCCCCCCTTTTCTY-TCTCC
 447





1599708
GGGGAG-GGGGGG--AGGGGGGGGGG-GGGGG
 448





1599712
GG-GGG-GGGGGG--GGGGGGGGAGG-GGGAG
 449





1599720
CG-GGG-CCGCGGG-GGGGCCCCGCS-CGCGG
 450





1599826
T-A-AGTTT-T--GGA--GTTTTGTT-T-TGG
 451





1599827
C-C-CTCCC-C--TTC--TCCCCCCC-C-CCT
 452





1599828
G-R-AAGGG-G--A-A--AGGGGAGG-G-GA-
 453





1599836
T-K-GGTTT-T----G--GTTTTGTK-T-TG-
 454





1599868
T---T--CT-Y-C------TTTTCTY-TCT--
 455





1599869
G------GG-G-A------GGGGAGR-GAG--
 456





1599900
G------GG-G--------GGGG-GG-GCGC-
 457





1599907
AG-G---GA-RG-G-GG-GAAAA-AA-AGAG-
 458





1599975
A-AG-GAAA-AGG-GGG-GAAAAGAAGAG-GG
 459





1599986
C-C---CCC-C-T-TCT-TCCCC-CCTCTCCT
 460





1599993
G-G---GGG-G----AA--GGGG-GG-G-G--
 461





1600015
G-----GGG-G--A-A---GGGG-G--GAGA-
 462





1600017
G-----GGG-G--A-A---GGGG-G--GAGA-
 463





1600060
CGG-G-ACC---GG-----CCCC-CSGCGC--
 464





1600072
G-C---GGG-G-CC-----GGGG-GGCG-G--
 465





1600084
T----CTTT-T-CC--C--TTTT-TT-T-T--
 466





1600128
C-C-TCCCC-CC-CCTCC-CCCC-CC-CCCC-
 467





1600162
C-C---CCC-C-G-G----CCCC-CC-C-C--
 468





1600179
A--TT-TAA-ATTTTTTT-AAAA-AW-A-A-T
 469





1600193
T--CCCCTT-TCCCCCCC-TTTT-TY-T-T-C
 470





1600209
G-GGGGGGG-GGGGGGGGGGGGG-GGCG-G-C
 471





1600945
TCYCCCCTTCTCCCCC--CTTTTCTCCT-TCC
 472





1600951
CTYTTTTCCTCTTTTT--TCCCCTC-TCTCTT
 473





1600980
CTYTTT-CCTCTTTT-TTTCCCCTC-TCTCTT
 474





1600987
TTWTATTTTTTTTTTATTTTTTTTT-TTTTTT
 475





1601238
AATATAAAAAAWAAATAAAAA-AAAAAAA-AA
 476





1601551
ATTTTTTTA-WTTTTTTTTAAAATAWTATTTT
 477





1602219
GGGKGTGGGGGTTTTGTTTGGGGGGKTGTGGT
 478





1602244
CCSCGCGGCCCCCCCGCCCCCCCCCCCCCCCC
 479





1602297
ATATAAAAATAAAAAAAAAAAAATAAAAAATA
 480





1602308
ACAMAAAAACAAAAAAAAAAAAACAAAAAACA
 481





1602593
GGGGGGGGG-GTGG-K---GGG--G--G-G-G
 482





1602594
ATAWTTAAA-AA-A-T---AAA--A--A-A-T
 483





1603109
TTTT-ATTTTT-A-T-T--TTTT-TWAT-TA-
 484





1603138
AA-A-TAAAAA-----A--AAAA-AAA--AA-
 485





1603142
AG-R-GAA-AAGGG-----AAAAGAAAA-AA-
 486





1603143
AA-A-AAA--AAAA-----AAAAAAW-A-AT-
 487





1603220
CCCYCTC-CCCTTTT-TTTCCCCTCYTCTC-T
 488





1603235
CCCYCTC-CCCTTTTCTTTCCCCTCYTCTC-T
 489





1603332
TTYTCTTTTTTTTTTCTTTTTTTTT-TTTTTT
 490





1603367
CTTTTTCCCTCTTTTTTT-CCCCTC-TCTCTT
 491





1603440
TTTT----TTTTKT-T-G-TTTT-TTTT-TK-
 492





1603441
GGGG----GGGGGG-G-A-GGGG-GGGG-GR-
 493





1603653
AAAAAGAAAAAGGGGAG-GAAAAGARGAGAGG
 494





1603713
AGG-GGGGA-GGGGGGGGGAAA-GARGAGAGG
 495





1603719
ATA-AAAAA-TAAAAA-AAAAAAAAAAAAAAA
 496





1603723
CCC-CGCCC-CGGGGC-GGCCCCGCSGCGCGG
 497





1603741
TTTTT-TTT-T--CCT---TTTTCTYCT-TCC
 498





1603750
TTTTT-TTT-T----T---TTTT-TYCT-TCC
 499





1603771
GGGGG-GGG-G--C-G---GGGG-GG-GC--C
 500





1603774
CCTCTCCCC-C--C-T---CCCC-CC-CC--C
 501





1603778
TCCCCCCCT-Y--C-C---TTTT-TT-TC--C
 502





1603794
GGGGGAGGGGGAAA-GA--GGGGAGGAGA-AA
 503





1603797
TCCCCTCCTCYTTT-CT--TTTTTTTTTT-TT
 504





1603800
AAAAAGAAAAAGGG-AG--AAAAGAAGAG-GG
 505





1603857
CGARGA-GCGCAAAAGA-ACCCCACCAC-CAA
 506





1603877
CTCTCCCCCTY--YYC--YCCCCCCCCC-CCC
 507





1603887
AAAAAAAWAAA--AA---A-AAAAAATAAAWW
 508





1603952
GGGRGAGGGGGAAAAGA-AGG-GAGRA-AGAA
 509





1604000
CCCYCT-CCCCTTTTCTTTCCCCTCYTCT-TT
 510





1604145
TATATT-TTAT-TTTT--TTTTTTTTTTTTTT
 511





1604181
TTYTCT-TTTT-TTTCTTTTTTTTTTTTTTTT
 512





1604183
GGGGGA-GGGG-AAAGAAAGGGGAGRAGAGAA
 513





1604206
CCCCCT-CCCCTTTTCTTTCCCCTCYTCTCTT
 514





1604236
CCCYCTCCCCCTTTTCTTTCCCCTCYTCT-TT
 515





1604259
GGGGGAGGGGGAAAAGAAAGGGGAGRAGAGAA
 516





1604304
GGAGAGGG--RGGG-AGGGGGGGGGGGGGGGG
 517





1604307
GGGGGAGGG-GAAA-GAAAGGGGAGRAGAGAA
 518





1604385
TTTTT--TTTTTTA-TT-ATTTTTTTTTATT-
 519





1604387
CCCCC--CCCCCCA-CC-ACCCCCCCCCACC-
 520





1604388
AAAAA--AAAATTT-AT-TAAAA-AATATAT-
 521





1604389
TTTTT--TTTTCTT-T--TTTTT-TTCTTTC-
 522





1604437
CCTCTCCCCCCCCC-TC-CCCCCCCCCCCCCC
 523





1604478
CCCCCTCCCCCTTTTCT-TCCCCTCYTCTCTT
 524





1604482
TTTTTATTTTTAAAATA-WTTTTATWATATAA
 525





1604540
TCTT-T-YT-T--T-T---TTTT-T--TTT--
 526





1604541
TTGGT--KT-T----T---TTTT-T--TGT--
 527





1604542
GGCCG--SG-G----G---GGGG-G--GCG--
 528





1604543
CCA-C--MC-C----C---CCCC-C--C-C--
 529





1604568
TTATA-TTT-TT-T-A---TTTTTTTTT-T--
 530





1604611
CGGGGGGGCGCGGG-GG-GCCCCGCSGC-CGG
 531





1604637
T--TTKTTGT-TGT----TTTTTTTTKT-TT-
 532





1604638
A--AAAAAAA-ATW----AAAAA-AAWA-AA-
 533





1604653
CCCYCTCCC-CTTT-CT-YCCCC-CYTC-C--
 534





1604820
T-T-T-T-TTTT---TA--TTTTATT-T-TAA
 535





1604867
AAAMACAAAAACCCCAMCCAAAACAMCACACC
 536





1605056
T-TY-CTTTTTCCCCT-CCTTTTCTCCTCTC-
 537





1605193
TGTKTTTTT-TTTT-TTTTTTTTT-T-TTTTT
 538





1605226
CCCCCACCCCCAAM-CAAACCCCA-CACACAA
 539





1605297
GGGGGA-GG-GGGGGGGG-GGGGGGRA--GG-
 540





1605327
TTTT-T-TT--GTK--GT-TTTTTTTT-GTTG
 541





1605336
TTT-TA-TT--T-T--TT-TTTT-TAT-TTTT
 542





1605337
TTT-TC--T--T-T--TT-TTTT-TCT-TT-T
 543





1605389
CCCCC--CC-CAAA-CAAACCCCACMACA-AA
 544





1605417
GGGG---GG-GAAA-GAAAGGGG-GR-GA-A-
 545





1605443
A-AA-A-TAAA----T---AAAA-AA-AAA-A
 546





1605445
GA-AAA-AGAG----A---GGGGAGR-GAG-A
 547





1605467
GGGGGA-GGGGAAA-GAAAGGGGAGRAGAGAA
 548





1605520
GGRGAGGGGGGGGGGAGGGGGGGGGGGGGGGG
 549





1605526
AGGGGGAAAGGGGGGGG-GAAAAGARGAGAGG
 550





1605527
CCCCCTCCCCCTTTTCT-TCCCCTCYTCTCTT
 551





1605559
CGGGGT-GCGSTT---T-TCCCCTCYTCTCTT
 552





1605573
GGGGGAGGGGGAAA----AGGGGGGRAGA--A
 553





1605598
GGGGGG-GGGGG-G----GGGGG-GG-GGG-A
 554





1605606
AAAAAG-AAAAG------GAAAA-AA-A-AGG
 555





1605613
GAAAGG-GGAAG------GGGGG-GG-G-GGG
 556





1605623
GGAGGG-GGGAG------GGGGG-GG-GGGGG
 557





1605629
GCGC-G-GGCGGG--G--GGGGG-GG-GGGGG
 558





1605631
CCCC-T-CCCCTT--C--TCCCC-CC-CTCCT
 559





1605665
AAAAA-AAAAA-G--A---AAAA-AA-A-AGG
 560





1605667
C-CCC-CCCCC-T--C---CCCC-CC-C-CTT
 561





1605687
GGGGGAGGGGG--G-G---GGGGGGG-GAGG-
 562





1605702
AAAAAAAAAAAATAAAT-AAA-AAAA-AAAAA
 563





1605716
GGGGGAGGGGGAAAAGA-AGGGGAGR-GAGAA
 564





1605853
CCCYCT-CCCCTTTTCTT-CCCCTCYTYTCTT
 565





1605879
CTCTCC-CCTYCCCCCCCCCCCCCCCCC-CCC
 566





1605938
TCTCTCT-TCC-CCCTC-CTTTTC-Y-T-T-C
 567





1605946
AAAAAGAAAAA-GGGA--GAAAAG-A-A-A-G
 568





1605957
C-CCC-CCCCC--T-C--TCCCCTCC-C-C--
 569





1605958
C-CCC-CCCCC--T-C--TCCCCTCC-C-C--
 570





1605965
A-AAACAAAAA----A--CAAAACAACA-A--
 571





1605983
A-AAAGAAAAA-G--A---AAAARAAGRG-G-
 572





1605993
G-GGGAGGGGG-A--GA--GGGGRGGARA-AA
 573





1606046
TTT-------T----T----TTT-TT-WAT--
 574





1606053
GAA-----A-A---------GGG-GG-G-G--
 575





1606065
AAA-A---A-R--G------AAA-AA-A-A--
 576





1606094
CTY-CT-TC-Y-TTTCT-TCCCC-CCTCT-TT
 577





1606158
GCCCCCCCG-SCCCSCCCCGGGGCGSCGCGCC
 578





1606358
AGARAAAAA-AAAAAAAAAAAAAAAAAAAAA-
 579





1606360
CCYYCTCCC-CTTTTCTTTCCCCTCYTCTCT-
 580





1606554
GGGGG-GG-GG-AA-G-A-GGGGAGG-GAG-A
 581





1606615
TTTYTCTTTTTCCCCTC-CTTTTCTCCTC-CC
 582





1606726
CTTTT-TTCTY-T-C---ACCCCACC-C-C--
 583





1606727
AAAAA-AAAAA-A-----GAAAAGAAGA-AG-
 584





1606728
GGGGG-GGGGG-G-----TGGGGTGGTG-GT-
 585





1606768
CA-AC--CCAM----C---CCCC-CC-C-C--
 586





1606777
GGGGG--GGGG-TT-G--TGGGG-GKTGTG-T
 587





1607231
TAWWTTTTTAATTTTTTTTTTTTTTTTTTTTT
 588





1607235
GCSSGGGGGCCGGGGGGGGGGGGGGGGGGGGG
 589





1607243
GTGKGGGGGTTGGGGGGGGGGGGGGGGGGGGG
 590





1607470
ATAWAAATATAAAAAAAAAAAAAAAAAAAAAA
 591





1607624
AARRAGGAAARGGGGAGGGAAAAGARGAGAGG
 592





1607724
CCCCCCCCCCCCCCCCCCCCCCCACCCCCCAC
 593





1607885
CCYYCTTCCCYTTTTCTTTCCCCTCTTCTCTT
 594





1608109
TTGTGTTTTTTTTTTGTTTTTTTTTTTTT-TT
 595





1608326
AAAAAAAGAARAAAAAAAAAAAAGAAAAAAGA
 596





1608370
GGGKGTGGGGGTTTTGTTTGGGGGGKTGTGGT
 597





1608498
AAAAAACCAAMAAAAAAAAAAAACAAAAAACA
 598





1608523
AAAAAAAMAAMAAAAAA-AAAAA-AAAAAA-A
 599





1608720
TTTTTTTTTATTTTTTTTTTTTTATTTTTTAT
 600





1608832
AGAGAGAAAAAGGGGA--GAAAAAARGAGAAG
 601





1609704
TGGG-GGGTGKGGGGGGGGTTTTTTGGTGTTG
 602





1609752
GGGGGCGGGGGCGGGGGGCGGGGGGSCGCGGC
 603





1610363
T----K------T---G--TT--G----TG--
 604





1610368
CC-C-M---C--A---C--A---C-----C-C
 605





1610369
AA-A-A---A--C---A--C---A-----A-A
 606





1610748
GGGGGG-GGGGGAAAGAAGGGGGGGGGGGG-G
 607





1610778
GAGAGGGGGGG-GGGGGGGGG-GGGGGGGG-G
 608





1610902
TCCCTTTTTTYTCC-CC-T-TTTCTTTT--TT
 609





1611052
A--G--A-A---G-AA-G-AAAAGA----A-A
 610





1611054
G--A--A-A---A-AA-A-GGGGAG----G-A
 611





1611234
GGKGKGGGGTGKGKKKGG--GGGG-GGGGGGT
 612





1611303
AAA--GG-AAAGAAAAAAGAAAAAAAG-GAAG
 613





1611422
GAAAGAA-GARAAAAAAAAGGGGAGRAGAGG-
 614





1611490
TT-TTTTTTTTTAAAA-ATTTTTATT--TTT-
 615





1611491
TT-TTTTTTTTTAAAA-ATTT-TATT--TTT-
 616





1611492
TT-TTTTTTTWTAAAA-ATTT-TATT--TTT-
 617





1611666
AGGGAGGGAGR-GGGGGGGAAAAGA-GAGAA-
 618





1611710
GTTTGTTTGGKTTTTTTT-GGGGTG-T-T-GT
 619





1611901
TTKTTTTTTTTTTTTTTTTTTT-KTTGTT-TT
 620





1611921
CTTTCTTTCCCTTTTTTTTCCCCT-TTCTCCT
 621





1612042
A-GGAGG-AGRGGGGGG-GAAAAGARGA--AG
 622





1612060
T-GGTGGG-GK-GGGGG-GTTTTGTK-T--TG
 623





1612073
A-ATATATAAA-AAAAA--AAAAAAW-A--A-
 624





1612200
ATAWATATAAATAAAAAAT-AAAAAWTAT-A-
 625





1612354
AAAAAGAAAAAGAAAAAAGAAAAAARGAGAAG
 626





1612360
AAAAACAAAAACAAAAAACAAAAAAACAC-AC
 627





1612711
TT-WTW-WTTTTTTTTW-TTTTTTTTTT-TTT
 628





1612712
TT-YTC-YTTTTTTTTY-TTTTTTTTTT-TTT
 629





1612720
GC-SGG-GGCG-GS-SG-GGGGGGGSGG-GG-
 630





1612721
CACMCC-CCAC-CM-MC-CCCCCCCMCC-CC-
 631





1612760
TAAATA-ATAWAAA-AA-ATTTTATWATATTA
 632





1613279
CCCCC--MC-CCCC-CMCMCCCCCCCCCC-C-
 633





1613280
AAAAA--WA-AAAA-AWAWAAAAAAAAAA-AT
 634





1613290
AAAWAAAAAAATWW-AAAAAAAAAAATA--AA
 635





1613292
TTTTTTTTTTT-WT-TTTTTTTTTTT-T--TT
 636





1613314
GGGRGGGGGGGGAA-GAAGGGGGGGGGGGGGG
 637





1613355
GTGKGGGGGGGGGGGG-GGGGGGGGGGGGGGG
 638





1613593
CTCTCC-CC-CCCCCCCCCCCCCCCCCCCCCC
 639





1613850
TTTTTATWTTTATTTTTTATTTTTTWTTATTT
 640





1614075
CCCMCCCCCCCCAAACAACCCCCACCCCCCCC
 641





1614423
TGGGTTGG-GKTGGGGGGTTTTTGTTGTTTTG
 642





1614447
GGGGGTGGGGGTGGGGGGTGGGGGGKGG-GGG
 643





1614490
AAAAAWAAAAA-AAAA-ATAAA-AAAAA-AAA
 644





1614715
T-WTTTT-WTTTTW-TTTTTTTTATTTTTTTT
 645





1614758
GCCCGGGCGCSGCCCCCCGGGGGGGGCGGGGC
 646





1614819
AGGG-AAGA-RAGGGGG-AAAAAGAAGAAAAG
 647





1615080
AGGGAAGAAGRAGGGGGGAAAAAAAAGAAAAG
 648





1615669
AAA-A---R-R--RG-A--AAAAAAA-A-A-A
 649





1615670
GAG-R---G-R--RA-G--GGGGAGG-G-G-G
 650





1615672
AGA-A---A-A--AA-A--AAAAGAA-A-A-A
 651





1615675
AGA-A---A-A-GAA-A--AAAAGAA-A-A-A
 652





1615684
TGTGTGG-T-T-GTTTT--TTTTGTT-T-T-T
 653





1615728
AGAGAGGAAAAGGAAAAAGAAAAGAAAA-AAA
 654





1615729
TCTCTCCTTTTCCTTTTTCTTTTCTTTT-TTT
 655





1615738
G-GA-AAGGGG-AGGGGGAGGGGAGRGG-GGG
 656





1615882
AGARAGG-AGAGGAAAAAGAAAAGA-AAGAAA
 657





1615940
C-TTCTT-CTYTTTTTTTTCCCC-CCTCTCCT
 658





1615996
T-T-TC--T-T-CTTTTT-TTTT-T-T--TTT
 659





1615997
T-T-TA--T-T-ATTTTT-TTTT-T-T--TTT
 660





1616062
T-TTTGTTTTTGGTTT-TGTTTTTT-TTGTT-
 661





1616174
TCTC-CCCTCTCCTTTTTCTTTTCTYTTCTTT
 662





1616203
C-CA-AA-CACAACCCCCACCCCACACCACCC
 663





1616335
T-TCT-Y-T-TTTTTTT--TTTT-TTTTCTT-
 664





1616336
A-AAA-M-A-AMCAAAA--AAAA-AAAAAAA-
 665





1616538
TCTCTCCCTCTCCTTTTTCTTTTCTYTTCTTT
 666





1616691
GA-AGAAAGARAAAAAA-AGGGGAGAAGA-GA
 667





1617198
AGGGAGGGAGRGGGGGGGGAAA---RGAGAAG
 668





1617696
AGRRAG-GAGAGGAAGAAGAAAA-AGGAGAAG
 669





1617770
CTTTCTTTCTYTTTTTTTTCCCCTCYTCTCCT
 670





1618051
ATTTATTTATWTTTTTTTTAAAATAWTAT-AT
 671





1618090
AAAWA-AWAWATWAA-AAAAAAATAA-AAAA-
 672





1618231
CGSGCC-CCCCCCCCGCCCCCCCG-CG-CCCG
 673





1618254
A-WTAAAAAAAAAAATAAAAAAATAATAAAAT
 674





1618273
TCTCTTTTTTTTTTTTTTTTTT-TTTTTTTTT
 675





1618347
T-TTTT--T-T-TC-TTC-TTTTT-TTTTTTT
 676





1618372
C-TCCC--C-C-TC--CC-CCCC----C-Y-C
 677





1618374
T-GTT---T-T-GK--TT-TTTT----T-K-T
 678





1618376
A-ARAA--A-AAAA-AA--AAAA----A-A-G
 679





1618456
CTT-CTT-CTTTTT-TTTTCCCCTCCTCTCC-
 680





1618461
CTC-CCC-CCCCCC-TCCCCCCC-CC-CCCC-
 681





1618502
C-YYCC-TCCCCCCCTCYCCCCCTCCTCCCCT
 682





1618804
GGGRGGGGG-GGGAAGGAGGGGGGGGGGGGGG
 683





1618913
CCMCCCCCCCCCCC-ACCCCCCCACCACCCCA
 684





1618914
AAWAAAAAAAAAAA-TAAAAAAATAATAAAAT
 685





1619145
AAGAAAGGAGRAAAAGG--AAAAGAAGAAAAG
 686





1619316
A--C----A----C-C---AAA-C-A-A-AA-
 687





1619317
C--G----C--G-C-G---CCC-GGC-CGCC-
 688





1619428
T-WTTTTTTTTTTTTAT--TTTTATTATTTT-
 689





1619605
AAWAAAATATWAAAAATAAAAAAAAAAAAAAA
 690





1619793
AGGGGGGGAGRGGGGGGGGAAAAGGRGAGAAG
 691





1619889
CCCT-C-CC-YC-CCTC-CCCCCCC--CCCCC
 692





1619893
TTTA-T-TT-WT-T-AT-TTTTT-T--T-TTA
 693





1619897
T-ATTA--T-T-TA-TT--TTTT-A--T-TTT
 694





1619898
A-TAAW--A-A-AT-AA--AAAA----A-AAA
 695





1619989
GG-GAAGGGGG-AG-GGGAGGGGGAA-GAGG-
 696





1619991
AC-CCCCCACA-CC-CCCCAAAACCC-ACAA-
 697





1620043
CCYCCCCT---CCC-CTCCCC-CCCC-CCCCC
 698





1620056
CCCCTTCCC--TTC-CCCTCCCCCTC--TCCC
 699





1620095
TTT-TTA-T---WT---A-TTTT----TWTTT
 700





1620101
AAAT-AT-A--AAWT--T-AAAA----AAAAA
 701





1620103
ATTA-AA-A--AAWA--A-AAAA-A--AAAA-
 702





1620104
TAAT-WT-T--TWWT--T-TTTT-A--TWTT-
 703





1620185
CAMAAAACMCCAAAA-CAACCCCAAAACACC-
 704





1620249
GCCCCCCCGC--CCCCCCCGGGGCCS-GCGGC
 705





1620527
TTTYTTTTTTTTTCCTTCTTTTTTTT-TTT-T
 706





1620544
CCMCCCCACCMCCCCCACCCCCCCCC-CCC-C
 707





1620585
TCCCCC-CTCYCCCCCCCCTTTTCCYCTCT-C
 708





1620728
TTT--T-T--T-TG--KT-TTTTTTT-T-T--
 709





1620739
T-T--G-----GGT-TTT-TTTT-GK-T--TT
 710





1620813
TGTGTTGTTTTTTTTTTT-TTTTTTTTTTTTT
 711





1620946
ATTT-A--A-AAAW---AAAAAAAAA-A-AA-
 712





1620954
TAWW-A----TAAW-TATATTTTTAW-T-TTT
 713





1620955
TAAA-A----TAAW-AATATTTTTAW-T-TTA
 714





1621237
A--TTTT----TT--T----A---TTTA---T
 715





1621391
GGGKGGGGGG-GGTTGG-GGGGGGGGGGGGGG
 716





1621467
CCACAACCCCCAACCACCA-CCCAAMACACCA
 717





1621759
CCCCCCC-C-CCCA-CCACCCCCCCCCCC-CC
 718





1621770
AAAAAAAAAAAAAT-AATAAAAAAAAAAAAAA
 719





1621799
AAAAAAAAAAAAAT-AATAAAAAAAAAAAAAA
 720





1621800
CCCCCCCCCCCCCA-CCA-CCCCCCCCCCCCC
 721





1621931
TTTTCCT-T-TCCTTTTTCTTTTTCYTTCTTT
 722





1622029
A--A-W-AAAA--AAW-AAAAAA-AA-A-AAA
 723





1622034
A-TA-T--ATATTA-T-ATAAAA--ATA-AAT
 724





1622108
A-TTTT--A-WTTTTTTTTAAAATTTTATAAT
 725





1622131
G-ARAA--G-RAAGG-AGAGGGGAARAGAGGA
 726





1622144
C-TYTT--CTYTTCC--CTCCCCTTYTC-CCT
 727





1622152
TAAWAA---AWAATTA-TATTTTAAWAT-TTA
 728





1622535
CCTYTT-TCCCTTTTTT-TC-CCCTYTCTCCT
 729





1622598
T-TWTTTTTTTTTAATTATTTTTTTTTTTTTT
 730





1622610
A-GRGGAGAAAGGGGGGGGAAAAAGAGAGAAG
 731





1622623
TTTTTTTTTTTTTGGTT-TTTTTTTTTTTTTT
 732





1622630
TTAWAATATTTAAAAAA-ATTTTTATATATTA
 733





1622659
ATAWAATAATAAAAAAA-AAAAATAAAAAAAA
 734





1622728
CCACAACAC-CAAA-AA-ACCCCCAMACACCA
 735





1622766
TATATTATTTTTTTTTT-TTTTTTTTTTTTTT
 736





1622876
GGRGGGGGGGGGGGGAG-GGGGGGGGAGGGGA
 737





1622961
TTCTTTTTTTTTTTTCT-TTTTTTTTCTTTTC
 738





1623024
TCCCCCCCTCYCCTTCC-CTTTTCCYCTCTTC
 739





1623076
GGGGGGGGGG-GGK-GGTGGGGGGGGGGGGGG
 740





1623155
TT-TTTTTTTTTTTTATTTTTTTTTT-TTTTW
 741





1623157
TA-ATTATTATTTTT-TTTTTTTATT-TTTTT
 742





1623183
CTCTCCTCCTCCCCC-CCCCCCCTCCCCCCCC
 743





1623346
AGARAA-AA-AAAAAAAAAAAAARAAAAAAAA
 744





1623426
CTCYCCTCCTCCCCCCCCCCCCCTCCCCCCCC
 745





1623482
AGARAAGAAGAAAAAAAAAAAAAGAA-AAAAA
 746





1623619
GGGG-T-GGGGTTGGGGGTGGGGG-GGGTGGG
 747





1623625
AGARGG--AGAGGAAAAAGAAAAG--AAGAAA
 748





1623626
CTCTAA--CTCAACCCCCACCCCT-MCC-CCC
 749





1623789
TTCTTTTCT-YTTTTCCTTTTTTTTTCTTTTC
 750





1623925
TTTYCCTTTTTCCT-TTTCTTTTTCYTTCTTT
 751





1624123
GAGAGG--GAGGGGGGGGGGGG-GGGGGGGGG
 752





1624435
TCTYTTCT-TTTTTT-TTTTTTTTTTTTTTTT
 753





1624569
GTGK-GTTGTGGGGGGGTGGGGGGGGGGGGGG
 754





1624739
GGGGGGGG--GGGAAGA-GGGGGGGGG-GGGG
 755





1624817
AAAATT-AAAA-TA-AA-T-AAAAT-A-TAAA
 756





1625263
TCCC-CCCTCYC-CCCCCCTTTTCCYCTCTT-
 757





1625295
A-TA-T-AA-AT---TA--AAAAATA-A-AA-
 758





1625296
A-TA-T-AA-AT---TA--AAAAATA-A-AA-
 759





1625300
A------AA-W-----T--AAAAT-A-A-AA-
 760





1625304
A------AA-W--------AAAAT-A-A-AA-
 761





1625331
T----C-TT-T--CC-C--TTT-CCT-T-TT-
 762





1625346
CT---T-TC-C--TT-T--CCCCTTC-C-CC-
 763





1625392
T-GG-GG-T-T--------TTTT-G-G-G-TG
 764





1625409
TCCCCCC-T-T---CC--CTTTT-CTCTCTTC
 765





1625424
GAGAGGAAG-G---AG-AGGGGG-GGGGGGGG
 766





1625443
CCYCCCCCC-CC-CCTCC-CCCCCCCTCCCCT
 767





1625454
GAAAAA-AG-RA-AAAAA-GGGGAARAGAGGA
 768





1625472
GGGGAA-GG-GAAG-AGG-GGGGGARAGAGG-
 769





1625548
G-------G---AA-----GGGG----G-GG-
 770





1625586
GAA----GG-G-GG-----GGGG--G-G-GG-
 771





1625587
TGG----TT-T-TT-----TTTT--T-T-TT-
 772





1625588
ATT----AA-A-AA-----AAAA--A-A-AA-
 773





1625599
CCY---C-C-C-C--TC-CCCCC--C---CC-
 774





1625658
C-YT-CT-CTCCCTTCTTCCCCCTCCC-CCCC
 775





1625660
C-AA-AA-CACAAAAAAAACCCCAAMA-ACCA
 776





1625677
T-CC-CCCTCTCCCCCCC-TTTTCCYC-CTTC
 777





1625693
AG-G-GGGAGAGGGGGGG-AAAAGGRG-GAAG
 778





1625694
CT-T-CTCCTCCCCCCCC-CCCCCCCC-CCCC
 779





1625725
CCAC-ACCCCCAAC-ACC-CCCCC-MACACCA
 780





1625788
CT-T--TTCTY-TTT-TT-CCCCT---C-CC-
 781





1625833
AA-W-TA-A-A--A-TAW-AAAAA-WTA-AAT
 782





1625895
AGAG-AG-AGAAAGGAGG-AAAAG-AA-AAAA
 783





1625923
AGAG-AGGAGAAAAAAAAAAAAAAAAAAAAAA
 784





1625924
TCTC-TCCTCTTTTTTTTTTTTTTTTTTTTTT
 785





1626140
CCACAACACCCAACCACCACCCCCAMACACCA
 786





1626248
T----C-CT-TTT--TT--TT-TT-T-TTTT-
 787





1626250
C----T-TC-C-C---C--CC-CC-C-CCCC-
 788





1626252
A----A-AA-A-T---A--AA-AA-A-ATAA-
 789





1626253
T----T-TT-T-A--TT--TT-TT-T-TATT-
 790





1626254
G----G-GG-A----GA--GG-GA-G-G-GG-
 791





1626263
T----TGTT-TTT--TTG-TT-TTTT-TTTT-
 792





1626278
GAG--GAGG-GGGAAGGAGGG-GGGG-GGG-G
 793





1626298
CCTCTTCCCCCTTCCTCCTCCCCCTC-CT-CT
 794





1626400
ACACAACAA-AAACCAA-AAAAAAAAAAAA-A
 795





1626505
CCYCCCCC-CCCCCCTCCCCCCCC-CTCCC-T
 796





1626585
AAAAAAATAAA-AAAATAAAAAAAAAAAAAAA
 797





1626676
TCYYCCCCTCTCCTTCCTCTTTTTCTCTCTTC
 798





1626838
GGGGGG-AGGRGGGGGG-GGGGGGGGGGGGGG
 799





1626935
ATTWTTTTATATTAATTATAAAAATWTATAAT
 800





1626986
TATWA-ATTAT-ATTTAA-TTTTTATTTATTT
 801





1627040
AAAAAWAA-AATWAAAAAWAAAAAWWAAAAAA
 802





1627079
GTGTGGT-GTGGGGGGGGGGGGG-GG--GGGG
 803





1627098
TTTTTY--TTTTTTTTTTTTTT--TYT-Y-TT
 804





1627135
GGGG---GG-GCCGGGGG-GGGGG--G--G-G
 805





1627136
TTTT---TT-T-CTTTTT-TTTTT--T--T-T
 806





1627245
A----A--A--AAAAT----AAA----AA--T
 807





1627248
T----T--T---TTTA----TTT----TT--A
 808





1627489
T-TA-TA-TATTTTTTATTTTTTATTTTTTTT
 809





1627535
TA-AAA--TAWAA-T--TATTTT-A--TA-T-
 810





1627607
ACCMCC-CACCCCAAC-ACAAAACCMCACAA-
 811





1627619
TTGTTT-TTTTTTTTG-TTTTTTTTTGTTTT-
 812





1627637
AGARAA-AAGAAAA-A-AAAAAAGAAAAAAAA
 813





1627649
CCTCCCCCCCCCCCCT-CCCCCCCCCTCCCCT
 814





1627669
GGCGGGG-GGGGGGGC---GGGGGG-CGGGGC
 815





1627686
AAAAATA-A-AAAAAA--A-AA-ATAAAW--A
 816





1627687
TTTTTAT-T-TTTTTT--T-TT-TATTTW--T
 817





1627688
AAAAATA-A-AAAAAA--A-AA-ATAAAW--A
 818





1627771
TGGKGG-GTGKGGTTGGTGTTTTG-GGTGTTG
 819





1627780
CCCCAA-CCCCAACCCC-ACCCCC--CCACCC
 820





1627783
GAARAA-AGARAAGGAA--GGGGA--AGAGGA
 821





1627802
GAGAGG-GGGGGGGGGG-GGGGGGGGGGGGGG
 822





1627934
T-TT---TTTTT-T-TTTCTTTT-TT-TTTT-
 823





1628083
AATATT-TAATTTA-T-ATAAAATTWTAT-AT
 824





1628315
TT-TTTCT-CTTTTTTCTTTTTTCTTTTTTTT
 825





1628562
GGGGCC-GG-GCCGGGGGCGGGGGCSGGCGGG
 826





1628644
A--AAACC--MAAAACCA-AAAACAACAAAAC
 827





1628651
T--TTTAA--WTTTTAAT-TTTTATTATTTTA
 828





1628760
TTCTTT-CT-YTTTT-CTTTTTTCT-CTTTTC
 829





1628782
AA-AAW--A-A-WAA-AAAAAAAA-TAAAAA-
 830





1628793
TT-TTT--T-T-TTTA-TTTTT---TAT-W--
 831





1628955
TTGT-T--T--TTTTGGTTTTT-KTTTT-T--
 832





1628956
TTTT-T--T--TTTTTTTTTTT-KTTKT-TT-
 833





1628957
GGGG-G--G--TKGGGGGGGGG-GGGKG-GG-
 834





1629663
C-TC-C---CCCCCC-C-CCCCT-CC-CCCT-
 835





1629665
T-AT-T---TTTTTT-T-TTTT---T-T-TAT
 836





1629698
C--C-C--CCCCCCCA-CCCCCAA-CACCC-A
 837





1629751
CCTCCC-TCTCCCCCTTCCCCCTTCCTCC-TT
 838





1629813
A-GAAAAGAARAAA-GAAAAAAGGAA-AAAGG
 839





1629885
TTGTTT--TTTTTTTGTTTTTT--TTGTTT-G
 840





1629924
TTWTTT--TTTTTTTTATTTTT-TT-TTTTTT
 841





1629966
GGAGGG--GGR-GG-AGG-GGG-AGG-GG-AA
 842





1629972
CC-CCC--CCCCCC-TCCCCCC-TCC-CC-TT
 843





1629986
GG-GGGA-GGGGGG-AGGGGGG--GG-GGGAA
 844





1630016
A-GAAAA-AAAAAAAGAAAAAA--AA-AAA--
 845





1630038
A-AAAAGAAGAAAAAAGAAAAA-AAAAAAAAA
 846





1630042
T--TTTTCTTYTTTTCTTTTTT-CTTCTTTCC
 847





1630059
TT-TTTTCTTYTTTTCTTTTTT-CTTCTTTCC
 848





1630073
GG-GGG-AGGAGGGGAGG-GGG-AGGAGGGAA
 849





1630087
AAAATT-AA-ATTAAA-A-AAA-ATTAATAAA
 850





1630090
GGAGGG-AG-RGGGGA-G-GGG-AGGAGGGAA
 851





1630106
TTCTTTT-T-TTTTTCTT-TTT-CTTCTT-CC
 852





1630132
TT-TTTT-TTYTTTT-TT-TTT-CTT-TTTC-
 853





1630139
GG-GGGG-GGRGGGG-GG-GGG-AGG-GGGA-
 854





1630142
CC-CCCC-CCYCCCC-CC-CCC-TCC-CCCT-
 855





1630154
GGAGGGG-GGR-GGG-GG-GGG-AGGAGGGA-
 856





1630196
A-CAAAA-AAAAAAACAAAAAAC-AA-AAAC-
 857





1630198
T-ATTTT-TTTTTTTATTTTTTA-TT-TTTA-
 858





1630217
CCTCCCC-CCCCCCCTCCCCCCT-CC-CCCT-
 859





1630219
AA-AAAA-AAAAAAAG-AAAAAG-AA-AAAG-
 860





1630225
CC-CCCC-CCCCCCCT-CCCCCT-CC-CCCT-
 861





1630284
TTATTTTAT-ATTTTATT-TTT-A-TAT-TAA
 862





1630303
AAAAAW-AA-A-WAAAAA-AAA-A-AAA-AAA
 863





1630311
AA-AAM-AA-A-AAAA-A-AAA-A-MAA-AAA
 864





1630325
TT-TTT-ATTTTTTTTTTTTTT-TTTAT-T-T
 865





1630357
C-TCCCC-CCYCCCCTCCCCCCTTCCTCCCTT
 866





1630376
CCGCCCC-CCSCCCCGCCCCCCGGCCGCCCGG
 867





1630447
TT---G--T-TGGT---TGTTT---T-T-T--
 868





1630475
CC-CAAAACAM-AAA-AA-CCC---M-CACA-
 869





1630513
AA-AAA--AARAAAA-AAAAAA-GAA-AAAG-
 870





1630515
CC-CCC--CCYCCCC-CCCCCC-TCC-CCCT-
 871





1630608
TT-TTTTATTTTTTTAT-TTTT--TT-TTTA-
 872





1630765
TTWTTT--TTATTTTTTTTTTTAWTTATTT--
 873





1630766
TTATTT-ATTATTTTAATTTTTAATT-TTT--
 874





1630870
CCACCCCAC-ACCC-AACCCCCAACCACCC-A
 875





1631079
GGGGGGGGGGGGGGGGG-GGG-AAGGGGGGAG
 876





1631156
GGCGGGCCGCSGGGGCCGGGGGCCGGCGGGCC
 877





1631191
AAWAAAATA-AAAAA-AAAWAA-TAAWAAA--
 878





1631193
CCMCCCCAC-CCCCC-CCCCCC--CCMCCC--
 879





1631316
TTATTTTATTW-TTTAATTTTTAA-TATTTAA
 880





1631449
CCCCCCTCCTCCCCCCCCCCCCCCCCCCCCCC
 881





1631761
GGTGGGTTGTGGGGGTTGGGGGTTGGTGGGTT
 882





1631765
AATAAAATAAAAAAATTAAAAATTAATAAATT
 883





1632488
TTCTTTTCTTYTTTTCTTTTTTCCTTCTTTCC
 884





1633118
TTCTTTTCTTYTTTTCCTTTTTCCTTCTTTCC
 885





1633465
AAGAAAAGAARAAAAG-AAAAAGGAARAAAGG
 886





1633629
TTTTTTA-TATTTTTTTTTTTTTTTTTTTTTT
 887





1633840
GGGGGGAGGAGGGGGGGGGGGGGGGGGGGGGG
 888





1633948
GGAGGGAAG-RGGGGAAGGGGGAAGGAGGG-A
 889





1633983
TTCTTTCCT-YTTTTCT-TT-TCCTTCTTTCC
 890





1634118
GGAGGGAAGARGGGGAAGGGGGAAGGAGG-AA
 891





1634206
TT-TTTTGTTTTTTT--TTTTT--TT-TTTGG
 892





1634213
GG-GGG--GGGGGGK--GGGGG--GG-GG-TT
 893





1634260
CCTCCCTTCTYCCCCTTCCCC-TTCCTCCCTT
 894





1634452
TTTTCCCTT-TCCCCTTCCTTTTTCYTTCTTT
 895





1634453
GGGGAAAGG-GAAAAGAAAGGGGGARGGAGGG
 896





1634537
TTW----AT-W----TA--TTTAA-TTT--AT
 897





1634543
TTTY-C-TT-T-CC-TTC-TTTTT-YTT--TT
 898





1634611
AAAW-T-AA-A-TT-A-T-AAAAATWAA-AAA
 899





1634612
TTTW-A-TT-T-AA-T-A-TTTTTAWTT-TTT
 900





1634643
CCCC-TTCC-C----C-T-CCCCC-CCC-CCC
 901





1634649
CCCC-TCCC-C-TT-C-TTCCCCC-CCC-CCC
 902





1634854
AAARGGAAA-AGGGGAAGGAAAAAGRAAGAAA
 903





1634907
AAAATTT-ATATTTT-TT-AAAAATWAATAAA
 904





1634974
A-AAGGG-AGAGGGGAGGGAAAAAGRA-GAAA
 905





1635001
CCCCCCTCCTCCCCCCTCCCCCCCCCC-CCCC
 906





1635093
A-AAAAG-AGAAAAAAAAAAAAAA--AAAAAA
 907





1635121
C-CCTT--C-CTTTTCTTTCCC-C-YCCTCCC
 908





1635161
T-TTCCT-TTTC-CCTTCCTTTTTCYTTCTTT
 909





1635172
T-TTA-T-TTT-TA-TTAATTTTTTTTTATTT
 910





1635188
A-AA-TA-AAATTT-AA--AAAAATWAA-AAA
 911





1635269
G-GKTTG-GGGTTTTG-TTGG-GGTTGGTGGG
 912





1635378
C-CC-T-C--CTTT-C-T-C--CCTTCCTCC-
 913





1635479
A-AA---AA-AC---A----A-AA--AA-AA-
 914





1635553
CCCSGGGCCGCGGGGCGGGCC-CCGSCCGCCC
 915





1635588
A-AAAATA-TAAAAAATAAAA-AAAAAAAAAA
 916





1635621
A-AACCMAAAACCCCAACCAAAAACCAAC-AA
 917





1635628
T-TTGGTTTTTGGGGTTGGTTTTTGGTTG-TT
 918





1635716
AAAA--AAAAAT-T-AA--AAAAA-WAA-AAA
 919





1635908
CCCC-C-CC-CYYC-Y-CTCCC-YCCCCCC--
 920





1635913
TTTT-W--T-TAWT-T--ATTT-TTTTTWT--
 921





1635916
TTTT-W--T-TTWW-T--TTTT-TATTTWT--
 922





1635919
AWAA-W----AAAT-A--AAAAAW-ATAAA--
 923





1636172
TTCYCCTCTCYCCCCCCCCTTTCCCYCTCTCC
 924





1636305
CCCCCCTCCCCCCC-CCCCCCCCCCCCCCCCC
 925





1636413
TTTYTTTTTTTTTCCTTCTTTTTTTTTTTTTT
 926





1638089
CCCCCTCCCCC-CT-CCCCCCCCC-CCCYCCC
 927





1638581
AAAA-AAAA-AAWA--A-WAAAAW-AWAAAA-
 928





1638591
CCTC-CY-C-CYCYC-T-CCCCYC-CYC-C-C
 929





1638593
TTCT-TY-T-TTTTT---TTTTYT-TYT-T-T
 930





1638641
CCYC-TCCCCYTTTTC-TTCC-CCTYCCTCCC
 931





1639354
AAAAAATAAAAATAAATAAAAAAAAAAA-AAA
 932





1639358
TTWTTTAATTTTTTTATTTTTTAATTAT-TAA
 933





1639379
CCTYTTTTCTYTTTTTTTTCCCTT-YTC-CTT
 934





1639405
T-ATAA-ATA-AAAAAAA-TTTAA-WAT-T-A
 935





1639422
T--T----T-----AW---TTTA--W---T--
 936





1639423
T--T----T------T---TTTA--W---T--
 937





1639425
T--T-A--T------A-A-TTTT--T---T--
 938





1639589
CCGCGGGGCGGGCGGGCGGCCCGG-SGCGCGG
 939





1639649
T--T-A--T--AA--AA--TTTA--T-TAT--
 940





1639650
T--T-C--T--CT--CT--TTTC--T-TCT--
 941





1639656
G--GA-A-G--AG-AAG-AGGG---G-GAGAA
 942





1639658
T--TCCC-T--CT-CCT-CTTT---T-TCTCC
 943





1639685
AAWAAAATAAAAAAATAAAAAATTAATAAATT
 944





1639720
CCCCCCCCCCCCGCCCGCC-CCCCCCCCCCCC
 945





1640319
TTTTAATATAWAAA-T--ATTTTTATTTATTT
 946





1640338
AAAACCCCACACCC-AC-CAAAAAC-AACAAA
 947





1640347
CCCCTTCTCTYTTT-CT-TCCCCCTTCCTCCC
 948





1640404
CCCCAACACAMAAAACAAACCCCCAMCCACCC
 949





1640483
CCMMAAAACAMAAAACAAACCCCCAMCCACCC
 950





1641130
TTTKGG-GTGK-TGGTGGGTTTTTGKTTGTTT
 951





1641172
TTTWAA-ATAWTTAATTAATTTTTAWTTATTT
 952





1641442
AAAAGGGGA-AAAG-AA-GAAAAAGRAAGAAA
 953





1641449
CCCC-CC-C-CCCCACC-CCCCCCMCCCACCC
 954





1641451
TTTT-AATT-TTTTTTT-TTTTTTTWTTTTTT
 955





1641714
CCYYTTCTCTTCCYTCCTTCCCCCTCCCTCCC
 956





1642236
TTTTCCCCTCY-TCCTT-CTTTTTCYTTCTTT
 957





1642267
CCYYTTTTCTYCCTTCC-TCCCCCTCCCTCCC
 958





1642307
CCYYTTTTCTYCCTTCC-TCCCCCTYCCTCCC
 959





1642711
TTTTGKTKTGK-TGGTTGGTTT-TGKTTGTTT
 960





1643324
CCYCTTT-CTTCCTTCC-TCCCCCTYCCTCCC
 961





1643682
TYTTTTTTYTTTTYTTYTTTTTTT-TTTTTTT
 962





1643963
GGGRAAGGGGGGGAAGGAAGGGGGAAGGAGGG
 963





1644011
CCTYTTTTCTTTTTTTTTTCCCTTTYTCTCTT
 964





1644076
GGGSCCCCGCSGGCCGGCCGGGGGCSGGC-GG
 965





1644328
TTTYCCTCTCYTTC-TTCC-TT-TCYTTCTTT
 966





1644511
TTTT--T-T-T-CT-TCT-TTTTT-TTTTT-T
 967





1644513
GGGG-TG-G-KGGT-GGT-GGGGG-GGGTG-G
 968





1644525
TTTTTTCTT-TTTT-TTTTTTTTT-T-TTTTT
 969





1644537
AAWAAAATA-WAAAAAAAAAAAAAAAAAAAAA
 970





1644750
GGGGGGGGGGGTTGGTTGGGGGGGGGTGGGGT
 971





1645218
AATAAATAA-ATTAATT-AAAAATAA-AAAAT
 972





1645228
GGGKTTGGG-GGGTTGG-TGGG-GTTGGTGGG
 973





1645334
CYCCTTCCCCCCCTTCCT-CCCCCTYC-TCCC
 974





1645582
GGCGGGCGGGGCCGGCC-GGGGGGGGCGGSCC
 975





1645815
AAGAAAAAA--GGAAGGAAAAAAAAAGAAAAG
 976





1646008
AACMCCCCM-ACCCC-CC-AAAAACMCACAAC
 977





1646226
AAAAAA-MA-AAA--CA--AAAAA-AAAA-AA
 978





1646229
CCCCCC-CC-CYY--CC-CCCCCC-C-CC-C-
 979





1646564
TTTYTTTTTTTTTC-TTCTTTTTTTTTTTTTT
 980





1646600
AAWW-ATAA-ATATAW-ATAAAAAAATAAAA-
 981





1646602
GGRR-GAGG-GAG-GR-GAGGGGGGGAGGGG-
 982





1646890
AAWAAAATAAWAAAAAA-AAAAAAAAAAAAA-
 983





1647863
GGGSGSGGG-GGGS-GSGGGGGGSG-SGSGGC
 984





1648561
AAMACCAAAAACCCCCC-CAAAACCCCACAAC
 985





1649069
GGGGGGTGGGGKGGGGGGG-GGGT-GGGG-GG
 986





1649189
AATWTTTTATWTTT-TTTTAAAATTATATAAT
 987





1649194
TTAWAAAATAWAAA-AAAAWTTTAATATATTA
 988





1649212
AAAAAACAAAAAAA-AAAAAAAACAAAAAAAA
 989





1649292
A-GAAAAAAAAGGA--G-AAAAAAAAGAAAAG
 990





1649335
T--TTWT-TTTWTT--TTTTTT-A-T--TT-T
 991





1649343
A-AAAA-AAAAWW--AWAAAAAAAAAA-AA-A
 992





1649385
AAA-AAGA-AAAAA-AAAAAAAAGAAAAAAAA
 993





1649892
TTATTTATTTWAATTAATTTTTTTTTATTTTA
 994





1649975
AAGAAAAAAARGGAAGG-AAAAARAAGAAAAG
 995





1650007
AAGAAAAAAAAG-AAGGAAAAAAAAAGAAAAG
 996





1650035
TTCTTTTTTTYCCTTCCTTTTTTTTTCTTTT-
 997





1650053
TTYTTTTTTTTCCTT-CTTTTTTTTTCTTTT-
 998





1650057
AARAAAAAAARGGAAGGAAAAAAAAAGAAAA-
 999





1650062
AARAAAAAAARGGAAGGAAAAAAAAAGAAAA-
1000





1650140
GGTGGGTGGGKTTGGTTGGGGGGGGGTGGG-T
1001





1650162
T-TTTTTTTTTTTTTTTTTTTTWTTTTTTTAT
1002





1650201
G-GGGGAGGGGGGG-GG-GGGGGGGGGGGGGG
1003





1650312
GGARAAAGGGRAAAAAAAAG-GGGAR-GAGGA
1004





1650501
TTTTTTATTTATTTTTTTTTTTTTTTTTTTTT
1005





1650892
TTTT-AT-T-T-T--T--ATTTTTT--TWT--
1006





1651006
TTYTTTTTTC--TTTC-T-TTTTT--CTTT-C
1007





1651019
TTWTA-T-TAT--AAATA-TTTTT--AT-T-A
1008





1651075
AAAAGGA-AAAAAGGAAGGAAAAAGRAAGAAA
1009





1651085
CCCCCCC-CCCAACCCACCCCCCCCCCCCCCC
1010





1651266
TTTTTTTCTTTTTTTTT-TTTT-YTTTTTTCT
1011





1651404
AAWAAAATATWAAA-TAAAAAA-A-ATAAATT
1012





1651496
AAAA-AAAAAAGGAAAGAAA-AAAA-AAAAAA
1013





1651574
A--AAAAAAAAWTAA-TAAAAA-AAA-AAAA-
1014





1651575
T--TTTTTTATW-TT-TTTTTT-TTT-TWTT-
1015





1651579
AA-AAAA-AAARGAAAAAAAAA-AAA-AAA-A
1016





1651787
A-AAAAA-AA-GGAAA-A-AAAAAAAAAAAAA
1017





1651950
A-AMCCM-AAM-ACCAA-CAAACACCAACACA
1018





1651977
AAAWTTT-AAW-TTTATT-AAATATWAATA-A
1019





1651991
AAWWTTT-ATT-ATTTAT-AAA-ATWTATA-T
1020





1651993
TTTTTTT-TTT-ATTTAT-TTT-TTTTTTT-T
1021





1651996
TTTWAA--TTW-TAATTA-TTT-TAWTTAT-T
1022





1652011
CCCMAAA-CCCAAAACAAACCCACAMCCACAC
1023





1652098
AAAAAAAAAAACCAAACAAAAAAAAAAAAAAA
1024





1652139
AACAAAAAACAAAAACAAAAAAAAAACAAAAC
1025





1652235
GGCGCCCCGCGCCCCCC-CGGGCGCSCGCGCC
1026





1652253
AAAACCCCAAACCCCAC-CAAACACMAACA-A
1027





1652352
TTTTTTTTTTTTTTTCTTTTTTTTTTCTTTTC
1028





1652357
CCCYTTCCCCCTTTTCTCTCCCCCTYCCTCCC
1029





1652401
TTTTTYCCTTYTTTTTTCTTTTCTTTTTTTCT
1030





1652453
AAGRGGGGAGRGGGGGG--AAAGAGRGAGAGG
1031





1652715
GCGGG-GGSG--SG-GG-CG-SGGG---SGG-
1032





1652723
TTT-T-CCT-TTTT-TT-TTTTC-TT--T-C-
1033





1652747
GGGGGGGAGGGGGG-G--GGGGAGGGGGG-AG
1034





1653494
AAAAAAAAAAAGGAAAGAAAAAAA-AAAAAAA
1035





1653520
CCMCCCCCCCACCCCACACCCCCC-CACCCCA
1036





1653597
GGGGGGGAGGGGGGGGGGGGGGAG--GGGGAG
1037





1653769
CCCCCCCCCCSCCCCCCGCCCCCCCCCCCCCC
1038





1653795
AAAAAAAGAAAAAAAAAAAAAAGAAAAAAAGA
1039





1653979
GGGGGGGTGGGTTGGG--GGGGTGGGGGGGTG
1040





1654008
AAAAAAAAAAATTAAA--AAAA-AAAAAAAAA
1041





1654084
TTTTTTTTTTTGGTTTGTTTTTTTTTTTTTTT
1042





1654113
CCCCCCCCCCC-GCCCGCCCCCCCCCCCCCCC
1043





1654119
GGAGAAAAGAR-AAAAAAA-GGAGARAGAGAA
1044





1654174
GGGGGGGTGGGGGGGGG-GGGG-GGGGGGGTG
1045





1654178
TTTTTTTCTTTTTTTTT-TTTT-TTTTTTTCT
1046





1654234
AAAAAAAAAAAGAAAAR-AAAAAAAA-AAA-G
1047





1654235
AAAAAACAAAA-AAAAM-AAAA-AAA-AAA--
1048





1654287
AAAAAAT-AAA--AA-T-AAAA-AAA-AAA-T
1049





1654362
TTTTTTTCTTTTTTTTTT-TTTCT-TTTTTCT
1050





1654368
GGAGGGGAGGGAAGGAAGGGGGAG-GAGGGAA
1051





1654467
TTKTTTTTTTTGGTTGG-TTTTTT-TGTTTTG
1052





1654484
AAGAAAGGAAAGGAAGGAAAAAGA-AGAAAGG
1053





1654519
TTATTTTTTTTAATTAATTTTTTTTTATTTTA
1054





1654529
CC-C-CCCCCC--CC--CCCCCCCCCGCCCCG
1055





1654530
TT-T-TTATTT--TT--TTTTTATTTTTTTAT
1056





1654634
GGGGGGGGGG-AAGGGAGGGGGGGGGGGGGGG
1057





1654649
GGGGGGGGGGGAAGGGAGGGGGGGGGGGGGGG
1058





1654795
TTTTTTAATTAAAT-TAATTTTATTTTTTTAT
1059





1654849
CCYCCCTTCCYTTCCTTTCCCCTCCCTCCCTT
1060





1654906
CCCCCCAACCMCCC-CCACCCCMCCCCCCCAC
1061





1655007
CCCCCC-ACCCCCC-CC-CCCCACCCCCCCAC
1062





1655042
GGGGGGGGGGGCCG-GCGGGGGGGGGGGGGGG
1063





1655195
CCCCCCTTCCYCCCCCC-CCCCCCCCCCCCC-
1064





1655349
TTTTTT--TTT-ATT-T-TTTT-TTTATTTTT
1065





1655353
GG-GGGAAGGR-AGG-AAGGGG-GGGAGGGG-
1066





1655408
GGGGGGAAGGGGGGGGGGGGGGGGGGGGGGG-
1067





1655881
TTGTTTTTT-TTTTTGTT-TTTTTTTGTTTTG
1068





1656044
GGGGGGAGGGGAAGGGAGGGGGGGGGGGGGGG
1069





1656188
AARAAAG-AAAAAA-AA--RAAAAAARAAAAA
1070





1656189
GGSGGGC-GGGGGG-GG--GGGGGGGSGGGGG
1071





1656263
GGG-GGCCGGGGGGGGGGGGGGGGGGGGGGGG
1072





1656348
TTATAAAATAAAAA-AAAATTTATATATATAA
1073





1656394
CCYCCC-TCCCTTCCTTTCCCCCCCCTCCCCT
1074





1656622
TTYTTTTTTTTTTTTCTTTTTTTTTTCTTTTC
1075





1656645
TTTTTTTTTTTCCTTTCTTTTTTTTTTTTTT-
1076





1656898
C-CCCCTTCCCCCCC-C-CCCCCCCCC-CCCC
1077





1656979
GGGGGGAGGGGGGGGGGGGGGGGGGGGGGGGG
1078





1657025
AAWAAATT-AWTWAATT-AAAAAAAATAAAA-
1079





1657162
CCCCCCTTCCCCCCCCCCCCCC-CCCCCCCCC
1080





1657319
GGGGGGGGGGGCCG-G---GGGGGGGGGGSG-
1081





1657593
AAAAAAATAAATAAAA-AAAAAAAAA-AAAAA
1082





1657661
AAAAAAATAAATTAAAT-AAAAAAAAAAAAAA
1083





1657803
TTTTTT--TTAAAT-TA-TTTTTTTT-TTTTT
1084





1657814
AAAATT--AAT-TT-A--TAAAAAWWAATAAA
1085





1657857
TTTTTTTC-TTCCTTT--TTTTTTTTTTTTTT
1086





1657867
AAAAAAAGAAAGGAAA-AAAAAAAAAAAAAAA
1087





1658176
CCCCCCCCCC-T-CCCTTCCCCCCCCCCCCC-
1088





1658177
GGGGGGGGGG-TKGGGTTGGGGGGGGGGGGG-
1089





1658205
CCCCCCC-CCCAAC-CACCCC-CCCCCCCCC-
1090





1658219
CCCCCCC-CCTTTCCCTT-CCCCCCCCCCCC-
1091





1658281
TTTT-TTTT-YCCTTTCCT-TTTTTTTT-TTT
1092





1658580
AAAAAAAA-AAAAAATTAAAAAAAA--AAAAT
1093





1658589
AAAAAAACAACAAAAC-CAAAAAAA--AAAA-
1094





1658590
CCCCAACCCCCCCCCC-CAACCCCC---ACC-
1095





1658617
AAAA-AGAAAAGGAAAGAAAAAAAA-AAAAAA
1096





1658632
CCCCCCC-CCCTTCCCTCCCCCCCC-CCCCCC
1097





1658707
GGAGGGA-GGG-AGGAA-GGG-GGGGAGGGG-
1098





1658756
GG-GGGGGGGGTTGGGT-GGGGGGGGGGGGG-
1099





1659101
AAAAAAAAAAAGGAAAGAAAAAAAAAAAAAAA
1100





1659190
AAAAAAAAAAAGGAAAGAAAAAAAAAAAAAAA
1101





1659382
TTTTTWTTTTT-TT--TT-TTTTT-W-TTTT-
1102





1659384
AWWAAAATAAW-AA---T-AAAAA-ATAAAA-
1103





1660414
CCMCCCCACCMAACCAAACCCCCCCCACCCCA
1104





1660604
AAWA---AA-A-AT-WT--AAAAA--AA-AA-
1105





1660606
TTTT---TT-T-AA-TA--TTTTT--TT-TT-
1106





1660618
GGGGT--GG-GTG--G---GGGGG-GGG-GGG
1107





1660621
AAAAG--AA-AGAG-A-A-AAAAA-AAA-AAA
1108





1660668
AAAAGGAAA-AAAG-AAAGAAAAA-RAAGAAA
1109





1660682
TTATAAATT-TTAA-ATAATT-TT-WATATTA
1110





1660704
TT-TTTTTTTTTCTTCT-TTTTTT-TC-TTT-
1111





1660795
CCCCCCWCC-CCCY-CC--CCCCCTC-C--CC
1112





1660796
TTTTTTTTT-TTTY-TT--TTTTTCT-T--TT
1113





1660935
T-YTTTTTTTTT--TCTC-TTTTTTTCTT-TC
1114





1660958
CCMCCCCCCCCCAC-AC--CCCCCCCACCCCA
1115





1660963
TTCTTTTTTTTTCT-CT--TTTTTTTCTTTTC
1116





1661003
CC-CCCC-CCCTTCCTT-CCCCCCCCTCCCC-
1117





1661011
TT-TTTT-TTTTCTTCT-TTTTTTTTCT-TT-
1118





1661015
GG-RAAG-GAGGGAAGG-AGGGGGAGG--GG-
1119





1661025
CC-SGGCCCGCC-GGCC-GCCCCCGCC--CC-
1120





1661066
A-AA-AAAAAAAAAWAAA-AAAAA-AAA-AAA
1121





1661084
T-CT-TTTT-TCC-TCCC-TTTTTTTCT-TTC
1122





1661088
A-GA-GGAA-AGG-GGGG-AAA-AGAGA--AG
1123





1661104
T-CT-CCTT-TCC-CCCC-TTT-TCTCT--TC
1124





1661128
A-GA-GG-A-AGGGGG-G-AAAAAGAGA-AAG
1125





1661129
C-CC-TC-C-C-CTTC-C-CCCCCTCCC-CCC
1126





1661131
G--G-GG-G-GG-GGC-C-GGGGGGGCG-GGC
1127





1661150
C--C-CC-C-CCT----T-CCCCCCC-C-CCT
1128





1661160
T--T---TT-TCC----C-TTTTT-TCT-TTC
1129





1661214
T-GTTTTTTTKGGTT-GG-TTTTT-TGTTTTG
1130





1661239
TTCYCCCTTCC-CCCC-CCTTTTT-TCTCTTC
1131





1661259
CCSSGGGCCGCCCGGCCCGCCCCCGCCCGCCC
1132





1661275
CCYCCC-CCCYCTCCTCTCCCCCCCCTCCCCT
1133





1661328
T-CYCCCTTC-CCCCCCCC-TTTTCTCTCTTC
1134





1661356
A-RAGGGAAGAAAGGAAAG-AAAAG-A-GAAA
1135





1661456
TT---AA-TAT---A----TTT-T-T-TATT-
1136





1661461
AA---AA--AA---AG-G-AAAAA-AGAA---
1137





1661539
T--TTTTTT-CTCTT-TCTTT-TTTTCTTTTC
1138





1661560
T-ATTTT-T--T-TTAT--TT-TTTT-TTTT-
1139





1661768
CCYCCCCCCCYCTCCTCTCCCCCCCCTCCC-T
1140





1661783
GGRGGGGGGGRAAGGAA-GGGGGGGGAGGGGA
1141





1661971
AAWAAAA-AAWATAATAT-AAAAAAATAAA-T
1142





1662071
CCMCCC--CCCCACCAC-C-CCCCCC--CCC-
1143





1662122
TT-T-T-TTTTTT---T--TTTTT-TATATTT
1144





1662124
AATATT-AATAAA---A--AAAAA-WAAAAAA
1145





1662258
CCYCCCCCCCYCTCCTCTCCCCCCCCTCCCCT
1146





1662283
CCACCCCCCCMCACCACACCCCCCCCACCCCA
1147





1662373
CCCCCCCGCCCGCCCCG-CCCCCCCCCCCCCC
1148





1662399
TT-TTTGKTTTGKTTT--TTTTTTTTKTTTTT
1149





1662401
TT-TTTTTTTKTTTTGT-TTTTTTTTKTTTTT
1150





1662555
GGGGAAGGGAGGGAAGGGAGGGGGAGGGA-GG
1151





1662568
AARAAAAAAA-AGAAGAGAAAAAAAAGAAAAG
1152





1662576
AACAAAAAAA-ACAACACAAAAAAAAC-AAAC
1153





1662653
GGKGGG-GGGKGTGGTG--GGGGGGGTGGGGT
1154





1662656
CCSCCC-CCCSCGCCGC--CCCCCCCGCCCCG
1155





1662666
GG-GGG-GGGSGCGGCG--GGGGGGGCGGGGC
1156





1662672
TT-KGG-TTGTTTGGTT--TTTTTGKTTGTTT
1157





1662692
CC-CCC-CC-YCTCCTCT-CCCCCCCTCCCCT
1158





1662694
TTCTGG-TT-YCCGGCCC-TTTTTGTCTGTTC
1159





1662706
CCCCTT-CC--TCTT-TC-CCCCCTYCC-CCC
1160





1662725
GGCG-G-GG---CGG-GC-GGG-G-G-G-GGC
1161





1662758
GG-G---GG-RGA----A-GGGGG-G-G-GG-
1162





1662766
TTCTCC-CT-YCCC---CCTTTTTCY-T-TT-
1163





1662804
CCTCTT-CC-C--T--C--CCCCCTY-CTCC-
1164





1662841
CC-CTT-CC----T----TCCCCCTY-CTCC-
1165





1662854
TT-T-GTT--T--G----GTTTTTTT-TKTT-
1166





1662858
GG-R--GGG-G--------GGGGGAR-GAGG-
1167





1662879
GG-R-AGAG-GG-A--G--GGGGGAR-GAGG-
1168





1662882
GG-G-GGGG-GT-G--T--GGGKGGG-GGGG-
1169





1662894
AAGR-GGAA-AG-G--G-GAAAAAGG-AGAA-
1170





1662916
GGKGGGGGGGKGTGGTG-GGGGGGGGTGGGGT
1171





1662935
TTCTCCCCTCYCCCCCCCCTTTTTCYCTCTTC
1172





1662980
GGAGGGGGGGR-AGGA-RGGGGGGGGRGGGGA
1173





1662997
CCCCCMCCCCCCCCCC-MCCCCCCCCMCCCCC
1174





1663007
TT-TTTCTTTTCTTTT-TTTTTTTTTTTTTTT
1175





1663014
GG-GGRTGGGGGGGGG-RGGGGGGGGRGGGGG
1176





1663032
GG-GGRAGGGGGAGG--AGGGGGGGGAGGGG-
1177





1663033
CC-CCYCCCCCCCCC--TCCCCCCCCTCCCC-
1178





1663051
GGARAAAGGAG--A-A-AAGGGGGAR-GAGG-
1179





1663064
AAGAAAGAAAAGGA-G-GAAAAAAAR-AAAA-
1180





1663110
GGGGGGGAGGG--GG--AGGGGGGGG-GGGG-
1181





1663143
AAGAGGGGAGA-GGG-G-GAAAAAGR-AGAAG
1182





1663181
CCTYTTTCCTCATTTTA-TCCCCCTYTCTCCT
1183





1663182
GGGGGGGGGGGAGGGGA-GGGGGGGGGGGGGG
1184





1663188
AAGAAAAAAAAAGAAGA-AAAAAAAAGAAAAG
1185





1663219
CC-CCCCCC-CA-C--A--CCCCCCC-C-CC-
1186





1663226
CC-CTTTCC-CT-T-----CCCCCTC-C-CC-
1187





1663237
CC-CC--CC-CA-MC-A-CCCCCCCC-CCCC-
1188





1663250
GGAGGGAGG-GA-GG-A-GGGGGGGGAGGGG-
1189





1663251
CCTCCCCCC-CC-CC-C-CCCCCCCCTCCCC-
1190





1663288
CCSCCCCCCCCCGCCGCGCCCCCCCCGCCCCG
1191





1663298
AAGRGGGAAG-GGG-GGGGAAAAAGAGAGAAG
1192





1663445
GGGGGG-GGGGAGGGGAGGGGGGGGGGGGGGG
1193





1663463
CCCCCC-CCCCTCCCCTCCCCCCCCCCCCCCC
1194





1663479
AAGRGG-AA-AGGGGGGGGAAAAAGR-AGAAG
1195





1663486
GGARAA-GG-GG-A-GG--GGGGGAR-GAGGG
1196





1663501
CC-Y---CC-CC-T--C--CCCCCTY-C-CC-
1197





1663505
CC-CA--CC-CM-C--C--CCCCCCC-C-CC-
1198





1663520
CCCCC-CCCCCT-CC-TC-CC-CC-CCCCCCC
1199





1663534
GGRRAAGGGAGGGAA-GG-GG-GG-GGGAGGR
1200





1663535
TTTTTTCTTTTCTTT-CT-TT-TT-TTTTTTT
1201





1663553
TTYTTT-TTTTCCT---C-TT-TTTT-T-TTC
1202





1663562
CCMCCC-CCCCCAC-A--CCCCCCCCACCCC-
1203





1663564
AARAAA-AAAAGGA-G--AAAAAAAAGAAAA-
1204





1663568
AARRGG-AAGAGAG-A-AGAAAAAGRAAGAA-
1205





1663573
AAARGG-AAAAAAG-A-AGAAAAAGRAAGAA-
1206





1663620
TTCT-C-TT-TC-----C-TTTTT-TCT-TT-
1207





1663628
TT-T-G-TT-TT---GGTGTTTTTGTTT-TT-
1208





1663698
CCYCCCC-CCYCTCCTCTCCCCCCCCTCCCCT
1209





1663719
CCYCCCC--CYCTCCTCTCCCCCCCCTCCCCT
1210





1663724
GGRGAAG--AGGGAA-GGAGGGGGA-GGAGGG
1211





1663731
CCACAA-CCAMAAAA-AAACCCCCAC-CACCA
1212





1663747
TTKTTT-TTTKT-TT-TG-TTTTTTTG-TTTG
1213





1663777
GGRGGGGGGGRA-GGAAA-GGGGGGGAG-GGA
1214





1663798
GG-G--GGG-KG---TG-GGGGGG-GTG-G-T
1215





1663907
CCGSGG-CC-CGGGG-G--CCCCCGSGCGCC-
1216





1663935
AA-AGA-A--AAAGAAAA-AAAAA-A-A-AAA
1217





1663957
AA-ATT-AATAAA-TAAATAAAA-TW-A--AA
1218





1664033
AAGAAAA-AARGGAAGGGAAAAAA-AGAA-AG
1219





1664115
GGARAAAGGARAAAAAA-AGGGGGARAGAGGA
1220





1664472
AAAA-T-AA-A-ATTA-AWAAA-A-AAAAAAA
1221





1664473
TTTT-A-WT-W-TA-T-TTTTT-T-TTTTTTT
1222





1664494
AAW----AA-A-A--TTAAAAAAA--AAAAA-
1223





1664496
TTW---TTT-T-T--AATTTTTTT-TTTTTT-
1224





1664567
AATATTTTATTATTTTATTAAAAATWTATAAT
1225





1665072
AAAAAAAAA-ACAAA-CAAAAAAA-AAAAAAA
1226





1665111
GGGGGGGGGGGTGGG-TGG-GGGGGGGGGGGG
1227





1665167
CCCCCCC-C-CACCCCACCCCCCCCCCCCCC-
1228





1665182
T-AWAAAATAWAAAAAAAATTT-TAWATATT-
1229





1665185
AAAAAAAAAAATAAAATAWAAA-AAAAAAAA-
1230





1665206
AATATTATATWTTTTTTTTAAAAATATATAA-
1231





1665230
AAAAAAAAAAAGAAAAGAAAAAAAAAAAAAAA
1232





1665914
TTTTTT-TTTTATTTTATTTTTTTTTTTTTTT
1233





1665941
AAWAAA-AAAWATAATATAAAAAAAATAAAAT
1234





1666046
TTATTT-TTTT-TTTWT-TTTTTTTT-TTTTA
1235





1666527
TTTTTTATTTTATTTTATTTTTTTTTTTTTTT
1236





1666561
TTCTTTCTTTYCCTTCCCTTTTTTTTCTTTTC
1237





1666912
AAAAAAAAAAAGAA-AG-AAAAAAAAAAAAAA
1238





1667354
TTTT-TCTTTTCTTTTC-TTTTTTTTTT--TT
1239





1667743
TTTTTTATTTTAWTT-A-TTTTTTTTTTTTT-
1240





1667747
GGSGGGGGGGGGSGG-G-GGGGGGGGCGGGGG
1241





1667810
CCCCCCCCCCCTCCCCTCCCCCCCCCCCCCC-
1242





1668348
AAAWTTATAAAAATTAA-TAAAAATWAATAAA
1243





1668441
AAGAAAGAAARGGAAGGGAAAA-AAAGAAAAG
1244





1668887
GGRGGGAGG-RAAGGAA-GGGGGGGGAGGGGA
1245





1668893
CCYCCCCCC-YCTCCTC-CCCCCCCCTCCCCT
1246





1668996
TTGTTTTTTTKTTTTGT-TTTTTTTTGTT-TG
1247





1669034
AA-A---AA-W-WA-A--AAAAAAAAWAAAA-
1248





1669145
AAAATTA-A-AAATTAA-TAAAAATAAA--AA
1249





1669151
TTATTTA-T-TTATTAT-TTTTTTT-AT--TA
1250





1669190
GGRGGGAGGGRAAGGAAAGGGGGGGGAGGGGA
1251





1669348
CCTYTT-TCTYTTTTTTTTCCCCCTYTCTCCT
1252





1669541
TTYTTT-TTT-TCTTCTC-TTTTTTTCTTTTC
1253





1669557
GGGGGG-GGG-GAGG-G-GGGGGG-GAGGGGA
1254





1669564
AAAAAA-AAA-A-AATA-AAAAAA-ATAAAAT
1255





1669584
CCMCCC-CCCCCACCACACCCCCCACACCCCA
1256





1670005
GGARAA-AGAGAAAAAAAAGGGGGAAAGAGGA
1257





1670010
TTTTTT-TTTTCTTTTCTTTTTTTTTTTTTTT
1258





1670034
TTCTTT-TTTTTCTTCTCTTTTTTTTCTTTTC
1259





1670046
TTGTTTGTTTTTGTT-T-TTTTTTTT-TTTTG
1260





1670786
TTCTCC--TC-TCCCCTCCTTTTTCC-TCTTC
1261





1671225
TTCYCCTCTCCTTCCCTTCTTTTTCYCTC-TC
1262





1671483
AAGAGGAGAGRAAGGGAAGAAAAAGRGAGGAG
1263





1671607
TT-T--TAT-T-TA--TT-TTTTT-T-T-TT-
1264





1671644
GGGG--GRG-G-GG-GRGGGGGGGRG-GR-G-
1265





1671646
TTKT--TKT-T-TT-TKTTTTTTTKTGTK-TK
1266





1671655
TTTT-T-TTTT-T-TCT-CTTTTTTTTTTTTT
1267





1672660
CCCCCCCCCCCCACCCCACCCCCCCCCCCCCC
1268





1673096
TTGKGGGGTGKGGGGGGGGTTTTTGKGTGGTG
1269





1673273
AAWWTT-TWTAATTTAAT-AAAAATWAATAAA
1270





1673437
AAAAAAAAAAAAGAAAAGAAAAAAAAAAAAAA
1271





1673454
CCCYTTCTCCCCTTTCCTTCCCCCTYCCTCCC
1272





1673470
AA-AGGAGAAAAAGGAAAGAAAAAGR-AGA--
1273





1673499
CCCYTTCTCCCCTTTCCTTCCCCCTY-CTCCC
1274





1673512
GGRGGGGGGAGGGGGGG--GGGGGRGGGGGGG
1275





1673576
GGAGGGGGGGAGGGGAG-GGGGGGGGAGGGGA
1276





1674853
CCCCCC-CCCSCCCCCC-CCCCCCCCCCCGCC
1277





1674957
TTTTTTATT-TTTTTTTTTTTTTTTTTTTATT
1278





1674960
CCCCCCTCC-CCCCCCYCCCCCCCCCCCCTCC
1279





1674972
CCCYTTTTC-CCTTTCCTTCCCCCTYCCTTCC
1280





1675064
GG-RAA-AGGGGAAAGGAAGGGGGARGGAAGG
1281





1675070
GG-RAA-AGGGGAAAGGAAGGGGGARGGAAGG
1282





1675285
CCCCCC-CCCCTCCCCTCCCCCCCCCCCCCCC
1283





1675368
AA-A-T--AAAT---A----AAAAAAAA--AA
1284





1675396
AAAA-G-GAAA-G--A----AAAAAAAA--AA
1285





1675397
GGGG-A-AGGG-A--G----GGGGGGGGAAGG
1286





1675405
TTTT-T-TTTTTTTTTT---TTTTTTTTTTTT
1287





1675552
AAGAAAAAAGAAAAAGAAA-AAAAA-GAAAAG
1288





1675709
GGAGGGGGGAGGGGGAGGGGGGGGGGAGGGGA
1289





1675849
AAAMCCACAAAACCCAA-CAAAAACM-ACAAA
1290





1675982
CCTCCCCCCTCCCCCCCCCCCCCCCCCCCCCC
1291





1676018
AACMCCCCACM-CCCA-CCAAAAACMAACCAA
1292





1676087
AAAAAAMAA-A-AAA-AAAAAAAAAACAACAA
1293





1676145
AATTTTTTA-WTTT--TTT-AA-T-A-ATTAT
1294





1676226
AATA-TTAA-A-ATTT--AAAAAA---ATTA-
1295





1676227
TTGT-GGTT-T-TGGG--TTTTTT---TGGT-
1296





1676291
CCTCCCC-C-CCCCCTCCCCCCCC-CTCCCC-
1297





1676587
TTGT-TG-TGKGGK-GG--TTTTT-KGTK-TG
1298





1676704
GGAGA---G-GA-------GGGGG---G--G-
1299





1676705
AACAC---A-AC-------AAAAA---A--A-
1300





1676732
TT-TAAA-T--AA------TTTTT---TA-T-
1301





1676809
T--T-TA----T-T-TAT-TTTTTTTWT--T-
1302





1676925
TTAWAA--TA--AAA--AATTTT--AATAAT-
1303





1676981
TTCTC----C--CC-----T-------T--T-
1304





1677110
C--Y----C---T--T---CC---------CT
1305





1677142
G--K----G-G--T-T---GG--G------GT
1306





1677210
A-GAGG--A-G-G----G--AAA--A---GA-
1307





1677244
TTCYCCCCT-Y-C-CCCC-TTTT--TCTCCT-
1308





1677273
TTGKGGGGTGKGGGGGGGGTTTTTGKGTGGT-
1309





1552671
AGAG-AGGAA-GGG-AG-AGAAGA--AAAAGA
   8



(SNAP Stop)









127 SNPs (FP88-S127, light-grey), 21 insertions (FP88-121, light-grey), and 27 deletions (FP88-D27, light-grey), possessed by Forrest, Peking, and PI88788, but not by Essex, confer resistance of Rhg1 to SCN (see e.g., FIG. 4). 6 SNPs (C163208A, C163225G, G164965C, G164968T, G164968C and C164974A) and 4 insertions (A164972AGGT and A164972AGGC) locate within the exons of SNAP, and cause amino acid (AA) changes to the predicted SNAP protein (see e.g., TABLE 7).













TABLE 7








AA (nucleotide)





Total AA
changes in
AA (nucleotide)


Gene
SNPs/Insertions
changes
Forrest/Peking
changes in PI88788



















Glyma18g02353
1
1
1



Glyma18g02420
1
1
1



Glyma18g02450
3
3
3



Glyma18g02520
1
1

1


Glyma18g02590
10
9
3 + 2*
4 + 2*


(SNAP)


D208E (C163225G)
Q203K (C163208A)





D286Y (G164968T)
E285Q (G164965C)





D287E* and
D286H (G164968C)





−288V (A164972AGGT)
D287E* and −288A





L289I* (C164974A)
(A164972AGGC)






L289I* (C164974A)


Glyma18g02650
1
1
1



Glyma18g02660
1
1
1



Glyma18g02681
1
1
1



Glyma18g02690
3
3
3



Glyma18g02700
1
1
1



Glyma18g02720
2
2

2


Glyma18g02741
1
1

1





Where * indicates that the amino acid changes take place in both Forrest/Peking and PI88788.






Example 7
Tilling Screening and Phenotyping

The following example describes additional evidence for the identification of a missense Forrest SNAP Type III mutation of A111D in SEQ ID NO: 7.


Forrest SNAP Type III missense mutant A111D was screened by TILLING from the newly developed, chemically mutagenized SCN-resistant soybean Forrest population. SNAP in the A111D mutant was sequenced to characterize the identified allele and its subsequent amino acid changes within the predicted protein sequence. SIFT predictions were performed on the identified mutation. SIFT predicts whether an amino acid substitution affects protein function based on sequence homology and the physical properties of amino acids (reviewed in Henikoff and Comai, Annu Rev Plant Biol., 54:375-401, 2003). SIFT predictions with MC<3.25 are considered confident. Changes within a SIFT score<0.05 are predicted to be damaging to the protein.


As shown in TABLE 8, the A111D SIFT score (0.03) is <0.05. This mutation is predicted to damage the SNAP protein. The A111D mutation identified had MC value (3.00)<3.25, thus the SIFT prediction of the A111D mutant can be considered confident.









TABLE 8







SIFT and MC prediction of TILLING-identified soybean mutant.










Effect













Mutant line no.
Nucleotide
Amino acid
SIFT (MC)







F1292
C322A
A111D
0.03 (3.00)










SCN susceptibility of Forrest SNAP Type III mutant A111D, as compared to SCN-resistant wild-type Forrest, indicates functionality of SNAP A111D mutation in soybean resistance to SCN (see e.g., FIG. 5).












SEQUENCES















>Gm18:1625005..1645004


SEQ ID NO: 1


AAGGAGAAGCAACAAATCAAAGGATGTTTACTTCTGATTTTGGAATGCATCCAAACACTC





CTCACGGGGAATTTTTTAGGCTCTCTACTCCCGCGCGATCTCTCTTGTATCACCAGCAAT





GAAACTTTGTCATATTTTTCCTTCTCTCTCCACACGAGCCATCTCCTATCCTTCTTTTTC





ATATTAGACATTAATTCAAACCCTCTTTCATTTAACTTTCCCTCATGAAAGTGAAATTTA





AATGATAATATTGGATTCTGATTGTGCAAATTAAATACAAGTAAAGGCTTAAATAAATAA





ATAAATAAATATATATATATATATATTCCAATAAATTAATACGTAATTTTATGATTTTGG





TCTTAATAATTTTTTTTAATATAATTTTTCTGTGCCTTTTCTAATAAGGATGATGTCAAG





TTAAAATCATGATGATGTCATTAAGTATCGCATCATCAAATGATGAAGTCATCCATTATC





ATATTAATAAAAAAAATGATATCATTAAATAGGTCGCCTACATCATCAAATGATATCATT





AAGGATTAAAAATAAAAGTTATATATATGCAAGAATAAAAAGTAAAAAAAATGACATTTG





TAAGAACAAAAAAAATATTTAAACATTAAATTAAATGTTTACTTCTATATACTCCCTTAT





CACACTTATCACTCTCAAGTTTGTTTATACAAATTCTATTTATCAAATATTATCACCAAA





CTTTCTCATACTAAAAACTTTTAGTCATGAATGTTGAATTGAGATACCATCTTTCATTTG





AATCAGGGACCAAAATAGTAAAAAAATATTCATTTACTAAAATTCAAAATAATATAATTA





TATAATTTTTTTGGTGAAAACATAATTATATAAATAAATAAAAGAAAGTAATGAAAAATA





TAAACACATATTAAATTAATCTAACAAATAAAAGGTATTTCAATTAGTTGACAATAAAAA





ATATACATTATTAACAAGATTATGACTTAAATTGTCTATCCACAATTGCCAATCAAAATA





CATCACTAAATATAATTATTATAATTATTGATTAAAAAAAAGCTATCAATCCATATTTTG





TAGAATAATACCTTAAAACAAGCATAAACATAAATTCTTAGAACTTAACAAAATACAATT





ATTATTATATTTAAATATATATTAATATAAAGTTCAATTTTGATCCTTATAGTTATATAA





ATTTTATCTTTTAGTTTCTATACTTAAAAATCATCTCTTTTAATCCTATGCATAATATTT





TTAATCTCTTTTAGTTCTTATTATGAGTTTAAACGGTATGGATTAAAAGAAATAAAAATG





ATACAAATTAAAAAAGATTAAAAATAATATGTGTAAAGATTAAAATACACAAATTTAAGT





ATAAGAATTAGGATAAAAATTATATAATTATAAAAATTAAATGAGTAATTAAGCATTAAT





ATAATACTATTATATGAAAAGTTTTGTTTCTAAGAAGAGTCTCGTCCACTTTGCTTTTTA





CAATCACATGTTAAAAAAATTATATCATATTAAGGGATGGCCAATTATCTATGCCTATAT





AGTTTATATGTTTTGAAAAAATTTGGGGCGTCGTGACTCCCCCGCCCTCAAATAGCTCCG





TCAATGACCATTCACATCTTATAGTTCAAGTATCATTGTTACAAGTACACATATTTATTA





CAATATATATACCTAAAGTAAGGTTGTTTTTGCTAACTTAAACATCAAAGTGATTTTGTG





GATAACCCACTGCCACTCAAGGAGTTCGGTGTCGCAGAGTTCAGGCATCTAAAATCTAGA





CCATCGCTTGTCGCACTTTGAACCAAACATGAAGGATGGAGCTTTGAGAAGGAGGGGTGG





AGAAGAAAATGGGGTGCAAATATATATAGCGATGAGGAGTATTTTAATATTTAATTGAAT





GCTTAAATATATTTTTATCTCCTAAATATGTCATTTATTACTTTTAATTCCAGTAAAAAA





ATATATTTTAATCCTTGTAAATTTGTTACTATTACATTTTGTGTTTAATAACATCATCAT





TTGATGATGTAACATTTTAATTATACAATCATTTGATGACATAATTTTATCTTTATATTA





TATTTGATTACATTATCATTTAGTAATGTGGTGCTTGATGATATTATTATGATGCCAATA





TAATATCATCACTATCGGTTCTGATGGTTTTTTTTTTAACAGCGAGAGAAACTAATTAAA





>Gm18:1625005..1645004


SEQ ID NO: 2


AGACACATAAAAAAGATAGAAAGTAATATATATATATATATATATATATATATATATATA





TATATATGTGTGTGTGTGTGTGTGAGTGTGTGTGTGTATACGGGTTTACAGGGTATGTAC





AACATGGTTATATATCATATAGTATCTAACTAACTATATATGAAAATAATTTCATAAATT





GTCATGAGTGGTTAGTATAATTTATTGTAAAAATTATTGAGATATGTATGTTTTTTAACT





GCATTGAGATATAGTTTTTTTGACTCATAATATATATTTTATCTGCACTCATAATATATA





GTATCAAGATTTTTTTTTTTGTCAGCGTCATATGTGATGTTAAATGTGATACTTTATATT





GATGAAAAATGCTAATAAGTTTATTTAGCATATTTTTTAAGGAATTAAAAATAATTTTTT





TATTCGAATGCATAAAATTGTGTTGTTTATAATTTTTTTTTATAATTTCCTAAATATTTA





CTTTTTTTAGTTTTTTAACCCATGAGCACTGGTTAACAAAACCATATATCAATTAGGGGT





TATAAATTTTTTAGGCATGATAATTAATGATAATAAAGTTTTAAACTTATGATTTATAAT





GCAGTGCAAAAGTCAATTTTCCAAGACATGTTAGGCAAAACTTGGATTTGAATTCTCTGC





CATCGGAGAAATCAGGCATTCACGTAGTCAGATGATTAAAAGCGTCGACTAGTGAAGATC





TAACAATTCATAAGTCATTTAATAACTTCAACTTTAAAAATATCTAAAAAATACTAACCA





TATGATGAAATCAAACGGTCCATAATTATGACTGTGTGAATGCAGAGAATCCAAATCCGT





AAAACTTCTGAAAAATGCATATCGAGCGTAAAATTTTATCGAGAATGAGCACGTACTCAG





AAGCATCTAGTATGAATTGAATTATACATATAGATTCAGATGTCTACATGCACACACATA





TATAATGAGACGGATCTTAGATCATATCATACGTATTTGGTATTTAGAGCTGTTACTTCC





TTGTGTTGCTTCCAGTATTGTCACCATTCCACAAGGACAAATCAGGTACCCACCTCTATT





CACATCTGTGTCCATTAATCATCACAACAAAACAATATCCTCCATACATATATCCTGCTC





CTACCTTGTCTATTGCCTTATATGCTCTGAACTCTCCTAGAGTACTCTTTTAGATCCCAT





AATGTTCACCATTCATCAAGGTTTTCCTAATTACTTTAATTAAGTGTTGGAAATGTTGTA





TTTTGAGTTGTTGGCCATATTTTGATGAGTTCTCAAAAGACCAGTTTACTTAGAAATTAT





TAATTTTTTTTAAGTTAGTATGTTCGCTATATGGCCAAGACTAGGGGAAAAAAACATAAC





ATGCATGTAACCCTAGGTCATAATAAAGAATTAAAACCAAGTTATTTTATTAAATAGTAT





AAGGGCTATCAATAAATAATTTTTACAAAAATTAAAATTAAAATATTGATATTTTTACAA





AAGTTCAACTTCTATGTTACTAATTAATAATATTCTGCAAATAGAAAATCATGTTAAATT





TTTTTTTTTAAGATAATTATTATAAAAGTTGATAAAATTATGCATGCATGATTATTTTAA





TGACTATTTGAAAGTGTAAGAATTATTTACATTATCTATATATATAATCGTTATTTTCTG





AGTTGAATGAACGTGTCCGAGTCATTATCCATATATTTGTATTTTTTTTTTTGAGTATTT





TTTTGTTGTTGAATAGTCCATATATTTGATGGTATCAGGAAAATGTGGAAAAGATACAAA





ATCATACAGGTAAGCAATTTGTCTTGAGTCGTCATGAGTCTGTTAAAAGTCTATCGATCA





GCAACCTTATGTAATATATACACTTTGGGTGCGATCATTTCAATCAAAATCACATTCCCT





AGCTTTGGCTCAGAAATCAATAATTGAGCAAGTAATTTTGTGGATACATAAAATAAAAAG





AGTTGGCCTGTATGGTGAGGTTCACAGGTAATTACACATTATTCCAGTTTATTTTAGATC





GAAACCAGATATTAATGATCACAAAAAAAGATGGATCGAAGAAAATAAAATATTGCTGGC





CCTCCCTTATGCTATTTCCTATTTCTTCTTATGCTTTGAACCTATTAATAGCACATTTTA





GCCTGAGCAAGGAGTCCGAATGTCCTTTTTCATGGACTTGGCAGTTGCAATAAAGTTGGA





GAAGCTGATAAGATTCTGTTTCAGATTTGATTATTCTTACTCACATGTTCCACGATTGAC





GAGAATATATATATATATATATATATATATATATATATATATATATATATATATATTGGT





TTTTAAGTGTGTCTCTATCCTCCCGTTTAATTTGAATTTGATGACTTTTTGGTAGGGTTA





AAGACTAAAGCTAAAACTCCTCTAAATAGTAACTTGTTCTTAAAAAAAATGATTTTTTTT





TATCAGAGTTAACCACACATACTTAATTTAAAAGATTTGAACTCCTTTAACTTGAATCAA





AAGATGCTAGCTTTTTATTAATGAATATGATGATATGTACGAGGTAATGTTTTAGTATGA





TTATTGGAAATTTGGCATCTTGCTAGCATGTAGGTGATACTCTTATTTCTCATGTTAAAT





AGTTAAATTCATTTTAAAAGTATAGAATACCAATAAATTGATTCTTAAAAAATAAAAATT





TAAATTTAATTATTGAATATGTAAAAAAATATGATAAATTAGTCTTGCAAATAATTTAAT





GGCAAGTTAGTTCTTGAAAATATAATTTATTATCAAATTAATCATTAAAAATTATAATTT





AATGACAAAATAGTTTCACAAATTTAATGACAAGAATAATTTATTGCATTTTTTACCTAT





TTAGAAATTAAATTTATGTTTTTCATTTTTTAAGGACCAATTTATCATCGTATCATACTT





TCGATAACAAATTTGACTATTTATATTTCTTATTAACATGCTCAATGATGATCTATTTTT





CTGTCTATATAGATAGATCTATGTTCTATAATTTACAATTGAATTATATACAAATATCAT





TATATAACAACATATATTAAAAAAAAACTTGTATTTTTTTTATTTTAGAGAGTTTTTGTA





TCTTTTTATCTTCCTGTTGGATGAATAATTTCCATGTACATATATACTAGTGCTTTGTAT





CATGGAACTTTTACATTTTTTTTTTCCTGGAAACAAGCCATGTTCATAGGCTTAAAAATA





ATTAAAGTGACTTTTATCTTTTTCAACAAAGTCTTCTTCATACGCATAACCAACAAATAC





ATATTTAAGAAATTCAATCATCTATCAACTGTGTTAAATCTTGTTGATATCTCTATTACA





ACACTTTTTATTAATCCGTGTATTAAACTTGAGATGTGGACTTAATTTTAACAATTAATT





AGAAAGCTATGCTAACTAAATGAGAGTAATTACAAGTAATAGAACCAAAATACAATAAAA





ACGTTCCAGAAATTAATGTTCAACTAGTATATTTTTTATGAAAAATAAACAGAAAAGTTT





TTAAAAAAATAAAGGGTTATAAATCACCTTGGTTGACACCCAATGAGATAATGGGCTAAA





ATTACTCATTCATTTCAAGCCCAGTAAGTCTGGGCCTAGCATTAGCTTCAAGTAGTTCGG





ATTCACCCGACCCAGATAAGACATTCGGGTTCGGATCCTGTAGTGTCATCACTCGTTAGA





TTTCGATGAAGAATAGTTAGAGAGTGCTGTGAGCTTCAGCAAAATGCGCGCCCTAGCAGC





TCAGTTCTCTAATGTAATGTTCTCTTTAATCCTTGTCTCTGCCTCTGTTCAATTTTAACC





CTTTATAATTTAAAGAAGAAAATAAAGTAATCATTTTCAAATTCAATTCATATTTGAAAT





CGATGTTGATCTTAAGTAAGTACCCATTTGCGGATTCATTTTATTCTTAACTTTTTTCAT





TTTTTTAACCCTTTTTGCAGTATTTATGCAGAAGAAAAGTTGGGGTCAATCTGCGATCTC





GTAATTTTTCATCATATAACAGTAAAGGTATCACCTTGTTTTTTTCTATCGGTACTTTTG





AAAGAAAGAAATAATTATAAGAAATTATGACTAATTTTGTCAGGCTATGGAATCATTCTG





TTGAGAACGAGAAAAATGAAATAGGTACTTGTATTGAACGTTCTAGTAGTAAATGATAGG





TACATATGTTGATTTTGGGGATTTCTGGGATTAGATCATCAAGCCAAAATCTTGCACTGT





AAAATACTAAAATACAAAGAAAATGGTTTTCTTTTTTCGTTCAATTTTTGGTTTCATTGT





TTGAGCTGCAATTATGTTGCGTGACTTCACTTGTAACTCTTTTGATTTCCAGATGAGCTA





ACCATCGAGGAAGAAGCTGAGAGAAAAGTTGGATGGCTATTGAAGACGATATTTTTTGTC





ACTGCAGGGGTAGCAGGATACCATTTCTTTCCTTATATGGGTATATCAGCAAAATCCCTG





CAACAATTTTTAACTTGCAAAGCCTTCATTTGCTCTGGGTAGTTCAAGCTTCACAATGCT





GTTAAAATTCATATTTTCAGGAGAGAATTTGATGCAACAGTCTGTTTCGCTTTTGCGTGT





CAAGGATCCCTTGTTCAAAAGGATGGGAGCTTCTAGATTGGCTCGTTTTGCAGTAGATGG





TAAGTTTTACTATCTGTATCTTTGTGTCACTAATTGCTTGCTGTTGTTGTTTTATGACCC





ATATTTCTTTGGCACATGACATATATATTGAATTGTTTATTAATTGTTAGTCATTTGCAT





ATTAGGGCAGTTGTTTTCTAGAACAGATTCCTATTCTTGCAACAAGCATATTTTCATTAT





CCTTGTGCTTTACACTAGTGACATTTAGTCATTTAGTATTAATCTCAGCTTATTCTTGAA





ATGACAATTTTGGTTGAAGGGGAGAGTTGATGGGAGATTTTGAACTTGGATAAAAGAGTC





ATAGATTGAAATTTTTCTCTTGAACTGATAATCAAATAGTTATTGAGATTTTTAATTGAG





CTGCATTTGTTAAGAAGTCACGGCTAAAAGAGTTACCTAGTTGTCAGTTATACTATTTTC





ATGACTAAGCAGCAAGCACAGATATTGCAGTGATACACAACCGAGAGCATATTCTCCAAA





AGGCAAATTGCCTTGATGCAATTTGCTAGTTTGTACACTGATAGAATTGCTTATTTATCA





ATAGTGTTCCAATGTATAGGTATCTCATGGCACTGATTAAGGATAAAACATGGTGATTTA





TTCCTTTCTAAAATCTTTTGTCCCTGCACAAGTTGTTTATATTAAACATTGTTTACCTTA





CTTTTGCTCACAAGATGAAAGAAGGAAGAAGATAGTTGAGATGGGTGGAGCTCAAGAACT





CTTAAATATGTTAAGCACTGCTAAAGACGACCGTACACGGAAAGAAGCATTGCATGCTCT





TGATGCACTGTCACAATCAGGTGAAATCATAATTTTAATATTTTTTTAAATAGTTATTAT





CATGCTGGTGGAGAGGTAGATTATTGTGATCAATTAGTACCTTTGTGGTTCTAAATAGTA





AACCAGAAATGCCCAACCACTTATGGAATTTGTTAATTTATTTGTATAATATTGAGCTGG





AAATCAATTTTATGAGCAAAGTGTGATAAGAAGCATATGAAACTTTTATATGTTTCCAGT





TTCTGTTATCCTTATTCAATAGATATGGGCTTGTAAAGATGAAAATGAACATAAATTGTT





TTGTGCATTAATTTTGGGACATATTATACGTGCACAAGCTTATGTGTAACAATATCTATA





CCTGCTACTTTCCCTGTCAACATATTGATTTTTAAGAATCCAGTTCAAGTAATATTTATG





AGGTTGAAAGATATGCAACAGTACAACCAAATTAGTAGTGCTAGCTAGTACTAGCCTTTT





CTGTTCTCTTTCGATTGAGATAAGCTACTTCAATGTTATAGATGAAGCTCTTGCATCCTT





GCATCATGCTGGGGCCATTTCAGTAATTAGGTCTGCACCAAATTCACTTGAGGATGCAGA





AGTTGAGGGATTCAAGTTGAGCTTGATGAAAAGATTTCAAGATCTCAGATATGATGTGCC





ATCATGACTTGAGGTGCATGCCTCCTTTTGCTTTATGTTTTTGGTTGGTTGGAGCATGAA





ATAACATGATATGAGAAATTAAGCTGGCAACCAAAGCTTTTGTGGGGAAGAGTACTTGAA





ATTACTGTGTATCATTTGACCAAATCTAATGGAAGATTATAGTTCTATTGTCATTTTAGT





TTTTTTCACTTGTCAAATGCGATTTGTCGCTCATTGTTCTGTCAATCATAATAAAATGGA





AAAGATTTATGTGCATGTCAATTTTTATTTTTTGAAATATGTGTTTAGAAGATAAAAGAT





TACAACAAACTAGTATTGAAGTTGTAAGTGTTTAGATACTGTAATTGTATATTTGGTTAA





CACTACTAGATTAAATTTAAGCCTCAACTTTCAAATGTGATTGATCATATAATGTCATAA





AATGTGTGTAATTATAGGTTGATGCTTTAATTGTTATTTACATATGCCTCAACTCTCAAC





CGTATGTCATCATCAGGCTTCTTGTTTAGGTTCAGCTGGCGCCTTGCTCGTTCTATTTTG





TTCGTATTCCTTTGTTCATTCGATGTTTTTTTAGGAAAATATGTTCGTTGAAAGAAAAAA





TCAGTCAACAGAAGATACATGCCCTTACTTTTCTCTACTCCACGTCTCCACCTACCCTAA





CCCTTGGAGTTACTTTTCAATTGAGCACGTTAACAAGCCTAACTAAACGTGGTTCTGTGG





AGATAATATACTAAAAAAATATTATTTTTTATTTAATTTAATAAGACTTGTGCGCGTAAC





TCTTTTCAAAGTGCTAGCTTTCTTTTTTGGTGAATTTTCAAAGTGCTAGGTGAATATGCG





TATTTGGAGATAGAAAGCTTTTTTTTTTGGGACACAAAAGCTTGTTTAACATGTGATAAA





CTTAAAACTAATAATCATTTTTTTTAATTATCCCTATTCAATGTTTTAGTTTTAAAAATA





CCCCATTTGGGAAAATAGCCCATCTGTGGATGTAAAAATTACTAGAGTACAAGTTAATTA





GGGTTAGTTGTTTTTTTTTTCTTTCTTCTTTTCCTACGAGATCAAGAGGAACGGAGCAAA





ATCATGTTTTTTCTTCCACCAAAATAATGTAAAATTTTAAAACAAAATACTAAAATAATA





GGTTATAATATATTTCTTTTCCTTCATTTTATACTTAGTCTTTTATTTTTTTTAAAAATA





TCATTCTAGTTATTTAACTCTCACATTTGAGTTAATGTTATACTTTTGAAACTTCAGTCA





TTTTTTATTTATTTTGGATACGATTTTGAGCCTTTTTTCTTTTACTAATTAAAGATATAA





AATTTGTCTCATTAAATTAAACTATGAATCCAATTAATGTTCAAACAAAACATAAATCCT





TAACATGATGAAAATTGAATGAAAAATGAATTTCATAAGCAATTGGAAAACTGAAAAAAA





AAATCTAATAACAATATTTGAAAATCTATTAACAATGAGCAGTAGGCTTCTTTGGAACTT





GAAATGAAGAATAAAAGAGTCATTGGAAATAAATCTCAATTAATTGAAAGATATTTTTTA





GAAAATGTCATTAAATAGAATAAAAATAATCAAAATTTGCTTATATTTGAATCTAATAAA





AAAATTGTTAATTACTCTTTTTGTCTCGAATATAAACAAAAAATACTTTGAGTATTAGTC





TCAAATAAAAGTAAAATTTAACTATTGTTACTTTATTTAATGAGATATTCCTAAATTATA





TTTTATTTAATTAAAGTTTTATTACTTATTATCTCTCTTTTTTATTTTTGATATGAACTT





TCAAGAAAGTGTAGTTTAGAAGAAGAATTTTTTTTAATAAGAAAGGTGTAATTAAATAAT





CTAATTATCTAACTACTAACTTTTTGAATAAACCGTAAGTTAATTTCTTTTATATAGAAA





GAAGGGATTAAATTAATTTTAAAAGAGTTCCTCTTCATTTTAATCATTAATTTTTTTGAA





ATTCAATAATCAATACCTACTCATTACAATAATAAAATACAAAAACTTCTTCACGAAATA





>Gm18:1625005..1645004


SEQ ID NO: 3


AATTGATTCCTTGTGTTATATATATATATATATATATATATATATATATATATATATATA





TATATATATATATATATATATATATATATTCAGTATATAATTTAGTGTAGCTGGACACAT





ATTAGTGCCCCGTGGCCGTGTGTTGTGCTTTTTTGTTGGGCGAAACAACGGCCAAAGCGA





CGAATCACTTCCTTACGTGACACACCTCTGTCTAATAGACGATAGGCCAAAGTCACAAAT





ACTTTTTGATTGAGTATTTTTTAATGCCACATATCATATCTGTCAGCGTCACATGTTCAA





ATAAATCCCTAGTAAAAACGCCAGAAAACAAATGCATGAGCAATTTTTGGACTTTGGACT





AGTTACAATTTTTCAACGTCACATCTTTAAATGATTTCGTCTCTATTTAGTAGTTGTTTT





TAATGCGGCCATGCCCACTTTCTCGTTACAAAGCATGCATTTTATTATTTGAACGAAAAT





ATTTAATATTGTGTATTACTCGTTTTACACAATTGCTTTTATTCTTTTTTTTTTATTTTT





TAAATACAGTCATCTTTAAAAACAAAATCTTCGATCATTCCATTTCATTGTTCACAAAAC





ATTATCCTATCACATGCACCCTATGTAATATAATACACGGTTGTGGATAAAATAATTCTG





CACCTGCCCAACTTTTGTATTGATATCATTTTTTTATTTCCTCTATATTTTCCATATTTA





TATATTAATTCTATCACTTTCCTGCACCCCAATAAGTCATGCTCCAAATATTATTGTTTT





GGATAAGTATATTGCACAATCTATTCTTGGCTTCTTCATGACCATGACACGGCAATGGAG





ACGAACGATAATGAAAGGCGCGTAAATCATTTGAAAGGTTGAATTATGTACCAAATGCTA





TTATATTAGCATTTCATACGCCATTCTAACAATAATGAGAACGAGTTGCCCCACAATTGA





TCAATATTTGTATCCTTGCACGGCACAAACTTGTAAGATATGGCCCCAACTTCGTCACCC





CATCAAGTTGATTTCATTTCTTTAATATTTGGTATTTTACATAGAAATGCTGCCACAAGA





CATGAATTCTACAATAAACAACAAGGGATCCAAAATACAAAAGTAACACAATCGCCACAG





CAACCAAATGCCTCTAGAATCTCCCCAGCCATACCCTCTCTCCCCAAATAAAATTTTCTA





TACATATAAAAAAAACAATAGAAGGGGAGAAATGAGAGAACCAGCTTGTATTTATGACTT





GCTACTAAAAGCATTATATATGTTGGTGGAAATGGCAAGCACACTTGTAACCACAGCTAG





TATAATCATTATCAGTGCAATAATTTTGTCTCTTCTCGTTGATATACCTTTAACATCCCT





GTCATCAGAAACAATGAAAATTAAGAGACTAATTTATTTATTTATTCAATTCAATTTTCA





ACTTGAGCTCTACTTTATCAGAAGTCAATAAGTCTATGCTAGATTACCTTAAAACAATAG





AGCCGGGGAAAATGAAGGCAAGGCACACTGCGGATGAGGATCCCAGGAACTGAAAGAAGT





ACCAAATATCTGGGATTGCTATAGCTGCAAGGTAGGAGAATACAAGCAGCACCAGAGTGA





GGATCATAAATCTTTTGTTGTCTGTGGCTAGCATAGGCTTCTTAGGGAAGAGAACTTCAT





CTATGTTGGTTCTCAAAGAGAAGTTCAAGAGAGGAAACACCAGCATGATGTGGAGGGCAT





AGCTTACACGGACCAAACTATTGAGCAAGGAACCAACTGCTGAACCAGCATTCTGGTCAA





AATTGATGAGAATGTCTGACTGGGTTGAATCCCCAAATAACATGTACCCAAATAAGCCTA





TTGCAAGGTAGATCACAGCACAAAGCAATAATGCTAATCGAACTGCTGTTGTCATTTGGG





ATGCCTTGGCAAGCTCAAACCCAATGGGGTGCACTGAAAGTAAAATCAACCATCAGCACA





GAAATTTTCTAGGCATACCCTTAAAAAAAATGAGCAAATCCATTAGGCCAATTACCATTA





AAGTGAAATGTGAAGGCTGTGACAACAACAGGAACTGCAGTGAACAGATCAAAGAATGAG





GTTTGGTAGTCTAGCCGAGGAAACAATCTAGGAGTTTGTGTTTTTCCTTGCACCAGAGCT





GTGATAGCCAACCCACAACATATGCCAACAAATGCCACTGCAAGAAGAGTTGACACTGCA





GAGCTGTACTTCAAGGACTCTACCAATTCCAAAACACCAAGAGGAAGAACCCATGTTAGT





TACTGAAACTCTTCAATTCCAAATACACAAGAAAGCATGTTATATAACAAAATATTCAAT





TGTAACACTCACCTACACGTTTGTACAATACCAATGGAAGCATAACAAAGACCAAGGTGA





AAAGCAAAGCAAATTCCCGGGAATTCCACCAGTGAATTCCAAACCACTGTTGCAAAATGC





CCAAATGCACTTCCCCTCCATTTTGCTTTCCAGATAGCACATCTCCTGTTGTTCAACAAA





AGGTAATTATTAAAATACATCTATTTTTTTCTTTCATCTTCTTTCTACTACATTTTCTTC





TTATATTTCTCTTTATTTCCTGTCATTTCACTTTTTTTTTTCCTGTCTTTCTTCTACCCA





TTACATAGACCAAAAATTGAGGTGTGCATTGCGGAAACAGATTCCCCACATTCACTTTTT





TTTTCATGAATCAGAACTATTAGATTGAATATCGAAGGTTATATTTGGAATACATATTTT





AAGAGTATGTTTGGATACAAAGTTAGAAGTGCATTTGACAACTTTTGGGAGCTCCTCTAA





TGGAAAAAGATTAATATTTTTAGTAGAAAACTCTCAAAAACACTTCTAGTTTGTATCGAA





ACAGGCCTAAGTTAGATTCATTGAGATAAGTTATGAGTAATACAATCTGTATTTAACAGT





GAATTCCCAGCGGTAGGAAATCCCGATCCTTACATTTCCATTATCTTGTTAAATTAATTA





CCATACACAAACATTAAAGAAAACGACTTACTACATATATTTTTTCAAAAAATGAACCTT





TCTATTTATAGTAAAAAAAATAAAAAACAAATTTCACTTATTATTAGCAACAATTTCACC





AATCAAAATGAATCTGACTGAAAACCCGGCAAAACTCAGAACAAGCATACCAAACCAAAA





ACATGAAAAAATCTACATTTTTTTTTCCTTTTTTACGAATTCAGTAGAAAGGAAATTAAA





AAAAAAGGGAAAAGTTCCGTTACGTTACCGATGATGATAAGGTAGAGAATTAAACCCCCA





ACGTTGGTGATGATGACGCAAACTTGCGCGGCTAATGCTCCACCCGATCCGAACGCCTCC





CTCATGACGCCAGCGTACGTCGTCGTTTCGCCGGAGTGCGTGAACCGCATCAGGAAGTCC





ACGGACAGTTCCGCCAGCACGGCCACCACGAGAATCATCGCGAAAGCGGGAACTACGCCG





AGAACCTTCATGATCGCCGGAATCGACATGATTCCGGCGCCGACTATGCTGGTGGCCACG





TTGAACACCGCGCCGGGGACGGAAGCCGGCGGCGGCGTTCCTTTGGAATCCCCCAGGAGG





GGGACGCTGACTCCGGCGGCCGGAGACATGCCGGAGGCAAAATTGTGAGGATCGGAGAAA





GTGCGGTGGCGGTGTGCGGTGCCTGGCAGCCTACTATCTTTGAATTGAATGGTTTTTGTT





TTGTTGTCTCTCACGAAAATTTCACTTCCTCTCTCTTTATAAATGATACAAGTGGATTTG





GGAAGTTAAGAAAACAAAAAATGAAGTTATAATAAGTAAGATTTTATTTTATAAGTTTTG





TAGGATGAAAAGAAATATAGTTGAATGAGGAAATTTCATTGAAAATAGTTAGCTAGATTT





TATAATAGAGATTAAACAATAATAAAATCTGCAGATACTTCAACATGAGTATGATAATAA





TAATAATAAAAAATTGTTGTTTTCTATTTTTACTCCAACATGGACTGAAATTCATATGAA





TTTTTTTGAATAGTCTATTTTTTTTTATTTAATTTAATATTCATATCAAAGTTATTTCAT





ACTGAAAAAAATATTAAATACTAGCATTCTATTATTACCATTTGGAGGAATGATTGAAAG





AGTGTTAAAGTGCACCTTTTCAGTCAACAGTTAAAAATAAGGCGTTTAATTCAATTCAAT





ATTACAAAGTTAAGTTGGCTGTATAATAATAACAGTGGTAGTAAGTAGTAGAGTGAAAGA





AAAATTTTTTTGGTCAAAATATTTAAATCAAGACTAGAAGATATGCAAATCAGAGATTAC





ATTGGATGATACGGTCGACCAATAAAAAATAAAAGAAAAAACATAAATTGGGATGTTCAA





ATACTAATAATAATAACTCTAAACAAACATTAACACGTGAGTTTTCTTTCCCACGTTGTA





ATCATTTTGAATTTTTAAAATGTTATGACACAAATAATAAGTTAATAATAATTATAATTT





AACATTTGAATTGATAAAAGTGTTTAGTTTTATTGTAGATTAAACTAATCTTTCTTCGAG





TAAAAATAACATTAAATTCCTACACAACAGGTTTATCAGTTTATAGAGTAATAACACTCT





TATTCTTAATCGTTTTCTTTTCTGGAAGAAAAAATAAATCTTAGTCTTGTTATTTTTTTG





AGAATGTAAAATATACCTTAAAAAATTCCCTTAAAGTTTGTATAATTTTTTGGTATGTAA





ATATATTTATAAATAAAAAAATGTTTGCGAAAAGTAATATTTACATAACAAACACTATTT





ACAGAACATTGATGAAATTATTTTTAGATATATAATTATTAATACGAATATATGAATATG





TTATTAAAGTAATCAATAGTTATGTTAAAACTGATCTGTTGACTAGACAGTTTGTCAATT





TATTTTTTATTCACTTAATTGCTATTTTTTTCTAGGTTTGTTCTTTCGTTAAAAAACCTT





GCATTGGAGGAAGGCCAATGCTAGTTATAAAAATATAAACCATGATTTGAATATAAAATT





ATTTTTAGTCGAAAAACAATGAATTATGTTGCAAGTATCACTATTGAAAAAATGCCAACG





GAGCCCAAGAAGGTGAGGCCCAAACTGAAAGCGTGAAGCGGCCCAAGACTGAGTGAGGAA





ATAAATAATTATCCAGAAAATCGGAAATGGACAATCCTTCTTGTTACGCAATTCTGAATT





TGCGGGTTTTGGATTTGGACTTGGTCGTCAACACAGTCTAATTAATATCTTTTTGCTCCT





TCGCTTATGAATCTTCTTCTTCTTCTTCTTGTTCCTGCAACGCACTGAATTCGATCAATC





AATCCATCTTCAATTGCTTTGTTTCGATCGGAGGAAAATGGCCGATCAGTTATCGAAGGG





AGAGGAATTCGAGAAAAAGGCTGAGAAGAAGCTCAGCGGTTGGGGCTTGTTTGGCTCCAA





GTATGAAGATGCCGCCGATCTCTTCGATAAAGCCGCCAATTGCTTCAAGCTCGCCAAATC





ATGTTTTTCCTCTTTCTCTCTACTTTTTTTAAATTCCATTTCGTGTCTCCTCAAAATGCT





GATTTAGTGTCATAAATCATAATTATTATTCTCTTCTATTGTTGTTATTTTATTGTTATT





ACTTCAATCGACGAGTGTGTTGAGTTTTGAGGTGTCCGATTTCCCGATTAATTGAAGTAT





AGTTTTAATCTGATTTTACTGGAAAATATTTTTTGCCTGATTTTTTTTTTTTGGAACAAT





TACTAGCATATAAATTAGAATTGTGGATGAAGTACGACAATCAACTCTGTGTTGTTTGTG





ACTGCGCTCACTTTCAATTTGACGACTAATCTCTTTATTTTGTTGAAAGTGACGAACTTT





GAAATTGATGTTGGAATAGTTCTGTTTATTGTTCTTGATTTGATCTATGTGGCATTTTAG





GGGACAAGGCTGGAGCGACATACCTGAAGTTGGCAAGTTGTCATTTGAAGGTAACATTCA





TCAGACTTGGGGTTTTGGAGTGGGCTGAATCTCTTTTGCATCCTTTAGTTCTCTATTAAG





CCTGCATGACATTGTTGTGTTCTGTTTCCATTTAGTTGGAAAGCAAGCATGAAGCTGCAC





AGGCCCATGTCGATGCTGCACATTGCTACAAAAAGACTAATATAAACGGTATGCATGTGT





CTCAGTTGTTACCACTACATGCACTACAATACTTTCTCATTTATGATTTGTGCTTTAAAT





GCTGCTCTTGCTTCCATGCAGCAAGGCCAATTCCTTTTAGCCTCAATGTTTCTCTGTATA





ACTTTAATGTAAATCATATAAAACAATTGCTACCTTTTTGCATGAACAAATTATATAAAG





CAAATCTCTTTGTTTAATCTTTACATATGTGTAAATCAAATACTGGGCTTCATATCGATA





AGGTCTAAGTAGGGGTTCAGTCTTTTATTTGGATTAGTTTAAGTCAGAAATTGAAGTTAA





TTTGTGCTTGCATAAGTTGCTTCCATCTGATTGCTTTCTTTTTATGGCTGTCTGTATGTC





ATAGCCTTATTTTGATTTGTTATTTGCTGACTATTATTAGATTGGAACTCATGATCATAT





CCCTAAGCAGGAGCAAATTATTTTGCTGTCTTGCTTGTCTTAGTATGTCCCACTTGCATT





AGGAAGAACTAAGACAATTAAAGTTACCTTTTCTTTCTTTGAATACAGAGTCTGTATCTT





GCTTAGACCGAGCTGTAAATCTTTTCTGTGACATTGGAAGACTCTCTATGGCTGCTAGAT





ATTTAAAGGTATATTATGTTTATGATATTGATATCTCTTCTCCTGGGTATGATTTTTAAT





TTATTCTCTTGTCCATATCCCAGATTTTAGATATTGATCCTGCAATAAAATGCGTTGAAG





TATACTAAGTTATCTGAATCCCCATTAACATGTTTTAACTGGGTTCACTATTTTATACAC





AGGAAATTGCTGAATTGTACGAGGGTGAACAGAATATTGAGCAGGCTCTTGTTTACTATG





AAAAATCAGCTGATTTTTTTCAAAATGAAGAAGTGACAACTTCTGCGAACCAATGCAAAC





AAAAAGTTGCCCAGTTTGCTGCTCAGCTAGAACAGTAAGATATTGTCCTTTCTGCATATA





TTATCTCTTTTATTATGCTGATGAATTGATCAATATTTCTTCAACTTGGGTTTATTCTTT





AATTGGTTAGTAATTTCTTCTGAGAACTTTCTTTCTGGCCTTTATTTTGTTCAGTACCCT





TTCTCTAACCCACTCTCCTCAGGTTAACATTAGCTTAGGTCAGTGTAGGTTGTTTGACAC





TGAGTTTTTATTGGTATGGATGTATGGTCTATTATGATCTCAATGGAAATCTAGCATATT





TTTTTTCCACAATCCATAATATGATGACTTGTGTACATGGTGTGAATAAAAGTCAGTCCA





TTGCTGCATTTGGTATTGGTTACGTGTTACTGTACTTTCTGCATATATTATCTCTTTTAT





CATGTCGATGATTTGATTAATATTTCTTCAATTTGGATTTATTCTTTAATTGGTTAGTAA





TTTCTTCTGTGAACTTCTAGTTAGAGCATGAACTGCTAAAGAAATCCAAAACTTTATTTT





TTACATGGAAGGAACTTTATCAGAGTTTTATTTATTTATTTATTTTTATGTTAAATTGAA





CTTTAACTGTTTCTATGTTATGATAACTCTTCTTCAGATATCAGAAGTCGATTGACATTT





ATGAAGAGATAGCTCGCCAATCCCTCAACAATAATTTGCTGAAGTATGGAGTTAAAGGAC





ACCTTCTTAATGCTGGCATCTGCCAACTCTGTAAAGAGGACGTTGTTGCTATAACCAATG





CATTAGAACGATATCAGGTCTAAGTTTTTTCAATAGTTCACTTCTGGAGACTGGACAGCT





TATTTGTTGCTAAATTATTCAGATATGTTTTTATTTTGCAGGAACTGGATCCAACATTTT





CAGGAACACGTGAATATAGATTGTTGGCGGTAGGTCACTGGTTTTGAAATTTCGTTATGA





ATTTTTTATGACCAAGTAAATTGGATTAGAATATTTGAACTTCTTTGTAGCTGTCTCCTG





GGTCATAATGTTTTATTATATTTTTGTATTTATCATAGCATTGTGATAGCCCTGTTACTA





CTTTGTTTGCTGATTTACTCATACATTTGCCAGATGAAACTGACATTTTTTTTTAATCCT





GGTGGATAGGACATTGCTGCTGCAATTGATGAAGAAGATGTTGCAAAGTTTACTGATGTT





GTCAAGGAATTTGATAGTATGACCCCTCTGGTAAGCTCCAAAAGTTGTTAAATAGGATAA





CTTCTAGTGGTGTTTAACAAAAAAAAAAATTCCACTTGTATTTTTTATCCACATTTTATA





ACAGAATAATCATAACCTTTCACAACTTAATTCTCAATTTTCACAGTAATTAAATGTGTA





ATTTTAAAAAAATATTTTCCTTAACTTAAACCTGATTGAAATTTCCCCCTGAAATTTAAG





TTCTATTTGATTACCTAGAGTGTAATTTCCGTGTTTTGTCACTTAATCACTGTGTAAAGT





TAATTTTTTTGCTTACAAAGGTGTCTTGTTTGGAATGCTAAAATAACAAGTACACGTGTC





ACCAAATTTAGTAGGATTAACATTTGTTGTTTTTTGCCATAATAAACGGTTGAACTTAAC





ATTTGTTGTACGTGTCATCAAATTCTACAAATTGTGAGCTGCTTAGTGGGTTGGACAAAC





ATTTTAGCAGGTGGTTTCGATTGCCTGTTGAATACGTGAAATTAAACCAAGGCAAAATTA





TAATTTGTTTCTTTTGTCTGTGTTTCACTCATACACATTGAATCTTGATGATACACAGCC





TTGTTAATTGTTATCCTTCCAATTTTTTTTAGTGTTTTTGAGCATCTATTCTTGTTGGTC





ATGTGTTTTCTTCACTCATGTACCTGGTTCTTTTCCTACAACGATAAATATGTATCCTTT





GTTTTTTTTTTCCAACTAAATATGTAATTTCAAATTTCTAATCAATCATTGCTTCCAAAA





TACTCTCTCTGTTTCAAAATAAGTATTATCCTATATTGTTTTACAAGACCAAGAAAAGCT





AATATATAGATGAAAGAAATTAGTAATTTTACAAAACTAACCTTAGTATTAATATTATAC





TGAAAAACTAAATTGACACTTATTAGGGGTGTTAGTGTAAAAAAGCAATTAATATTACAT





TGAAAAGCTAACATGATACTTATTTTGGGACAACTTTTTTCTTTCAAATGCAACACTTGT





TTTGGAACGGAGGGAATACTAGATATTGTGCTCCCTTGTATGCCCTGGACATAACGTATT





TAACTGGTCTGGATGAGTTTATGAATGTCATTAATTTAGGGGGAGTCATTTAGAATAGCT





TACCTATAAGTACTTTCTAACTTTTCTCAATTAGTTTCACAGTGCAATTTATTAAAAATG





TCTGTATCTAATCAACATTGTCTGTGTGCTTGTGCAGGATTCTTGGAAGACCACACTTCT





CTTAAGGGTGAAGGAAAAGCTGAAAGCCAAAGAACTTGAGGAGGATGATCTTACTTGAAT





TGTACCTTTAATATTCCTGG





Glyma18g02570.1:peptide


SEQ ID NO: 4


MRALAAQFSN YLCRRKVGVN LRSRNFSSYN SKDELTIEEE AERKVGWLLK TIFFVTAGVA





GYHFFPYMGE NLMQQSVSLL RVKDPLFKRM GASRLARFAV DDERRKKIVE MGGAQELLNM





LSTAKDDRTR KEALHALDAL SQSDEALASL HHAGAISVIR SAPNSLEDAE VEGFKLSLMK





RFQDLRYDVP S*





Glyma18g02580.1:peptide


SEQ ID NO: 5


MSPAAGVSVP LLGDSKGTPP PASVPGAVFN VATSIVGAGI MSIPAIMKVL GVVPAFAMIL





VVAVLAELSV DFLMRFTHSG ETTTYAGVMR EAFGSGGALA AQVCVIITNV GGLILYLIII





GDVLSGKQNG GEVHLGILQQ WFGIHWWNSR EFALLFTLVF VMLPLVLYKR VESLKYSSAV





STLLAVAFVG ICCGLAITAL VQGKTQTPRL FPRLDYQTSF FDLFTAVPVV VTAFTFHFNV





HPIGFELAKA SQMTTAVRLA LLLCAVIYLA IGLFGYMLFG DSTQSDILIN FDQNAGSAVG





SLLNSLVRVS YALHIMLVFP LLNFSLRTNI DEVLFPKKPM LATDNKRFMI LTLVLLVFSY





LAAIAIPDIW YFFQFLGSSS AVCLAFIFPG SIVLRDVKGI STRRDKIIAL IMIILAVVTS





VLAISTNIYN AFSSKS*





Glyma18g02590.1:peptide


SEQ ID NO: 6


MADQLSKGEE FEKKAEKKLS GWGLFGSKYE DAADLFDKAA NCFKLAKSWD KAGATYLKLA





SCHLKLESKH EAAQAHVDAA HCYKKTNINE SVSCLDRAVN LFCDIGRLSM AARYLKEIAE





LYEGEQNIEQ ALVYYEKSAD FFQNEEVTTS ANQCKQKVAQ FAAQLEQYQK SIDIYEEIAR





QSLNNNLLKY GVKGHLLNAG ICQLCKEDVV AITNALERYQ ELDPTFSGTR EYRLLADIAA





AIDEEDVAKF TDVVKEFDSM TPLDSWKTTL LLRVKEKLKA KELEEDDLT*





Forrest SNAP Type III polypeptide mutant A111D:


SEQ ID NO: 7


MADQLSKGEE FEKKAEKKLS GWGLFGSKYE DAADLFDKAA NCFKLAKSWD KAGATYLKLA





SCHLKLESKH EAAQAHVDAA HCYKKTNINE SVSCLDRAVN LFCDIGRLSM DARYLKEIAE





LYEGEQNIEQ ALVYYEKSAD FFQNEEVTTS ANQCKQKVAQ FAAQLEQYQK SIDIYEEIAR





QSLNNNLLKY GVKGHLLNAG ICQLCKEEVV AITNALERYQ ELDPTFSGTR EYRLLADIAA





AIDEEDVAKF TDVVKEFDSM TPLDSWKTTL LLRVKEKLKA KELEEYEVIT








Claims
  • 1. A transgenic soybean resistant to soybean cyst nematode (SCN), or a seed, plant part, or progeny thereof, the soybean plant transformed with an artificial DNA construct comprising, as operably associated components in the 5′ to 3′ direction of transcription: (a) a promoter that functions in a soybean;(b) a transcribable nucleic acid molecule comprising (i) a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 (Glyma18g02570), SEQ ID NO: 2 (Glyma18g02580), and SEQ ID NO: 3 (Glyma18g02590), or a nucleotide sequence at least 95% identical thereto encoding a polypeptide having Glyma18g02570, Glyma18g02580, or Glyma18g02590 activity, respectively;(ii) a nucleotide sequence encoding a polypeptide comprising SEQ ID NO: 4 (Glyma18g02570), SEQ ID NO: 5 (Glyma18g02580), SEQ ID NO: 6 (Glyma18g02590), or SEQ ID NO: 7 (Forrest SNAP A111D mutant), or an amino acid sequence at least 95% identical thereto having Glyma18g02570, Glyma18g02580, Glyma18g02590, or SNAP activity, respectively;(iii) a nucleotide sequence that hybridizes under stringent conditions to a nucleic acid sequence of SEQ ID NO: 1, SEQ ID NO: 2, or SEQ ID NO: 3, wherein the polynucleotide encodes a polypeptide having Glyma18g02570, Glyma18g02580, or Glyma18g02590 activity, wherein said stringent conditions comprise incubation at 65° C. in a solution comprising 6×SSC (0.9 M sodium chloride and 0.09 M sodium citrate); or(iv) a nucleotide sequence which is the reverse complement of (i), (ii), or (iii); and(c) a transcriptional termination sequence;wherein the transgenic soybean exhibits increased SCN resistance compared to a control not expressing the transcribable nucleic acid molecule.
  • 2. The transgenic soybean of claim 1, wherein the nucleotide sequence is at least 95% identical to SEQ ID NO: 3 (Glyma18g02590) having one of more mutations selected from the group consisting of C163225G, G174968T, A164972AGGT, C164974A, C163208A, G164965C, G164968C, A164972AGGC, and C164974A; orthe encoded polypeptide comprises an amino acid sequence at least 95% identical to SEQ ID NO: 6 (Glyma18g02590) having one of more mutations selected from the group consisting of D208E, D286Y, D287E, -288V, L289I, Q203K, E285Q, D286H, D287E, -288A, L289I, and A111D.
  • 3. The transgenic soybean of claim 1, wherein the encoded polypeptide comprises an amino acid sequence at least 95% identical to SEQ ID NO: 6 (Glyma18g02590), a mutation of A111D, and Glyma18g02590 polypeptide activity.
  • 4. The transgenic soybean of claim 1, wherein the transcribable nucleic acid molecule is expressed in epidermis, vascular tissue, meristem, cambium, cortex, pith, leaf, sheath, root, flower, developing ovule or seed.
  • 5. The transgenic soybean of claim 1, wherein the promoter comprises an inducible promoter or a tissue-specific promoter.
  • 6. The transgenic soybean of claim 5, wherein the promoter comprises a nematode-inducible promoter.
  • 7. The transgenic soybean of claim 1, wherein the promoter is selected from the group consisting of factor EF1α gene promoter; rice tungro bacilliform virus (RTBV) gene promoter; cestrum yellow leaf curling virus (CmYLCV) promoter; tCUP cryptic promoter system; T6P-3 promoter; S-adenosyl-L-methionine synthetase promoter; Raspberry E4 gene promoter; cauliflower mosaic virus 35S promoter; figwort mosaic virus promoter; conditional heat-shock promoter; promoter sub-fragments of sugar beet V-type H+-ATPase subunit c isoform; and beta-tubulin promoter.
  • 8. The transgenic soybean of claim 1, wherein increased SCN resistance comprises at least about 20%, at least about 30%, at least about 40%, at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 100%, at least about 200%, at least about 300%, at least about 400%, at least about 500%, at least about 600%, at least about 700%, at least about 800%, at least about 900%, or at least about 1000% decrease in susceptibility to SCN as compared to a non-transformed control.
  • 9. The transgenic soybean of claim 1, wherein the transcribable nucleic acid molecule comprises a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 (Glyma18g02570), SEQ ID NO: 2 (Glyma18g02580), and SEQ ID NO: 3 (Glyma18g02590).
  • 10. The transgenic soybean of claim 1, wherein the transcribable nucleic acid molecule comprises a nucleotide sequence at least 95% identical to SEQ ID NO: 1 (Glyma18g02570), SEQ ID NO: 2 (Glyma18g02580), or SEQ ID NO: 3 (Glyma18g02590), and encodes a polypeptide having Glyma18g02570, Glyma18g02580, or Glyma18g02590 activity, respectively.
  • 11. The transgenic soybean of claim 1, wherein the transcribable nucleic acid molecule encodes a polypeptide comprising SEQ ID NO: 4 (Glyma18g02570), SEQ ID NO: 5 (Glyma18g02580), SEQ ID NO: 6 (Glyma18g02590), or SEQ ID NO: 7 (Forrest SNAP A111D mutant).
  • 12. The transgenic soybean of claim 1, wherein the transcribable nucleic acid molecule encodes a polypeptide comprising an amino acid sequence at least 95% identical to SEQ ID NO: 4 (Glyma18g02570), SEQ ID NO: 5 (Glyma18g02580), SEQ ID NO: 6 (Glyma18g02590), or SEQ ID NO: 7 (Forrest SNAP A111D mutant) and having Glyma18g02570, Glyma18g02580, Glyma18g02590, or SNAP activity, respectively.
  • 13. The transgenic progeny, seed, or plant part from the transgenic soybean of claim 1, wherein the transgenic progeny, seed, or part comprises the transcribable nucleic acid molecule.
  • 14. A soybean plant comprising in its genome: a) at least one introgressed allele locus associated with an SCN resistant phenotype wherein the locus is in a genomic region flanked by at least two loci selected from TABLE 6; andb) one or more polymorphic loci comprising alleles or combinations of alleles that are not found in an SCN resistant variety and that are linked to said locus associated with an SCN resistant phenotype, or a progeny plant therefrom.
  • 15. The soybean plant of claim 14, wherein the at least one allele locus is selected from the group consisting of Glyma18g02570, Glyma18g02580, and Glyma18g02590.
  • 16. The soybean plant of claim 14, produced by a method comprising: a) crossing a first soybean plant lacking a locus associated with an SCN resistant phenotype with a second soybean plant comprising: (i) an allele of at least one polymorphic nucleic acid that is associated with an SCN resistant phenotype located in a genomic region flanked by at least two loci selected from TABLE 6; and(ii) at least one additional polymorphic locus located outside of said region that is not present in the first soybean plant, to obtain a population of soybean plants segregating for the polymorphic locus that is associated with an SCN resistant phenotype and said additional polymorphic locus;b) detecting said polymorphic locus in at least one soybean plant from said population of soybean plants, andc) selecting a soybean plant comprising the locus associated with an SCN resistant phenotype that lacks the additional polymorphic locus, thereby obtaining a soybean plant comprising in its genome at least one introgressed allele of a polymorphic nucleic acid associated with an SCN resistant phenotype.
  • 17. The soybean plant of claim 16, wherein the first soybean plant comprises germplasm capable of conferring agronomically elite characteristics to a progeny plant of the first soybean plant and the second soybean plant.
  • 18. An artificial DNA construct comprising, as operably associated components in the 5′ to 3′ direction of transcription: (a) a promoter that functions in a soybean;(b) a transcribable nucleic acid molecule comprising (i) a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 (Glyma18g02570), SEQ ID NO: 2 (Glyma18g02580), and SEQ ID NO: 3 (Glyma18g02590), or a nucleotide sequence at least 95% identical thereto encoding a polypeptide having Glyma18g02570, Glyma18g02580, or Glyma18g02590 activity, respectively;(ii) a nucleotide sequence encoding a polypeptide comprising SEQ ID NO: 4 (Glyma18g02570), SEQ ID NO: 5 (Glyma18g02580), SEQ ID NO: 6 (Glyma18g02590), or SEQ ID NO: 7 (Forrest SNAP A111D mutant), or an amino acid sequence at least 95% identical thereto having Glyma18g02570, Glyma18g02580, Glyma18g02590, or SNAP activity, respectively;(iii) a nucleotide sequence that hybridizes under stringent conditions to a nucleic acid sequence of SEQ ID NO: 1, SEQ ID NO: 2, or SEQ ID NO: 3, wherein the polynucleotide encodes a polypeptide having Glyma18g02570, Glyma18g02580, or Glyma18g02590 activity, wherein said stringent conditions comprise incubation at 65° C. in a solution comprising 6×SSC (0.9 M sodium chloride and 0.09 M sodium citrate); or(iv) a nucleotide sequence which is the reverse complement of (i), (ii), or (iii); and(c) a transcriptional termination sequence.
  • 19. A method of increasing soybean cyst nematode (SCN) resistance of a soybean comprising: transforming a soybean plant with an artificial DNA construct according to claim 18;wherein the transformed soybean plant exhibits increased SCN resistance compared to a control not expressing the transcribable nucleic acid molecule.
CROSS-REFERENCE TO RELATED APPLICATIONS

The present application claims the benefit of U.S. Provisional Application No. 61/799,912 filed 15 Mar. 2013, which is incorporated herein by reference in its entirety.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT

This invention was made with government support under grant number 0820642 awarded by National Science Foundation Plant Genome Research Program and DBI-0845196 awarded by National Science Foundation. The government has certain rights in the invention.

Provisional Applications (1)
Number Date Country
61799912 Mar 2013 US