Identification of genes associated with growth in plants

Information

  • Patent Application
  • 20030188343
  • Publication Number
    20030188343
  • Date Filed
    January 07, 2003
    21 years ago
  • Date Published
    October 02, 2003
    20 years ago
Abstract
Genes, nucleic acids and polypeptides associated with growth traits in plants are provided. Related probes, antibodies, marker sets, and arrays are provided as well as methods for predicting plant growth traits.
Description


COPYRIGHT NOTIFICATION

[0003] Pursuant to 37 C.F.R. 1.71(e), Applicants note that a portion of this disclosure contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.



FIELD OF THE INVENTION

[0004] This invention is in the field of genes which control growth traits in plants. The present invention relates, e.g., to the identification of candidate genes associated with growth in plants, polypeptides encoded by these genes, related probes, marker sets, methods for predicting the presence of growth traits in plants, and the like.



BACKGROUND OF THE INVENTION

[0005] Improvement of plant crops has generally proceeded incrementally through the intentional and/or incidental selection of individual plants with desired traits for cultivation. Crossing of unique individuals can result in vigorous individual hybrid plants with desirable characteristics. These established methods of hybrid generation and selection have provided mankind with vastly improved crop plants, but continued improvement by these methods is slow and unpredictable.


[0006] Plant growth traits are among the most important crop characteristics in commercial agriculture. The green revolution has increased plant growth rates with fertilizers and inhibited plant (weed) growth through herbicide application, providing significant improvements in crop yields to feed the world population since at least the 1960s. However, marginal improvements in green revolution technologies are tapering off and new approaches are needed to increase the productivity of agriculture.


[0007] Agricultural biotechnology can provide a directed approach to enhancing the quality and quantity of crops. Identification of genes associated with a desired plant characteristic, or trait, can be the first step to control of the trait. Gene recombination technologies can be employed to incorporate the identified genes into expression systems which can modulate display of a trait, screen for plants having a trait, and/or screen for additional genes associated with the trait. Plant growth traits are of special significance in agriculture, and identification of genes controlling plant growth is critical to providing food for the growing world population. Thus, identification and characterization of gene(s) controlling plant characteristics is of great interest, and will be of significant scientific and commercial importance.


[0008] The present invention relates to the identification of genes associated with plant growth traits. Polypeptides encoded by these genes, as well as related probes, marker sets, and methods for predicting growth traits in plants, as well as other features, will become apparent upon review of the following materials.



SUMMARY OF THE INVENTION

[0009] The present invention relates to a set of polynucleotide sequences which control growth traits in plants, exemplified by, e.g., SEQ ID NO: 1 through SEQ ID NO: 30 and, e.g., a set of polypeptide sequences which control growth traits in plants, exemplified by, e.g., SEQ ID NO: 31 through SEQ ID NO: 60.


[0010] In a first aspect, the invention relates to compositions including one or more nucleic acid expression vectors which include the polynucleotide sequences of the invention. For example, such expression vectors include nucleic acids including at least one polynucleotide sequence selected from SEQ ID NOs: 1-30. Similarly, sequences that hybridize under stringent hybridization conditions, or that are at least about 70%, (or at least about 75%, about 80%, about 85%, about 90%, about 95%, about 97%, about 98%, or at least about 99%) identical to one or more of SEQ ID NO: 1-30 can be included in the expression vectors of the invention. In addition, expression vectors, including polynucleotide sequences that encode a polypeptide sequence selected from among SEQ ID NO: 31-SEQ ID NO: 60, or conservative variations thereof, are compositions of the invention. Likewise, expression vectors incorporating nucleic acids with subsequences of at least 10 contiguous nucleotides of, e.g., SEQ ID NOs: 1-30 (or at least 12, 14, 16, or 17 or more contiguous nucleotides of one of the designated sequences) are included among the compositions of the invention. The polynucleotide sequences of the invention also include polynucleotide sequences complementary to any one of the above polynucleotide sequences described above. In some embodiments, the expression vector includes a promoter operably linked to one or more of the nucleic acids described above. Such expression vectors can encode expression products such as sense or antisense RNAs, or polypeptides.


[0011] Polypeptides having an amino acid sequence selected from the group consisting of SEQ ID NO: 31 to SEQ ID NO: 60, and conservative variants thereof, are also a feature of the invention, as are polypeptides encoded by a polynucleotide sequence of the invention (e.g., SEQ ID NO: 1-SEQ ID NO: 30, sequences that hybridize under stringent conditions to any one of SEQ ID NO: 1-SEQ ID NO: 30, sequences that are at least about 70% identical to any one of SEQ ID NO: 1-SEQ ID NO: 30, sequences that encode a polypeptide or conservative variations any such sequences, or subsequences thereof). Polypeptides (and oligopeptides and peptides) including amino acid subsequences of SEQ ID NO: 31 through SEQ ID NO: 60 are also a feature of the invention. For example, fusion proteins including a polypeptide of SEQ ID NO: 31 through SEQ ID NO: 60, or a subsequence, e.g., an antigenic subsequence, thereof are included in the polypeptides of the invention. Likewise, proteins having a sequence selected from SEQ ID NO: 31 to SEQ ID NO: 60, and homologous or variant polypeptides, and a peptide or polypeptide tag, such as a reporter peptide or polypeptide, localization signal or sequence, or antigenic epitope, are included among the polypeptides of the invention.


[0012] Cells comprising an expression vector, and/or expressing a polypeptide as described above, are also a feature of the invention. In certain embodiments, the expressed polypeptide can be encoded by an exogenous polynucleotide, e.g., an expression vector. Such expression vectors typically include a polynucleotide sequence encoding the polypeptide of interest operably linked to, and under the transcriptional regulation of, a constitutive or inducible promoter. In other embodiments, the polypeptide is encoded by an endogenous polynucleotide sequence activated by an exogenous promoter and/or enhancer.


[0013] Antibodies specific for the polypeptides of the invention, e.g., SEQ ID NO: 31-SEQ ID NO: 60, and conservatively modified variants, etc., are also a feature of the invention. Such specific antibodies can be either derived from a polyclonal antiserum or can be monoclonal antibodies. For example, such antibodies are specific for an epitope including or derived from a subsequence of one of SEQ ID NO: 31-SEQ ID NO: 60.


[0014] Another aspect of the invention provides labeled nucleic acid or polypeptide probes. For example, nucleic acid probes of the invention include DNA or RNA molecules incorporating a polynucleotide sequence of the invention e.g., selected from SEQ ID NO: 1-SEQ ID NO: 30, sequences that hybridize under stringent conditions to any one of SEQ ID NO: 1-SEQ ID NO: 30, sequences that are at least about 70% identical to any one of SEQ ID NO: 1-SEQ ID NO: 30, sequences that encode a polypeptide selected from SEQ ID NO: 1-SEQ ID NO: 30, sequences complementary to any such sequences, or a subsequence thereof including at least 10 contiguous nucleotides. Optionally, the subsequences include at least 12 contiguous nucleotides of one of, e.g., SEQ ID NOs: 1-30. Often such subsequences include at least 14 contiguous nucleotides, typically at least 16 contiguous nucleotides, and usually at least 17 or more contiguous nucleotides, e.g., of SEQ ID NO: 1 to SEQ ID NO: 30. These nucleic acid probes can be, e.g., synthetic oligonucleotides and probes, cDNA molecules, amplification products (e.g., produced by PCR or LCR), transcripts, or restriction fragments. In other embodiments, the labeled probes are polypeptides, such as polypeptides with amino acid sequences corresponding to SEQ ID NOs: 31-60, or subsequences thereof (e.g., peptide subsequence comprising at least six amino acids), including peptide subsequences. Antibodies specific for such polypeptides or peptides are also a feature of the invention (as are polypeptides which bind to such antibodies). For example, a polypeptide probe can be a fusion protein, or a polypeptide with an epitope tag. A peptide probe can be an antigenic peptide derived from one of SEQ ID NO: 31 through SEQ ID NO: 60.


[0015] The label of the nucleic acid, polypeptide or antibody probe can be any of a variety of detectable moieties including isotopic, fluorescent, fluorogenic, or colorimetric labels.


[0016] In another aspect, the invention relates to a marker set, e.g., for predicting at least one growth trait of a plant cell. Such marker sets can include a plurality of members, where the members comprise nucleic acids, polypeptides, and/or peptides, and/or antibodies. Marker sets can include two or more of one type of member, or optionally can include one or more of two or more different types of members. For example, marker sets can include a plurality of nucleic acids including one or more polynucleotide sequence selected from SEQ ID NO: 1 to SEQ ID NO: 30, or SEQ ID NO: 61 to SEQ ID NO: 403, or conservative modifications thereof; polynucleotide sequences that hybridize under stringent hybridization conditions, or that are at least about 70%, (or at least about 75%, 80%, 85%, 90%, 95%, 97%, 98%, or at least about 99%) identical to one or more of SEQ ID NOs: 1-30; sequences complementary to any such sequences or subsequences thereof including at least 10 contiguous nucleotides of, e.g., SEQ ID NOs: 1-30 (or at least 12, 14, 16, 17 or more contiguous nucleotides of one of the designated sequences).


[0017] In one embodiment, the marker set includes a plurality of oligonucleotides, such as synthetic oligonucleotides. In other embodiments, the marker set includes expression products, amplification products, nucleic acid probes, or the like. The marker set of the invention can also include multiple nucleic acids selected from among different molecular classifications, e.g., oligonucleotides, expression products (such as cDNAs), amplification products, restriction fragments, etc. In one embodiment, the marker set is made up of nucleic acids including polynucleotide sequences corresponding to each of SEQ ID NO: 1 through SEQ ID NO: 30, or a subsequence selected from each of SEQ ID NO: 1 through SEQ ID NO: 30, or their compliments. In one embodiment, the marker set is made up of a plurality or a majority of members that together comprise a plurality, majority, or all of sequences or subsequences selected from a plurality, a majority or each nucleic acid represented by SEQ ID NO: 61-SEQ ID NO: 403, or their compliments.


[0018] Markers of the invention can also be polypeptides, e.g., polypeptides encoded by SEQ ID NO: 31-SEQ ID NO: 60, or polypeptide or peptide subsequences thereof. Typically, a peptide subsequence comprises, e.g., at least about 6 contiguous amino acids, 10 contiguous amino acids or more, often at least about 15 contiguous amino acids, and frequently at least about 20 contiguous amino acids of, e.g., one of SEQ ID NOs: 31-60.


[0019] Markers of the invention can also be antibodies, e.g., monoclonal or polyclonal antibodies, or anti-sera specific for an epitope derived from a polypeptide of the invention, e.g., one or more of SEQ ID NO: 31 through SEQ ID NO: 60.


[0020] In certain useful embodiments, the marker set is logically or physically arrayed. For example, the members of the marker set, whether nucleic acid, polypeptide, peptide or antibody, or a combination thereof, can be physically arrayed in a solid phase or liquid phase array, such as a bead (or microbead) array. Arrays, including a plurality of SEQ ID NO: 1 to SEQ ID NO: 30, SEQ ID NO: 31-SEQ ID NO: 60, SEQ ID NO: 61-SEQ ID NO: 403, or antibodies specific therefor, are also a feature of the invention. In some embodiments, the arrays include members corresponding to a majority of SEQ ID NO: 1 to SEQ ID NO: 30, SEQ ID NO: 61-SEQ ID NO: 403, SEQ ID NO: 31 to SEQ ID NO: 60, or antibodies specific therefor. In one embodiment, the array includes members corresponding to each of SEQ ID NO: 1 to SEQ ID NO: 30, SEQ ID NO: 31 to SEQ ID NO: 60, or antibodies specific therefor. In an embodiment, the marker set is comprised of at least 10 contiguous nucleotides of each of SEQ ID NO: 61-SEQ ID NO: 403, at least 10 contiguous nucleotides of a plurality of SEQ ID NO: 61-SEQ ID NO: 403, at least 10 contiguous nucleotides of a majority of SEQ ID NO: 61-SEQ ID NO: 403, or complimentary sequences thereof. In an embodiment, the marker set is a mixed marker set including members that are selected from nucleic acids, polypeptides or peptides, and antibodies.


[0021] In one embodiment, the marker set of the invention is used to predict at least one growth trait of a plant cell by hybridizing one or more nucleic acids of the marker set to a DNA or RNA sample from a cell or tissue, and detecting at least one polymorphic polynucleotide or differentially expressed expression product in the sample. In another related embodiment, differentially expressed expression products are detected using an array, e.g., an antibody array.


[0022] Another aspect of the invention provides methods for modulating a plant growth trait. The methods of the invention for modulating plant growth in a cell or tissue optionally include modulating expression or activity of at least one polypeptide encoded by a nucleic acid with a polynucleotide sequence selected from SEQ ID) NO: 1 to SEQ ID NO: 30, or conservative modifications thereof; a polynucleotide sequence encoding a polypeptide sequence selected from SEQ ID NO: 31 to SEQ ID NO: 60; a polynucleotide sequence that hybridizes under stringent hybridization conditions, or that is at least 70%, (or at least 75%, 80%, 85%, 90%, 95%, 97%, 98%, or at least 99%) identical to at least one of SEQ ID NOs: 1-30; sequences complementary to any such sequences, or subsequences thereof including at least 10 contiguous nucleotides of, e.g., SEQ ID NOs: 1-30 (or at least 12, 14, 16, 17 or more contiguous nucleotides of one of the designated sequences).


[0023] In one embodiment, plant growth is regulated by modulating expression or activity of at least one polypeptide contributing to a plant growth trait. The modulation of plant growth traits can be done in variety of plants, e.g., flowering plants, a member of the family of Brassicaceae, or Arabidopsis, Brassica, Zea, Oryza, Triticum, Hordeum, Lolium, Sorghum, Glycine, Medicago, Helianthus, Lactuca, Beta, Vitis, Solanum, Lycopersicon, Capsicum, Gossypium, Hevea, Linum, Prunus, Citrus, Populus, Pinus, Quercus, Aspergillus, Neurospora, Candida and Saccharomyces. In an embodiment, expression is modulated by expressing an exogenous nucleic acid including a polynucleotide sequence selected from SEQ ID NO: 1 to SEQ ID NO: 30. In other embodiments, expression of an endogenous nucleic acid, such as an endogenous nucleic acid encoding one of SEQ ID NO: 31 through SEQ ID NO: 60 is induced or suppressed, for example, by introducing, e.g., integrating, an exogenous nucleic acid including at least one promoter that regulates expression of the endogenous nucleic acid. In other embodiments, altered expression or activity of an expression product encoded by a nucleic acid, e.g., a polynucleotide sequence selected from SEQ ID NO: 1 to SEQ ID NO: 30 or conservative varients thereof, is detected, e.g., in a high throughput assay.


[0024] In some embodiments, expression or activity is modulated in response to an environmental factor, a chemical or biological agent, a pathogen, a bacteria, a virus, a fungus or an insect. An aspect of the invention includes methods which involve detecting altered expression or activity of an expression product, such as an RNA or polypeptide, encoded by a nucleic acid including a polynucleotide sequence selected from, e.g., SEQ ID NO: 1 to SEQ ID NO: 30. In some cases, altered expression or activity in response to the presence of a fertilizer or a herbicide is detected. In certain embodiments, a plurality of expression products are detected, e.g., in an array, a bead array or in a high-throughput assay.


[0025] In an embodiment, a data record related to the altered expression or activity is recorded in a database. For example, a data record can be a character string recorded in a data base made up of a plurality of character strings recorded in a computer or on a computer readable medium.


[0026] In another aspect, the invention provides methods for detecting genes for a plant growth trait. The methods of the invention for detecting genes for a plant growth trait involve providing a subject cell or tissue sample of nucleic acids and detecting at least one polynucleotide sequence or expression product corresponding to a polynucleotide sequence of the invention, e.g., such as a polynucleotide sequence selected from SEQ ID NO: 1-SEQ ID NO: 30, sequences that hybridize under stringent conditions to any one of SEQ ID NO: 1-SEQ ID NO: 30, sequences that are at least about 70% (or at least 75%, 80%, 85%, 90%, 95%, 97%, 98%, or at least 99%) identical to any one of SEQ ID NO: 1-SEQ ID NO: 30, sequences that encode a polypeptide encoded by any one of SEQ ID NO: 1-SEQ ID NO: 30, sequences complementary to any such sequences, or subsequences thereof including at least 10 contiguous nucleotides, e.g., of SEQ ID NOs: 1-30 (or at least 12, 14, 16, or 17 or more contiguous nucleotides of one of the designated sequences.


[0027] Detection of expression products is performed either qualitatively (presence or absence of one or more product of interest) or quantitatively (by monitoring the level of expression of one or more product of interest). In one embodiment, the expression product is an RNA expression product, such as differentially expressed RNA. The present invention optionally includes monitoring an expression level of a nucleic acid or polypeptide as noted herein for detection of a plant growth trait in a plant or in a population of plants.


[0028] Kits which incorporate one or more of the nucleic acids, polypeptides, antibodies, or arrays noted above are also a feature of the invention. Such kits can include any of the above noted components and further include, e.g., instructions for use of the components in any of the methods noted herein, packaging materials, containers for holding the components, and/or the like.


[0029] Digital systems which incorporate one or more representation (e.g., character string, data table, or the like) of one or more of the nucleic acids or polypeptides herein are also a feature of the invention.







BRIEF DESCRIPTION OF THE DRAWINGS

[0030]
FIG. 1 shows a chart of differential gene expression between a plant having long roots and a plant having short roots versus chromosome position. A QTL plot for association with root length is also mapped on the same genome.


[0031]
FIG. 2 shows Arabidopsis QTL plots for three growth related traits (root length, aerial mass, and root mass). The LOD score for association of each marker interval in the genome with each phenotype is shown.







DETAILED DISCUSSION

[0032] Control of plant growth is perhaps the most important goal in modern agriculture. The rate of plant growth, overall yield of usable plant mass, fertilizer response, and sensitivity to herbicides can all affect a farmer's productivity. First, the rate of plant growth can be critical, e.g., where growing seasons are short, where several crops are planted each year, or for long growing crops such as lumber. Second, maximum growth in the usable plant mass is desirable, e.g., in the roots of a potato plant, trunk of a pine tree, leaves of tobacco and grain of wheat. Third, growth modulation by application of fertilizers and herbicides must be efficient to reduce costs and to protect the environment. As a result, effective control of plant growth traits is central to productive agriculture.


[0033] Plant growth is a complex trait subject to complex interactions of genes and the environment. Multiple genes, e.g., metabolic, structural and tissue specific genes, interact to influence plant growth. Multiple environmental factors, e.g., availability of nutrients, light conditions, temperature, the presence of herbicides, availability of water, the presence of salts, etc., also play roles in plant growth. Finally, the multiple genetic and environmental factors interact to provide the ultimate plant growth trait. Thus, identification of genes associated with growth in plants can furnish tools to investigate interactions that can produce a desired plant growth trait.


[0034] The present invention provides genes associated with plant growth, which are useful tools in deciphering the complex interactions for improved plant growth. The provided genes can be employed directly, e.g., to produce recombinant plants with desired characteristics. The polynucleotides and polypeptides of the invention can be used as tools, e.g., as elements of marker sets, sequence databases, probes, enzymes, and processes, to investigate interactions resulting in desired growth traits.


[0035] Definitions


[0036] Unless defined otherwise, all scientific and technical terms are understood to have the same meaning as commonly used in the art to which they pertain. For the purpose of the present invention, the following terms are defined below.


[0037] The term plant growth trait refers to quantifiable plant growth parameters such as, e.g., root length, aerial mass, root mass, total plant mass, stem growth rate, etc.


[0038] The term “nucleic acid” is generally used in its art-recognized meaning to refer to a ribose nucleic acid (RNA) or deoxyribose nucleic acid (DNA) polymer, or analog thereof, e.g., a nucleotide polymer comprising modifications of the nucleotides, a peptide nucleic acid, or the like. In certain applications, the nucleic acid can be a polymer that includes both RNA and DNA subunits. A nucleic acid can be, e.g., a chromosome or chromosomal segment, a vector (e.g., an expression vector), a naked DNA or RNA polymer, the product of a polymerase chain reaction (PCR), an oligonucleotide, a probe, etc.


[0039] The term “polynucleotide sequence” refers to a contiguous sequence of nucleotides in a single nucleic acid or to a representation, e.g., a character string, thereof. “Polymorphic polynucleotides” are polynucleotide sequences corresponding to a single locus, i.e., alleles at a locus, characterized by at least one variant (or alternative) nucleotide subunit. Thus, a polymorphic polynucleotide is a polynucleotide that differs, e.g., from another allele at the same locus, or between an otherwise homologous or similar polynucleotide, at one or more nucleotide positions.


[0040] A “phenotype” is the display of a trait in an individual organism resulting from the interaction of gene expression and the environment.


[0041] An “expression vector” is a vector, e.g., a plasmid, capable of producing transcripts and, potentially, polypeptides encoded by a polynucleotide sequence. Typically, an expression vector is capable of producing transcripts in an exogenous cell, e.g., a bacterial cell, or a plant cell, in vivo or invitro, e.g., a cultured plant protoplast. Expression of a product can be either constitutive or inducible depending, e.g., on the promoter selected. In the context of an expression vector, a promoter is said to be “operably linked” to a polynucleotide sequence if it is capable of regulating expression of the associated polynucleotide sequence. The term also applies to alternative exogenous gene constructs, such as expressed or integrated transgenes. Similarly, the term operably linked applies equally to alternative or additional transcriptional regulatory sequences such as enhancers, associated with a polynucleotide sequence.


[0042] An “expression product” is a transcribed sense or antisense RNA, or a translated polypeptide corresponding to a polynucleotide sequence. Depending on context, the term also can be used to refer to an amplification product (amplicon) or cDNA corresponding to the RNA expression product transcribed from the polynucleotide sequence.


[0043] A polynucleotide sequence is said to “encode” a sense or antisense RNA molecule, or a polypeptide, if the polynucleotide sequence can be transcribed (in spliced or unspliced form) or translated into the RNA or polypeptide, or a fragment of thereof.


[0044] A probe and a gene (or expression product) are said to “correspond” when they share substantial structural identity, or complimentarity, depending on context. For example, a probe or an expression product, e.g., a messenger RNA, corresponds to a gene when it is derived from a genetic element with substantial sequence identity.


[0045] Polynucleotides of the Invention


[0046] The present invention is based on the identification of nucleic acid sequences and full length genes associated with control of growth traits in plants. The gene sequences of the invention can influence plant growth by their presence in the genome of a plant species or by the abundance of their expression products in such a plant.


[0047] The sequences of the invention can be implicated in control of plant growth traits in their differential expression between plants with high growth and low growth characteristics. The specified sequences can be implicated in the control of growth traits in plants by their differential regulation in response to environmental factors known to induce or suppress display of the growth traits. Unlike the vast majority of polynucleotide sequences present in the plant genome, e.g., randomly selected unique or repetitive polynucleotide sequences, this defined and limited group of polynucleotides, possess an extraordinary high probability of association with loci involved in the growth traits in plants.


[0048] Given the sequences of the invention, as disclosed herein, those skilled in the art can readily synthesize the sequences or screen them from nature. Screening from nature can be, e.g., by massively parallel signature sequencing (MPSS). Massively parallel signature sequencing is a wide ranging and sensitive quantitative cDNA analysis tool for preparation of expression profiles, Brenner et al. “In vitro cloning of complex mixtures of DNA on microbeads: Physical separation of differentially expressed cDNAs”, (2000) PNAS 97, 1665-1670. In MPSS, cDNA is prepared from poly(A) RNA (mRNA) using a biotin-labeled oligo-dT primer. The oligo-dT is designed to prime each mRNA molecule exactly at the poly(A) junction. The cDNA fragments are then digested with DpnII (recognition sequence GATC), and the 3′-most DpnII-poly(A) fragments are purified utilizing the biotin label at the end of each molecule. The fragments are subsequently bound to 5 micron diameter microbeads using a complex set of 32 base tag/antitags. This process yields a library of beads where one mRNA molecule is represented by one microbead, and each microbead contains approximately 100,000 identical cDNA fragments from that mRNA. All molecules are covalently attached to the microbeads at their poly(A) ends; therefore, the DPNII end is available for sequencing reactions. Expression differences between organisms, e.g., of different phenotypes can be identified using MPSS as a tool.


[0049] Accordingly, in one aspect, the polynucleotide sequences of the invention are useful for identifying corresponding cDNAs associated with growth in plants and/or chromosomal segments associated with growth. More generally, the polynucleotide sequences of the invention and corresponding polypeptides are useful, individually and/or collectively, as probes (e.g., probes labeled with a detectable moiety) and markers. In addition, the polynucleotide sequences of the invention are useful for the production of plant and cell culture models useful for the monitoring of agents and evaluation of protocols aimed at controlling growth in plants. Nucleic acid sequences of the invention, e.g., SEQ ID NO: 1 through SEQ ID NO: 30, can also be used in vector systems to control plant growth, e.g., by transformation of plant cells to modulate expression of growth correlated genes.


[0050] Polynucleotide sequences of the invention include, e.g., the polynucleotide sequences represented by SEQ ID NO: 1 through SEQ ID NO: 30 and SEQ ID NO: 61 through SEQ ID NO: 403. In addition to the sequences expressly provided in the accompanying sequence listing, the invention includes polynucleotide sequences, that are highly related structurally and/or functionally. For example, polynucleotides encoding polypeptide sequences represented by SEQ ID NO: 31 through SEQ ID NO: 60, or subsequences thereof are one embodiment of the invention. In addition, polynucleotide sequences of the invention include polynucleotide sequences that hybridize under stringent conditions to a polynucleotide sequence comprising any of SEQ ID NO: 1-SEQ ID NO: 30.


[0051] In addition to the polynucleotide sequences of the invention, e.g., enumerated in SEQ ID NO: 1 to SEQ ID NO: 30, or SEQ ID NO: 61-SEQ ID NO: 403, polynucleotide sequences that are substantially identical to a polynucleotide of the invention can be used in the compositions and methods of the invention. Substantially identical or substantially similar polynucleotide (or polypeptide) sequences are defined as polynucleotide (or polypeptide) sequences that are identical, on a nucleotide by nucleotide bases, with at least a subsequence of a reference polynucleotide (or polypeptide), e.g., selected from SEQ ID NO: 1-30 (or 61-403). Such polynucleotides can include, e.g., insertions, deletions, and substitutions relative to any of SEQ ID NO: 1-30. For example, such polynucleotides are typically at least about 70% identical to a reference polynucleotide (or polypeptide) selected from among SEQ ID NO: 1 through SEQ ID NO: 30 (or 61-403). That is, at least 7 out of 10 nucleotides (or amino acids) within a window of comparison are identical to the reference sequence selected SEQ ID NO: 1-30. Frequently, such sequences are at least about 80%, usually at least about 90%, and often at least about 95%, or even at least about 98%, or about 99%, identical to the reference sequence, e.g., at least one of SEQ ID NO: 1 to SEQ ID NO: 30 or SEQ ID NO: 61 to SEQ ID NO: 403.


[0052] Subsequences of the polynucleotides of the invention described above, e.g., SEQ ID NOs: 1-30, including at least 10 contiguous nucleotides or complementary subsequences thereof are also a feature of the invention. More commonly a subsequence includes at least 12 contiguous nucleotides, e.g.;, of one or more of SEQ ID NO: 1 through SEQ ID NO: 30 or SEQ ID NO: 61 through SEQ ID NO: 403. Typically, the subsequence includes at least 14, frequently at least 16, and usually at least 17 or more contiguous nucleotides of one of the specified polynucleotide sequences. Such subsequences can be, e.g., oligonucleotides, such as synthetic oligonucleotides, or full-length genes or cDNAs.


[0053] In addition, polynucleotide sequences complementary to any of the above described sequences are included among the polynucleotides of the invention. Where the polynucleotide sequences are translated to form a polypeptide or subsequence of a polypeptide, the nucleotide changes can result in either conservative or non-conservative amino acid substitutions. Conservative amino acid substitutions refer to the interchangeability of residues having functionally similar side chains. Conservative substitution tables providing functionally similar amino acids are well known in the art. Table 1 sets forth six groups which contain amino acids that are “conservative substitutions” for one another. Other conservative substitution charts are available in the art, and can be used in a similar manner.
1TABLE 1Conservative Substitution Group1Alanine (A)Serine (S)Threonine (T)2Aspartic acid (D)Glutamic acid (E)3Asparagine (N)Glutamine (Q)4Arginine (R)Lysine (K)5Isoleucine (I)Leucine (L)Methionine (M)Valine (V)6Phenylalanine (F)Tyrosine (Y)Tryptophan (W)


[0054] One of skill in the art will appreciate that many conservative substitutions of the nucleic acid constructs which are disclosed yield a functionally identical construct. For example, as discussed above, owing to the degeneracy of the genetic code, “silent substitutions” (i.e., substitutions in a nucleic acid sequence which do not result in an alteration in an encoded polypeptide) are an implied feature of every nucleic acid sequence which encodes an amino acid. Similarly, “conservative amino acid substitutions,” in one or a few amino acids in an amino acid sequence (e.g., about 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10% or more) are substituted with different amino acids with highly similar properties, are also readily identified as being highly similar to a disclosed construct. Such conservative variations of each disclosed sequence are a feature of the present invention.


[0055] Methods for obtaining conservative variants, as well as more divergent versions of the nucleic acids and polypeptides of the invention are widely known in the art. In addition to naturally occurring homologues which can be obtained, e.g., by screening genomic or expression libraries according to any of a variety of well-established protocols, see, e.g., Ausubel et al. Current Protocols in Molecular Biology (supplemented through 2001) John Wiley & Sons, New York (“Ausubel”); Sambrook et al. Molecular Cloning—A Laboratory Manual (2nd Ed.), Vol. 1-3, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y., 1989 (“Sambrook”), and Berger and Kimmel Guide to Molecular Cloning Techniques, Methods in Enzymology volume 152 Academic Press, Inc., San Diego, Calif. (“Berger”), additional variants can be produced by a variety of mutagenesis procedures. Many such procedures are known in the art, including site directed mutagenesis, oligonucleotide-directed mutagenesis, and many others. For example, site directed mutagenesis is described, e.g., in Smith (1985) “In vitro mutagenesis” Ann. Rev. Genet. 19:423-462, and references therein, Botstein & Shortle (1985) “Strategies and applications of in vitro mutagenesis” Science 229:1193-1201; and Carter (1986) “Site-directed mutagenesis” Biochem. J. 237:1-7. Oligonucleotide-directed mutagenesis is described, e.g., in Zoller & Smith (1982) “Oligonucleotide-directed mutagenesis using M13-derived vectors: an efficient and general procedure for the production of point mutations in any DNA fragment” Nucleic Acids Res. 10:6487-6500). Mutagenesis using modified bases is described e.g., in Kunkel (1985) “Rapid and efficient site-specific mutagenesis without phenotypic selection” Proc. Natl. Acad. Sci. USA 82:488-492, and Taylor et al. (1985) “The rapid generation of oligonucleotide-directed mutations at high frequency using phosphorothioate-modified DNA” Nucl. Acids Res. 13: 8765-8787. Mutagenesis using gapped duplex DNA is described, e.g., in Kramer et al. (1984) “The gapped duplex DNA approach to oligonucleotide-directed mutation construction” Nucl. Acids Res. 12: 9441-9460). Point mismatch repair is described, e.g., by Kramer et al. (1984) “Point Mismatch Repair” Cell 38:879-887). Double-strand break repair is described, e.g., in Mandecki (1986) “Oligonucleotide-directed double-strand break repair in plasmids of Escherichia coli: a method for site-specific mutagenesis” Proc. Natl. Acad. Sci. USA, 83:7177-7181, and in Arnold (1993) “Protein engineering for unusual environments” Current Opinion in Biotechnology 4:450-455). Mutagenesis using repair-deficient host strains is described, e.g., in Carter et al. (1985) “Improved oligonucleotide site-directed mutagenesis using M13 vectors” Nucl. Acids Res. 13: 4431-4443. Mutagenesis by total gene synthesis is described e.g., by Nambiar et al. (1984) “Total synthesis and cloning of a gene coding for the ribonuclease S protein” Science 223: 1299-1301. DNA shuffling is described, e.g., by Stemmer (1994) “Rapid evolution of a protein in vitro by DNA shuffling” Nature 370:389-391, and Stemmer (1994) “DNA shuffling by random fragmentation and reassembly: In vitro recombination for molecular evolution.” Proc. Natl. Acad. Sci. USA 91:10747-10751.


[0056] Many of the above methods are further described in Methods in Enzymology Volume 154, which also describes useful controls for trouble-shooting problems with various mutagenesis methods. Kits for mutagenesis, library construction and other diversity generation methods are also commercially available. For example, kits are available from, e.g., Amersham International plc (e.g., using the Eckstein method above), Anglian Biotechnology Ltd (e.g., using the Carter/Winter method above), Bio/Can Scientific, Bio-Rad (e.g., using the Kunkel method described above), Boehringer Mannheim Corp., Clonetech Laboratories, DNA Technologies, Epicentre Technologies (e.g., the 5 prime 3 prime kit); Genpak Inc, Lemargo Inc, Life Technologies (Gibco BRL), New England Biolabs, Pharmacia Biotech, Promega Corp., Quantum Biotechnologies, Stratagene (e.g., QuickChange™ site-directed mutagenesis kit; and Chameleon™ double-stranded, site-directed mutagenesis kit).


[0057] Determining Sequence Relationships


[0058] The nucleic acid and amino acid sequences of the invention include, e.g., those provided in SEQ ID NO: 1 to SEQ ID NO: 403 as well as similar sequences. Similar sequences are objectively determined by any number of methods, e.g., percent identity, hybridization, immunologically, and the like. A variety of methods for determining relationships between two or more sequences (e.g., identity, similarity and/or homology) are available, and well known in the art. The methods include manual alignment, computer assisted sequence alignment and combinations thereof. A number of algorithms (which are generally computer implemented) for performing sequence alignment are widely available, or can be produced by one of skill. These methods include, e.g., the local homology algorithm of Smith and Waterman (1981) Adv. Appl. Math. 2:482; the homology alignment algorithm of Needleman and Wunsch (1970) J. Mol. Biol. 48:443; the search for similarity method of Pearson and Lipman (1988) Proc. Natl. Acad. Sci. (USA) 85:2444; and/or by computerized implementations of these algorithms (e.g., GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package Release 7.0, Genetics Computer Group, 575 Science Dr., Madison, Wis.).


[0059] For example, software for performing sequence identity (and sequence similarity) analysis using the BLAST algorithm is described in Altschul et al. (1990) J. Mol. Biol. 215:403-410. This software is publicly available, e.g., through the National Center for Biotechnology Information on the world wide web at ncbi.nlm.nih.gov. This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold. These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are then extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always >0) and N (penalty score for mismatching residues; always <0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 11, an expectation (E) of 10, a cutoff of 100, M=5, N=−4, and a comparison of both strands. For amino acid sequences, the BLASTP (BLAST Protein) program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (see, Henikoff & Henikoff (1989) Proc. Natl. Acad. Sci. USA 89:10915).


[0060] Additionally, the BLAST algorithm performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin & Altschul (1993) Proc. Nat'l. Acad. Sci. USA 90:5873-5787). One measure of similarity provided by the BLAST algorithm is the smallest sum probability (p(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid is considered similar to a reference sequence (and, therefore, in this context, homologous) if the smallest sum probability in a comparison of the test nucleic acid to the reference nucleic acid is less than about 0.1, or less than about 0.01, and or even less than about 0.001.


[0061] Another example of a useful sequence alignment algorithm is PILEUP. PILEUP creates a multiple sequence alignment from a group of related sequences using progressive, pairwise alignments. It can also plot a tree showing the clustering relationships used to create the alignment. PILEUP uses a simplification of the progressive alignment method of Feng & Doolittle (1987) J. Mol. Evol. 35:351-360. The method used is similar to the method described by Higgins & Sharp (1989) CABIOS5:151-153. The program can align, e.g., up to 300 sequences of a maximum length of 5,000 letters. The multiple alignment procedure begins with the pairwise alignment of the two most similar sequences, producing a cluster of two aligned sequences. This cluster can then be aligned to the next most related sequence or cluster of aligned sequences. Two clusters of sequences can be aligned by a simple extension of the pairwise alignment of two individual sequences. The final alignment is achieved by a series of progressive, pairwise alignments. The program can also be used to plot a dendogram or tree representation of clustering relationships. The program is run by designating specific sequences and their amino acid or nucleotide coordinates for regions of sequence comparison.


[0062] An additional example of an algorithm that is suitable for multiple DNA, or amino acid, sequence alignments is the CLUSTALW program (Thompson, J. D. et al. (1994) Nucl. Acids. Res. 22: 4673-4680). CLUSTALW performs multiple pairwise comparisons between groups of sequences and assembles them into a multiple alignment based on homology. Gap open and Gap extension penalties can be, e.g., 10 and 0.05 respectively. For amino acid alignments, the BLOSUM algorithm can be used as a protein weight matrix. See, e.g., Henikoff and Henikoff (1992) Proc. Natl. Acad. Sci. USA 89: 10915-10919.


[0063] Nucleic Acid Hybridization


[0064] Similarity between nucleic acids of the invention can also be evaluated by “hybridization” between single stranded (or single stranded regions of) nucleic acids with complementary or partially complementary polynucleotide sequences.


[0065] Hybridization is a measure of the physical association between nucleic acids, typically, in solution, or with one of the nucleic acid strands immobilized on a solid support, e.g., a membrane, a bead, a chip, a filter, etc. Nucleic acid hybridization occurs based on a variety of well characterized physico-chemical forces, such as hydrogen bonding, solvent exclusion, base stacking, and the like. Numerous protocols for nucleic acid hybridization are well known in the art. An extensive guide to the hybridization of nucleic acids is found in Tijssen (1993) Laboratory Techniques in Biochemistry and Molecular Biology—Hybridization with Nucleic Acid Probes, part I, chapter 2, “Overview of principles of hybridization and the strategy of nucleic acid probe assays,” (Elsevier, New York), as well as in Ausubel et al. Current Protocols in Molecular Biology (supplemented through 2001) John Wiley & Sons, New York (“Ausubel”); Sambrook et al. Molecular Cloning—A Laboratory Manual (2nd Ed.), Vol. 1-3, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y., 1989 (“Sambrook”), and Berger and Kimmel Guide to Molecular Cloning Techniques, Methods in Enzymology volume 152 Academic Press, Inc., San Diego, Calif. (“Berger”). Hames and Higgins (1995) Gene Probes 1, IRL Press at Oxford University Press, Oxford, England (Hames and Higgins 1) and Hames and Higgins (1995) Gene Probes 2, IRL Press at Oxford University Press, Oxford, England (Hames and Higgins 2) provide details on the synthesis, labeling, detection and quantification of DNA and RNA, including oligonucleotides.


[0066] Conditions suitable for obtaining hybridization, including differential hybridization, are selected according to the theoretical melting temperature (Tm) between complementary and partially complementary nucleic acids. Under a given set of conditions, e.g., solvent composition, ionic strength, etc., the Tm is the temperature at which the duplex between the hybridizing nucleic acid strands is 50% denatured. That is, the Tm corresponds to the temperature corresponding to the midpoint in transition from helix to random coil; it depends on the length of the nucleotides, nucleotide composition, and ionic strength, for long stretches of nucleotides.


[0067] After hybridization, unhybridized nucleic acids can be removed by a series of washes, the stringency of which can be adjusted depending upon the desired results. Low stringency washing conditions (e.g., using higher salt and lower temperature) increase sensitivity, but can product nonspecific hybridization signals and high background signals. Higher stringency conditions (e.g., using lower salt and higher temperature that is closer to the Tm) lower the background signal, typically with primarily the specific signal remaining. See, also, Rapley, R. and Walker, J. M. eds., Molecular Biomethods Handbook (Humana Press, Inc. 1998).


[0068] “Stringent hybridization wash conditions” or “stringent conditions” in the context of nucleic acid hybridization experiments, such as Southern and northern hybridizations, are sequence dependent, and are different under different environmental parameters. An extensive guide to the hybridization of nucleic acids is found in Tijssen (1993), supra, and in Hames and Higgins 1 and Hames and Higgins 2, supra.


[0069] An example of stringent hybridization conditions for hybridization of complementary nucleic acids which have more than 100 complementary residues on a filter in a Southern or northern blot is 2× SSC, 50% formamide at 42° C., with the hybridization being carried out overnight (e.g., for approximately 20 hours). An example of stringent wash conditions is a 0.2× SSC wash at 65° C. for 15 minutes (see Sambrook, supra for a description of SSC buffer). Often, the wash determining the stringency is preceded by a low stringency wash to remove signal due to residual unhybridized probe. An example low stringency wash is 2× SSC at room temperature (e.g., 20° C. for 15 minutes).


[0070] In general, a signal to noise ratio of at least 2.5×-5× (and typically higher) than that observed for an unrelated probe in the particular hybridization assay indicates detection of a specific hybridization. Detection of at least stringent hybridization between two sequences in the context of the present invention indicates relatively strong structural similarity to, e.g., the nucleic acids of the present invention provided in the sequence listings herein.


[0071] For purposes of the present invention, generally, “highly stringent” hybridization and wash conditions are selected to be about 5° C. or less lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH (as noted below, highly stringent conditions can also be referred to in comparative terms). Target sequences that are closely related or identical to the nucleotide sequence of interest (e.g., “probe”) can be identified under stringent or highly stringent conditions. Lower stringency conditions are appropriate for sequences that are less complementary.


[0072] For example, in determining stringent or highly stringent hybridization (or even more stringent hybridization) and wash conditions, the hybridization and wash conditions are gradually increased (e.g., by increasing temperature, decreasing salt concentration, increasing detergent concentration, and/or increasing the concentration of organic solvents, such as formamide, in the hybridization or wash), until a selected set of criteria are met. For example, the hybridization and wash conditions are gradually increased until a probe comprising one or more polynucleotide sequences of the invention, e.g., selected from SEQ ID NO: 1 to SEQ ID NO: 30, SEQ ID NO: 61 through SEQ ID NO: 403, and/or complementary polynucleotide sequences thereof, binds to a perfectly matched complementary target (again, a nucleic acid comprising one or more nucleic acid sequences or subsequences selected from SEQ ID NO: 1 to SEQ ID NO: 30, SEQ ID NO: 61 through SEQ ID NO: 403, and complementary polynucleotide sequences thereof), with a signal to noise ratio that is at least 2.5×, and optionally 5×, or 10×, or 100× or more, as high as that observed for hybridization of the probe to an unmatched target, as desired.


[0073] For example, using subsequences derived from the nucleic acids encoding the polypeptides of the invention, novel target nucleic acids can be obtained; such target nucleic acids are also a feature of the invention. For example, such target nucleic acids include sequences that hybridize under stringent conditions to an oligonucleotide probe that encodes a unique subsequence in any of the polypeptides of the invention, e.g., SEQ ID NOs: 31-60.


[0074] For example, hybridization conditions are chosen under which a target oligonucleotide that is perfectly complementary to the oligonucleotide probe hybridizes to the probe with at least about a 5-10× higher signal to noise ratio than for hybridization of the target oligonucleotide to a negative control non-complimentary nucleic acid.


[0075] Higher ratios of signal to noise can be achieved by increasing the stringency of the hybridization conditions such that ratios of about 15×, 20×, 30×, 50× or more are obtained. The particular signal will depend on the label used in the relevant assay, e.g., a fluorescent label, a calorimetric label, a radio active label, or the like.


[0076] Probes


[0077] Nucleic acids including one or more polynucleotide sequence of the invention are favorably used as probes for the detection of complimentary, corresponding, or related nucleic acids in a variety of contexts, such as the nucleic hybridization experiments discussed above. The probes can be either DNA or RNA molecules, such as restriction fragments of genomic or cloned DNA, cDNAs, amplification products, transcripts, and oligonucleotides, and can vary in length from oligonucleotides as short as about 10 nucleotides in length to chromosomal fragments or cDNAs in excess of one or more kilobases. For example, in some embodiments, a probe of the invention includes a polynucleotide sequence or subsequence selected from among SEQ ID NO: 1 to SEQ ID NO: 30, SEQ ID NO: 61 through SEQ ID NO: 403, or sequences complementary thereto. Alternatively, polynucleotide sequences that are variants of one of the above designated sequences can be used as probes. Most typically, such variants include one or a few nucleotide variations. For example, pairs (or sets) of oligonucleotides can be selected, in which the two (or more) polynucleotide sequences are conservative variations of each other, wherein one polynucleotide sequence corresponds identically to a first allele or allelic variant and the other(s) correspond identically to additional alleles or allelic variants. Such pairs of oligonucleotide probes are particularly useful, e.g., for allele specific hybridization experiments to detect polymorphic nucleotides. In other applications, probes are selected that are more divergent, that is, probes that are at least about 70% (or 80%, 90%, 95%, 98%, or 99%) identical are selected.


[0078] The probes of the invention, as exemplified by sequences derived from SEQ ID NO: 1 through SEQ ID NO: 30 and SEQ ID NO: 61 through SEQ ID NO: 403, can also be used to identify additional useful polynucleotide sequences according to procedures routine in the art. In one set of embodiments, one or more probes, as described above, are utilized to screen libraries of expression products or chromosomal segments (e.g. expression libraries or genomic libraries) to identify clones that include sequences identical to, or with significant sequence similarity to, one or more of SEQ ID NO: 1-30, i.e., allelic variants, homologues or orthologues. In turn, each of these identified sequences can be used to make probes, including pairs or sets of variant probes as described above. It will be understood that in addition to such physical methods as library screening, computer assisted bioinformatic approaches, e.g., BLAST and other sequence homology search algorithms, and the like, can also be used for identifying related polynucleotide sequences. Polynucleotide sequences identified in this manner are also a feature of the invention.


[0079] For example, oligonucleotide probes, most typically produced by well known synthetic methods, such as the solid phase phosphoramidite triester method described by Beaucage and Caruthers (1981) Tetrahedron Letts. 22(20):1859-1862, e.g., using an automated synthesizer, as described in Needham-VanDevanter et al. (1984) Nucleic Acids Res., 12:6159-6168. Oligonucleotides can also be custom made and ordered from a variety of commercial sources known to persons of skill. Purification of oligonucleotides, where necessary, is typically performed by either native acrylamide gel electrophoresis or by anion-exchange UPLC as described in Pearson and Regnier (1983) J. Chrom. 255:137-149. The sequence of the synthetic oligonucleotides can be verified using the chemical degradation method of Maxam and Gilbert (1980) in Grossman and Moldave (eds.) Academic Press, New York, Methods in Enzymology 65:499-560. Custom oligos can also easily be ordered from a variety of commercial sources known to persons of skill.


[0080] In addition, essentially any nucleic acid can be custom ordered from any of a variety of commercial sources, such as The Midland Certified Reagent Company (mcrc@oligos.com), The Great American Gene Company (http:Hlwww.genco.com), ExpressGen Inc. (www.expressgen.com), Operon Technologies Inc. (Alameda, Calif.) and many others. Similarly, peptides and antibodies can be custom ordered from any of a variety of sources, such as PeptidoGenic (pkim@ccnet.com), HTI Bio-products, inc. (http:/Iwww.htibio.com), BMA Biomedicals Ltd (U.K.), Bio.Synthesis, Inc., and many others.


[0081] As noted, in one embodiment, oligonucleotide probes of the invention include subsequences of SEQ ID NO: 1 through SEQ ID NO: 30, SEQ ID NO: 61 through SEQ ID NO: 403, and/or complementary sequences thereof, including e.g., at least 10 contiguous nucleotides in length. Commonly, the oligonucleotide probes are at least 12 contiguous nucleotides in length; usually, the oligonucleotides are at least 14 contiguous nucleotides in length; frequently, the oligonucleotides are at least 16 contiguous nucleotides in length, and in many cases the oligonucleotides are at least 17 or more contiguous nucleotides of at least one sequence selected from SEQ ID NO: 1 to SEQ ID NO: 30 or SEQ ID NO: 61 through SEQ ID NO: 403. In some cases, the oligonucleotide probes consist of a polynucleotide sequence selected from SEQ ID NO: 1 through SEQ ID NO: 30 or from SEQ ID NO: 61 through SEQ ID NO: 403.


[0082] In other circumstances, e.g., relating to functional attributes of cells or organisms expressing the polynucleotides and polypeptides of the invention, probes that are polypeptides, peptides, or antibodies are favorably utilized. For example, polypeptides, polypeptide fragments, and peptides corresponding to, or derived from SEQ ID NO: 31 to SEQ ID NO: 60, are favorably used to identify and isolate antibodies or other binding proteins, e.g., from phage display libraries, combinatorial libraries, polyclonal sera, and the like.


[0083] Antibodies specific for any one of SEQ ID NO: 31 to SEQ ID NO: 60 are likewise valuable as probes for evaluating expression products, e.g., from cells or tissues. In addition, antibodies are particularly suitable for evaluating expression of proteins corresponding to SEQ ID NOs: 31-60, in situ, in a cell, tissue or whole plant, e.g., a plant providing an experimental model for manipulation of growth traits. Antibodies can be directly labeled with a detectable reagent as described below, or detected indirectly by labeling of a secondary antibody specific for the heavy chain constant region (i.e., isotype) of the specific antibody. Additional details regarding production of specific antibodies are provided below in the section entitled “Antibodies.”


[0084] Labeling and Detecting Probes


[0085] Numerous methods are available for labeling and detection of the nucleic acid and polypeptide (or peptide or antibody) probes of the invention, these include: 1) fluorescence (using, e.g., fluorescein, Cy-5, rhodamine or other fluorescent tags); 2) isotopic methods, e.g., using end-labeling, nick translation, random priming, or PCR to incorporate radioactive isotopes into the probe polynucleotide/oligonucleotide; 3) chemifluorescence using alkaline phosphatase and the substrate AttoPhos (Amersham) or other substrates that produce fluorescent products; 4) chemiluminescence (using either horseradish peroxidase and/or alkaline phosphatase with substrates that produce photons as breakdown products, kits providing reagents and protocols are available from such commercial sources as Amersham, Boehringer-Mannheim, and Life Technologies/Gibco BRL); and, 5) colorimetric methods (again using both horseradish peroxidase and alkaline phosphatase with substrates that produce a colored precipitate, kits are available from Life Technologies/Gibco BRL, and Boehringer-Mannheim). Other methods for labeling and detection will be readily apparent to one skilled in the art.


[0086] More generally, a probe can be labeled with any composition detectable by spectroscopic, photochemical, biochemical, immunochemical, electrical, optical, chemical, or other available means. Useful labels in the present invention include spectral labels such as fluorescent dyes (e.g., fluorescein isothiocyanate, Texas red, rhodamine, and the like), radiolabels (e.g., 3H, 125I, 35S, 14C, 32P, 33P, etc.), enzymes (e.g., horse-radish peroxidase, alkaline phosphatase, etc.), spectral colorimetric labels such as colloidal gold, or colored glass or plastic (e.g. polystyrene, polypropylene, latex, etc.) beads. The label may be coupled directly or indirectly to a component of the detection assay (e.g., a probe, such as an oligonucleotide, isolated DNA, amplicon, restriction fragment, or the like) according to methods well known in the art. As indicated above, a wide variety of labels may be used, with the choice of label depending on sensitivity required, ease of conjugation with the compound, stability requirements, available instrumentation, and disposal provisions. In general, a detector which monitors a probe-target nucleic acid hybridization is adapted to the particular label which is used. Typical detectors include spectrophotometers, phototubes and photodiodes, microscopes, scintillation counters, cameras, film and the like, as well as combinations thereof. Examples of suitable detectors are widely available from a variety of commercial sources known to persons of skill. Commonly, an optical image of a substrate comprising a nucleic acid array with particular set of probes bound to the array is digitized for subsequent computer analysis.


[0087] Because incorporation of radiolabeled nucleotides into nucleic acids is straightforward, this detection represents one favorable labeling strategy. Exemplar technologies for incorporating radiolabels include end-labeling with a kinase or phoshpatase enzyme, nick translation, incorporation of radio-active nucleotides with a polymerase and many other well known strategies.


[0088] Fluorescent labels are desirable, having the advantage of requiring fewer precautions in handling, and being amenable to high-throughput visualization techniques. Preferred labels are typically characterized by one or more of the following: high sensitivity, high stability, low background, low environmental sensitivity and high specificity in labeling. Fluorescent moieties, which are incorporated into the labels of the invention, are generally are known, including Texas red, fluorescein isothiocyanate, rhodamine, etc. Many fluorescent tags are commercially available from SIGMA chemical company (Saint Louis, Mo.), Molecular Probes (Eugene, Oreg.), R&D systems (Minneapolis, Minn.), Pharmacia LKB Biotechnology (Piscataway, N.J.), CLONTECH Laboratories, Inc. (Palo Alto, Calif.), Chem Genes Corp., Aldrich Chemical Company (Milwaukee, Wis.), Glen Research, Inc., GIBCO BRL Life Technologies, Inc. (Gaithersberg, Md.), Fluka Chemica-Biochemika Analytika (Fluka Chemie AG, Buchs, Switzerland), and Applied Biosystems (Foster City, Calif.) as well as other commercial sources known to one of skill. Similarly, moieties such as digoxygenin and biotin, which are not themselves fluorescent but are readily used in conjunction with secondary reagents, i.e., anti-digoxygenin antibodies, avidin (or streptavidin), that can be labeled, are suitable as labeling reagents in the context of the probes of the invention.


[0089] The label is coupled directly or indirectly to a molecule to be detected (a product, substrate, enzyme, or the like) according to methods well known in the art. As indicated above, a wide variety of labels are used, with the choice of label depending on the sensitivity required, ease of conjugation of the compound, stability requirements, available instrumentation, and disposal provisions. Non-radioactive labels are often attached by indirect means. Generally, a ligand molecule (e.g., biotin) is covalently bound to a nucleic acid such as a probe, primer, amplicon, or the like. The ligand then binds to an anti-ligand (e.g., streptavidin) molecule which is either inherently detectable or covalently bound to a signal system, such as a detectable enzyme, a fluorescent compound, or a chemiluminescent compound. A number of ligands and anti-ligands can be used. Where a ligand has a natural anti-ligand, for example, biotin, thyroxine, and cortisol, it can be used in conjunction with labeled, anti-ligands. Alternatively, any haptenic or antigenic compound can be used in combination with an antibody. Labels can also be conjugated directly to signal generating compounds, e.g., by conjugation with an enzyme or fluorophore or chromophore. Enzymes of interest a labels will primarily be hydrolases, particularly phosphatases, esterases and glycosidases, or oxidoreductases, particularly peroxidases. Fluorescent compounds include fluorescein and its derivatives, rhodamine and its derivatives, dansyl, umbelliferone, etc. Chemiluminescent compounds include luciferin, and 2,3-dihydrophthalazinediones, e.g., luminol. Means of detecting labels are well known to those of skill in the art. Thus, for example, where the label is a radioactive label, means for detection include a scintillation counter or photographic film as in autoradiography. Where the label is optically detectable, typical detectors include microscopes, cameras, phototubes and photodiodes and many other detection systems which are widely available.


[0090] It will be appreciated that probe design is influenced by the intended application. For example, where several allele-specific probe-target interactions are to be detected in a single assay, e.g., on a single DNA chip, it is desirable to have similar melting temperatures for all of the probes. Accordingly, the length of the probes are adjusted so that the melting temperatures for all of the probes on the array are closely similar (it will be appreciated that different lengths for different probes may be needed to achieve a particular Tm where different probes have different GC contents). Although melting temperature is a primary consideration in probe design, other factors are optionally used to further adjust probe construction, such as selecting against primer self-complementarity and the like.


[0091] Marker Sets


[0092] Sets of probes, including multiple nucleic acids with polynucleotide sequences or sequences selected from among the polynucleotides of the invention, e.g., SEQ ID NO: 1 through SEQ ID NO: 30, SEQ ID NO: 61 through SEQ ID NO: 403, or subsequences thereof, or conservative variants thereof, or sequences complimentary to any of the foregoing are also a feature of the invention. Such sets of probes are useful as marker sets, e.g., for predicting plant growth traits before they become apparent, identifying plant or cell phenotype, and/or the like.


[0093] Marker sets of the invention favorably include any of the probe sequences described above, such as polynucleotide sequences that hybridize under stringent conditions to any one of SEQ ID NO: 1-SEQ ID NO: 30, any one of SEQ ID NO: 61 through SEQ ID NO: 403, sequences that are at least 70% identical to any one of SEQ ID NO: 1-SEQ ID NO: 30, sequences that encode a polypeptide or peptide comprising a subsequence encoded by any one of SEQ ID NO: 31-SEQ ID NO: 60, sequences complementary to any such sequences, or subsequences thereof.


[0094] In one embodiment, the marker set of the invention is a plurality of oligonucleotides, e.g., synthetic oligonucleotides produced by the phosporamidite triester synthesis method on an automated synthesizer, as described above. For example, at least two oligonucleotides including a polynucleotide sequence of at least 10 contiguous nucleotides of sequences selected from a polynucleotide of the invention, e.g., SEQ ID NO: 1 to SEQ ID NO: 30 or SEQ ID NO: 61 through SEQ ID NO: 403, can be used as a set to predict plant growth traits before they become apparent. Frequently, the oligonucleotides selected will be longer than 10 contiguous nucleotides in length, for example, oligonucleotides of at least 12, or 14, or 16 or 17, or more contiguous nucleotides are favorably employed in the marker sets of the invention.


[0095] While as few as one or two probes can constitute a marker set, it is frequently desirable to employ marker sets with more than two members. Typically, a marker set of the invention has at least 3, often at least about 5 or more members selected from among any of the polynucleotides of the invention. In one favorable embodiment, the marker set includes oligonucleotides corresponding in sequence to at least part of each of SEQ ID NO: 1 through SEQ ID NO: 30 or SEQ ID NO: 61 through SEQ ID NO: 403. In another embodiment, the marker sets are made up of expression products such as cDNAs, or amplification products corresponding to cDNA or RNA expression products.


[0096] In some applications, the marker set includes labeled nucleic acid probes as described in the preceding section. In other applications, e.g., certain array applications, a labeled nucleic acid sample is hybridized to a set of unlabeled marker nucleic acids.


[0097] The marker sets of the invention are frequently employed in the context of a polynucleotide sequence array. Any of the polynucleotide sequences of the invention, as described above, can be logically or physically arrayed to produce a useful array. For example, nucleic acids, e.g., oligonucleotides, cDNAs, amplicons, and/or chromosomal segments, can be physically arrayed in a solid phase or liquid phase array. Common solid phase arrays include a variety of solid substrates suitable for attaching nucleic acids in an ordered manner, such as membranes, filters, chips, beads, pins, slides, plates, etc. Common liquid phase arrays include, e.g., arrays of wells (e.g., as in microtiter trays) or containers (e.g., as in arrays of test tubes).


[0098] Nucleic acids of the marker sets are optionally immobilized, for example by direct or indirect cross-linking, to the solid support. Essentially any solid support capable of withstanding the reagents and conditions used in the particular detection assay can be utilized. For example, functionalized glass, silicon, silicon dioxide, modified silicon, any of a variety of polymers, such as (poly)tetrafluoroethylene, (poly)vinylidenedifluoride, polystyrene, polycarbonate, membranes (e.g., nylon or nitrocellulose), or combinations thereof, can all serve as the substrate for a solid phase array.


[0099] In one embodiment, the array is a “chip” composed, e.g., of one of the above specified materials. Polynucleotide probes, e.g., RNA or DNA, such as cDNA, synthetic oligonucleotides, and the like, as discussed above are adhered to the chip in a logically ordered manner, i.e., in an array. Additional details regarding methods for linking nucleic acids and proteins to a chip substrate, can be found in, e.g., U.S. Pat. No. 5,143,854 “Large Scale Photolithographic Solid Phase Synthesis of Polypeptides and Receptor Binding Screening Thereof” to Pirrung et al., issued, Sep. 1, 1992; U.S. Pat. No. 5,837,832 “Arrays of Nucleic Acid Probes on Biological Chips” to Chee et al., issued Nov. 17, 1998; U.S. Pat. No. 6,087,112 “Arrays with Modified Oligonucleotide and Polynucleotide Compositions” to Dale, issued Jul. 11, 2000; U.S. Pat. No. 5,215,882 “Method of Immobilizing Nucleic Acid on a Solid Substrate for Use in Nucleic Acid Hybridization Assays” to Bahl et al., issued Jun. 1, 1993; U.S. Pat. No. 5,707,807 “Molecular Indexing for Expressed Gene Analysis” to Kato, issued Jan. 13, 1998; U.S. Pat. No. 5,807,522 “Methods for Fabricating Microarrays of Biological Samples” to Brown et al., issued Sep. 15, 1998; U.S. Pat. No. 5,958,342 “Jet Droplet Device” to Gamble et al., issued Sep. 28, 1999; U.S. Pat. No. 5,994,076 “Methods of Assaying Differential Expression” to Chenchik et al., issued Nov. 30, 1999; U.S. Pat. No. 6,004,755 “Quantitative Microarray Hybridization Assays” to Wang, issued Dec. 21, 1999; U.S. Pat. No. 6,048,695 “Chemically Modified Nucleic Acids and Method for Coupling Nucleic Acids to Solid Support” to Bradley et al., issued Apr. 11, 2000; U.S. Pat. No. 6,060,240 “Methods for Measuring Relative Amounts of Nucleic Acids in a Complex Mixture and Retrieval of Specific Sequences Therefrom” to Kamb et al., issued May 9, 2000; U.S. Pat. No. 6,090,556 “Method for Quantitatively Determining the Expression of a Gene” to Kato, issued Jul. 18, 2000; and U.S. Pat. No. 6,040,138 “Expression Monitoring by Hybridization to High Density Oligonucleotide Arrays” to Lockhart et al., issued Mar. 21, 2000.


[0100] In addition to being able to design, build and use probe arrays using available techniques, one of skill can simply order custom-made arrays and array-reading devices from manufacturers specializing in array manufacture. For example, custom arrays are available through Agilent Technology, Inc. or through Affymetrix Corp., in Santa Clara, Calif. which manufactures DNA VLSIP™ arrays.


[0101] In addition to marker sets made up of nucleic acid probes described above, marker sets including polypeptide, peptide, and antibody probes as discussed in the section entitled “Labeled Probes” are favorably used in certain applications. As discussed above for individual probes, sets of probes including multiple members selected from SEQ ID NOs: 31-60, or antibodies specific to such sequences can be used in liquid phase, or immobilized as described above with respect to nucleic acid markers.


[0102] Vectors, Promoters and Expression Systems


[0103] The present invention includes recombinant constructs incorporating one or more of the nucleic acid sequences described above. Such constructs include a vector, for example, a plasmid, a cosmid, a phage, a virus, a bacterial artificial chromosome (BAC), a yeast artificial chromosome (YAC), etc., into which one or more of the polynucleotide sequences of the invention, e.g., comprising any of SEQ ID NO: 1-30 or SEQ ID NO: 61-403, or a subsequence thereof, has been inserted, in a forward or reverse orientation. For example, the inserted nucleic acid can include a chromosomal sequence or cDNA including a all or part of at least one of SEQ ID NO: 1 through SEQ ID NO: 30, such as a sequence originating on Arabidopsis chromosome 2, or a cDNA corresponding to an mRNA expression product transcribed from a polynucleotide sequence on Arabidopsis chromosome 2. In an embodiment, the construct further comprises regulatory sequences, including, for example, a promoter, operably linked to the sequence. Large numbers of suitable vectors and promoters are known to those of skill in the art, and are commercially available.


[0104] The polynucleotides of the present invention can be included in any one of a variety of vectors suitable for generating sense or antisense RNA, and optionally, polypeptide expression products. Such vectors include chromosomal, nonchromosomal and synthetic DNA sequences, e.g., derivatives of SV40; bacterial plasmids; phage DNA; baculovirus; yeast plasmids; vectors derived from combinations of plasmids and phage DNA, viral DNA such as vaccinia, adenovirus, fowl pox virus, pseudorabies, adenovirus, adeno-associated virus, retroviruses and many others. Any vector that is capable of introducing genetic material into a cell, and, if replication is desired, which is replicable in the relevant host can be used.


[0105] In an expression vector, the polynucleotide sequence of interest is physically arranged in proximity and orientation to an appropriate transcription control sequence (promoter, and optionally, one or more enhancers) to direct mRNA synthesis. That is, the polynucleotide sequence of interest is operably linked to an appropriate transcription control sequence. Examples of such promoters include: LTR or SV40 promoter, E. coli lac or trp promoter, phage lambda PL promoter, and other promoters known to control expression of genes in prokaryotic or eukaryotic cells or their viruses. The expression vector also contains a ribosome binding site for translation initiation, and a transcription terminator. The vector optionally includes appropriate sequences for amplifying expression.


[0106] For example, constitutive promoters useful in vectors of the invention include the cauliflower mosaic virus (CaMV) 35S transcription initiation region, the 1′- or 2′-promoter derived from T-DNA of Agrobacterium tumefaciens, and other transcription initiation regions from various bacterial, plant or animal genes known to those of skill. Alternatively, the promoter can direct expression of a polynucleotide of the invention in a specific tissue (tissue-specific promoters) or can be otherwise under more precise environmental control (inducible promoters). Examples of tissue-specific promoters under developmental control include promoters that initiate transcription only in certain tissues, such as fruit, seeds, or flowers.


[0107] Any of a number of promoters which direct transcription in cells can be suitable. The promoter can be either constitutive or inducible. For example, in addition to the promoters noted above, promoters of bacterial origin which operate in plants include the octopine synthase promoter, the nopaline synthase promoter and other promoters derived from native Ti plasmids. See, Herrara-Estrella et al. (1983), Nature, 303:209-213. Viral promoters include the 35S and 19S RNA promoters of cauliflower mosaic virus. See, Odell et al. (1985) Nature, 313:810-812. Other plant promoters include the ribulose-1,3-bisphosphate carboxylase small subunit promoter and the phaseolin promoter. The promoter sequence from the E8 gene and other genes can also be used. The isolation and sequence of the E8 promoter is described in detail in Deikman and Fischer, (1988) EMBO J. 7:3315-3327. Many other promoters are in current use and can be coupled to an exogenous DNA sequence to direct expression of the nucleic acid.


[0108] In addition, the expression vectors optionally comprise one or more selectable marker genes to provide a phenotypic trait for selection of transformed host cells, such as dihydrofolate reductase or neomycin resistance for eukaryotic cell culture, or such as tetracycline or ampicillin resistance in E. coli. The vector comprising the sequences (e.g., promoters or coding regions) from genes encoding expression products and polynucleotides of the invention optionally include a nucleic acid subsequence, a marker gene which confers a selectable, or alternatively, a screenable, phenotype on plant cells. For example, the marker may encode biocide tolerance, particularly antibiotic tolerance, such as tolerance to kanamycin, G418, bleomycin, hygromycin, or in plants: herbicide tolerance, such as tolerance to chlorosluforon, or phosphinothricin (the active ingredient in the herbicides bialaphos or Basta). See, e.g., Padgette et al. (1996) “New weed control opportunities: Development of soybeans with a Round UP Ready™ gene” In: Herbicide-Resistant Crops (Duke, ed.), pp. 53-84, CRC Lewis Publishers, Boca Raton (“Padgette, 1996”). For example, crop selectivity to specific herbicides can be conferred by engineering genes into crops which encode appropriate herbicide metabolizing enzymes from other organisms, such as microbes. See, Vasil (1996) “Phosphinothricin-resistant crops” In: Herbicide-Resistant Crops (Duke, ed.), pp 85-91, CRC Lewis Publishers, Boca Raton) (“Vasil”, 1996).


[0109] Additional Expression Elements


[0110] Where translation of polypeptide encoded by a nucleic acid comprising a polynucleotide sequence of the invention is desired, additional translation specific initiation signals can improve the efficiency of translation. These signals can include, e.g., an ATG initiation codon and adjacent sequences. In some cases, for example, full-length cDNA molecules or chromosomal segments including a coding sequence incorporating, e.g., a polynucleotide sequence of the invention, a translation initiation codon and associated sequence elements are inserted into the appropriate expression vector simultaneously with the polynucleotide sequence of interest. In such cases, additional translational control signals frequently are not required. However, in cases where only a polypeptide coding sequence, or a portion thereof, is inserted, exogenous translational control signals, including an ATG initiation codon is provided for expression of the relevant sequence. The initiation codon is put in the correct reading frame to ensure transcription of the polynucleotide sequence of interest. Exogenous transcriptional elements and initiation codons can be of various origins, both natural and synthetic. The efficiency of expression can be enhanced by the inclusion of enhancers appropriate to the cell system in use (Scharf D et al. (1994) Results Probl Cell Differ 20:125-62; Bittner et al. (1987) Methods in Enzymol 153:516-544).


[0111] Expression Hosts


[0112] The present invention also relates to host cells which are transduced with vectors of the invention, and the production of polypeptides of the invention by recombinant techniques. Host cells are genetically engineered (i.e., transduced, transformed or transfected) with a vector, such as an expression vector, of this invention. As described above, the vector can be in the form of a plasmid, a viral particle, a phage, etc. Examples of appropriate expression hosts include: bacterial cells, such as Agrobacterium tumefaciens, E. coli, Streptomyces, and Salmonella typhimurium; fungal cells, such as Saccharomyces cerevisiae, Pichia pastoris, and Neurospora crassa; insect cells such as Drosophila and Spodoptera frugiperda; mammalian cells such as COS, CHO, BHK, HEK 293 or Bowes melanoma; plant cells, etc.


[0113] The engineered host cells can be cultured in conventional nutrient media modified as appropriate for activating promoters, selecting transformants, or amplifying the inserted polynucleotide sequences. The culture conditions, such as temperature, pH and the like, are typically those previously used with the host cell selected for expression, and will be apparent to those skilled in the art and in the references cited herein, including, e.g., Freshney (1994) Culture of Animal Cells, a Manual of Basic Technique, third edition, Wiley-Liss, New York and the references cited therein. Expression products corresponding to the nucleic acids of the invention can also be produced in non-animal cells such as plants, yeast, fungi, bacteria and the like. In addition to Sambrook, Berger and Ausubel, details regarding cell culture can be found in Payne et al. (1992) Plant Cell and Tissue Culture in Liquid Systems John Wiley & Sons, Inc. New York, N.Y.; Gamborg and Phillips (eds) (1995) Plant Cell, Tissue and Organ Culture; Fundamental Methods Springer Lab Manual, Springer-Verlag (Berlin Heidelberg New York) and Atlas and Parks (eds) The Handbook of Microbiological Media (1993) CRC Press, Boca Raton, Fla.


[0114] In bacterial systems, a number of expression vectors can be selected depending upon the use intended for the expressed product. For example, when large quantities of a polypeptide or fragments thereof are needed for the production of antibodies, vectors which direct high level expression of fusion proteins that are readily purified are favorably employed. Such vectors include, but are not limited to, multifunctional E. coli cloning and expression vectors such as BLUESCRIPT (Stratagene), in which the coding sequence of interest, e.g., a polynucleotide of the invention as described above, can be ligated into the vector in-frame with sequences for the amino-terminal translation initiating Methionine and the subsequent 7 residues of beta-galactosidase producing a catalytically active beta galactosidase fusion protein; pIN vectors (Van Heeke & Schuster (1989) J Biol Chem 264:5503-5509); pET vectors (Novagen, Madison Wis.); and the like.


[0115] Similarly, in the yeast Saccharomyces cerevisiae a number of vectors containing constitutive or inducible promoters such as alpha factor, alcohol oxidase and PGH can be used for production of the desired expression products. For reviews, see Berger, Ausubel, and, e.g., Grant et al. (1987; Methods in Enzymology 153:516-544).


[0116] In mammalian host cells, a number expression systems, such as viral-based systems, can be utilized. For example, in cases where an adenovirus is used as an expression vector, a coding sequence is optionally ligated into an adenovirus transcription/translation complex consisting of the late promoter and tripartite leader sequence. Insertion in a nonessential E1 or E3 region of the viral genome will result in a viable virus capable of expressing the polypeptides of interest in infected host cells (Logan and Shenk (1984) Proc Natl Acad Sci 81:3655-3659). In addition, transcription enhancers, such as the rous sarcoma virus (RSV) enhancer, can be used to increase expression in mammalian host cells.


[0117] Transformed or transfected host cells containing the expression vectors described above are also a feature of the invention. The host cell can be a eukaryotic cell, such as a mammalian cell, a yeast cell, or a plant cell, or the host cell can be a prokaryotic cell, such as a bacterial cell. Introduction of the construct into the host cell can be effected by calcium phosphate transfection, DEAE-Dextran mediated transfection, electroporation, or other common techniques (Davis, L., Dibner, M., and Battey, I. (1986) Basic Methods in Molecular Biology).


[0118] A host cell strain is optionally chosen for its ability to modulate the expression of the inserted sequences or to process the expressed protein in the desired fashion. Such modifications of the protein include, but are not limited to, acetylation, carboxylation, glycosylation, phosphorylation, lipidation, and acylation. Post-translational processing which cleaves a precursor form into a mature form of the protein is sometimes important for correct insertion, folding, and/or function. Different host cells such as bacterial, fungal, plant and animal host cells have specific cellular machinery and characteristic mechanisms for such post-translational activities and can be chosen to ensure the correct modification and processing of the introduced, foreign protein.


[0119] For long-term, high-yield production of recombinant proteins encoded by or having subsequences encoded by the polynucleotides of the invention, stable expression systems are typically used. For example, cell lines which stably express a polypeptide of the invention are transfected using expression vectors which contain viral origins of replication or endogenous expression elements and a selectable marker gene. Following the introduction of the vector, cells are allowed to grow for 1-2 days in an enriched media before they are switched to selective media. The purpose of the selectable marker is to confer resistance to selection, and its presence allows growth and recovery of cells which successfully express the introduced sequences. For example, resistant colonies of stably transformed cells can be proliferated using tissue culture techniques appropriate to the cell type.


[0120] Host cells transformed with a nucleotide sequence encoding a polypeptide of the invention are optionally cultured under conditions suitable for the expression and recovery of the encoded protein from cell culture. The protein or fragment thereof produced by a recombinant cell can be secreted, membrane-bound, or contained intracellularly, depending on the sequence and/or the vector used.


[0121] Plant Transformation


[0122] The nucleic acids of the invention can be introduced into plants to modulate growth of the plants. That is, expression of the nucleic acids, e.g., when present as transgenes can modulate growth of the plants. Similarly, transgenic expression of sense or anti-sense sequences of the invention can modulate expression of endogenous forms or homologues of the nucleic acids, thereby modulating growth of the plants. Thus, the sequences specified herein, or homologues (or other variants) thereof, can be expressed to modulate plant growth.


[0123] The nucleic acids of the invention are optionally expressed under the control of an inducible promoter, e.g., a promoter regulated by an environmental signal (e.g., a chemical, a hormone (e.g., a plant or insect hormone), heat, light, water or the like. Alternately, a constitutive promoter can be used to drive expression of a nucleic acid of interest.


[0124] It can also be useful to stack expression of multiple nucleic acids of the invention in a single plant to modulate growth of the plant, or to stack expression of the nucleic acids of the invention with any other nucleic acid that provides a desired property (resistance to pests, herbicides, etc).


[0125] As noted, natural homologues, e.g., of the Arabadopsis sequences noted herein can be identified using standard molecular techniques as noted herein, and/or using sequence comparison methods as noted herein. In one embodiment, nucleic acids corresponding to homologues from a species are introduced as components of expression vectors into plants of that species (e.g., a corn homologue is introduced into corn) to modulate plant growth of the resulting transgenic plant. In another embodiment, nucleic acids from a species are introduced into a different species (e.g., a corn homologue is optionally introduced into a different grass family plant) to modulate plant growth of the resulting transgenic plant.


[0126] Accordingly, polynucleotides of the invention can be introduced into an Arabidopsis or any other desired plant genome, e.g., Brassica, Zea, Oryza, Triticum, Hordeum, Lolium, Sorghum, Glycine, Medicago, Helianthus, Lactuca, Beta, Vitis, Solanum, Lycopersicon, Capsicum, Gossypium, Hevea, Linum, Prunus, Citrus, Populus, Pinus, and Quercus, using a number of techniques well established in the art. Methods for transforming a wide variety of higher plant species have been described in the technical and scientific literature (see, e.g., Payne et al. (1992) Plant Cell and Tisue Culture in Liquid Systems John Wiley & Sons, Inc. New York, N.Y.; Gamborg and Phillips (1995) Plant Cell, Tissue and Organ Culture: Fundamental Methods Springer Lab Manual, Springer-Verlag, Berlin; Jones (1995) Plant Gene Transfer and Expression Protocols: Methods in Molecular Biology, Volume 49 Humana Press, Towata, N.J.; and Croy (1993) Plant Molecular Biology Bios Scientific Publishers, Oxfore, U.K., as well as, e.g., Weising et al. (1988) Ann. Rev. Genet. 22:421.


[0127] In many cases, introduction of exogenous nucleic acids into a plant genome is facilitated by molecular transformation of plant protoplasts or isolated plant tissues in a tissue culture system, e.g., a liquid tissue culture system, as described in the references above. Numerous protocols for establishment of transformable protoplasts from a variety of plant types and subsequent transformation of the cultured protoplasts are available in the art and are incorporated herein by reference. For examples, see, Hashimoto et al. (1990) Plant Physiol. 93:857; Fowke and Constabel (eds)(1994) Plant Protoplasts; Saunders et al. (1993) Applications of Plant In Vitro Technology Symposium, UPM 16-18; and Lyznik et al. (1991) BioTechniques 10:295, each of which is incorporated herein by reference.


[0128] Nucleic acids, e.g., DNA expression vectors comprising the polynucleotides of the invention, can be introduced directly into the genomic DNA of a plant cell using techniques such as electroporation (see, e.g., Fromm et al. (1985) Proc Nat'l Acad Sci USA 82:5824), polyethylene glycol precipitation (see, e.g., Paszkowski et al. (1984) EMBO J. 3:2717) and microinjection of plant cell protoplasts. Ballistic methods, such as DNA particle bombardment can be used to introduce DNA into plant tissues (see, e.g., Klein et al. (1987) Nature 327:70; and Weeks et al. Plant Physiol 102:1077).


[0129] Alternatively, the polynucleotides of the invention can be combined with suitable T-DNA flanking regions and introduced into a conventional Agrobacterium tumefaciens host vector. Agrobacterium-mediated transformation is widely used for the transformation of dicots, such as Arabidopsis as well as numerous other species of experimental and commercial interest, as well as certain monocots. For example, Agrobacterium transformation of rice is described by Hiei et al. (1994) Plant J. 6:271; U.S. Pat. No. 5,187,073; U.S. Pat. No. 5,591,616; Li et al. (1991) Science in China 34:54; and Raineri et al. (1990) Bio/Technology 8:33. Transformed maize, barley, triticale and asparagus by Agrobacterium mediated transformation have also been described (Xu et al. (1990) Chinese J Bot 2:81).


[0130] Agrobacterium mediated transformation techniques take advantage of the ability of the tumor-inducing (Ti) plasmid of A. tumefaciens to integrate into a plant cell genome, to co-transfer a nucleic acid of interest into a plant cell. Typically, an expression vector is produced wherein the nucleic acid of interest, such as a GAT polynucleotide of the invention, is ligated into an autonomously replicating plasmid which also contains T-DNA sequences. T-DNA sequences typically flank the expression cassette nucleic acid of interest and comprise the integration sequences of the plasmid. In addition to the expression cassette, T-DNA also typically includes a marker sequence, e.g., antibiotic resistance genes. The plasmid with the T-DNA and the expression cassette can then be transfected into Agrobacterium cells. Typically, for effective transformation of plant cells, the A. tumefaciens bacterium also possesses the necessary vir regions on a plasmid, or integrated into its chromosome. For a discussion of Agrobacterium mediated transformation, see, Firoozabady and Kuehnle, (1995) Plant Cell Tissue and Organ Culture Fundamental Methods, Gamborg and Phillips (eds.).


[0131] In addition, methods for transforming Arabidopsis in whole plants without tissue culture have been developed, e.g., using vacuum infiltration (Bechtold et al. (1993) “In planta Agrobacterium mediated gene transfer by infiltration of adult Arabidopsis thaliana plants”. CR Acad Sci Paris Life Sci 316:1194-1199) and simple dipping of flowering plants (Desfeux et al. (2000) “Female reproductive tissues are the primary target of Agrobacterium-mediated transformation by the Arabidopsis floral-dip method” Plant Physiol. 123:895-904).


[0132] Plant viral vectors can also be used to introduce exogenous nucleic acids comprising the polynucleotides of the invention into a plant genome. Typically, viral vectors are used when transient expression of the exogenous polynucleotide sequence is desirable. Viral vectors are simple to manipulate in vitro and can be easily introduced into mechanically wounded leaves of intact plants of a variety of laboratory plant species as well as common crop species. Over six-hundred-fifty plant viruses have been identified, and both DNA and RNA viruses have been used as vectors for gene replacement, gene insertion, epitope presentation and complementation, (see, e.g., Scholthof, Scholthof and Jackson, (1996) “Plant virus gene vectors for transient expression of foreign proteins in plants,” Annu. Rev. of Phytopathol. 34:299-323). The nucleotide sequences encoding many of these proteins are matters of public knowledge, and accessible through any of a number of databases, e.g. (Genbank: available at the world wide web at ncbi.nlm.nih.gov/genbank/or EMBL: available at the world wide web at ebi.ac.uk.embl/).


[0133] Methods for the transformation of plants and plant cells using sequences derived from plant viruses include the direct transformation techniques described above relating to DNA molecules, see e.g., Jones, ed. (1995) Plant Gene Transfer and Expression Protocols, Humana Press, Totowa, N.J., for a recent compilation. In addition viral sequences can be cloned adjacent T-DNA border sequences and introduced via Agrobacterium mediated transformation, or Agroinfection.


[0134] Viral particles comprising the plant virus vectors of the invention can also be introduced by mechanical inoculation using techniques well known in the art, (see e.g., Cunningham and Porter, eds. (1997) Methods in Biotechnology, Vol. 3. Recombinant Proteins from Plants: Production and Isolation of Clinically Useful Compounds, for detailed protocols).


[0135] Regeneration of Transgenic Plants


[0136] Transgenic plant cells which are derived by plant transformation techniques, including those discussed above, can be cultured to regenerate a whole plant which possesses the transformed genotype (e.g., SEQ ID NO: 1-30), and thus the desired phenotype, such as a desirable growth trait. Such regeneration techniques rely on manipulation of certain phytohormones in a tissue culture growth medium, typically relying on a biocide and/or herbicide marker which has been introduced together with the desired nucleotide sequences. Plant regeneration from cultured protoplasts is described in Evans et al. (1983) Protoplasts Isolation and Culture, Handbook of Plant Cell Culture, pp 124-176, Macmillan Publishing Company, New York; and Binding (1985) Regeneration of Plants, Plant Protoplasts pp 21-73, CRC Press, Boca Raton. Regeneration can also be obtained from plant callus, explants, organs, or parts thereof. Such regeneration techniques are described generally in Klee et al. (1987) Ann Rev of Plant Phys 38:467. See also, e.g., Payne and Gamborg, supra. After transformation with Agrobacterium, the explants typically are transferred to selection medium. One of skill will realize that the selection medium depends on the selectable marker that was co-transfected into the explants. After a suitable length of time, transformants will begin to form shoots. After the shoots are about 1-2 cm in length, the shoots should be transferred to a suitable root and shoot medium. Selection pressure should be maintained in the root and shoot medium.


[0137] Typically, the transformants will develop roots in about 1-2 weeks and form plantlets. After the plantlets are about 3-5 cm in height, they are placed in sterile soil in fiber pots. Those of skill in the art will realize that different acclimation procedures are used to obtain transformed plants of different species. For example, after developing a root and shoot, cuttings, as well as somatic embryos of transformed plants, are transferred to medium for establishment of plantlets. For a description of selection and regeneration of transformed plants, see, e.g., Dodds and Roberts (1995) Experiments in Plant Tissue Culture, 3rd Ed., Cambridge University Press.


[0138] The transgenic plants of this invention can be characterized either genotypically or phenotypically to evaluate the presence of an exogenous nucleic acid, e.g., a polynucleotide of the invention. Genotypic analysis can be performed by any of a number of well-known techniques, including PCR amplification of genomic DNA and hybridization of genomic DNA with specific labeled probes. Phenotypic analysis includes, e.g., survival of plants or plant tissues exposed to a selected biocide or herbicide.


[0139] Essentially any plant can be transformed with the polynucleotides of the invention. Suitable plants include agronomically and horticulturally important species. Such species include, but are not restricted to members of the families: Graminae (including corn, rye, triticale, barley, millet, rice, wheat, oats, etc.); Leguminosae (including pea, beans, lentil, peanut, yam bean, cowpeas, velvet beans, soybean, clover, alfalfa, lupine, vetch, lotus, sweet clover, wisteria, and sweetpea); Compositae (the largest family of vascular plants, including at least 1,000 genera, including important commercial crops such as sunflower) and Rosaciae (including raspberry, apricot, almond, peach, rose, etc.), as well as nut plants (including, walnut, pecan, hazelnut, etc.), and forest trees (including Pinus, Quercus, Pseutotsuga, Sequoia, Populus, etc.). The ability to modulate growth of commercially relevant plants using the nucleic acids and proteins of the invention provides a clear utility for such nucleic acids and proteins.


[0140] Additional targets for modification by the polynucleotides of the invention, as well as those specified above, include plants from the genera: Agrostis, Allium, Antirrhinum, Apium, Arachis, Asparagus, Atropa, Avena (e.g., oats), Bambusa, Brassica, Bromus, Browaalia, Camellia, Cannabis, Capsicum, Cicer, Chenopodium, Chichorium, Citrus, Coffea, Coix, Cucumis, Curcubita, Cynodon, Dactylis, Datura, Daucus, Digitalis, Dioscorea, Elaeis, Eleusine, Festuca, Fragaria, Geranium, Gossypium, Glycine, Helianthus, Heterocallis, Hevea, Hordeum (e.g., barley), Hyoscyamus, Ipomoea, Lactuca, Lens, Lilium, Linum, Lolium, Lotus, Lycopersicon, Majorana, Malus, Mangifera, Manihot, Medicago, Nemesia, Nicotiana, Onobrychis, Oryza (e.g., rice), Panicum, Pelargonium, Pennisetum (e.g., millet), Petunia, Pisum, Phaseolus, Phleum, Poa, Prunus, Ranunculus, Raphanus, Ribes, Ricinus, Rubus, Saccharum, Salpiglossis, Secale (e.g., rye), Senecio, Setaria, Sinapis, Solanum, sorghum, Stenotaphrum, Theobroma, Trifolium, Trigonella, Triticum (e.g., wheat), Vicia, Vigna, Vitis, Zea (e.g., corn), the Olyreae, the Pharoideae, and many others. As noted, plants in the family Brassicaceae are a particularly favored target plants for the methods of the invention.


[0141] Common crop plants which are targets of the present invention include corn, rice, triticale, rye, cotton, soybean, sorghum, wheat, oats, barley, millet, sunflower, canola, peas, beans, lentils, peanuts, yam beans, cowpeas, velvet beans, clover, alfalfa, lupine, vetch, lotus, sweet clover, wisteria, sweetpea, and nut plants (e.g., walnut, pecan, etc).


[0142] In cases where expression in the plant chloroplast is desired, the polynucleotide of the invention is modified by the addition of a chloroplast transit sequence peptide to facilitate translocation of the gene products into the chloroplasts. Additionally, methods are available in the art to accomplish transformation directly into the chloroplast accompanied by expression of the transformed polynucleotides (e.g., Daniell et al. (1998) Nature Biotechnology 16:346; O'Neill et al. (1993) The Plant Journal 3:729; Maliga (1993) TIBTECH 11:1). In such cases, it is desirable to employ expression vectors that are designed to specifically to function in the chloroplast. Typically, the coding sequence, e.g., a polynucleotide sequence of the invention, is flanked by two regions of homology to the chloroplastid genome to effect a homologous recombination with the chloroplast genome; often a selectable marker gene is also present within the flanking plastid DNA sequences to facilitate selection of genetically stable transformed chloroplasts in the resultant transplastonic plant cells (see, e.g., Maliga (1993) and Daniell (1998), and references cited therein).


[0143] Polypeptide Production and Recovery


[0144] Following transduction of a suitable host cell line or strain, and growth of the host cells to an appropriate cell density, the selected promoter is induced by appropriate means (e.g., temperature shift or chemical induction) and cells are cultured for an additional period. The secreted polypeptide product is then recovered from the culture medium. Alternatively, cells can be harvested by centrifugation, disrupted by physical or chemical means, and the resulting crude extract retained for further purification. Eukaryotic or microbial cells employed in expression of proteins can be disrupted by any convenient method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing agents, or other methods, which are well know to those skilled in the art.


[0145] Expressed polypeptides can be recovered and purified from recombinant cell cultures by any of a number of methods well known in the art, including ammonium sulfate or ethanol precipitation, acid extraction, anion or cation exchange chromatography, phosphocellulose chromatography, hydrophobic interaction chromatography, affinity chromatography (e.g., using any of the tagging systems noted herein), hydroxylapatite chromatography, and lectin chromatography. Protein refolding steps can be used, as desired, in completing configuration of the mature protein. Finally, high performance liquid chromatography (HPLC) can be employed in the final purification steps. In addition to the references noted above, a variety of purification methods are well known in the art, including, e.g., those set forth in Sandana (1997) Bioseparation of Proteins, Academic Press, Inc.; and Bollag et al. (1996) Protein Methods, 2nd Edition Wiley-Liss, NY; Walker (1996) The Protein Protocols Handbook Humana Press, NJ, Harris and Angal (1990) Protein Purification Applications: A Practical Approach IRL Press at Oxford, Oxford, England; Harris and Angal Protein Purification Methods: A Practical Approach IRL Press at Oxford, Oxford, England; Scopes (1993) Protein Purification: Principles and Practice 3rd Edition Springer Verlag, NY; Janson and Ryden (1998) Protein Purification: Principles, High Resolution Methods and Applications. Second Edition Wiley-VCH, NY; and Walker (1998) Protein Protocols on CD-ROM Humana Press, NJ.


[0146] Alternatively, cell-free transcription/translation systems can be employed to produce polypeptides, e.g., corresponding to SEQ ID NO: 31 through SEQ ID NO: 60, subsequences thereof or sequences or subsequences encoded by the polynucleotides of the invention. A number of suitable in vitro transcription and translation systems are commercially available. A general guide to in vitro transcription and translation protocols is found in Tymms (1995) In vitro Transcription and Translation Protocols: Methods in Molecular Biology Volume 37, Garland Publishing, NY.


[0147] In addition, the polypeptides, or subsequences thereof, e.g., subsequences comprising antigenic peptides, can be produced manually or by using an automated system, by direct peptide synthesis using solid-phase techniques (see, Stewart et al. (1969) Solid-Phase Peptide Synthesis, W H Freeman Co, San Francisco; Merrifieid J (i963) J. Am. Chem. Soc. 85:2149-2154). Exemplary automated systems include the Applied Biosystems 431A Peptide Synthesizer (Perkin Elmer, Foster City, Calif.). If desired, subsequences can be chemically synthesized separately, and combined using chemical methods to provide full-length polypeptides.


[0148] Conservatively Modified Variations


[0149] The polypeptides of the invention include, e.g., those presented in SEQ ID NO: 31 to SEQ ID NO: 60, but also similar polypeptides such as, e.g., homologues, peptides synthesized with modified amino acids, subsequences, peptides with conservative modifications, etc.


[0150] For example, the polypeptides of the present invention include conservatively modified variations of SEQ ID NO: 31 to SEQ ID NO: 60. Such conservatively modified variations comprise substitutions, additions, or deletions which alter, add or delete a single amino acid or a small percentage of amino acids (typically less than about 5%, more typically less than about 4%, 2%, or 1%) in any of SEQ ID NO: 31 to SEQ ID NO: 60. Typically, substitutions of amino acids are conservative substitutions according to the six substitution groups set forth in Table 1 (supra).


[0151] For example, a conservatively substituted variation of the polypeptide identified herein as SEQ ID NO: 31 will contain “conservative substitutions”, according to the six groups defined above, in up to 17 residues (i.e., 5% of the amino acids) in the 346 amino acid polypeptide.


[0152] For example, if four conservative substitutions were localized in the region corresponding to amino acids 2-26 of SEQ ID NO: 31, examples of conservatively substituted variations of this region,


[0153] ALKSKLVSL LFLIATLSST FAASFS include:


[0154] AMKSKLLSL LFLIAALSST FAASWS and


[0155] ALRSKLVSL LFIIATLTST FAASYS and the like, in accordance with the conservative substitutions listed in Table 1 (in the above example, conservative substitutions are underlined). Listing of a protein sequence herein, in conjunction with the above substitution table, provides an express listing of all conservatively substituted proteins.


[0156] Finally, the addition of sequences which do not alter the encoded activity of a nucleic acid molecule, such as the addition of a non-functional sequence, provides conservative variations of the basic nucleic acid.


[0157] The polypeptides of the invention, including conservatively substituted sequences, can be present as part of larger polypeptide sequences such as occur upon the addition of one or more domains for purification of the protein (e.g., poly his segments, FLAG tag segments, etc.), e.g., where the additional functional domains have little or no effect on the activity of the protein, or where the additional domains can be removed by post synthesis processing steps such as by treatment with a protease.


[0158] Modified Amino Acids


[0159] Expressed polypeptides of the invention can contain one or more modified amino acid. The presence of modified amino acids can be advantageous in, for example, (a) increasing polypeptide serum half-life, (b) reducing polypeptide antigenicity, (c) increasing polypeptide storage stability. Amino acid(s) are modified, for example, co-translationally or post-translationally during recombinant production (e.g., N-linked glycosylation at N-X-S/T motifs during expression in mammalian cells), or modified by synthetic means (e.g., via PEGylation).


[0160] Non-limiting examples of a modified amino acid include a glycosylated amino acid, a sulfated amino acid, a prenlyated (e.g., farnesylated, geranylgeranylated) amino acid, an acetylated amino acid, an acylated amino acid, a PEG-ylated amino acid, a biotinylated amino acid, a carboxylated amino acid, a phosphorylated amino acid, and the like, as well as amino acids modified by conjugation to, e.g., lipid moieties or other organic derivatizing agents. References adequate to guide one of skill in the modification of amino acids are replete throughout the literature. Example protocols are found in Walker (1998) Protein Protocols on CD-ROM Human Press, Towata, N.J.


[0161] Antibodies


[0162] The polypeptides of the invention can be used to produce antibodies specific for the polypeptides of SEQ ID NO: 31-SEQ ID NO: 60, and conservative variants thereof. Antibodies specific for, e.g., SEQ ID NOs: 31-60, and related variant polypeptides are useful, e.g., for screening and identification purposes, e.g., related to the activity, distribution, and expression of target polypeptides.


[0163] Antibodies specific for the polypeptides of the invention can be generated by methods well known in the art. Such antibodies can include, but are not limited to, polyclonal, monoclonal, chimeric, humanized, single chain, Fab fragments and fragments produced by an Fab expression library.


[0164] Polypeptides do not require biological activity for antibody production. The full length polypeptide, subsequences, fragments or oligopeptide can be antigenic. Peptides used to induce specific antibodies typically have an amino acid sequence of at least about 10 amino acids, and often at least 15 or 20 amino acids. Short stretches of a polypeptide, e.g., selected from among SEQ ID NO: 31-SEQ ID NO: 60, can be fused with another protein, such as keyhole limpet hemocyanin, and antibody produced against the chimeric molecule.


[0165] Numerous methods for producing polyclonal and monoclonal antibodies are known to those of skill in the art, and can be adapted to produce antibodies specific for the polypeptides of the invention, e.g., corresponding to SEQ ID NO: 31-SEQ ID NO: 60. See, e.g., Coligan (1991) Current Protocols in Immunology Wiley/Greene, NY; and Harlow and Lane (1989) Antibodies: A Laboratory Manual Cold Spring Harbor Press, NY; Stites et al. (eds.) Basic and Clinical Immunology (4th ed.) Lange Medical Publications, Los Altos, Calif., and references cited therein; Goding (1986) Monoclonal Antibodies: Principles and Practice (2d ed.) Academic Press, New York, N.Y.; Fundamental Immunology, e.g., 4th Edition (or later), W. E. Paul (ed.), Raven Press, N.Y. (1998); and Kohler and Milstein (1975) Nature 256: 495-497. Other suitable techniques for antibody preparation include selection of libraries of recombinant antibodies in phage or similar vectors. See, Huse et al. (1989) Science 246: 1275-1281; and Ward, et al. (1989) Nature 341: 544-546. Specific monoclonal and polyclonal antibodies and antisera will usually bind with a KD of at least about 0.1 μM, preferably at least about 0.01 μM or better, and most typically and preferably, 0.001 μM or better.


[0166] Defining Polypeptides by Immunoreactivity


[0167] The polypeptides of the invention listed in the sequence listing herein, as well as novel variants derived therefrom, which are also encompassed within the present invention, provide a variety of structural features which can be recognized, e.g., in immunological assays. The generation of antisera which specifically binds the polypeptides of the invention, as well as the polypeptides which are bound by such antisera, are a feature of the invention.


[0168] The invention includes polypeptides that specifically bind to or that are specifically immunoreactive with an antibody or antisera generated against an immunogen comprising an amino acid sequence, e.g., selected from one or more of SEQ ID NO: 31 to SEQ ID NO: 60. To eliminate cross-reactivity with non related polypeptides, the antibody or antisera can be subtracted with unrelated polypeptides or proteins.


[0169] In one typical format, the immunological assay uses a polyclonal antiserum which was raised against one or more polypeptide comprising one or more of the sequences corresponding to one or more polypeptides of the invention, such as SEQ ID NO: 31 to SEQ ID NO: 60, or a subsequence thereof (e.g., a substantial subsequence including at least about 30% of the full length sequence provided). Such an antigenic peptide or polypeptide is referred to as an “immunogenic polypeptide.” The resulting antisera is optionally selected to have low cross-reactivity against unrelated polypeptides, e.g., BSA, and any such cross-reactivity can be removed by immunoabsorbtion with one or more of the unrelated polypeptides, or protein preparations, prior to use of the polyclonal antiserum in the immunoassay.


[0170] In order to produce antisera for use in an immunoassay, one or more of the immunogenic polypeptides is produced and purified as described herein. For example, a recombinant protein can be produced in a bacterial host. An inbred strain of mice (used in this assay because results are more reproducible due to the virtual genetic identity of the mice)can be immunized with the immunogenic protein(s) in combination with a standard adjuvant, such as Freund's adjuvant, and a standard mouse immunization protocol (see, Harlow and Lane (1988) Antibodies, A Laboratory Manual, Cold Spring Harbor Publications, New York, for a standard description of antibody generation, immunoassay formats and conditions that can be used to determine specific immunoreactivity). Alternatively, one or more synthetic or recombinant polypeptide derived from the sequences disclosed herein can be conjugated to a carrier protein and used as an immunogen.


[0171] Polyclonal sera are collected and titered against the immunogenic polypeptide in an immunoassay, for example, a solid phase immunoassay with one or more of the immunogenic proteins immobilized on a solid support. Polyclonal antisera with a titer of 106 or greater are selected, pooled and subtracted with the control unrelated polypeptides to produce subtracted pooled titered polyclonal antisera.


[0172] If desired, the subtracted pooled titered polyclonal antisera are tested for cross reactivity against any unrelated polypeptides. Discriminatory binding conditions are determined for the subtracted titered polyclonal antisera which result in at least about a 5-fold to 10-fold higher signal to noise ratio for binding of the titered polyclonal antisera to the immunogenic polypeptide of interest as compared to binding to the unrelated polypeptide. That is, the stringency of the binding reaction can be adjusted by the addition of non-specific competitors such as albumin or non-fat dry milk, or by adjusting salt conditions, temperature, and/or the like. These binding conditions can be used in subsequent assays for determining whether a test polypeptide is specifically bound by the pooled subtracted polyclonal antisera. In particular, test polypeptides which show at least a 2-5× (i.e., 2-fold to 5-fold) and preferably 10× or higher signal to noise ratio than for the control polypeptides under discriminatory binding conditions, and at least about a half the signal to noise ratio as compared to the immunogenic polypeptide(s) (and typically 90% or more of the signal to noise ratio shown for the immunogenic peptide), shares substantial structural similarity with the immunogenic polypeptide as compared to unrelated polypeptides, and is, therefore, a polypeptide of the invention.


[0173] Such methods are also useful for detecting an unknown test protein or polypeptide, which is also specifically bound by the antisera under conditions as described above. In one format, the immunogenic polypeptide(s) are immobilized to a solid support which is exposed to the subtracted pooled antisera. Test proteins are added to the assay to compete for binding to the pooled subtracted antisera. The ability of the test protein(s) to compete for binding to the pooled subtracted antisera as compared to the immobilized protein(s) is compared to the ability of the immunogenic polypeptide(s) added to the assay to compete for binding (the immunogenic polypeptides compete effectively with the immobilized immunogenic polypeptides for binding to the pooled antisera). The percent cross-reactivity for the test proteins is calculated, using standard calculations.


[0174] In a parallel assay, the ability of the control proteins to compete for binding to the pooled subtracted antisera is determined as compared to the ability of the immunogenic polypeptide(s) to compete for binding to the antisera. Again, the percent cross-reactivity for the control polypeptides is calculated, using standard calculations. Where the percent cross-reactivity is at least 5-10× as high for the test polypeptides, the test polypeptides are said to specifically bind the pooled subtracted antisera.


[0175] In general, the immunoabsorbed and pooled antisera can be used in a competitive binding immunoassay as described herein to compare any test polypeptide to the immunogenic polypeptide(s). In order to make this comparison, the two polypeptides are each assayed at a wide range of concentrations and the amount of each polypeptide required to inhibit 50% of the binding of the subtracted antisera to the immobilized protein is determined using standard techniques. If the amount of the test polypeptide required required to inhibit 50% of the binding of the subtracted antisera to the immobilized protein is less than twice the amount of the immunogenic polypeptide that is required, then the test polypeptide is said to specifically bind to an antibody generated to the immunogenic protein; provided the amount is at least about 5-10× as high as for a control polypeptide.


[0176] As an additional determination of specificity, the pooled antisera can be optionally fully immunosorbed with the immunogenic polypeptide(s) (rather than the control polypeptides) until little or no binding of the resulting immunogenic polypeptide subtracted pooled antisera to the immunogenic polypeptide(s) used in the immunosorbtion is detectable. This fully immunosorbed antisera is then tested for reactivity with the test polypeptide. If little or no reactivity is observed (i.e., no more than 2× the signal to noise ratio observed for binding of the fully immunosorbed antisera to the immunogenic polypeptide), then the test polypeptide can be deemed specifically bound by the antisera elicited by the immunogenic protein.


[0177] Predicting Plant Growth Traits


[0178] The presence of sequences of the invention, or the amount of their expression products, can be predictive of plant growth traits before they actually become apparent. Detection of polynucleotide sequences of the invention in plant cells can predict plant growth traits, such as root length or leaf mass, well before the maturity of a plant. The presence of particular combinations of polynucleotide sequences of the invention can predict one plant growth trait, e.g., large root mass, while a different combination of polynucleotides of the invention can predict another plant growth trait, e.g., short stalk length. In addition, the amount of expression products, such as the quantity of mRNAs transcribed from polynucleotides of the invention, or amount of translated polypeptides of the invention, can be predictive of plant growth traits. The presence of sequences of the invention, combinations of the sequences, and amount of expression products can predict plant growth traits, e.g., in cultured plant cells and immature plants. Such a predictive information can be useful in, e.g., rapid screening of desirable plants in culture or cultivation.


[0179] The probes and marker sets of the invention are favorably employed in methods for predicting plant growth traits in an individual specimen, such as cultured plant cells. Nucleic acids of a marker set or individual probes including one or more polynucleotides of the invention, as described, e.g., in the section entitled “Probes,” are hybridized, e.g., as an array, to a DNA or RNA sample from a subject cell or tissue sample. Upon hybridization of the sample to at least a subset of the probes, a signal is detected corresponding to at least one nucleic acid or to expression or activity of an expression product correlatable to a plant growth trait. When expression is detected, the evaluation can be made on a qualitative basis, that is, detecting whether or not an expression product (or multiple expression products) are expressed in a subject cell or tissue sample. Alternatively, the evaluation can be quantitative, to determine whether levels are adequate to provide the desired trait.


[0180] While a variety of biological samples reflective of a growth trait can be employed, the specimen is usually selected for ease of acquisition, to minimize invasiveness of the collection procedure to the subject, or to focus on the tissue of interest. Thus, in the context of individual whole plants, individual leaves, roots or branches can be preferred samples, and can be obtained simple cutting. In the case of recombinant inbred lines (RILs) entire individual plants can be sampled knowing they are representative of other available individuals of the line.


[0181] For example, a marker set including a plurality (e.g., several or all of SEQ ID NO: 1 through SEQ ID NO: 30 or of SEQ ID NO: 61 through SEQ ID NO: 403) of the polynucleotides of the invention, can be hybridized individually, or as an array, to an RNA or cDNA sample produced, e.g., by a reverse transcription-polymerase chain reaction (RT-PCR), from a subject RNA sample. Typically, prior to hybridization of the probes or array to a subject or “test” specimen, the probe or array is validated and/or calibrated by comparing samples obtained from classes of subjects known to differ with respect to their growth traits. For example, specimens from individuals displaying a high root mass trait are compared to subjects that display low root mass relative to the general population of individual plants. In one embodiment, for example, nucleic acid SEQ ID NO: 397 through SEQ ID NO: 403 have been associated with enhanced root growth in Arabidopsis plants exposed to environments containing either ammonium sulfate or ammonium nitrate fertilizer. See copending provisional application 60/344,499, Identification of Genes Controlling Complex Traits, by Benjamin A. Bowen, et al., filed Dec. 28, 2001.


[0182] Alternatively, a marker set including a plurality of antibodies, or other binding proteins, specific for a polypeptide of the invention, e.g., SEQ ID NO: 31-SEQ ID NO: 60, are employed as individual probes or marker sets to evaluate expression of proteins, e.g., corresponding to SEQ ID NO: 31-SEQ ID NO: 60 in a cell or tissue specimen. In this case, rather than, or in addition to, preparing RNA from a sample, proteins are recovered and exposed to the probe or marker set of antibodies, in liquid phase or with either the target of antibody immobilized on a solid substrate, such as a solid phase array.


[0183] Patterns of expression that correlate to a particular growth trait are detected by hybridization to one or more probes. In some embodiments, a single probe with a high predictive value is favored, e.g., for ease of handling and cost containment. In other embodiments multiple probes, e.g., the entire marker set, are preferred, e.g., to increase sensitivity or diagnostic or prognostic value. Optimal probes and marker sets are readily ascertained on an empirical basis.


[0184] Alternatively, the invention provides an oligonucleotide or polynucleotide probe that detects sequence polymorphisms rather than expression differences between specimens from individuals with different growth traits. Polymorphisms at a nucleotide level can correspond either directly or indirectly to the gene of interest underlying the growth trait, and can be detected in any of several ways, for example, as restriction fragment length polymorphisms, by allele specific hybridization, as amplification length polymorphisms, and the like.


[0185] For example, oligonucleotide probes including conservative variants of a polynucleotide sequences can be selected which correspond to polymorphic variations in a target sequence. For example, a probe pair incorporating a single variant nucleotide can be designed to hybridize under allele specific hybridization conditions to allelic target sequences in which one allele is correlated to a fast growth trait and the other allele indicates a relatively slow growth trait. For example, probe sequences are selected from among SEQ ID NO: 1-SEQ ID NO: 30 (or other polynucleotides of the invention) and variants thereof. In some instances, for example, where the cDNA or chromosomal segment has been sequenced and a particular nucleotide polymorphism is associated with a high growth trait, the probes can be chosen to detect the nucleotide polymorphism, e.g., by allele specific hybridization.


[0186] Modulating Plant Growth Traits


[0187] The invention also provides experimental methods for modulating plant growth traits in vitro and in vivo. Tissue culture and plant models useful for elucidating the molecular mechanisms underlying growth traits as well as for screening and evaluating potential growth control targets are produced by modulating expression or activity of polypeptides (e.g., represented by SEQ ID NO: 31-SEQ ID NO: 60, and conservative variants thereof) encoded by the nucleic acids of the invention.


[0188] For example, plant cells in culture can be transfected with a nucleic acid, e.g., comprising a polynucleotide sequence selected from SEQ ID NO: 1 through SEQ ID NO: 30, to produce cells that express a polypeptide involved in plant growth. It will be understood, that where exogenous polynucleotide sequences are introduced into cells, tissues or individual plants, that the polynucleotide sequences can be selected from among SEQ ID NO: 1-30, conservative variants thereof, polynucleotide sequences encoding SEQ ID NO: 31-60, or other homologous polynucleotide sequences such as polynucleotides sequences that hybridize thereto, or polynucleotides that are at least 70%, (or at least about 75%, about 80%, about 85%, about 90%, or at least about 95%) identical thereto. In some cases, it is preferable to link the polynucleotide sequence of interest to the regulatory sequences with which it is typically associated in vivo in nature. Alternatively, in cases where constitutive expression at levels that are in excess of those found in nature is desired, exogenous promoters and enhancers can be employed, as described in detail in the section entitled “Vectors, Promoters and Expression Systems.”


[0189] Expression and/or activity of the gene or polypeptide can also be modulated in a negative manner, that is, suppressed. For example, knock out mutations can be produced by homologous recombination of an exogenous gene homologue, e.g., bearing a stop codon, and/or insertion of, e.g., a selectable marker, that disrupts production of an intact transcript. Alternatively, vectors incorporating the sequence of interest in the antisense orientation can be introduced to suppress translation at a post-transcriptional level.


[0190] Alternatively, cell lines, e.g., plant or bacterial cells, that express a polypeptide of the invention, e.g., corresponding to one or more of SEQ ID NO: 31-SEQ ID NO: 60, or a subsequence thereof, into which vectors have been transduced that randomly activate expression of associated endogenous sequences upon integration can be isolated. Such vectors have been described, e.g., by Harrington et al. “Creation of genome-wide protein expression libraries using random activation of gene expression.” Nature Biotechnology 19: 440-445, which is incorporated herein by reference. Typically, the vector is constructed with a strong exogenous promoter linked to an exon and an unpaired splice donor site. Upon integration into the genome, splicing with a proximal splice-acceptor site occurs, activating expression of a chimeric transcript encoding at least a portion of the endogenous gene. Cells expressing a polypeptide of interest e.g., SEQ ID NO: 31-SEQ ID NO: 60 can be selected by well known methods, including those based on phenotypic screening methods, antibody or receptor binding, RNA analytical methods, e.g., RT-PCR, northern analysis, MPSS, and the like. By preference, the screening is performed in a high-throughput format.


[0191] The above-described methods for producing cell culture or plant cultivation model systems can be adapted for use in the screening of growth modulating environmental factors, e.g., aimed at optimizing application of water, fertilizer or herbicides. For example, it is desirable to select promoters and enhancers that are modulated in response to nutrients or plant hormones.


[0192] Following introduction of environmental factors, e.g., application of fertilizers, herbicides, or other molecules that affect plant growth traits, altered expression or activity can be detected at the RNA or protein level. Detection of altered levels of RNA is most conveniently accomplished by such methods as RT-PCR, MPSS, or northern analysis. Protein expression is conveniently monitored using, e.g., antibody based detection methods, such as ELISA'S, immunoprecipitations, or immunohistochemical methods including western analysis. In each of these procedures, the sample including the expressed protein of interest is reacted with an antibody (e.g., monoclonal antibody) or antiserum specific for the protein of interest. Methods for generating specific antibodies are well known and further details are provided above in the section entitled “Antibodies.”


[0193] The cell culture models can be used to identify chemical agents capable of favorably regulating the expression or activity of a polypeptide of interest, e.g., a polypeptide selected from among SEQ ID NO: 31-60, in a cell culture system as described above. Most typically, this involves exposing the cells to a chemical or biological composition, e.g., a small organic molecule, or biological macromolecule such as a protein, e.g., an antibody, binding protein, or macromolecular cofactor. Following exposure to the one or more compositions, for example, members of a chemical or biological composition library, such as a combinatorial chemical library, a library of peptide or polypeptide products expressed from a library of nucleic acids, an antibody (or other polypeptide) display library such as a phage display library, etc., modulation of the polypeptide of interest is detected. As discussed above, modulation of the polypeptide can be detected as an alteration in expression at the level of transcription or translation, or as an alteration in the activity of the encoded protein or polypeptide. In some instances, it is desirable to monitor expression or activity of multiple expression products in the same cell, or cell line. The monitored expression products, can be exogenous, i.e., introduced as described above, or endogenous, such as transcripts or polypeptides whose expression or activity is dependent on the amount or activity of a polypeptide of interest.


[0194] In cases where the expression or activity of multiple products are of interest, or where the effect of a plurality of different compounds on the expression or activity of one or more expression products, e.g., screening for growth modulating agents as described above, the monitoring assay is conveniently performed in an array. For example, cells can be arrayed by aliquoting into the wells of a multiwell plate, e.g., a 96, 384, 1536, or other convenient format selected according to available equipment. The arrayed cells can exposed to members of a composition library, and the cells sampled and monitored by, e.g., FACS, immunohistochemisty, ELISA, etc. Alternatively, nucleic acids or proteins can be prepared from the arrayed cells, in a manual, semi-automatic or automated procedure, and the products arranged in a liquid or solid phase array for evaluation. Additional details regarding arrays are provided above in the section entitled “Marker Sets.” Alternative high throughput processing methods, such as microfluidic devices, are also available, and can favorably be employed in the context of monitoring modulation of expression products, e.g., corresponding to SEQ ID NO: 1-403.


[0195] Typically, when processing and evaluating large numbers of samples, e.g., in a high throughput assay, data relating to expression or activity is recorded in a database, typically the database includes character strings representing the data recorded on a computer or in a computer readable medium.


[0196] In addition to tissue culture systems, transgenic plants can be produced which have integrated one or more of the polynucleotide sequences of the invention, e.g., selected from SEQ ID NO: 1 to SEQ ID NO: 30. In this context, commonly used experimental plants include, e.g., Arabidopsis and tobacco.


[0197] Such transgenic plant models are useful, in addition to the cultured cells discussed above, for the evaluation of chemical agents suitable for the modulation plant growth traits. Transgenic plant models, e.g., expressing a polypeptide selected from SEQ ID NO: 31-60, are suitable for evaluating fertilizers, hormones and herbicides useful in modulation of plant growth. For example, following administration of a particular herbicide to a transgenic plant expressing a polypeptide of the invention, leaf growth can be monitored. Monitoring can also involve detecting altered expression or activity of an expression product corresponding to one or more of SEQ ID NO: 1-403 as discussed above.


[0198] Kits and Reagents


[0199] Certain embodiments of the present invention can be optionally provided to a user as a kit. For example, a kit of the invention can contain one or more nucleic acid, polypeptide, antibody, and/or cell line described herein. Most often, the kit contains a diagnostic nucleic acid or polypeptide, e.g., antibody, probe set, e.g., as a cDNA microarray packaged in a suitable container, or other nucleic acid such as one or more expression vector. The kit typically further comprises, one or more additional reagents, e.g., substrates, labels, primers, for labeling expression products, tubes and/or other accessories, reagents for collecting samples, buffers, hybridization chambers, cover slips, etc. The kit optionally further comprises an instruction set or user manual detailing preferred methods of using the kit components for discovery or application of gene sets. When used according to the instructions, the kit can be used, e.g., for evaluating expression or polymorphisms in a plant sample, e.g., for evaluating growth traits.


[0200] Digital Systems


[0201] The present invention provides digital systems, e.g., computers, computer readable media, and integrated systems, comprising character strings corresponding to the sequence information herein for the polypeptides and nucleic acids herein, including, e.g., those sequences listed herein and the various silent substitutions and conservative variations thereof. Integrated systems can further include, e.g., gene synthesis equipment for making genes corresponding to the character strings.


[0202] Various methods known in the art can be used to detect homology or similarity between different character strings, or can be used to perform other desirable functions such as to control output files, provide the basis for making presentations of information including the sequences, and the like. Examples include BLAST, discussed supra. Computer systems of the invention can include such programs, e.g., in conjunction with one or more data file or data base comprising a sequence as noted herein.


[0203] Thus, different types of homology and similarity of various stringency and length can be detected and recognized in the integrated systems herein. For example, many homology determination methods have been designed for comparative analysis of sequences of biopolymers, for spell-checking in word processing, and for data retrieval from various databases. With an understanding of double-helix pair-wise complement interactions among 4 principal nucleobases in natural polynucleotides, models that simulate annealing of complementary homologous polynucleotide strings can also be used as a foundation of sequence alignment or other operations typically performed on the character strings corresponding to the sequences herein (e.g., word-processing manipulations, construction of figures comprising sequence or subsequence character strings, output tables, etc.).


[0204] Thus, standard desktop applications such as word processing software (e.g., Microsoft Word™ or Corel WordPerfect™) and database software (e.g., spreadsheet software such as Microsoft Excel™, Corel Quattro Pro™, or database programs such as Microsoft Access™ or Paradox™) can be adapted to the present invention by inputting a character string corresponding to one or more polynucleotides and polypeptides of the invention (either nucleic acids or proteins, or both). For example, a system of the invention can include the foregoing software having the appropriate character string information, e.g., used in conjunction with a user interface (e.g., a GUI in a standard operating system such as a Windows, Macintosh or LINUX system) to manipulate strings of characters corresponding to the sequences herein. As noted, specialized alignment programs such as BLAST can also be incorporated into the systems of the invention for alignment of nucleic acids or proteins (or corresponding character strings).


[0205] Systems in the present invention typically include a digital computer with data sets entered into the software system comprising any of the sequences herein. The computer can be, e.g., a PC (Intel x86 or Pentium chip-compatible DOS™, OS2™ WINDOWS™ WINDOWS NT™, WINDOWS95™, WINDOWS98™ LINUX based machine, a MACINTOSH™, Power PC, or a UNIX based (e.g., SUN™ work station) machine) or other commercially common computer which is known to one of skill. Software for aligning or otherwise manipulating sequences is available, or can easily be constructed by one of skill using a standard programming language such as Visualbasic, Fortran, Basic, Java, or the like.


[0206] Any controller or computer optionally includes a monitor which is often a cathode ray tube (“CRT”) display, a flat panel display (e.g., active matrix liquid crystal display, liquid crystal display), or others. Computer circuitry is often placed in a box which includes numerous integrated circuit chips, such as a microprocessor, memory, interface circuits, and others. The box also optionally includes a hard disk drive, a floppy disk drive, a high capacity removable drive such as a writeable CD-ROM, and other common peripheral elements. Inputting devices such as a keyboard or mouse optionally provide for input from a user and for user selection of sequences to be compared or otherwise manipulated in the relevant computer system.


[0207] The computer typically includes appropriate software for receiving user instructions, either in the form of user input into a set parameter fields, e.g., in a GUI, or in the form of preprogrammed instructions, e.g., preprogrammed for a variety of different specific operations. The software then converts these instructions to appropriate language for instructing the operation of the fluid direction and transport controller to carry out the desired operation.


[0208] The software can also include output elements for controlling nucleic acid synthesis (e.g., based upon a sequence or an alignment of a sequences herein) or other operations.


[0209] General Molecular Techniques


[0210] In the context of the invention, nucleic acids and/or proteins are manipulated according to well known molecular biology methods. Detailed protocols for numerous such procedures are described in, e.g., in Ausubel et al. Current Protocols in Molecular Biology (supplemented through 2000) John Wiley & Sons, New York (“Ausubel”); Sambrook et al. Molecular Cloning—A Laboratory Manual (2nd Ed.), Vol. 1-3, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y., 1989 (“Sambrook”), and Berger and Kimmel Guide to Molecular Cloning Techniques, Methods in Enzymology volume 152 Academic Press, Inc., San Diego, Calif. (“Berger”).


[0211] In addition to the above references, protocols for in vitro amplification techniques, such as the polymerase chain reaction (PCR), the ligase chain reaction (LCR), Qβ-replicase amplification, and other RNA polymerase mediated techniques (e.g., NASBA), useful e.g., for amplifying cDNA probes of the invention, are found in Mullis et al. (1987) U.S. Pat. No. 4,683,202; PCR Protocols A Guide to Methods and Applications (Innis et al. eds) Academic Press Inc. San Diego, Calif. (1990) (“Innis”); Arnheim and Levinson (1990) C&EN 36; The Journal Of NIH Research (1991) 3:81; Kwoh et al. (1989) Proc Natl Acad Sci USA 86, 1173; Guatelli et al. (1990) Proc Natl Acad Sci USA 87:1874; Lomell et al. (1989) J Clin Chem 35:1826; Landegren et al. (1988) Science 241:1077; Van Brunt (1990) Biotechnology 8:291; Wu and Wallace (1989) Gene 4: 560; Barringer et al. (1990) Gene 89:117, and Sooknanan and Malek (1995) Biotechnology 13:563. Additional methods, useful for cloning nucleic acids in the context of the present invention, include Wallace et al. U.S. Pat. No. 5,426,039. Improved methods of amplifying large nucleic acids by PCR are summarized in Cheng et al. (1994) Nature 369:684 and the references therein.


[0212] Certain polynucleotides of the invention, e.g., SEQ ID NO: 61-SEQ ID NO: 403, can be synthesized utilizing various solid-phase strategies involving mononucleotide- and/or trinucleotide-based phosphoramidite coupling chemistry. For example, nucleic acid sequences can be synthesized by the sequential addition of activated monomers and/or trimers to an elongating polynucleotide chain. See e.g., Caruthers, M. H. et al. (1992) Meth Enzymol 211:3. In lieu of synthesizing the desired sequences, essentially any nucleic acid can be custom ordered from any of a variety of commercial sources, such as The Midland Certified Reagent Company (mcrc@oligos.com), The Great American Gene Company (www.genco.com), ExpressGen, Inc. (www.expressgen.com), Operon Technologies, Inc. (www.operon.com), and many others.


[0213] Similarly, commercial sources for nucleic acid and protein microarrays are available, and include, e.g., Affymetrix, Santa Clara, Calif. (http://www.affymetrix.com/); and Agilent, Palo Alto, Calif. (http://www.agilent.com) Zyomyx, Hayward, Calif. (http://www.zyomyx.com); and Ciphergen Biosciences, Fremont, Calif. (http://www.ciphergen.com/).



EXAMPLES

[0214] The following examples are offered to illustrate, but not to limit, the claimed invention.



Example 1


Growth Gene Combinations in Different Environments

[0215] Genes associated with a particular plant growth trait, such as root length, can vary depending on the environment in which the plant is grown. For example, as described in “Identification of Fenes Controlling Compex Traits” by Benjamin A. Bowen, et al., filed Dec. 28, 2001 (Attorney Docket No. 37-000800US) incorporated herein by reference, gene expression by massively parallel signature sequence (MPSS) analysis was determined for Arabidopsis plants having long roots and short roots in ammonium nitrate fertilizer. FIG. 1 shows differential gene expression between the plants having long and short roots. Similar analysis was carried out comparing gene expression in long root and short root Arabidopsis plants but grown in ammonium sulfate fertilizer. In the ammonium nitrate environment, 56 genes were found to have differential expression between long and short root plants and also to be correlated to root growth by quantitative trait locus (QTL) analysis. In the ammonium sulfate environment. 80 genes were found to have differential expression between long and short root plants and also to be correlated to root growth by QTL analysis. Only 7 genes were found to be correlated in the same direction in both environments. The combination of genes associated with root length was considerably different depending on the nutritional environment. Sequences of the present invention are similarly expressed in unique combinations depending on environmental factors.



Example 2


Genes Associated with Different Plant Growth Traits

[0216] The combination of genes associated with one plant growth trait, such as root length, is often different from the combination of genes associated with another growth trait, such as aerial mass. FIG. 2 shows Arabidopsis QTL plots for three plant growth traits (root length, aerial mass, and root mass). Although there is some overlap of the plots for each trait, QTL analysis would identify a unique combination of differentially expressed genes associated with each trait. For example, differential expression analyses were carried out on long root and short root plants grown with ammonium nitrate fertilizer. Forty-six genes were found to have differential expression between long and short root plants and also to be correlated to root growth by quantitative trait locus (QTL) analysis. The combination of sequences of the present invention also varies uniquely with different plant growth traits.


[0217] While the foregoing invention has been described in some detail for purposes of clarity and understanding, it will be clear to one skilled in the art from a reading of this disclosure that various changes in form and detail can be made without departing from the true scope of the invention. For example, the sequences, techniques and apparatus described above can be used in various combinations. All publications, patents, patent applications, and/or other documents cited in this application are incorporated by reference in their entirety for all purposes to the same extent as if each individual publication, patent, patent application, and/or other document were individually indicated to be incorporated by reference for all purposes.


[0218] Sequence ID Table:
2SEQ ID NO.SEQ1cacaaatcct aacgccaata gtatagattc aattagaatt aaaaccgatc caagtatagattgattcaat tagaatatgg aattcaaaga gaagattatt gatggactta cacttatcgcaaaccatctt cttcttccga gagagaaata tgaagaaacc ctaacgccta aatcaattcgaatgggttag agttacgacg aaaacttatc ggtgttgaaa tttttatcta tgtttaaatatatttttttt ccttttctgg atttggaaag tcggatatgt ctcgtcaaaa ctcatagcctcacaggtatt ttatgccacg aatcgtaata atccacgtgg tacatcaacc aataaaaacgttccacgtgg tacaaccagc gagataccaa gaacttcgag accttcttct ccagatagaggctttccggt aaacggcaaa tacccttttc cttcactttc ttcgtcttct cgaatctgagagaacgagag atcaacaaca ATGGCGCTCA AATCAAAACT CGTCTCTCTT CTCTTCCTCATAGCAACACT ATCATCCACA TTCGCAGCTT CGTTTTCCGA TTCGGATTCC GATTCAGATCTTCTCAACGA ACTTGTATCT CTCAGATCAA CAAGCGAATC AGGCGTAATC CATCTCGATGACCATGGAAT CTCAAAATTC CTAACCTCCG CTTCCACGCC TCGTCCTTAC TCGTTACTCGTCTTCTTCGA CGCTACTCAA CTCCACAGCA AAAACGAGCT TCGTCTTCAA GAGCTCCGTCGCGAATTCGG CATCGTCTCC GCTTCATTCC TCGCTAACAA CAATGGATCT GAAGGAACTAAGCTTTTCTT CTGTGAGATC GAGTTTTCGA AGTCTCAATC TTCGTTCCAG CTCTTTGGCGTTAACGCTTT ACCTCACATT CGTCTTGTAA GTCCTTCGAT ATCGAATCTA CGTGATGAATCTGGTCAAAT GGATCAATCG GATTACTCTA GATTAGCTGA ATCAATGGCT GAGTTTGTTGAGCAACGAAC TAAACTCAAG GTCGGTCCTA TTCAACGTCC ACCGCTACTT TCGAAACCACAGATCGGTAT TATCGTTGCG TTGATCGTTA TCGCTACTCC GTTTATCATC AAAAGAGTTTTGAAAGGAGA AACTATTCTT CATGATACTA GACTTTGGTT ATCTGGTGCT ATCTTCATTTACTTCTTTAG TGTTGCTGGT ACAATGCACA ACATTATCAG GAAAATGCCG ATGTTTCTTCAAGATCGTAA CGATCCGAAT AAGCTTGTGT TTTTCTACCA AGGATCTGGA ATGCAGCTTGGAGCTGAAGG ATTTGCTGTT GGATTCTTGT ATACTGTTGT TGGATTGCTT TTGGCGTTTGTTACCAATGT GCTTGTTCGA GTGAAGAATA TTACTGCACA AAGGTTGATT ATGCTTTTGGCTTTGTTCAT ATCGTTCTGG GCTGTGAAGA AAGTTGTTTA CTTGGATAAC TGGAAGACTGGATATGGAAT TCATCCGTAT TGGCCATCGA GTTGGCGTTG Attacatcac acttgaggatctctgtttca caaggtaatg gctttagttt tggaaaaaca gttatgggaa ttgagtaatgatgtttctgg atgttttgtg tttcgatttg aaatactttt gaatcggtgt agtactactatttcagatgg tttaaaactc cttactgtta cattagtcca ttgttaagtt atttatctgaatgagtaact tatataacca agaatatggg atctttagtc gattgaatat aggaaccatatttggaaatt caggtactgt ttcttgagat cagtctagga ttgttgttat ttggtacattgacactttta gagtttctat gtgtcttcag ccttgcgccc cttgcttact gcatctattcagaaaaaggg actttgtgat tgaggatagt gtttctgttt aagcattatg ggaccttatgttttgtcgtt gactgtgtcc tcttctcgtt ttgctctctg ttttagaatg agtctaagtaa2atttaaatgt gttataatat ttgataaaaa atttgaatct ttttaaaaat atatataattgtgttaaaaa aaactatact ttttattatt ttattttatc ttcctttaaa atgttaaatttaaatttatt ttcaaaaaat ttgataattt taggcttttt gataatgttt ttcaactttttatataatat ataagtacat attgttttat tctaaaatcg tttagatctt aacgaatagttataggcgtt agacggcctc aactaattgt tataagtgtt agacggaaag ttaccgtccccttagcgttt attttaacat taaaagaaaa gatacatact attaaactaa tggagtattaacaagaaaaa aaagaaagag taaaatacga aaggttcctt aagcaagttt ataaatatttatagccaaaa acaaaagcaa aaccaaaaat cacaagtaac cccaaaagaa aaaaagcaaagagagaggaa aagaaaaaaa ATGACGAAGA CGATGATGAT CTTTGCGGCG GCGATGACGGTGATGGCTTT GCTTTTGGTT CCGACTATTG AAGCACAAAC TGAGTGCGTG AGCAAGCTAGTCCCTTGCTT CAACGACCTG AACACGACAA CAACGCCGGT GAAAGAATGT TGCGACTCGATTAAAGAAGC GGTGGAGAAG GAACTTACAT GTCTCTGTAC AATCTACACC AGTCCAGGTTTGCTCGCTCA GTTCAACGTC ACCACTGAGA AAGCTCTCGG TCTTAGCCGT CGTTGCAACGTCACCACTGA TCTCTCCGCT TGTACCGgta accaatttca ttttctccga tctccgattttttaattttt ttgtcaacaa catgcattat gaatggattt gtggattctg attaatgtgaatgtgactaa gaaaattagc atagtttttt gtctactgct aacatttttt agatcttgttgagattatga aacagagatt tgcaatttca tatatcagta ttaatcatgt ttttgttttttgtttagCTA AAGGAGCTCC ATCGCCAAAA GCTTCTTTAC CTCCTCCAGC TCCAGgtatgaaccaaactc ttcacctact ccttacaatt atttccttga atactttgtt atcaaaaaaaaaaaaaaatg aaatattgat cgacttgatt gtgtattaat tgaattattc gattgatttgattagtagag ttaattaacc aaatcaaatg gtgttaatca aggcaattat tcaattgatactctaaatcg atcttataat tttcccagat ttttctctct ttttttgttt tctatataaaaaacataaac agagtgtgaa tgccagcttt tacttgtgta ctttattttg tctcgagtattgacttgaat aattcggaca aaaccactaa aaaatgaaac ttgtcagatt ttttatttttttataaattt tttatttgtt atttgctgat tgacgatttg tcttatatta tatggatgggtttctaaata ttcagCAGGG AATACCAAAA AAGACGCCGG AGCTGGGAAC AAGCTCGCCGGTTATGGAGT CACCACCGTG ATCTTGTCTT TGATCTCATC CATCTTCTTC TGAattcctttacccggttt tattattatt agctcaataa attctcgaga tttgtttgct tttggcttaacttatttaat atttaaagaa aaacaaaaag tattttttgt tcacatgtta tgtattatcattgattcatt attgagtccc atgttagtat atttaccggt tataatcgga ctctatcatttgcatatctg atttgagtgt ggatctgtgt tgttaattga tgtaatcttt attatataaattgaaaatga aaacaaaata taaaaaactg tgttggttta aaggtcccaa tcctcattttggtaggtttg actaccaact agaaacaata tcatccataa tattgcttct ttgtgctatcttattaaatg taaaccaaga acgcagtttt attctctaat tgtgttcata aattaaacaacaaaagaaca gaatcgcaaa tttaattagg cgatgcgagt aacaacagca tgtatagcatcagcgagttg agg3ctgtatatga cttatcacca tgagattgta ataactctta tctaataata ctcactcaagtaaaagatcc aataatcttc aaacgaaagt agtaccaggt atgaaactcc agcgttgatgatgtgagctt ctcaatatct actagtcaaa gacgcatcgg atcgatcatc ggagttgcatcggaatttat cgggaaagaa tggattgggc ccaatgtgga aatgataagt cgtatgggcctaaatcattt agtcgtaggc ccaatatgag tttaagctct ttgatatttc agagaatgttattcaattta ttagtaattt tcaaatgata taaattcaat ttattaatca cttggttaaaacttatacac gtgaaaaaat gagaaatcat tttagtacat tgttgaccat ctttttcgtatagactacta tctctgatct cttgcgagtt aagtcagtaa ctaggaaaat tcagaagcgctctcaatctc aaaaatatcc ATGGCGGCGA TTACAGAATT TCTACCAAAA GAGTACGGATATGTCGTTCT CGTCCTCGTC TTCTACTGTT TCCTCAACCT CTGGATGGGT GCTCAAGTCGGCAGAGCTCG CAAAAGgttt ccacgaaact cctagatcgt taacgcttga attgccgtgatttcgccact aaaatcgaat cgaggacgat gctagatcgt tccctttgtt cttgattggaatcgaatttt aactgaaatc tgtagattga tgtgacctaa aactagaatt ttgcaattttcgtcctaagt ttttggattc tgtagtctga ttcattgttt tgatgttatc atcagttcgatttcaagttt attgaactta cgatttcaat ctgttgtttg tttgttcatc ttctactaattgattagtat gagcgagatt gtcttatcgg ttagatctgt tgtttgttca tcttcaattttgaatgatct cacatgagtc tatgatcttg atgcagGTAC AACGTCCCGT ATCCAACTCTATATGCAATA GAATCAGAAA ACAAAGATGC TAAGCTCTTC AACTGTGTTC AGgtttgaaatatagttaaa acaatacttg tgtgattctg ttttcttgta ctacttgtta ttgagatgtgataaaatttg tggttgtagA GAGGACATCA AAACTCTTTA GAGATGATGC CAATGTATTTCATACTGATG ATCCTCGGTG GGATGAAGCA CCCTTGTATC TGTACTGGCC TTGGTTTGCTTTACAACGTT AGCCGATTCT TCTACTTTAA AGGTTATGCT ACTGGAGATC CCATGAAGCGTCTTACGATC GGgtttgttc ttttatcctc ttatcagtgt tcattatctt tattgattgatttagttatg ttagtcaata ggatatagag tttagacttg tatataaggt tgtaacttgcaagtatagtt tcattaactg atttcttcgg ttattgtatc aaagcattga tctaaggctctaagctcaac cattttccgt tttgcgtatc aaatgtttct cgctttcttt gtctttgattcttgggaaat ttctttgttt ctgcatacag cttttcccat tcttcgtttc tttactcggttctgtattta ctacgacttt gttccacgtc ttcgtctcta aatcgagttt acgtagataatcgttgtaat ctacaatgtt gcagttaagt tagtcagagt aatagttaag agttaagacttgtacatacg gttgtaagtg aacattttcc taaactgact tcttctgtta tggtgtcagagcgtgaagct aagctcaacg atttcttcgt gtttctgata agtaacaagc caccaaagtctgattactta tctttctaat ctataatgtt gcagGAAATA CGGTTTCTTG GGGTTGCTAGGTCTGATGAT ATGTACCATC TCGTTTGGTG TCACTCTGAT CCTTGCTTGA gctactcgtttctggggtta atgattctct ggtttgctcg aagaatatag aaccaatgct tgtaagctgtccacaaaact tgtgtaatac tttagagttt gtcactttta aaagtttgta ataaatcatggcttcataga acagttgaaa tttcacatcc gtagacgtta ataaagattt gaattatgaagacactttct ggttatttta taattccatc tatctatatc tctgtactga agtgatcaaaacacttacga cacgttatct tggcttgtta ctcaaaaaat gaaaaaaata aactaaaaacgtgaacggca ggattcgaac ctgcgcgggc aaagcccaca tgatttctag tcatgcccgataaccactcc ggcacgtcca ctgtttgaga tgtaacttaa atattaagat aatataattataaataaaga caacacgtta cgatactacg tggatagtaa ctaactattt gctgaattatgataaagtcg4ttcaacttta cctatcagtt tgttggatca atttattacc atccaattct cttgttattattcaaagttc aaacattccg ttccaatgtt aactttgtaa agtagtaaat ggtaagtaacaataactcta aatacctacc cttacaaatt aaaaattcaa cgcctacata aattatctacctactagaat ttaaatatat aaaatcctag aataagtcaa caatcatatt aatgactaaaaattaccaaa actaaattat ttcattagtt taaaaaaaaa acaatttatt atattttatataatattata atgtttgcaa aaacagagta tcacgtcacc ttctctctct ctctatctctgtatcctctc attgcactat aagtactacc acaaccacga actctaaagc atcatctcattaacaaaaat aaaacacaca atctcaagat tttctacttc ttattacaaa gattcaatcttcttgtttct tcttgcaacc ATGAGTCTTC TTGCAGATCT TGTTAACCTT GACATCTCAGACAACAGTGA AAAGATCATC GCTGAATACA TATGgttcgt cttcttcctc tgcttttgaccatttgagtt tctctggttt tttctgttct tatcggaaaa caagagcttg agttaaagatttgaatctta aagtcaatct tatcttaaag tcaatctttg tcatttacca ttttgtattacatctctaat ttggttttaa ttcaaatagG GTTGGTGGTT CTGGTATGGA CATGAGAAGCAAAGCCAGGg taatttaatc tttctttaac tataatttct ttgacaaatt gtaacttttctcggagagat ttgattcgat tgaattacta agactctggt ttgttgcctg cagACTCTCCCTGGACCTGT GACCGATCCA TCAAAACTTC CAAAGTGGAA CTATGATGGT TCAAGCACTGGTCAAGCTCC TGGTCAAGAC AGTGAAGTGA TCTTATAgta agtctcttca agattaaaaccaaaaaaaaa agtctcttca agattttctc taaagatcca tctcttttgt tttttgtttactttcttaat aatatttgtt gtatttgtgt ttcttagCCC TCAAGCAATT TTCAAAGATCCATTCCGTAG AGGCAACAAC ATCCTTgtga gtttaaactt tttttttttt tttcttgctatatgttctgt ttttagcggt taaagattaa cgttttttat cggtttgatc agGTTATGTGTGATGCTTAC ACTCCAGCGG GAGAGCCAAT CCCTACTAAC AAGCGACATG CTGCGGCTGAGATCTTTGCT AACCCTGATG TTATTGCTGA AGTGCCATGg ttaatccaaa ttcccctgttctttttatat agctttttcg ctttcttgcg gtggtcgtag atcgctgatt ttttttccggttaattagGT ATGGAATCGA ACAAGAATAC ACTTTGTTGC AGAAGGATGT GAACTGGCCTCTTGGATGGC CCATTGGTGG CTTCCCTGGC CCTCAGgtac attccgtttt tgcggagttttttcgtttgt ttactgctct ttttcgattc tccgttcttg gcttctgaat tatctcttgcactcttgcag GGACCATACT ACTGCAGTAT TGGAGCTGAC AAATCTTTTG GAAGAGACATTGTTGATGCT CACTACAAAG CCTCTTTGTA TGCTGGAATC AACATCAGTG GGATCAATGGAGAAGTCATG CCGGGACAAT GGGAGTTCCA AGTCGGCCCA TCGGTCGGTA TCTCAGCTGCTGATGAAATA TGGATCGCTC GTTACATTTT GGAGgtataa tttaaaacca ttcacttttcgattcttgtt gatctcttta aggaaatata aacttataac acaagttttg gtggttttaaaaacagAGGA TCACAGAGAT TGCTGGTGTG GTTGTATCTT TTGACCCAAA ACCTATTCCTGGTGACTGGA ATGGAGCTGG TGCTCACACC AATTACAGgt aaaaagaatc atgaatcttttctcttgtta gatcattaca atgtttgtga gaacattcaa gaaaatggtg aacgtttttatttcagTACT AAATCAATGA GGGAAGAAGG AGGATACGAG ATAATCAAGA AGGCGATCGAGAAGCTTGGC TTGAGACACA AGGAACACAT TTCCGCTTAC GGTGAAGGAA ACGAGCGTCGTCTCACGGGA CACCATGAAA CTGCTGACAT CAACACTTTC CTTTGGgtaa agattttagaacattgtttt atttgtaaaa tgtttgataa cattttctga tctttgtgtt tgaatcttctttaaaaagGG TGTTGCGAAC CGTGGTGCAT CGATCCGAGT AGGACGTGAC ACCGAGAAAGAAGGGAAGGG ATACTTTGAG GATAGGAGGC CAGCTTCAAA CATGGACCCT TACGTTGTTACTTCCATGAT TGCAGAGACT ACACTCCTCT GGAACCCTTG Aaaggatgat ccgtaactcttgaagttgct tctgattggg ttttttggaa gttccaagct tgtcttttct ctacagtgtgtattaagcaa ttgtaccggt tgacactgcc ggagtttgtg atttggggcc tttctttctttttcttcttt ttataatctt ttgggttctg tggttagagc aaattcggtt tgctctgtttgtttgacctt tattgaaacc tttggtattg gtactaataa tacaatctga aaaggcctcttcatgtttca atgttagaga ctaattaaag atctctttta tttttcattt tatacaaacatgaaacacca atgttgatcc tgtctggtcc gtttttgatc tatgactcac aagatcgttgcgtactcata tcaacggctt tttgaacccc tttgtttgca aacaaaccac caatgtgggatgcttatcag tagaccgaac aaatgactac ttctccggaa ttttatttcc tttcaccttcc5taggactttt actatggtaa atcggtttag cacaatacac atgactttat gttattcattcttcattcgt atatggataa aaaatcagcg atgctaaaca gatctcaata tgtatgtgaacttgtgaagt agcaaattgt tgcttattcc actatattaa gtcaagtttc cacaatgtgccagacaatcc ctagttgttt agattccaag atttcgacaa tgtaacaccc gttaataattcacaacagct ctcttattgg caatatattc gataattatt aaatacataa atacaaaatcacattttgga atttaagaca ttttacaatt aaaaaaaaag tggaatcacg ttcaaaggtcgttgatagtc acaacttaac aatgacgcat taaagtattc aaaagtctat ttaactgatctatgattgac acatagaaat gaagctatat aaaagttgta ctctcttttt gaaccatctcacaatcaaac tcaagtcaac ATGTATCAAA AATTTCAGAT CTCCGGCAAA ATTGTTAAGACTTTGGGGCT AAAGATGAAA GTTCTGATAG CAGTCTCCTT TGGTTCCTTA CTATTTATACTATCATACTC AAACAACTTT AACAACAAAC TTCTTGATGC TACAACCAAA Ggtaagaaaattatccatat cttgtgtttt attgttaagt caatgaatcc tcattttggt tttatgttttcattttgttg tagTAGACAT AAAGGAAACC GAAAAACCGG TGGATAAACT TATAGGAGGGCTTTTAACTG CGGATTTTGA TGAAGGTTCT TGCTTGAGTA GGTATCATAA ATATTTCTTGTACCGCAAGC CATCCCCGTA CAAGCCTTCT GAATATCTAG TCTCTAAGCT CAGAAGCTATGAGATGCTTC ACAAACGTTG TGGTCCAGAT ACAGAATATT ACAAAGAAGC AATAGAGAAACTTAGTCGTG ATGATGCAAG CGAATCAAAT GGTGAATGCA GATACATTGT ATGGGTGGCAGGTTACGGGC TTGGAAACAG ATTACTTACT CTTGCTTCTG TTTTCCTCTA CGCTCTCTTGACCGAGAGAA TCATTCTTGT CGACAACCGC AAGGATGTTA GTGATCTCTT ATGCGAGCCATTTCCAGGTA CTTCATGGTT GCTTCCGCTT GACTTTCCAA TGCTGAATTA TACTTATGCTTGGGGCTACA ATAAGGAATA TCCTCGTTGT TACGGAACAA TGTCTGAAAA ACATTCCATCAACTCGACTT CAATCCCGCC GCATCTATAC ATGCATAACC TTCATGATTC AAGGGATAGTGATAAGCTGT TTGTATGCCA AAAGGATCAA AGTTTGATTG ACAAAGTCCC ATGGTTGATTGTTCAAGCCA ATGTTTACTT TGTTCCATCG TTATGGTTTA ATCCAACTTT CCAAACCGAACTAGTTAAGC TGTTCCCGCA GAAAGAAACC GTCTTTCACC ACTTGGCTCG GTATCTTTTTCACCCTACAA ATGAAGTTTG GGATATGGTC ACTGACTACT ACCACGCTCA TTTGTCGAAAGCCGACGAGA GACTCGGGAT TCAAATAAGG GTTTTCGGCA AACCTGATGG ACGTTTCAAACATGTCATTG ACCAGGTCAT ATCATGTACA CAAAGAGAGA AACTGTTACC TGAATTTGCTACACCAGAGG AATCAAAAGT CAATATATCA AAAACCCCGA AACTCAAATC TGTTCTTGTCGCATCTCTCT ATCCAGAGTT CTCTGGCAAC TTAACTAACA TGTTTTCAAA GCGACCAAGTTCAACAGGAG AAATTGTTGA AGTTTATCAA CCAAGTGGAG AGAGAGTTCA GCAAACAGACAAGAAAAGTC ACGACCAAAA GGCGCTTGCT GAGATGTATC TTTTGAGCTT AACCGATAACATTGTCACGA GCGCAAGGTC TACATTTGGA TATGTTTCAT ATAGTCTTGG AGGATTAAAGCCATGGTTAC TTTATCAGCC AACAAATTTC ACCACTCCTA ATCCGCCATG TGTTCGATCTAAGTCGATGG AGCCATGTTA CCTAACTCCT CCGTCTCATG GATGTGAAGC TGACTGGGGAACTAACTCGG GGAAGATTCT TCCTTTTGTT AGGCATTGTG AGGATCTTAT ATATGGGGGGCTTAAGCTAT ATGATGAATT TTAGttctat tttatcacat ttgattttat tggattattgagtttttata atctaaggaa aaaatgctat ccgatccctc tttacagttt acacttgtgtcctcttctta tgtattaata tgttagtttt cttaaaacgt ttactaggtt tgtatggtttataatattaa ataaaatgaa atttacatat atacttgtat cacttaaaat cattaagactctaatttaat ttatatcatt gtgatgtttt ctcgaggtta ctttatgtgt catgaagataatggagtatt ggagttgtga ggtatcatgc gtcgtcgttg ttctactcta gtccacctttaaagaatata aaaagagata tttaatcaat gttatgcgtt acaacatttt attatcgaaaaaacgttttg agtataaaag aaaaaataga gaaattttag tgatttccga gatataatattcacctgcaa aagagagtgc tgattttaca caaatattga gagc6atcttccaat ataaagtctg aagcgcgggg tagtggagat ttgaacaatg gagtacataaaatagttcgg accccacctg tctttgatgg gaccatgcgc gcaaagcgct ctttcctcttggatgatgcg tctgatggta atgaatctgg aacggaagag gatcaatctg cttttatgaaagaattggat agttttttta gagagcgaaa catggatttc aaacctccaa aattttacggggagggcatg aactgcctca agtaagcttg atacccatca ttatttggtc actttactgtgttacatttt aaaattttca gcaggagctg atatctaatc aatttctttg gcacaaggttgtggagagct gtaactagat tgggcggata tgacaaggta cgggtcactg tgaatacgcctgttgaatgt cacagcatct tttttgacaa gcaaatgtga cttcggcttt tcatcttttgttccatcctg gcttacttgc ATGCGTACTG TTGTTCATGA TCTAGCAGTG GTGCTTTTGGTGATTTTCTA TGATTATTAT ATGCTTTTTA TACTGGATAG GTTACTGGAA GCAAATTATGGCGGCAAGTG GGAGAgtctt tcaggccccc aaagtaagaa gaatgctttt cttattagtggtttgtctta gAAATTTTGG GAAATCATGT GGATATTTTT AAGAATTACC CTCTAATTGGTCAATTGTTT GTTCAGGACA TGTACAACAG TATCATGGAC TTTCCGAGgt ttctacgaaaaggtgagact atattcacca ccttttcctc tctctgcttt tggttcgtct atgtgacttttgtatacact ggcatgggac tgggactcta tgtatcaacc cttctgagaa ataattgaaatgattgaaca gtgaacaact gtgaatcatc ttgagatatg ttttccttaa gatacagtaacatcttgtaa cattatagTT TCTTCATTTT TCAGGCTCTT CTTGAATATG AGCGGCATAAAGTTAGTGAA GGTGAACTTC AGATACCCCT TCCGTTGGAA CTAGAACCGA TGAATATTGATAATCAGgta aaattgagaa aaccatatca tgtgtctgta gtttttgttt gatcttcttcttctgattaa tgtcagtgtt ttaacttaac ccactgcctt gtttctacac tagGCGTCTGGATCAGGGAG AGCAAGGAGA GATGCAGCAT CACGTGCTAT GCAAGGTTGG CATTCACAGCGTCTTAATGG TAACGGTGAA GTTAGTGACC CTGCAATCAA Ggtccggtag aatctttttatatgtttcat tttacattca cactagatct ctcgtttttt ttttgtcaaa catttaatctatatctcata gtctgaacga acatactgtt ttgtaattaa tagGATAAGA ACTTAGTTCTTCATCAAAAG CGCGAAAAAC AGATTGGAAC CACCCCTGgt atgagttctg tttgatgaagaagtgttgtt ctcattttta ttttgaaact ttgacatggg ttatcactta catctcacaatgtcatcagG TTTGCTCAAA CGTAAGAGGG CTGCTGAACA TGGTGCAAAA AATGCCATCCATGTATCTAA ATCTATgtac gatttttggc tttgtggtct ggttttcaat gcgtgataattcacatttga attctgattc cagttgttgt ttttcctagG TTGGATGTGA CTGTTGTTGATGTTGGACCA CCAGCTGACT GGGTGAAGAT TAACGTACAG AGAACGgtaa aatcaattgccactttctta aaaacctgag caatcacttt ctggttttac atatattaat aaactcttccactatctgca gCAAGATTGC TTTGAGGTGT ATGCATTAGT CCCAGGATTA GTCCGTGAAGAGgtaagctc tcaaatctcg ttgtgtttac atatggatcc taagattgag tttagcactcagtttttgtc ttggcaacaa taatacagGT CCGAGTCCAA TCAGATCCGG CTGGGCGGTTAGTAATAAGT GGCGAACCCG AGAACCCTAT GAATCCTTGG GGAGCTACTC CTTTCAAAAAGgtaaatgct ggttacatga tttttcagct tacacgtaga atgttgaatg acattttcaaacctccattg aaactgcagG TGGTAAGTTT ACCAACGAGA ATCGATCCGC ATCACACATCGGCTGTGGTA ACCCTAAACG GGCAGTTATT TGTTCGTGTG CCTCTGGAGC AATTGGAGTAGaaacattta cagtttaaca aagcctttga agatctgaaa gagagaagat tgttagaagtagttgttgag agtattttgt ttgtatatta tgagagatta agcacaacat gagaagagcctttaggaatc cttaattagg ccatctagtt tttattgtct ctcctctctt tgattagattcttcttctaa gtgtcatcac tattgatttg ttgtagcacc aaacttcttt aaacctttctattaagaaca cacaaatcta caaccttttt atttttttta attgtttatg tgatttgttttctgtggcag tgaatttttt atattatcaa cttatcatgt tagctcaaga ttgcatctcaatttgtactt atcttagtgg taattagaaa aaaaaacaaa attaggctac aatagttttgtttgtttgtt tgtttaggtg ttagggatag ggtttatttt ttccgaagtt tattagtgtttactatttag agtttaatgt t7gaaagtatta tgataaagaa ggattaaaaa aaaaaaaatc ttcttaatat agcttacaatgttttgttgt taaagtatag ctaagtaaag tatgttataa atggtgcatg attttttatttttgattaaa aagtggtaaa tgatattttt ttcctccatt ttgcattttt acactttgtatgatccaatt tgcttttatt tatctacata taataaatct ctataataaa ccatttacataccattacta aaactaaaat tataatggaa aaatattatt atgttattta ttgttactttggtaaagcat tattatttat tttgcttatt ttaagggcta ataattaatt gaaattaagcagttgacgaa agtttttttt attaatttat aaagcacaac atttccttgt ctacacgatcataaagctca caaagagaga attgagaaga aacaaactcg tcggagaatt cagtactcgccgaagaggaa gaagaagaag ATGTCTTGGC AATCATACGT CGATGATCAC CTTATGTGTGATGTCGAAGG CAACCATCTC ACCGCCGCCG CAATTCTCGG CCAAGACGGC AGTGTCTGGGCTCAGAGCGC CAAATTTCCT CAGgtttttt tacttcttca tcctctcttt tcgccttactacgatccgtc gcttgaattg tcggaatcct ccgtgatcgg atctgacgaa tctcggatctgattttgaat ttttcaatct ccggaatctg atgaatattt tcgatttgca tttctaaatctatcgatccg tatgcgaaat tgaattcaaa cgtagggctc tagaccatta gtctattgtgagatttcttc ggtatcagaa gttattagat cgtagcttcc atagaagaag atccatatgcttgtgaaatt gtacgcatgc gtgtgcaacc atcgatgcaa ggtcttcttc ttcttgtaggcatgtagatt ctatggtctt agtcagaatt actgcttaac aattgcatct tggataatctctgtttccat ttttcttata tgcttgagga aatgttttga tcaatagcct aaaatgttgatttgattttg ccaaaatctg atgatgtgtt attgataatg tgtgtttagT TGAAGCCTCAAGAAATCGAT GGAATCAAGA AGGACTTTGA GGAGCCCGGG TTTCTTGCCC CAACCGGACTATTTCTCGGT GGCGAAAAAT ACATGGTTAT CCAAGGTGAA CAAGGAGCTG TGATCCGAGGGAAGAAGgta actttcttta cttcatacat cagaaagctg catgtagatt ttgatagagaatagaatcgg aattcatgta acaatctgtg aatcttcagG GACCTGGAGG TGTCACTATCAAGAAGACAA ACCAAGCTTT GGTCTTTGGC TTCTACGATG AACCAATGAC TGGAGGTCAATGCAACTTGG TTGTCGAAAG GCTCGGGGAT TACCTTATCG AGTCTGAACT CTAAaaccaaggtttcattt caggttcttc ttaactaaag agtgtcaatg cactttttat tgtgattgattgtaatgctt tcaaacacaa atcatttgtt actttagaac caattgtgat tgattggtctccttcgttac cgagtttgag tttgtgtgtt cttgtaatga catttgatca tcttttttctccatatgtat tgagttttga tttttgtttc ttcatattat tactttttct tgaaatgatctgctgtttat gatttggggt tcaaaatatt tttggtttgg caaacaagga agagtttgccaagtattagt agcaagtgct atgagtattt tcggcttggc gaacatcttc gtgtacacgtgtgacataac aaacctattt gagaatggtg taagctaggt agatattaca taaacgatgtaagttgggaa ttcgtttagg agagagatat tgtatggtaa gaatttcact tcgaattctctgcttcaacg tggc8agaagactag gcggaacatc tcatcaaaac cctatacatt caacagggaa attcttttgcacgaatgtta gacttcaata ttgaataaaa ttcatagttt caacaatctc ataaaaaaagagctgggctc cattcgaaga cacattaatt tccatgggcc tggtccacat acaaccatactaaatttgaa gtaatttacc cgccatttaa aaaagcccat aggctccttc tcctagaagctggcgggaaa atcccaaaac ttttcccggg aaagtagata aaaaatttcg gccattaaaggacaaaatca caagaaagta gaaaccctag agattttgaa accgaaaccc caaaaacccctttgacgcct ccttgttctt atctctttat aaaaaaccat ttctttcctg caacatcgttgcttatcatc agacgcacat cacctgttcg ataaaattcc tctgagagtg ttttttttgttttccttctg acaaagaaat ATGTATGTAG TGAAGCGTGA CGGAAGACAG GAAACTGTTCATTTCGATAA GATTACTGCG AGGCTTAAGA AACTTAGCTA TGGGCTTAGC AGTGACCATTGTGACCCTGT CCTCGTTGCT CAGAAGGTCT GTGCCGGTGT CTATAAAGGA GTCACTACGAGTCAACTTGA TGAGTTGGCT GCTGAAACTG CTGCTGCTAT GACTTGTAAC CATCCTGATTATGCATCTgt gagtatctct cttcgttttc ctttctgggt attgcttgat tttgattagtcgtttctgga gaagtgatct ctgtcattgg attggtgttt catttgattg aattgatctgtataatttac atgttatctg tgttcatatg tcagCTTGCT GCTAGGATTG CTGTGTCGAATCTCCACAAG AACACTAAGA AGTCATTTTC TGAGACgtga gtgttgagtt ctttcttagtgtgtattata cccttgatat gagttcaagt ttccatgtgt gttgactccg atggcttgtgtggtatcttg cagGATTAAG GATATGTTCT ATCATGTCAA TGATAGATCT GGACTAAAGTCCCCACTAAT AGCCGATGAT GTGTTTGAGA TAATTATGCA Ggtaaagaaa tcttgtgttaagctcttgat tcaatctgtt tcttggtgtg atatatatat atatatatat gtatgtatcttataaatcac tgacttgtgt gttactggtt tcttcagAAC GCTGCTCGTT TGGACAGTGAGATCATCTAT GACCGTGATT TTGAATATGA TTACTTTGGA TTTAAAACTC TTGAGAGATCGTACCTCTTG AAAGTCCAAG GGACTGTTGT TGAAAGGCCT CAACACATGC TGATGAGGGTTGCTGTTGGG ATCCACAAGG ATGATATTGA TTCCGTGATC CAAACCTACC ATTTGATGTCTCAGAGATGG TTCACTCATG CATCTCCTAC TCTCTTCAAC GCAGGAACTC CAAGGCCTCAAgtaaatacc tatcacttga tatttattat atctattaaa taaggcgttt tactttgatacgtgtctttg ctgatctgct attgaaaata attgaaattg cagTTAAGTA GCTGCTTTCTAGTCTGCATG AAAGATGATA GCATTGAGGG CATATATGAA ACACTCAAAG AGTGTGCTGTTATAAGCAAA TCTGCTGGGG GTATTGGTGT TTCAGTTCAT AATATTCGTG CTACCGGAAGTTACATTCGT GGCACAAATG GAACATCTAA TGGTATTGTT CCTATGCTGC GTGTATTCAACGATACAGCT CGTTATGTTG ACCAAGGAGG AGGCAAGAGA AAGGgtacgt atcagctctttgtactatta gcataatcat ctgtccagta tatggtctaa agtgtatctg atttataatttgtaattggt gaagGAGCCT TTGCTGTTTA CCTGGAGCCA TGGCATGCTG ATGTCTATGAGTTTCTGGAG CTGCGAAAGA ACCATGGAAA Ggtatagtca tagctagata attcaccatatctactccct aaatgtgatt accatttgac gctgatacaa cctcttaata cactttgtcgcattgcagGA AGAACACAGG GCTAGAGATT TGTTTTATGC TCTCTGGCTT CCAGATCTTTTCATGGAGAG GGTCCAGAAT AATGGGCAGT GGTCACTGTT TTGTCCTAAC GAAGCTCCAGGTTTGGCAGA TTGCTGGGGA GCTGAATTTG AGACACTGTA CACTAAGTAT GAAAGAGAGgtgagtcccta tttcatccat gtatatgctg cttctttagt aactcaaatt cctgttatctcaatacagtt atgtttgttc atatcttcag GGAAAGGCCA AAAAGGTTGT TCAGGCGCAGCAGCTTTGGT ACGAAATATT GACATCCCAG GTAGAAACAG GAACACCATA CATGCTTTTCAAGgtaagta acagtcatca ttctgtagct acacgttatg gccttataat cattggttcttactccaaat ttgaatgctc ttaaactata gGATTCATGC AACCGAAAAA GTAATCAGCAAAATCTGGGT ACCATAAAGT CGTCCAACTT ATGCACTGAA ATCATTGAGT ACACTAGTCCAACAGAAACT GCTGTGTGCA ATCTTGCATC TATTGCTTTA CCCAGATTTG TAAGGGAGAAGgtgagaggg agactggttt tttaaaattt gctttctctt tattactcaa tgtatagctctaacattctt catctcacaa cagGGTGTCC CATTAGACTC TCATCCACCT AAGCTCGCTGGCAGTCTGGA CTCAAAGAAT CGTTACTTTG ATTTTGAAAA ATTAGCAGAG gtcagatacaagcactcgcc ttgcttgacc tgaaatctga ttcttaagga attatctgtg gagatatttccgtgtctgtg atgtgatgtt tgacttttta atttttctgt gtggccagGT GACTGCTACTGTTACTGTTA ATCTCAATAA GATAATAGAT GTGAATTACT ATCCTGTGGA GACTGCAAAAACTTCAAACA TGCGTCATAG ACCTATTGGT ATTGGTGTAC AAGGCCTTGC AGATGCATTTATCCTCCTTG GAATGCCATT TGATTCTCCA GAGgtagact tgttttgaat tatgatcaatcttggaaaat ataattttgt tatctgttct taagcagttt aatttgttac tcagGCCCAACAACTGAATA AGGATATATT CGAAACCATA TACTACCATG CACTCAAAGC ATCTACAGAGCTTGCTGCAA GACTTGGCCC CTATGAAACC TATGCTGGAA GTCCCGTGAG TAAGgtatgcatctcagcca tcaattatat caatttggtt ttcccaaact tcataagcta ccattgtggattgttatgct gactttatcc catgcttctc tagGGAATCC TTCAACCTGA CATGTGGAATGTAATTCCAT CAGACCGCTG GGACTGGGCT GTTCTTAGAG ATATGATATC AAAGAATGGAGTGAGGAACT CTCTTTTAGT AGCACCAATG CCAACTGCTT CAACCAGTCA AATCCTTGGGAACAATGAAT GTTTTGAGCC CTACACATCA AACATCTACA GCCGCAGAGT CTTGAGgtatgtgaatatta aatcatttga caagtatgtt tctggttttc cccatttgat gcttactcacttggttgtct tggtttgtac agTGGTGAAT TCGTAGTGGT TAATAAGCAT CTTCTCCATGACCTAACTGA TATGGGACTT TGGACTCCAA CGCTGAAAAA CAAATTAATT AATGAGAATGGTTCTATAGT TAATGTTGCT GAGATACCTG ATGACTTGAA GGCGATTTAC AGgtatagcttccacttatt ttgtgttttc actctctact gtctagataa agaaatttga cttgtttcttctgtaaaaca acacagAACT GTCTGGGAAA TCAAACAGAG AACAGTGGTG GACATGGCTGCTGATCGTGG ATGCTACATA GATCAAAGCC AAAGCTTAAA CATACACATG GACAAACCCAACTTCGCAAA ACTCACTTCG CTACACTTCT ATACTTGGAA AAAGgtacaa accttaatcatctaaactct tcatatgata attgtgaaat aggttagaga ttctatagag tatctgatccttcactcatc tgacaattac tcttaatctc acttatgttg ttgtgaatct accttaagGGTCTGAAAACC GGGATGTACT ACCTGCGATC CCGTGCTGCA GCTGATGCGA TAAAGTTCACCGTTGACACA GCCATGCTCA AGgtagaaaa aacaatgcaa actctttacg ctgattcttcttgtgaactc agacatttta cctatgagtt gttttcgttg gggtgaatgt agGAGAAGCCGAGTGTAGCA GAAGGAGACA AAGAAGTAGA AGAAGAGGAT AATGAAACTA AGTTGGCGCAGATGGTATGT TCCTTGACAA ACCCTGAAGA GTGTTTGGCC TGCGGAAGTT GAagctctaagttatagttt gggtcttaaa aagttagaaa gtaaaagcat gtctcttgga cggtcttttttatttacttg cttatctggg tgtattttgt taatagtttc ctaatgctta atgttgcttgagtttttgtg taatccaatt tcgtttttac cttttctctt gaaacaataa ggatttgtaacgagaattat gtataaccac caccacctta cggtagattt tactatccat atataaatattttaccatcc atttataaat atttgtagtt tggtactact accaatggtt gtaagtaatctgtaagaata tattctgatc attgtagatt agaaaatgtg ttactacagg tttcactagcttatcctaga actagaaaca tgaaaattat gtatcgaatg gtgaaaatat taatacaaacatatttacgt ttaaatgcat gtgtacacaa caaagtttct aaagcaagct ctatcatatagagaataaag ta9ttgctttagg tatccatata gttttgaccg acctcgatga tcatgttata ttctgtggagatttatcaac tatttataaa taccttgaaa ccgctactag acattggagt aatccctcaccttgtctcat ttggcaaata tttcctatag gttcaactta ttagtagaaa tgacaatgtcttggctgaca cttatcaaga actctccttg taatcactta gttacttcca ttatggaaaagttgaccgat cgaaaaaagg tattaaaaaa aaaaaataga aaaattaaga ttttcatagtgtaattgtaa aaaataaaat caaattattt tcagatattc cgtattggga ataaatctcagccgttgatt actatcaacg gtgtacaatt actgcctttg cctgttactt gttctgctccgtcgctcaga taggatctca acaagacacc acaaacccta aatttcgtca actccacagcgactcgattc gatcaaggaa ATGGCGTACG CTTCTCGTTT TCTCTCCAGA TCTAAGCAGgtatatactct ctctccctcg atttttctga ttctcttctt cgttctgttt gattccttttgttttcctcc catttctggg ttttatgtgt ttcgatgcga tggttagagt gagattatcgattttactgt atctctatca ctgaatcaca tcttagggtg tgccatttca atatcgtagtcgaatttttg ttatctttcg tacgatctca atcggagagt ttgttgaaat caaatgataaatttgatggg gtttttttct actcgttgtt gatttctaat acagttcgaa atgataagatgatttgcaag aagtattctt ttcatcaaaa cttgttattg atccataatt tttattatcttactctcatt acgcagCTAC AGGGGGGTCT GGTCATTTTG CAGCAGCAAC ATGCTATTCCAGTCCGAGCT TTTGCTAAGG AAGCTGCTCG TCCAACCTTT AAAGGAGATG gttagtgaccaaaactcata cttcggattt gttattatgc atagaacatt acgttttcaa taacacacctagttgaaaac agttgctttc ctttctttag cccttcgtgc ttttgagttt aacatcgtgactacttaaga atatgtcaag tcactttttt tatgtcgaat gtgtagaaaa actatattggtcaatgtaat ataatcttgt gaaacccagg ccatgattgc taggactgtt gttctgcttacttcttttgt tgagttttat atgtatccag tttatgatgg attatgttta atatgttgctgaaatctgta ctatgtgttt agagtgaaga agcattgctg tttactatta ttgactcaagttttacactt tttgacagAG ATGTTGAAGG GTGTCTTTTT TGATATCAAG AACAAATTCCAGGCTGCTGT TGATATTCTC CGTAAGGAAA AGATCACCCT TGATCCAGAG GACCCAGCTGCCGTAAAACA GTATGCAAAT GTAATGAAGA CCATCAGGCA AAAgtaggcc tcttgttactcttttgtagg tgtttgttat ttagcttgaa tcttgtatgt cgtgatctct atttctgtttgttgggattg gttttacttt tcgacttttc tgaaacgagt taaatatatg tgtcaatgctgctattttaa ccttgttaat ttggttgctt gtcatccgtt tttttggtat gcagGGCAGACATGTTCTCA GAATCTCAGC GCATTAAACA TGACATTGAT ACTGAGACTC AAGACATTCCAGATGCTCGT GCATACTTGT TGAAGTTGCA GGAAATTCGC ACCAGgtagc tgttagactttgaataattt tcagttatct taggatagtt ttccctcacc cgtaaacttg ctcttcttatgttattataa tattggaatt atcttcctgt aagatcttga atgtgatcgt taagcagttatctgaagact gcatttaact atctatattt tcatctccct ctttgatctg ctattgtttgcaacatatga agaattgttg gaagcagtct ttagttatac tcccacttgt gatatatcttgcagGAGGGG GCTTACTGAT GAGCTTGGTG CTGAGGCCAT GATGTTCGAG GCTTTGGAGAAAGTCGAGAA GGACATAAAG AAGCCTCTCC TGAGAAGTGA CAAGAAAGGA ATGGATCTTTTGGTTGCAGA GTTTGAGAAA GGCAACAAAA Agtgcgtcat cattcttcaa ccatccatacaaaacacgaa caaatgattc tcattactac ttatatgtat atcgatttac atattgatagctaattgaat tgcatgtttg cgtctcatta atctaaacag GCTTGGGATT AGGAAAGAAGATCTTCCTAA GTACGAAGAA AATTTGGAGC TCAGCATGGC CAAAGCACAG TTGGATGAGCTGAAGAGTGA TGCTGTTGAA GCTATGGAAT CTCAGAAAAA GAAgtgagtt ttgttttcttttcacttttt ttgtttctca atttatcaat cattgatctt actcatgtca taacgcgatggaacttgcgg attattcagG GAGGAATTCC AGGATGAGGA AATGCCGGAC GTGAAGTCTCTAGACATCCG TAACTTCATC TAAggtttga tccttagaaa catttgattt gttgtaagaaaaggcaaaga tctctcactt gattgtcttt gaaagagaag atcgttccct tgctgctgttttggtttggc gttcaataag gtctctcacc tggatttgag tctaactctc tctgtggttattacgcttga gattcttaga cacaaacgtt gtttcatgtt tttttgataa tggtgatcactggaatttga gataattaat aaaagttgtg atgttaattc gaaacaaaag cgtggcaagcaaaatcaacc cgagaaacta ttatagtttt gtatttagta gaccaaattc gaaccaaatctaaccgaaat gggatctgga gtatcataca ttctagatga attaaaccaa tcatatcgaacacgtggctt gtctgtgaac aattataatg ggtttgtctg agagacgtta acaactgttttcttcgccat ggcggcgatt cctctcaaag ctccttctct tcc10tttcgatcag ctttttcgat tttggatcta ttttctatga aatatcagat ctggtgattgttttacatat ttttgggttg aattcacaag attttctgga aacgagatcg attaattgagttttctgtgt ttttatctta agctagatct cgatttctat gtttttggat tgatttgataagattttcga gaattttttg tgtttttgtc aaagttcgat ctcgatttct atatttttggttgaattcac aagactttct ggaaacgaga tcgattttgt gagttttctt tgtttttaatctcgattttt ggattgattt gagaagattt tctgaaagcg agatcgatgt ttttggggattttctttgtt ttgttcaata attcggtctc tgttttctta tcaaaaaatt cgttttccatctcaaatcga tgttcttatt gatttaattg agttttagtt tgcagggatt tgatcgttggtaagctatct ttcagcaaac ATGCATGGTT ATGAAGATgt aagcacgctc atgaatttttgttttcagtg attttgtcga attcaattta aggtagatag atttgacatt gttcgataatgttatattgc agGACCTTGA TGAGGAAGCT GGGTATGATG ACTATTACAG CGGTGATGAGGATGAGTATG AAGATGAGGA AGAGGAGGAT GAAGAACCTC CTAAGGAAGA ATTGGAATTTCTTGAGTCAC GCCAAAAGTT GAAGGAATCA ATTCGGAAGA AAATGGGAAA TCGAAGTGCTAATGCTCAAT CTTCACAAGA GAGAAGAAGA AAACTTCCTT ATAACGAgta tgtggtggctaaatcacatt ttctaattca ttacaatgtc ctggaatgtg ttttgatgct gagcttattgatttttctta atgcagCTTT GGTTCTTTCT TTGGTCCTTC ACGGCCTGTT ATTTCCTCAAGGGTTATACA AGAAAGCAAA TCCTTGCTTG AAAACGAGCT ACGTAAAATG TCGAATTCGAGCCAAACTgt atgtgcattt gatctttgtt actctttgta tttttatcat ttaagATGTTTTTGCTGATG GAATTGTTTT TTGGGGTGCA GAAGAAAAGA CCAGTTCCGA CGAATGGTTCAGGCTCTAAG AATGTGTCAC AAGAGAAGCG ACCTAAAGTT GTGAATGAGG TGAGAAGGAAAGTTGAGACT CTTAAGGATA CAAGAGACTA TTCGTTTTTG TTTTCCGATG ACGCGGAGCTTCCTGTTCCG AAGAAGGAAT CTCTTTCACG AAGTGGCTCT TTTCCTAATT CTGgtatgttgtgtcttttg aaaaatcttt ttcgctattt gtgatcttta agCATACCAT TTTCATGAAGATAACTTATA CAGGTTTTTT GCTGATGTTC AAGAGGCTCG ATCTGCTCAA TTATCATCGAGGCCCAAACA ATCATCAGGT ATCAATGGTA GAACTGCTCA CAGTCCCCAT CGTGAGGAGAAGAGACCTGT TTCAGCGAAT GGACATTCAA GACCGTCTTC CTCGGGCAGT CAAATGAATCATTCAAGACC GTCTTCCTCT GGCAGTAAAA TGAATCATTC AAGACCGGCT ACCTCGGGCAGCCAAATGCC AAATTCAAGA CCAGCTTCCT CTGGCAGCCA AATGCAGTCG AGAGCTGTCTCAGGCTCAGG GCGACCTGCT TCCTCAGGCA GCCAGATGCA AAATTCAAGA CCACAAAATTCAAGACCAGC TTCCGCTGGT AGCCAAATGC AGCAAAGGCC TGCGTCCTCA CGCAGCCAAAGGCCTGCGTC CTCAGGCAGC CAAAGGCCTG CGTCCTCAGG CAGCCAAAGG CCAGGTTCGTCGACAAACCG TCAAGCACCT ATGAGGCCAC CAGGTTCAGG TTCCACAATG AATGGTCAATCAGCCAACCG GAATGGCCAA CTGAATTCCA GATCAGATTC CCGAAGATCA GCTCCTGCTAAAGTGCCAGT GGATCATAGG AAACAGATGA GCAGTAGCAA TGGAGTTGGT CCTGGTCGGTCAGCGACCAA TGCAAGACCT TTACCTTCTA AGAGTTCATT GGAAAGAAAA CCCTCAATCTCGGCGGGAAA GAGTTCTCTT CAAAGCCCTC AGAGACCGTC CTCATCAAGA CCAATGTCATCTGATCCTAG GCAACGGGTA GTAGAACAGA GAAAGGTTTC TCGTGACATG GCCACACCCCGAATGATACC TAAACAATCA GCGCCTACCT CGAAACACCA Ggtatcatga tcatgatctttcacatctct ttcttttgtc cttcctctag ccaaggcact aatttgtcaa gtaatatttacagATGATGA GTAAACCAGC GCTCAAGAGA CCTCCCTCGC GTGACATAGA TCATGAAAGGAGGCTGTTGA AGAAGAAGAA GCCTGCAAGG TCAGAGGATC AAGAAGCATT CGATATGCTTAGACAGTTAT Tgtaagtatt gctccaaact ttcttcctac tctcaaattg taagttacaattttctaatt ctattttgtc tcctgatact taaatggggg tttgtgtatc aattttagACCACCCAAGCG GTTTTCTCGG TATGACGATG ATGACATAAA CATGGAAGCA GGCTTTGAAGATATCCAAAA GGAAGAGAGA CGAAGgtaca tgagtatttt tgttatcaca cgtttcatttatttgtgttt cttggatatt ccttaacgat tgaattggtt gttaaatgca gTGCGAGAATCGCAAGGGAG GAAGATGAAA GAGAACTTAA GCTCTTAGAG GAAGAAGAAA GGAGAGAAAGACTGAAAAAG AATCGGAAGC TGAGCCGTTA Gaagaatcct ttctcctttg tgtctttgtcttcttttagg acttttttag tgttttctca ttgaaatctc tttggccgct tgaggcaaaaaagagtttga cctttttttt gttttgtgtt ttcaaattaa ggatcttttt tttgttcatggaaattgtac aattagaaat aatatctttt attggggaca cttcaagaag aatctgttggaaaccttccc agttagtgaa agcttgattc tctttttttt ttttggagta aagctaaaaccagaggagga tgataaagaa aaagaaacaa agaatatttc tttattcacg tgtagagttcctttagctga taaaatttca ctttttatga gtctgataac atgattttag tgattctttgtctcttttat tctttggcta aacaaattcg ttgagaaatc aaatggtgac caaagaagaagattgccttc ctcctgtaac ggagaccacg tcgagatgtt attctacttc t11ataccggaaa tgtcgtaccg tcctgaacat aatgcacata atttgactgt agctaggctgtaaaagattt taacaaaatt gttttagaat aaaattataa gtttaaaagg tatggtttgacttgaactgt actggaattt ataccggaaa tatcgtaccg ttctgaacat aatgcacataatttgactgt agttaagcag taaaagattt taacaaaatt gttttaaaat aaaattataagtttaaaagg tatggtttga cttgaactgt accggaattt ataccggaaa tgtcgtaccgtcttccacac ttcggagaaa cgacagataa gctctctctg ttctcttgcc acacttcccaatacatggat ccattttgac gtcatcttta tcactatctc tctattatat aaatctcttcgtaccctttt accgattctt caccgtgatc gcttaatcag acctcaattt cgttgttaaagaacaaagct ttaagcagcc ATGGATCCAA ACCAACGTAT CGCGAGAATC TCTGCTCATCTCAATCCTCC TAATCTTCAT AATCAGgttc aaatttcgtt gaattctctg attcttaaaccaatttggtg atcgaagttt gattcttttt tttttgggtt gatctgattt cgatgatttggatttagATT GCTGACGGGT CAGGTTTGAA TCGGGTGGCT TGTCGGGCAA AAGGTGGATCACCCGGATTC AAAGTGGCGA TACTTGGAGC AGCTGGTGGA ATTGGACAAC CTCTTGCGATGTTGATGAAG ATGAATCCTT TGGTTTCGGT TCTTCATCTC TATGATGTTG CTAATGCTCCTGGTGTTACT GCTGATATTA GTCATATGGA TACTAGTGCC GTTgtaagtt ctaaattctccggttttcga ttccaaaatt actactttag atgttttaga gctaataaaa ttgatcaatagtgatgattg ttgttgttga aatagagaaa tgagcttaaa gatcatatac atgagcttaaaaactagtac tttagatgtt gtagagcact agtgatgatt gttgttgtta agatcatatagagattgttg tgaatgtttt tggaaaactt tgttttagGT TCGTGGATTT CTCGGGCAGCCGCAGTTAGA GGAAGCACTT ACGGGTATGG ATTTAGTGAT CATACCTGCT GGTGTTCCGAGGAAACCAGG GATGACGAGG GATGATCTGT TTAACATTAA TGCTGGGATT GTGAGGACACTCTCTGAAGC TATAGCTAAA TGTTGTCCTA AAGCAATTGT GAATATAATC AGTAATCCGGTGAACTCCAC GGTGCCAATC GCAGCTGAGG TTTTCAAGAA AGCTGGAACC TTTGATCCAAAGAAACTCAT GGGTGTCACT ATGCTTGATG TTGTTAGAGC TAATACCTTT GTGgtatgcactcattattt ggtcttagaa tggtgtttag tattgtccat tagaactcaa ctatcttcttctttgcattt atggggttga atagGCGGAA GTAATGAGTC TTGATCCCCG TGAAGTTGAAGTTCCGGTTG TTGGAGGACA CGCAGGAGTT ACGATTTTAC CACTGCTTTC GCAGgtttgagatcagatga ttctcatcat tatgtttgtt tgaagcagat ataatattct catcattatgttggctacag GTGAAACCTC CTTGCTCGTT CACTCAAAAA GAGATTGAAT ATCTCACAGACCGCATCCAA AACGGTGGCA CTGAAGTTGT TGAGgtataa actaatcttt cagctttctttgttttgaac ttcgaattaa gcggtgcatt taccgtttaa atcattttgc agGCTAAAGCTGGAGCAGGT TCTGCAACAC TATCCATGgt aggtcttttg ttgtaacatg ggagttgtatgacaaagctg ggaatttgat tgatatctca atctgttaaa tgataaaata cagGCATATGCAGCAGTGGA GTTTGCAGAT GCTTGCCTCA GGGGTCTACG AGGTGATGCA AACATCGTTGAGTGCGCATA TGTGGCATCC CATgtacagt cctttaattc aactgtacaa tattgtatctataaaagatc tcttaaccct aaaagatgaa catatggact ttgtcttatt cctcatacagGTGACTGAGC TTCCCTTCTT CGCATCGAAG GTGCGTCTGG GACGATGTGG GATCGATGAAGTGTACGGCC TTGGACCATT GAACGAATAT GAGAGgtaaa agttaaaatc ttgatcgatctgacatcttg aatttacttc gacatgtttg tatgttcata tcgtttttcc gccctttctttttgctaatt gatcagGATG GGATTAGAGA AGGCAAAGAA AGAGCTTTCA GTAAGTATTCATAAAGGTGT TACCTTTGCG AAGAAATAAa gagactcgat cgtgaataaa cacacttaagcgatggtttt ggaatagtca gagttttgga ataagaataa tgcctcacaa taaaagctcttgcggtcttc ttggatccaa tcttaaaggt tcaagaaact catctccttt aggtaaaatcttcgattgtt ttatcgttcc atcgaaccac tttgttctta gatacaagaa cgtttatgatttatgtagtt gggctataaa agtgagaaca gagcaataat cttgcaacat tttttctcatcttcttggtg tgtttttttt ttgttggttt tcatcttttt gttcttgctc atgagagcatctttagaagg ctattgttgg gaagtaaata agtttgcatc gcggaaaaga tgatcaaggtcattcgggat acctcatacc tgtcatttga gttcatctaa gtaacttctt acgcttttaggctatctacg gttgttctta ggatttaggt gttagtggtt atgctatta12aacaaaaata ctcgaattca aacttaagca gtcacagtaa cttcgtgcag gagcttaccggagatgaatt catcataaac cggcgacggt agcggcggag caaagcaaaa atgcgatgattcatggaata ggtctcaaaa gtcacgagag gatcacgtga gatatcttga aaagaatcggacggctaaga ataaagcaga ctaattctct tatctatctc taaccgttaa ataaaaactaaagttttaac cttttaacct gggactaggg ttttcagatt tcactactct tgtcgtgtaagacttgagca actatataat ctcaactttt ctcaatcact atccgctgcg gtctcgccgtgctgcccaca acaatctccg acttcgtctt cctcatctat catcgtcgtc gtcaaccttatttatctctt aatttatcat taaaaccaaa aaaccaaaaa aaaagcctta gctttcgtttcttcaatccc agcaaaaaaa ATGGCTCAGG TTCAAGCTCC TTCTTCACAT TCTCCTCCTCCTCCTGCTGT TGTTAACGAC GGGGCTGCGA CGGCTTCTGC TACCCCTGGA ATCGGCGTCGGCGGCGGTGG AGACGGAGTC ACTCACGGTG CTCTTTGTTC TCTCTATGTC GGAGATCTGGATTTCAATGT CACCGATTCT CAGCTTTATG ACTATTTCAC CGAGGTGTGT CAGGTTGTATCTGTTCGTGT TTGTCGTGAT GCTGCTACCA ATACTTCTCT TGGTTATGGT TATGTCAACTACAGCAACAC CGACGATGgt ttgtgcccta aaaatttccc cttttttttg ttgattgataacatttgata ttttggtaaa gatctgattt ttcggttttg gaatcattcc tttggctagtttgattgatg ggttttgttt gattttgtta atagatatta atttacacga atttaaaatgttgacactga ttagggattt tgttatcatt gttgtttttt gtaatgtcag CGGAGAAGGCAATGCAGAAG TTGAACTACA GTTATCTCAA TGGGAAGATG ATTCGGATTA CTTACTCTTCTCGTGACTCT TCTGCCCGTA CAAGTGGGGT TGGGAATTTG TTTGTAAAGg tatattctttgtttgatgtc tcttatctag cagcttctct ttttgtttga ttgcctaatt atgtattctttctttatgtg aagAATTTGG ATAAGTCAGT TGACAACAAA ACTCTGCACG AGGCGTTTTCCGGGTGTGGG ACTATTGTGT CCTGTAAGGT TGCTACTGAT CACATGGGTC AGTCTAGAGGATATGGGTTT GTGCACTTTG ACACTGAGGA TTCAGCTAAG AATGCTATTG AGAAGCTGAATGGGAAAGTG TTGAATGACA AACAGATTTT TGTTGGACCT TTTCTTCGTA AGGAGGAAAGAGAGTCTGCT GCTGATAAGA TGAAGTTTAC TAATGTTTAT GTGAAGAATC TTTCGGAGGCGACTACTGAC GATGAGTTGA AGACTACTTT TGGTCAGTAT GGTAGTATCT CGAGCGCTGTAGTTATGAGG GATGGAGATG GGAAATCCAG GTGTTTTGGA TTTGTCAACT TTGAGAATCCTGAAGATGCA GCTCGTGCTG TTGAAGCTCT CAATGGAAAG AAGTTTGATG ATAAGGAGTGGTATGTGGGT AAAGCTCAGA AGAAATCTGA GAGGGAACTT GAGTTGAGCC GGAGATATGAACAAGGCTCA AGTGATGGTG GAAACAAATT TGATGGGTTG AATTTATATG TTAAGAACCTTGATGATACC GTCACCGATG AGAAGTTGCG CGAGTTGTTT GCCGAATTTG GTACAATCACCTCTTGCAAG gtcagcattg tttgttttcc gcatacataa taacatgaga gatgcaattttttttgtctc ttgattgatc ggaacctcat acttttgtaa caaacagGTT ATGCGGGACCCTAGTGGTAC TAGCAAAGGA TCAGGATTTG TTGCCTTCTC TGCTGCCAGT GAAGCTTCAAGAGTGgtaat ttaaataatc ctgtgtcaag acaatattaa atttgttttg agcctctattttctttcttg attcaatttc ttttggggtc ttctgcagCT GAATGAAATG AATGGTAAAATGGTTGGTGG CAAACCGTTG TATGTTGCTC TTGCACAGAG GAAAGAAGAA AGGAGGGCTAAGCTGCAGgt agtacttccc accatagata aacaacccct acgtacactt atgtttgctatgtctcaagt ccttatgttt ctttttcagG CACAGTTTTC TCAAATGAGA CCTGCTTTTATCCCCGGTGT CGGTCCTCGA ATGCCAATAT TTACAGGTGG TGCTCCAGGT CTTGGACAACAGATTTTTTA CGGTCAAGGA CCTCCACCAA TCATCCCTCA CCAGgtacca ttttgttctaactgaccact atgtaactct gcttgaatat gggactcttt caatcaataa gcactcacttggttctactt aaatctgtga tatagCCTGG ATTTGGATAT CAGCCTCAGC TGGTTCCTGGAATGAGGCCG GCCTTTTTTG GTGGACCGAT GATGCAGCCA GGTCAGCAAG GTCCACGACCAGGTGGCAGA CGGTCAGGTG ATGGACCCAT GCGCCATCAG CATCAGCAGC CAATGCCTTACATGCAGCCA CAGgttagtt tataaaaaaa ggagaatatg tcttaaatcc cagatcaagatgaatctata agtctttgct ttcttctctc ctctagATGA TGCCAAGAGG ACGAGGGTACCGGTACCCTT CTGGTGGTAG AAACATGCCT GACGGTCCAA TGCCAGGAGG AATGGTTCCAGTTGCTTATC ACATGAATGT AATGCCGTAT AGTCAGCCTA TGTCCGCTGG TCAATTGGCTACTTCCCTTG CTAATGCTAC ACCTGCTCAA CAGAGAACAg taagtctctc tcaatacctcttgacttgct gctatgtagg agaaaaaata agattactta cattcgatat gtttgttttggggtttttgt agCTTCTTGG TGAGAGTCTA TATCCATTAG TGGACCAGAT AGAGAGTGAGCACGCTGCGA AAGTGACTGG TATGCTTCTG GAAATGGATC AGACCGAGGT TTTGCATCTGCTCGAGTCAC CAGAGGCTCT AAATGCCAAA GTTTCAGAGG CATTAGATGT GTTGAGAAACGTGAATCAGC CATCTTCACA GGGAAGTGAA GGCAACAAAA GTGGAAGTCC AAGTGATCTCTTGGCTTCAC TTTCCATCAA TGATCATTTA TGAgaagctt ttgttcgagt ttttttttttactttgactc tcttcctctc tatctctctc tctgattgac aaatttttgc gggaatctatttgctgtttt agactttttt tgctcgatat gattgtttct gttttgactt cttacttttttgggttgact taaaaaagga tggttttatt ttattttgtt ggattatatt ttactgttgcaaaattttgc gctcagttta aaacttttta tgattgattt aagtttttag ttatttgttggtaattgtca attttgaacg agaaggtgat gaaattagga tatgtatagt tcattagctaattaatccaa ttttagtttt tcacaaatat taacaactga ttataaatgt atcattttttgtgattacca attttcataa ttctaaacca atagtaaatt actttgtagt aaaatcaacacaaactcatg gaccatgact cgtaaagaag ataaaaacaa gtggtacatt tat13atatcaacat caaacaatat tatagcaaag ataatgtgat tatttggtta ttgtaattgaaattaatcca tataccaatt cattttgttt tgttatatat atcgagaggt tattgtgatttaaaaaaaaa aaatatttaa tcatctaccc agtaaaacta cgccacataa ccaccacaataactctaaga gcacttctta ccttgaaacg tctcttactt aaattaataa ttaaatctttaatttttatc atttattaac ctaagaaaca gctaataaat atttattaat ctaagagacttacacgtctc tctttcttat aacatatcaa catcaaacaa tattatagca aagataatgtgattatttag ttattgaaat tgaaattatc cacacaccaa ttcattttgt tttgttatatatatcgagag gcctaagaca acacttacac gtctatcttt ctttcctttg tataccaaaaaatataaaat aaaaaacact ATGGCGGAAA ACTACGACCG TGCCAGTGAG TTAAAAGCATTCGACGAGAT GAAGATTGGC GTGAAAGGAC TCGTCGACGC CGGAGTCACA AAAGTCCCGCGCATTTTCCA TAACCCGCAT GTTAACGTAG CAAACCCTAA GCCTACATCG ACGGTGGTGATGATTCCAAC AATCGATCTA GGTGGCGTGT TCGAATCCAC GGTCGTGCGA GAGAGTGTAGTTGCGAAGGT TAAAGACGCA ATGGAGAAGT TTGGATTTTT CCAGGCGATT AACCATGGGGTTCCACTTGA TGTGATGGAG AAGATGATAA ATGGTATTCG TCGGTTTCAC GACCAAGATCCAGAAGTGAG GAAAATGTTC TATACCCGAG ACAAAACCAA AAAGCTTAAA TATCACTCTAATGCTGATCT CTATGAGTCT CCTGCTGCGA GTTGGAGAGA TACCTTAAGT TGTGTCATGGCTCCTGATGT TCCAAAAGCA CAGGACTTAC CTGAGGTTTG TGGgtaagaa tacatttctttaatttattt ctaatctaag aagaaacaag actagtttaa actttgattt gatattattgatgtggtttg aaaattggtt ggtgtgaata ttgttagGGA GATCATGTTG GAGTACTCAAAGGAAGTGAT GAAGTTAGCG GAGTTAATGT TTGAAATTTT ATCAGAAGCT TTAGGGTTGAGTCCTAACCA CCTCAAAGAA ATGGATTGCG CAAAAGGTTT ATGGATGCTC TGTCATTGTTTTCCACCCTG TCCTGAGCCA AACCGAACAT TCGGCGGCGC TCAGCACACA GACAGATCTTTCCTTACTAT TCTTCTTAAC GACAACAATG GAGGACTTCA AGTTCTCTAC GATGGATACTGGATCGATGT TCCTCCTAAT CCCGAAGCAC TTATCTTTAA CGTAGGAGAT TTCCTCCAGgcaagtcgttg tttactcttg aattgaatgg tctataaaaa cccataagtc acaaaaagtaagtctttttt tttttttttg cagCTTATCT CGAATGACAA GTTTGTAAGC ATGGAGCATAGAATTTTGGC AAATGGAGGT GAAGAGCCGC GCATTTCGGT CGCTTGTTTC TTTGTGCATACTTTTACTTC ACCAAGTTCG AGAGTATATG GACCCATTAA AGAGCTTCTG TCTGAGCTAAACCCTCCAAA ATACAGAGAC ACCACCTCGG AATCCTCCAA TCACTATGTG GCTAGAAAACCTAATGGGAA TTCTTCGTTG GACCATTTAA GGATCTGAaa cttgaaccta tatctcagaggttttcttga gtttccaata aaatttggtg cacgctgtga cgtaccatgt tcaagaccttgaacgtatca ttcaataatt cttccgttgt gagtttcggc tgcatgtttg acccaaaccagagagagtat ggatcaatca aggagagtga acctaaaaat aaaaaaaaaa taaaaaaaagagtgtgaacc tttaattatg taaaatctta aataaacatc gagattgtat ttaaggattttccatttgtt ataatctcaa tttaccttta atatgaggtt tatattcttt cttataacatatcaacatca aacaatatta tagcaaagat aatgtgatta tttagttatt gaaattgaaattatccacac accaattcat tttgttttgt tatatatatc gagaggccta agacaacactttggcgtcta tctttctttc ctttgtatac caaatgtttg attttgttat ttaaatca14acgtacgatg cctgagctgc gtagcaacgc acgcagagat cgggataaga agaacccgaagcagaaccca attgctttga aacaatcacc tgttaggaga aatccgaggc ggcagctgaagaagaaagtg gtggtgaagg aagcgatcgt tgcagctgaa aagacgacgc ctttggtgaaagaggaagaa gaacagatta gggtttcgag tgaagataag aagatggatg agaacgacagtggtggtcaa gcagctccag tgcctgatga tgaaggaaac gctcctccac ttcctgaaaaggtgtcaact ttattgttgg ttttgttgtt tttatgaggt tttagttcat cggaattgtctcttgcattg tgtgttgtgt tttttgatta ggagaaagct ctcaaactta ggcatgccacttaaagttaa aactttctct tgtaggatga tttgattatt gactccttgg tttttacaggttcaggttgg taattcaccc ATGTACAAGT TAGATAGAAA GCTAGGCAAA GGTGGTTTTGGACAAGTTTA TGTTGGTCGA AAGATGGGCA CGAGTACTTC TAATGCTAGA TTTGGCCCGGGAGCTTTGGA Ggtatgctgt ttgtgtttgc aagtttactt gctttctttt ggttttctgtgatctgtaat gtgattttga tgtgtccact tttgtagGTG GCTTTGAAGT TTGAGCATAGAACCAGCAAA GGATGTAACT ATGGGCCACC GTATGAGTGG CAAGTTTACA Agtgagcgttatggtctctt gtctttggct ctaggattca tcttctgctt gttcaaatag tttgtttataaaaggatgag ataactaatg atgctttatc atctgttcgt ccagTGCACT TGGTGGCAGTCATGGTGTGC CACGAGTTCA TTTTAAGGGT CGGCAGGGCG ATTTTTACGT GATGgtatgtggaatttagt caggtctgaa caagagcact tgcagtatga tgaattactg tttttaatctttcatacagG TTATGGATAT CCTTGGGCCT AGCTTATGGG ATGTTTGGAA TAGTACCACCCAGGCgtaaa cattcactct gagaaacatt tactttattt tgtagcatct gaagattttgttatatgaac cattgataaa cataattttt cctgagatga gcccttcaat attggtggcactcaccatat gatttgtgtg ttttatacat tccagGATGT CAACAGAGAT GGTTGCATGCATTGCAATTG AGGCAATATC CATATTAGAA AAGATGCATT CTAGAGGgta attttctaatatttctgcta ctgtaactct ctttcttcaa gtggttttta tttgctaaga agcagtgctcctgtttctac agATATGTGC ATGGCGATGT AAAACCAGAG AATTTTCTGC TTGGGCCTCCTGGAACTCCT GAAGAGAAAA AACTTTTCCT TGTAGACCTC GGCTTAGgta cactttatttttgttataag agtgagcgta ctttattgtc tttctgctgc ttatccaatc tgttgatcttgcagCATCCA AATGGCGAGA TACTGCAACT GGACTACATG TTGAATATGA CCAGCGTCCTGATGTTTTTA Ggtaagttga ttcagctagg cataaagcct gtgagattga ttcttatcagggacttcaac tttagggtac ttattaacgt gttggctttt tcattttcag AGGAACAGTACGTTATGCTA GTGTACATGC TCATCTTGGC AGAACTTGCA GTCGGAGGGA TGACCTGGAATCTCTTGCTT ACACTCTTGT TTTCCTTCTT CGAGGCCGGC TTCCATGGCA AGGGTACCAGGTTGGGGACA CTAAAgttat ttgttttatt tcctggcaac tttccttgtc aatcattaacttggtctatt tgttagggag agAACAAAGG TTTCCTTGTT TGCAAGAAGA AGATGGCCACTTCCCCAGAA ACTCTTTGCT GCTTCTGTCC CCAACCTTTT CGTCAGTTTG TCGAGTATGTGGTCAATTTG AAGTTTGATG AGGAGCCTGA TTATGCTAAA TATGTCTCCC TTTTTGATGGAATAGTCGGC CCAAACCCAG ACATTAGGCC AATAAATACT GAGGGTGCAC AGAAGGTGATTTGGTGAtct tctttatgaa acatatattg aggtttacta tttagctccg gtctgaatgtctaaagtttt ttcgtgtttg tctggtgtga agctcataca tcaagtgggt caaaagagggggaggctgac aatggacgag gaggatgaac aaccaacaaa gaagatcaga ttgggcatgccagcaacaca atggatcagc atttacagtg ctcacagacc aatgaaacaa cggtgacatcttggatcata cttgagaatt cttcggctgt acgttgatga ccatgcagct gacatgtcttttatctttgt gcagatatca ttataatgtt actgatacaa ggcttgcaca acacattgaaaaaggaaatg aggatgggtt atttatcagc agtgtggctt cttgcacgga tctctgggctttgatcatgg atgcaggaag tggctttacg gatcaagttt accagttatc accaagctttctccacaagg tagcttcatt taatatt15tgtctaactg catgtctatc atgtacatta agatcaagac taatataaaa ctcacaaatcaatatactac ttaagaaaaa gaaaaaaatc tggttctttt ttattcatgc acacacatagtataagttaa aaaatgacca tattaatttg taaactgacc aatcgtgtat ataaaaggacaccttctcta cctacttata tattatacat catttctcta cattgttcac cagctctctccatctctcta ctccaagcat aagaggtaat ctctcaatag tttgaaacaa ccttttgtaaaacgtattgt aacttactta aaattgtaga acgtgagaaa tatcttaaat gtttaaagtcttcctttttc acccaagaac tgaaaatgat tttgcatata tattttctca agtgggtataatggatataa agaaattata caatgactaa ggaacaaaat aaaatctctt ttattgaataatgatttgaa tcagttctcg ATGGCCCAAA GGTTGGAGGC AAAAGGCGGA AAGGGAGGGAATCAATGGGA TGATGGAGCC GACCATGAAA ATGTAACAAA GATACATGTA CGAGGTGGTCTTGAAGGAAT CCAATTCATC AAGTTTGAGT ATGTCAAAGC TGGACAAACA GTTGTTGGACCAATTCATGG TGTCTCGGGT AAAGGTTTCA CACAAACGgt aagcatgtta aatatagaactacctgaact cttttttttt gaagatataa ggttgtatcc tggattgaat gtttagaaaatttgaacaca gaaactaatc ggttgtgaag gtgatatgat gttaatagct agatgtacatgtatatcctt actatatata tcagaacttt ttagttggtc aacttttaat gatcggtgcttaaattttat taattaatcg agtctccata attgttttaa attatccccc acagcttatatattactgat caagttttaa tattcttttt tttttcttac agTTTGAGAT TAATCATCTCAATGGCGAAC ATGTGGTGTC AGTAAAAGGT TGCTATGATA ACATATCCGG TGTGATCCAAGCACTTCAAT TCGAAACCAA TCAAAGGAGT TCTGAAGTCA TGGGATACGA TGACACTGGCACTAAGTTTA CACTTGAAAT CAGTGGAAAC AAAATCACTG GGTTCCATGG ATCTGCTGACGCAAACCTAA AATCTCTTGG AGCTTATTTC ACACCACCTC CTCCTATTAA ACAGGAATACCAAGGTGGTA CTGGAGGCAG CCCATGGGAC CATGGTATTT ACACCGGCAT AAGAAAAGTCTATGTTACAT TTAGTCCCGT TAGCATATCG CATATCAAGG TCGACTACGA CAAAGATGGAAAAGTGGAAA CGCGTCAAGA CGGGGACATG CTTGGAGAAA ATAGGGTCCA AGGACAACCAAACGAGgttc tagttttaac actccttact tcttattatt ttagtttttt ttggtaaaatgctaaatctt taatagaaag gaatatgtca agagtaaatc atatatggga agaatcataaaccattcgtt aacccttcaa ttttttaaaa tatataaatt gaaggatccc tttatttgttttttgcagTT TGTAGTGGAC TATCCATATG AATATATTAC ATCAATAGAA GTGACCTGTGACAAAGTCTC TGGCAATACA AACCGAGTTA GGTCGTTGAG TTTCAAGACA TCAAAAGACAGAACATCTCC TACATATGGA CGTAAGAGCG AGCGAACTTT CGTGTTTGAG AGCAAAGGTAGGGCTCTTGT TGGGCTCCAT GGAAGGTGTT GTTGGCCTAT TGATGCTCTA GGTGCACATTTTGGTGCGCC TCCTATTCCT CCACCTCCTC CCACGGAGAA ACTACAAGGA TCAGGTGGTGACGGAGGAAA ATCATGGGAC GATGGAGCTT TCGACGGTGT GAGAAAGATA TACGTGGGACAAGGTGAGAA TGGTATCGCA TCTGTCAAGT TTGTGTATGA CAAGAACAAC CAGTTGGTACTAGGAGAAGA GCATGGAAAG CATACTTTGC TTGGATACGA AGAGgtgatt aattatactatacttcgttg ctattttctt aaactataac tataaagttg tgttattgtt attctgatgaaccgctttca cagTTCGAGT TGGACTATCC GAGTGAATAC ATCACAGCGG TAGAGGGTTATTATGATAAA GTGTTTGGTA GTGAATCTTC AGTAATAGTC ATGCTTAAGT TCAAGACCAATAAACGAACC TCCCCGCCTT ATGGAATGGA TGCTGGCGTT AGCTTCATAC TCGGGAAGGAAGGTCACAAA GTGGTAGGGT TCCATGGAAA AGCTAGTCCC GAGCTCTATC AGATTGGGGTCACTGTTGCC CCAATCACCA AGTGAcgacg tccttgaact ttattctcaa atcaagtttgatcatgcata tttgttaagg cgcctctctc gtattgtctc caccactttt ctacgtgttttgttttctcc gatgttttac tttgaaaaat ctatttcaat caagcaatat cgtgtaataaaagcaaggtt ctcgaacctg cgggtaaact ttttattttg aataatttat tttcaatcaagcattctttt gactttttgc tttaaccaaa tgtctctagt ttcaaaaaag attaagaactcaaagatata agaattactt tcttattaag cttactttct tattaagctt aggaaaattactcaaaacgt aaacaatctc aaagtcttaa tttctctaaa ctcatatagt caaccacagcttgggactca tatatataga gattaataaa ccaaaacata ctaggattag cattagataactcctaacat atatctttag atatctccta aagatttaac ataat16ttctaaggaa atgttttgtt aatatgaatt cattaactgc aacctaaaga aaagtttgtgaataactcag cgtgacctaa tcctacaaaa aaagtataat gttccactca gagtcactggtcaaaaagta ttaattcttt aaaagaacct ctttttgtgt tgtataatga actagtttggttataaactt ataacttaaa gggacatggt tgttgactta aacttaggta gaattgttttttatatagaa atggagcaag tcgatcttaa atgttagatc ataaataaac ttctcatgaaacctaaaaga aaaaatatat aaacacccaa acccattcca ttcacttcaa caactcaattacaattatgc ttatatatct tacatgcaaa acttcatcat tatcatcatc atctctagctcctcctttga atcttttcca aattcaactt ccgaaagaga taaccctaat ttctagtcttcttcttctaa attttcttcc ATGGATATCG AAAAGGCAGG GAGCAGAAGA GAAGAAGAAGAACCCATTGT TCAAAGGCCA AAGCTAGACA AAGGCAAAGG AAAGGCTCAT GTATTTGCTCCTCCTATGAA CTACAACCGG ATCATGGACA AACACAAGCA AGAAAAGATG AGCCCTGCCGGGTGGAAAAG AGGTGTAGCA ATCTTCGATT TTGTTCTTAG ACTCATCGCA GCAATCACAGCTATGGCTGC TGCAGCAAAG ATGGCGACAA CGGAAGAGAC TCTTCCTTTC TTCACTCAGTTCTTGCAGTT CCAAGCTGAC TACACTGATC TACCAACTAT GTCgtaagtt tctctccaaatgttactctt actataggtt atgccaagaa tgtagtaacc aactatggaa atgaaaccccaaatgtgtat agtcgtacta tagataatac caagactgct acgtagctta acccgttgaatccaaccaaa gccaggctag ttgcaaagtt caagcagtag ttagagagaa aaaatgagctacgttttaaa taagggggga aaaaaaacta tcaacatgaa tttcgagcaa tgtgcttggtgcttattagg gatttaatta tggtacatga ttttcaatta tataaagatt caaacttatatcattttttt ttattgtttt gttttgcagA TCTTTTGTGA TAGTAAACTC AATCGTGGGTGGCTACCTAA CCCTCTCATT GCCTTTTTCT ATAGTCTGTA TCCTCCGCCC CCTCGCGGTGCCGCCTAGGC TATTCCTGAT CTTATGTGAT ACGgtaacat ttataaaaaa aatttgaaaataaatagtta taataatgca atgccaaaca tacaaatgaa atttctcatt ttgtttgtggtttaacaatg aaacttttcg tagctttaaa aaaaagtaca aacgcaaacg ctaaaataagtcaaggcttt acttaagctc gagtaatcct tatattggtc acaaattaca atgaatatgtttgttgagta aacatatgac aaatccctct aactagttcg tacggttgtg ttggtccagGTGATGATGGG CCTCACCCTC ATCCCCCCAT CCCCTTCCCC ACCCATACTT TACTTGGCGCACAACGGGAA TTCAAGCTCG AACTGGCTTC CGGTTTGCCA GCAGTTTGGT GACTTTTGCCAAGGAACGAG CGGTGCCGTG GTGGCATCCT TTATTGCTGC GACTCTTGTC ATGTTCCTCGTCATCCTATC TGCATTTGCT CTCAAGAGAA CAACCTGAaa acttggattg atcctcttgattaaattttt atgtgctttg atattcattt gtgtgaattt ttattaaaag gttcctatgtataatttggt tttgttgtgt ttggtaactc gggttttagt gtggaaaaat gttgtaaatcaatcttctat attcacatat tgttttcttt ttccctatat aattttcgtt tcaaagataacaaattttaa acttatatct gcccggccat aattttaatt aaattagtaa gggtgttaagttgatgtaat atcacatgat tttaaatatc taagtaacta actaattatatatcattata tttatatatt tgactaggtg gggctcaatt ggctccaaag aattttgtttgcatgcttaa ttattttgta tttggtggat gatttgattt gaaatgataa aagtttaatccattgtcctt ccacctcttc tagcatttga tattttctcc tattaattgt ttaatatg17ttgtaataag taaattcggc cacctagttc tccggtgaaa gaaagaagaa gacacaaatggagctccgtg acgtggaaaa acattattag gcccaaaacc ctctgactta aaaaagacttgataattgaa taaatagttt aatgtcgttg acataaacgt aagccgtctt agctcagtggtagagcgcgt ggcttttaac cacgtggccg tgggttcgat ccccacagac ggcgttttcgtattccgaca taggttgtct tttttgctgc ttttctttaa ctgaaatatt ccgaccaattttttccagct gataagccca acggacaatg tgtaatattg cgattttata taaaagttttgggccttttg attttccttg caataattaa cactcggtct tctccaacct aacaattattctagggtttt agagtttccg cacgaatcac gaatctctct ctctttcaca cacttcacactttcaatata cactctcatt ATGACTACCG AAGAGAAAGA GATCCTCGCC GCCAAATTGGAAGAACAGAA GATCGATgta attgattact cttttattct ttacctatct atcatctctgtttatttgtt gttatttgtc ttttagtctg gaaatcatta gactgaattc agagttttttaatctgttcc tgcccagatc tttgcttttg ttttgttttg tatatgcaaa tattggaccttattataaga ctttagatct gaatttacat gtaattaacc tttgtggatt ctctcattttcccaattagt tcaattattg atgatttgtt gtagCTCGAT AAGCCCGAAG TTGAGGACGATGATGATAAC GAAGACGATG ACTCTGATGA CGATGATAAG GATGATGACG AGGCTGATGgtaaaagcttt ctacatttca ttcatcaaat tactggaata attagtatag ttcctagtatttctgttagc ttacatctgg ggcagatttg ttgatgctca cgtgtatgtg tagatatgtagcaatgataa ttatatggcc atagcttgaa aatttagtga aaatgaatcc atcttctttgttttcaaata atctttgcgt tgacttgtgt tgatagacat gtttgtggaa cttaatgttatcatctattt tattcttgtt gattggtgat tggaaaacag GACTAGATGG AGAGGCAGGAGGTAAGTCAA AACAAAGCAG AAGTGAGAAG AAGAGTCGCA AAGCCATGCT CAAGCTTGGCATGAAACCCA TCACTGGTGT TAGCCGAGTC ACCGTCAAAA AGAGCAAGAA Tgtttgtgttttctctttaa tattcagtca atcttaattt cttttattca cacatcaggc tttaatattgatctgttttg gggacatttg ctttggaaca cagATCTTGT TTGTCATATC AAAGCCTGATGTGTTCAAGA GTCCAGCATC AGACACATAT GTGATCTTTG GAGAGGCGAA GATCGAGGATTTGAGCTCTC AGATCCAGTC GCAAGCAGCA GAGCAATTCA AGGCACCAGA TCTCAGCAATGTGATCTCAA AGGGTGAGTC ATCGAGCGCT GCAGTGGTTC AGGATGATGA GGAGGTTGACGAGGAAGGTG TTGAGCCAAA GGACATTGAG TTGGTGATGA CTCAAGCAGG AGTGTCTAGGCCAAATGCTG TGAAGGCTCT CAAGGCTGCA GATGGAGATA TTGTCTCTGC CATCATGGAGCTTACCACCT AAaccaaagt cttttctact tagatgtggt ttaacctgag ttatgtgccagagattgtcc aaagaattcg gaaatttttg gtttcaatgt ttttcatgaa gtgattttcgatgttgtatc agtataaacc tcataagttt ttgattttca gtttgatttt atattgaatatcaagtccaa gtgtttacca ttatagactt gtagttataa tttgtcaagt atcagtctgtttaatgaacc gaacccaaag gatatggaca ccccttcact ccaaccaata cgaggtatcaactgaggtta atcgatacat gcagtacaat gtacaaagtg ctacaagtgg aggttcatagactagaaaag tattcaacag gacctgattc taagagaaat tgttataaag ccgatgtttattacctaact cctcaaggaa ggaggctagg gagttgcaag gaaggagctg gttttatccaagactacgaa agattcaaag gcacactgat ga18tcgatctgtg ttttgatttc tcgatcttga atctgttgga tcttgaatcc agtgagctgattttgagtct tgttcagata tatttgatat tgcctagatt cagtttcggg tttctcaatatatttctcga ttgttaggtt tctatattga ttcaaatcga ttcatttgtg gcgagtttgattgatttgag aatgtttgct ttccactatt ctaatggtta attgtgtaat tctttgcttccttgactcac cttgtttgta gaagctacag atctgttgca gaaactatcc ttggactcgccagcaaaagc ttcagagatc cctgagccta acaagaaggt gatttgcaga ttgaattttggttttctgtt gtcacaacct ttgcttcttc cagttttttt taacgctttt gttttgtgtcttgtgtagac tgccgtctac cagtatggag gcgttgatgt tcatggtcaa gttccttcttatgatcgatc tttgacacca ATGCTTCCCA GTGATGCTGC TGACCCTTCA GTTTGCTATGTTCCTAATCC TTACAATCCC TACCAGTATT ACAATGgtag cttcatcctc aaatcatttacaatctagaa acattatttc actaaattgt caccactggt ttaacaagtt tttcgttttgtaacttttca gTATATGGGA GTGGTCAAGA GTGGACTGAC TACCCAGCTT ACACAAATCCTGAGGGTGTT GACATGAATT CTgtaagtgt gtgctgacta gttataatag tgcctttcatcgtctttata ttttctttgc ttaacaggtt caatatttta ccagGGAATT TATGGAGAGAATGGGACTGT TGTGTATCCT CAGGGTTATG GGTATGCAGC GTATCCTTAC TCGCCAGCAACTAGCCCTGC TCCACAGCTT GGCGGGGAAG GGCAGTTGTA CGGTGCTCAG CAGTATCAGTATCCTAACTA TTTTCCAAAC AGTGGACCGT ATGCTTCATC TGTGGCTACA CCTACCCAGCCGGATCTCTC TGCAAACAAA CCTGCTGGTG TGAAGACACT ACCTGCGGAT AGCAATAATGTTGCTTCTGC TGCTGGTATC ACAAAAGGAA GTAATGGATC AGCTCCAGTG AAACCAACTAACCAGGCTAC CCTTAACACC TCAAGTAATT TGTATGGTAT GGGTGCTCCA GGAGGAGGTTTGGCTGCTGG TTATCAGGAC CCCAGGTATG CCTATGAAGG GTATTATGCT CCTGTGCCGTGGCACGATGG CTCTAAGTAC TCTGATGTGC AGAGACCTGT TTCTGGTAGT GGAGTTGCATCCTCCTATTC TAAGTCTAGC ACAGTACCTT CATCGAGGAA TCAAAACTAC CGCTCAAATTCTCACTACAC Ggtatgatgt ctttccaaac ttctttttgc taatgaacac cattgtctgctttactggca tatatatata gccgctcaag tcttccaaat ttgttaactg accttcaatcaacttttttc tttgcagAGC GTGCACCAGC CTTCATCAGT GACTGGCTAT GGTACAGCTCAGGGGTACTA CAACAGGATG TATCAGAACA AGTTATATGG TCAGTATGGT AGCACAGGGAGATCTGCTTT GGGTTATGCT TCATCTGGGT ATGATTCAAC AACAAATGGA AGAGGATGGGCGGCCACAGA CAACAAATAC AGAAGCTGGG GCAGGGGTAA CAGTTACTAT TACGGAAATGAGAACAATGT AGATGGTTTG AATGAACTTA ACAGGGGACC TAGAGCTAAG GGCACAAAGAACCAGAAGGG AAATCTAGAT GATAGCTTAG AGGTTAAGGA GCAGACTGGA GAATCAAATGTAACTGAGGT TGGGGAGGCG GATAACACAT GTGTTGTTCC TGACAGAGAA CAGTACAATAAAGAAGATTT CCCAGTGGAT TATGCAAATG CCATGTTCTT TATCATCAAG TCATACAGTGAAGATGATGT GCACAAGAGC ATTAAATATA ATGTTTGGGC TAGCACACCA AATGGAAACAAGAAGCTTGC TGCAGCATAC CAGGAAGCTC AACAGAAAGC TGGCGGCTGT CCCATCTTTCTGTTTTTCTC Ggtgtgtata taatcctgaa attaaaaact gtgctctttt tactttgttttatgatattg ttctttatac tccagttttt gtctttcagG TCAATGCAAG TGGACAATTTGTTGGTCTTG CTGAAATGAC AGGACCAGTT GATTTCAACA CAAATGTGGA GTACTGGCAGCAAGATAAGT GGACCGGCTC TTTCCCCCTC AAGTGGCATA TTGTGAAGGA TGTGCCAAACAGTTTACTGA AGCATATTAC TTTAGAGAAC AATGAGAACA AACCTGTTAC CAACAGCAGAGACACACAAG AGgtaaatat ttgtgacatc ttttggcttg ttttactgat tactccacgagcgtttttgt tttcttgtgc ctaactttct ttgtttggat catattagGT TAAGTTGGAGCAAGGTTTGA AGATTGTGAA AATTTTCAAG GAGCATAGCA GCAAGACTTG CATTTTGGATGATTTCTCAT TCTACGAGGT TCGACAGAAG ACTATCTTGG AGAAGAAAGC CAAGCAAACCCAGAAACAGg taagaactag aaaacaattt cagaaatctt tttcattcag tatatatataacttgagtgt ttctaatgta ttaaagctta acagGTAAGC GAGGAGAAGG TAACCGATGAAAAGAAGGAA TCTGCAACTG CAGAGTCAGC GAGCAAGGAA TCTCCTGCAG CTGTTCAAACGTCCAGTGAT GTTAAGGTTG CTGAGAATGG GTCTGTTGCT AAACCAGTCA CAGGCGATGTGGTGGCAAAT GGTTGCTAAc taagaggatg gtgtcgctca cggcatgggc ataaaactgactagagatga agatatgaac aatcccgttt aacgtttctc ttgagaagaa gattgccgtgagccttgaag catggaagga gctttagtac ctgagacgga tccgtttctt tgcccttagaagtttaaatc ccagttattt ttttttcaat cttttcttgt tttcattttt ccttttcttcaaaatcgcag tctcgttaca agtttatgtt gggtttcttt ttcattttct gttgttcctaccctgtaaaa atgcgcatag gacctactaa atcgtgggaa gaattagaga aaaggagataaaagcagggt gggattttgt tttttcatgt ctgttggatt tttaggcaga gttttcttttcttttggttt cttgctttgg tttcagactt gactctcttg agtcgtttag aatttgagatggtcttttgc ctctctcgtc ttgtttctgt cattctcca19gagcgaggtc ttgtgtccag tttatgtttg aatcggtgat caaaacacaa tcctaaacagtgttagttaa tttaaaagct tcaatagcga aagacttact ttttgttttt ggtttctacacttttataag tttactaatg cagaacttga tgaagctttt ttctgaattc attgattagtgaatatcatt atcttgttat tatcgtagac aaattgatat gagatcctta attatgataccaaataaaaa ccaccactaa agtgaaagaa aaaacaaagt caaagtaata tacaatatcatacaaatatc tgcaaaacgt ggaggaaaag aaaaatcgaa taattcgatg attctctctatcaaagaaac gaaaaagtcg tattgaagtt ttgccatttg tttataaaag aagtggctgttcaacgattc taaagtcatt tactttacca ttttgatctg ttgctctgtt tcactgtgcgtgatcgggaa gaagaagaaa ATGTTGGCGA TTTTCGACAA GAACGTGGCG AAAACACCCGAGGCTCTTCA GGGTCAAGAG GGTGGATCGG TTTGTGCTCT TAAAGATAGG TTCTTGCCGAACCATTTCTC CTCTGTTTAT CCTGGTGCTG TCACCATCAA TCTCGGATCT TCTGGTTTCATTGCTTGCTC TCTCGAGAAA CAGAACCCTC TTCTTCCCAG gttttgtaca atagtttattcctcaggatg atgttttctt cttctgtcct agatatgaga gatttgctat cttaatgtttcactggcttg caaagatagt ttaggatatg tttcactgaa tctgagagat tgagatatcgatctgttgtt atgttttgat ggaataatga agttatatat ctactttgtt gtgatgtttaaaatgtgttg aaactggaag gatgtgatta gataagtggt ggtgattttt tcaaaacaattttgtgtgtg tgacagATTG TTTGCTGTGG TGGATGATAT GTTCTGCATA TTCCAAGGACATATAGAGAA CGTTCCAATT CTTAAGCAAC AATATGGACT AACCAAAACA GCTACAGAGGTTACCATTGT GATTGAAGCC TACAGAACTC TAAGAGATCG TGGTCCGTAT TCAGCTGAACAAGTTGTTAG AGATTTTCAA GGCAAATTCG GGTTTATGCT CTATGACTGC TCCACACAAAATGTCTTCCT TGCCGGGgta agtttgaatt ctgcttcttt actatttgac acttatttctgcatattgta atgctgaggt tattattatt atacgcgttt cagGATGTAG ATGGGAGTGTTCCTCTCTAC TGGGGAACCG ATGCTGAAGG ACATCTTGTT GTTTCTGATG ATGTTGAGACTGTCAAGAAG GGTTGTGGTA AATCCTTTGC GCCATTCCCT AAAGgtatgt agcaagccgtttttcgggtt ttgaagacat ctcactgttc tttgatctag tgcaaatatg aattaggatgtggttgtgtg tatgcataat gcagGATGTT TCTTTACCTC ATCTGGAGGT TTGAGGAGCTATGAGCATCC ATCAAATGAG TTAAAGCCGG TACCAAGGGT AGACAGTTCG GGTGAGGTTTGCGGTGTAAC GTTTAAAGTG GATTCTGAGG CCAAGAAAGA AGCGATGCCT AGGGTTGGGAGTGTTCAGAA TTGGTCTAAA CAAATCTGAa ctagctgaaa aaggcttgtt ttattttttacttgttggac tcctgtggct gtgttccaca gatttactct tttcctgata ttctcactgtagccattcta aggactaatg gtgctcttat tgctattgta cctgtacttg gtaacaaggaagctaagaat aaaatatttt ataaacgtct aatgattcca gtgtatgcat atgatgtcatattgataaaa ccagagctgc aagaacatga gctccaacaa taacaattca taaacaacctttggtaacaa aacaaaacct aaaactgtaa tgaaacataa tgacaggtct tagactcttagtaagagcct aaggttaaca ctgcctgcag atttctccac attctcttta cgcagaaacgcctcgggtaa gacttgagcc atccattttc agacctctgt tgtctgatgc tgctgctgcatagtcctgac tgttttccct tctccttgaa gtcatatcaa ggccaacac20ttgcttaaca ctcttaaatt attctcaagg aatctttcga ttgtgttctt aggattcaattagtaataga cttgagtgtg tttgacatat ctattgggct tcgattgttt gttgcgtttacatgttataa taggttttta tttcttggtt caaacgaaac caaaacttaa aagtaaatcatttttttcta ctgaattttg tttttgatgc ttttgatttc atttgatcac ttcaactttagttccagggt cttgacgatt taattcaaaa agcaaaaaaa tcaataggaa acaaaaactcataaaggact ttgacataca gatgggccca ttgtttatga ccaatcctta tactatatatgggccttatt agttaaacct aaggcccaaa gtcagattag ggttttcaga aagtgtactataaattcttc ttctttaaac aacttcgtct agtggaacga cgacggcaca aaagcttcaccggagatcag agacgcgaaa ATGgtaaatt gtttcttctc tttcgatgtg attttggaatttgtaaagtt cgttgacttt gaagaacaac aatacatggt tgattgattt attgtattgtttttcagatc tatcataaaa gttttcaatc taaatgatgt ttgtattttg attaaccttaaaagtctctt gattttgtat gtgtgtgagt gattcatttt tgattttatg aattttgaagGTGAACATTC CAAAGACAAA GAACACTTAC TGTAAGAACA AGGAATGCAA AAAGCATACTTTGCACAAGG TTACCCAATA CAAGAAGGGT AAAGACAGTC TTGCTGCTCA AGGAAAGCGTCGTTATGACC GTAAACAATC TGGTTATGGT GGTCAGACTA AGCCTGTCTT CCACAAAAAGgtaacattga ttatcatgca ttgattgttt tttcagtttg aattaggtct agttagttgaaatgaggtag ttttaaggaa ccatttatag tagaattttg gaagtgagct gtgaggaaacagacattcca atagtctcag ttttggactg agatacatct tgtgaatctt gttaacagGCTAAGACCACG AAGAAGATTG TTTTGAGGCT TCAGTGTCAA AGCTGCAAGC ACTTTTCGCAGCGTCCTATC AAGgtgcata gaacatagat cagttcatta taccggattt gtaacttggtaatttgctta cattgtgttg gtttgttgtt tatttcagAG GTGCAAGCAT TTCGAGATCGGTGGTGACAA GAAGGGAAAG GGAACATCTC TGTTTTAAgt tggtttcatc ttattttctgcgatttttgt acttgctgga tttggaatcc atttgtttta gctctctcgt ataagattgtctcatctttg cttgttaact ctatattttg aatcatcaag atatggtttt gctgttaatcattgaccttc gatatttttt tgccaatccg ttctctctac caacctaaga aaaaatcactaatatctcac attagagggt gcaaaatttg gaaggtctat atcattgtcc aattttctgagtcatacaaa ttctttcata tgattcattg aacaagacac tcatttactt ataaagcgcatttatatgtt cacatgattt gtacaaaact catgagactg catcaagcag aaagtatttatttatcttta catgtcaaag ctttgagaat taagcaatga cgaataccct aagttcacctctgtccccgc gagttatgcg catggtatca tcaacatagg taacttcgaa atccccag21gacgccctat ctttgggttg aaaacttgag tttccttagc agcttttgtg atattttgaatcatttttat gggatatgtt tgagttattt tgtttttacg atatggtatt ggtaatacatactagttact acatagtcgt agactttcat gtttatttac aaatggatac aggtttaaaaacatttactt gcgactattt gatacacgtt agttacctgt taaaccagat taaataaaactaaaccactt gcacttgtta attgttagtg cttcgttagt tgtaaagctg agtaattttgtttccactcg agagagagaa aatggatctt atcttctttt ttttttttat catcacatcgatcgagaagc ctagagttag ggcctagggg tccactctca tattaataac ataaatgatttcttgtgtga tatagcttca ctgatttatc agatcttttt gcatttgggt cgacaaacaagaaagaagaa gaaagcttca ATGGAGAAGA GTAATGGCCT TCGAGTGATT CTGTTTCCACTTCCATTACA AGGCTGCATC AACCCCATGA TTCAGCTCGC CAAGATCCTC CACTCAAGAGGTTTCTCCAT CACTGTGATC CACACGTGCT TCAACGCGCC AAAAGCTTCA AGCCATCCTCTCTTCACCTT CTTAGAGATC CCAGATGGCT TGTCCGAAAC AGAGAAAAGA ACTAACAATACCAAACTTCT CCTAACGCTT CTCAACCGGA ACTGTGAGTC TCCGTTTCGT GAATGTTTGAGTAAACTGTT GCAGTCTGCA GATTCAGAAA CAGGGGAAGA GAAACAGAGG ATTAGCTGTTTGATCGCTGA TTCTGGATGG ATGTTCACAC AACCCATTGC TCAGAGTTTG AAACTCCCAATATTGGTCCT CAGTGTGTTT ACAGTCTCCT TCTTTCGCTG CCAATTTGTT CTTCCTAAGCTTCGGCGTGA AGTGTATCTT CCACTTCAAG gtattgttat ttcttacatt tttcgtatagaccaagcaac tcgttaacct aaaaacatat atctaaattt tctcacagAT TCAGAACAGGAGGATCTAGT TCAAGAGTTT CCGCCGCTTC GAAAGAAGGA TATTGTACGT ATTCTTGATGTAGAAACAGA TATACTAGAT CCATTCTTGG ACAAAGTTCT ACAAATGACA AAGGCGTCTTCAGGTCTTAT ATTCATGTCA TGTGAAGAGT TGGACCACGA CTCAGTGAGT CAGGCACGTGAAGATTTCAA AATTCCTATC TTTGGGATTG GACCATCTCA CAGCCACTTT CCAGCTACCTCTAGTAGCTT GTCCACACCC GACGAGACTT GCATTCCATG GTTAGACAAA CAAGAAGACAAATCCGTGAT TTACGTCAGT TACGGGAGCA TCGTGACCAT CAGCGAATCA GATTTAATAGAGATTGCTTG GGGTCTAAGA AACAGCGACC AACCCTTCTT GTTGGTCGTA CGGGTTGGTTCAGTCCGTGG CAGAGAATGG ATCGAGACAA TCCCGGAAGA GATCATGGAA AAGCTTAATGAGAAGGGAAA GATAGTGAAA TGGGCTCCGC AACAAGACGT TCTAAAGCAT CGAGCCATTGGGGGATTCCT GACACATAAT GGTTGGAGCT CGACTGTTGA GAGTGTTTGT GAAGCAGTCCCTATGATCTG TTTGCCTTTT CGTTGGGACC AAATGCTAAA TGCAAGATTT GTTAGCGATGTATGGATGGT CGGGATAAAC CTAGAGGATC GGGTTGAAAG GAATGAGATC GAGGGAGCGATAAGGAGATT ATTGGTGGAA CCTGAAGGAG AAGCCATCCG AGAGAGGATA GAACATCTTAAGGAGAAAGT AGGACGATCG TTTCAACAAA ACGGTTCCGC ATATCAATCG TTACAAAATTTGATTGATTA TATATCATCT TTTTAGccac tgacatgttg tttctttgtg ttttaagtttttcaaccgat aaattgtttg tgtatcagaa atttcttcct ttgtgtgttt tgtattgttagaataaaatt ttcttcgtaa gttggaattt acatatatac ttaccactta attatcagccacgttttcag caacttttta ctattatttt gcaacctact aatacaaacg catcttgtctttttatgtcc cttaactaat gaaaatcaaa tataaattag accactagtt acatgccctagagggaaaac gaatctggtc tttctttatt agcacatcat gaagagtata gttttgtctcactctcgagt aataaagaat gcgaagtgct aataaagaaa gaccagattc ggaaatttctttatgttata tatagatgtt tgttatcaaa agggaaagaa ttacaccatt cactgaaatatcaggagatt tacatttgga aagaaggtca aaaggagaaa gcttca22tattgttgat tctctatgcc gatttcgcta gatctgttta gcatgcgttg tggttttatgagaaaatctt tgttttgggg gttgcttgtt atgtgattcg atccgtgctt gttggatcgatctgagctaa ttcttaaggt ttatgtgtta gatctatgga gtttgaggat tcttctcgcttctgtcgatc tctcgctgtt atttttgttt ttttcagtga agtgaagttg tttagttcgaaatgacttcg tgtatgctcg attgatctgg ttttaatctt cgatctgtta ggtgttgatgtttacaagtg aattctagtg ttttctcttt gagatctgtg aagtttgaac ctagttttctcaataatcaa catatgaagc gatgtttgag tttcaataaa cgctgctaat cttcgaaactaagttgtgat ctgattcgtg tttacttcat gagcttatcc aattcatttc ggtttcattttacttttttt ttagtgaaaa ATGGCCGATG GTGAGGATAT TCAGCCACTT GTCTGTGACAATGGAACTGG AATGGTGAAG gtgagttaga ctgtttattt agatactgta tggttctaaccttctttgtt gtacatgtgt aagactactg atcatgattt ttgtatatta acagGCTGGTTTTGCTGGTG ATGATGCCCC GAGAGCAGTG TTCCCAAGTA TTGTTGGTCG TCCTAGGCACACTGGTGTCA TGGTTGGTAT GGGTCAGAAA GATGCTTACG TTGGTGATGA AGCTCAGTCCAAGAGAGGTA TCCTCACTCT GAAGTATCCA ATCGAACATG GTATTGTAAG TAACTGGGATGACATGGAAA AGATATGGCA TCACACTTTC TACAACGAGC TTCGTGTTGC CCCTGAGGAGCACCCAGTTC TACTCACAGA GGCACCTCTT AACCCTAAAG CTAACAGGGA GAAGATGACTCAGATCATGT TTGAGACATT CAATGTCCCT GCCATGTATG TTGCCATTCA GGCCGTTCTTTCTCTCTATG CCAGTGGTCG TACAACCGgt tagttcttaa ctctaaacat ccaagtctgagttatattat cttcttactt gtatttactt aaagtcgttc tctttttgta acagGTATTGTGCTCGATTC TGGTGATGGT GTGTCTCACA CTGTGCCAAT CTACGAGGGG TATGCTCTTCCTCATGCTAT CCTTCGTCTT GATCTTGCGG GTCGGGATCT CACAGACTCA CTCATGAAGATTCTCACTGA GAGAGGTTAC ATGTTCACCA CTACCGCAGA ACGGGAAATT GTCCGTGACATAAAGGAGAA ACTTGCTTAT GTCGCTCTTG ACTACGAGCA AGAGCTAGAG ACAGCCAAGAGCAGTTCTTC AGTGGAGAAG AACTACGAGC TACCTGATGG ACAAGTCATA ACCATCGGAGCTGAGAGATT CCGTTGTCCT GAGGTTCTGT TCCAGCCATC GCTCATCGGA ATGGAAGCTCCTGGAATCCA TGAAACAACT TACAACTCCA TCATGAAATG TGATGTCGAT ATCAGGAAGGATCTCTATGG AAACATCGTT CTCAGTGGTG GTTCCACCAT GTTCCCAGGA ATTGCTGACCGTATGAGCAA AGAGATCACC GCTCTTGCAC CTAGCAGCAT GAAGATCAAG GTGGTTGCACCGCCAGAGAG AAAATACAGT GTCTGGATCG GAGGATCAAT CCTTGCATCC CTCAGCACCTTCCAACAGgt aaaaatccca attccgcctc tttaaaactt tcagctccat ttatgaaacatgagtgaaaa tactgaaatt ttgttttgtt tgtgtgtgtg aatcagATGT GGATTTCAAAGAGTGAGTAC GATGAGTCAG GTCCATCGAT TGTTCACAGG AAATGCTTCT AAgtgtgtcttgtcttatct ggttcgtggt ggtgagtttg ttacaaaaaa atctattttc cctagttgagatgggaattg aactatctgt tgttatgtgg attttatttt cttttttctc tttagaaccttatggttgtg tcaagaagtc ttgtgtactt tagttttata tctctgtttt atctcttctattttctttag gatgcttgtg atgatgctgt ttttttttgt ccctaagcaa aaaaatatcatattatattt ggtccttggt tcattttttt ggtttttttt tgtcttcaca tataaatattgtttgaatgt cttcaatctt ttatttgtat gagacaatta tttaagtatc gggtgacaatgcagctatta tgtattgtcg atttggatat tggcgcccaa aatatatact tagcctaagaatttggtaag tgagtggctt atgttttact ccagcaaaaa ttgtgtgtgt attaccattctgatgcgaaa ca23aaaccatcta atctaagtct tgtctccttt atctacatat acggacaatt agatatcacatgtacgaata tacaggcaat gtgggacaaa attcaaaaaa atgtgtctaa aaggggacaagtggtcatta accttaattt aaattacggc caaatgttta gtaactaaat aaatatggggtcgaaatgta aattctaaat tatctcacaa agtggggtac agaagtgaac actaataagtcataaagaga gatttaaagg agaaacgaaa agcattaaga tttaatttat atgaaattagtgaaaaccaa ccaaaaagaa tttatatgaa attctaaggg gcaaattgcg gaacaaagattgtaaatagc aaaaggagtt tcagtataaa tatatgggga caagggccat aaaaataacaaaaacattct tagagagctt tggagataac gagaacaaga aagaaagaga agattatatacatagaaaag gagagatcaa ATGGAGTGGG AGAAATGGTA CTTAGATGCG GTTCTTGTGCCAAGTGCTTT ACTTATGATG TTTGGTTACC ACATCTATTT GTGGTATAAG GTTCGAACCGATCCTTTCTG CACCATTGTT GGTACAAATT CCCGCGCCCG TCGATCTTGG GTAGCAGCCATCATGAAGgt agttatatta ctcaaaaacg atatatatcc cgaaataatc tttcaaaaatcttgtgttaa gtgattgtag taactagtaa gtagtaatta ctaattaatc atcatattagcgaaagtaat tagcttcatt gaacatatat accataatgt ttactaactg caatttttctatgaaaattg cttatgcaaa aacttagtat aggtgtcggc ccaaaatttt attaagtccgtatgaataca aaataaataa atttgcatgc atatttggcc aataagagac tataaatccatacaatgtca taatatctct atgtatacat cattaacttt cttcatatat atgtacacagtatatacata gaattacttc tcaaatagta acaatatact gtgtctttgt tcagGACAACGAGAAGAAGA ACATCTTAGC GGTACAAACA CTACGAAACA CGATAATGGG AGGGACGTTAATGGCAACCA CTTGCATCCT CCTCTGCGCA GGTCTCGCTG CCGTTTTAAG CAGTACTTATAGCATCAAGA AACCTTTAAA CGACGCCGTA TATGGAGCTC ATGGTGACTT CACTGTTGCACTCAAATACG TAACCATCCT CACAATCTTC CTCTTCGCCT TCTTCTCTCA TTCTCTCTCCATTCGCTTCA TCAACCAAGT CAACATCCTT ATTAACGCTC CTCAAGAACC TTTTTCTGATGATTTCGGCG AAATAGGAAG CTTTGTGACT CCCGAGTATG TCTCTGAACT ACTCGAGAAAGCTTTCTTGC TCAATACGGT AGGTAATAGG CTGTTCTACA TGGGCTTGCC TTTGATGCTATGGATCTTTG GGCCTGTGCT TGTGTTCTTG AGCTCTGCTT TGATAATCCC TGTTCTTTATAACCTCGACT TCGTGTTTTT GTTGAGCAAT AAGGAGAAGG GTAAAGTCGA TTGCAATGGAGGTTGTGATG ACAACTTCTC GCCTTAAtta tctgttgatg ttgaattcga ataatgataaagctgtttgt tattactgat ttactagtct aaaaagtctt tcgatttact cttttcaaagcttaccaaaa aaaaaatgta ctagatccga gtcttttttt aatttttaat tttttttcctggtgaagata ttcatgatct gctatatata attagtaaaa gttccatgga tagtcaaaatggaaattaat taacaaaact atctttttta taaaattttt tattactatg ctgctaacaagtaacaatga tgcgaccatc cttagtccct tacacttgat tcgtctatta ttttttctaattcaaatgtc aattttttaa tggcacagat actcgttttc aagtcaatgg agtgatactcatctgaattg gtcgtgtctt tttcctttat attagcccta tcagcggctt taataattataacagacatt attatattga tgattattgg gatccaatga agaaagc24cattgttatt aagggaaatg aaatatctta actaaaccaa tttgttatct attgtgctctactgttctgt tcgtattgac tcgaacccac taaaccaaga cgagccctga ccgtcattgtctaaattgac tcgaacccac taaagaaaaa aagaaaaaaa aacttagata ataattggcgcagaagggcc gattaataaa aactttaggc ccattaaagt aaagcttatt gtcaaccctatccagtctcc ttgtatatat ttatttacga caccaacgcg gcgttggtga ttcattctcttcagtcagag atttcgaaac cctagtcgat ttcgagatcc aaccaactct gctccttatctcaggtaaaa ttctcgctcg agaactcaat tgcttatcca aagttccaat ggaagatgctttcctactga atcttaggtt aatgttttgg atttggaatc ttacccgaaa tttctctgcagcttgttgaa tttgcgaagt ATGGGAGACG CTAGAGACAA CGAAGCCTAC GAGGAGGAGCTCTTGGACTA TGAAGAAGAA GACGAGAAGG TCCCAGATTC TGGAAACAAA GTTAACGGTGAAGCCGTGAA AAAgtgagtt ttatgatttc ctcgatctgt ttcatgagat agtggatgtttaaatttagg gttttcttag attactgctt gataacaacc gactaagttc ttcaattatctatgtgtttg gttagttgct taactttatg acaattgact aagttcttca atgctaaaattcctggaacc tacccaatat tagacggtca tgtgtttatc atcttgtatt ttctctttgtgacagAGGGT ACGTGGGAAT ACACAGTTCT GGATTCAGAG ACTTCCTTTT AAAACCGGAGCTTCTCAGAG CTATTGTTGA CTCTGGATTT GAACATCCAT CTGAAGgtta ttacaatgaaatacagcgta gctttgactt ttctgccttg cctttcacca ttctattacc gaatgatattgtataattta cagaagtgac ttctccataa gatgttttag ttgtccggaa acttttaattatatgtactt cgtctagttt tgagaagata tgttggttaa agatatttta tactttatcttggtcctttg cttatcatct aactaaatta aaaaaagttt gtgttgaggt caaattcttttttatttcct gttataatgg tttttgtttt ctttgtttat taacgtttca ctgattactttttccaggta ataaacgata tttcaatcta ttggtttgga gtgagcttaa acatgtgctaaagccaccaa tttaaaagat atggaggtta tcatctactt ataaaggctt tcttcggtacaattttcttt ggttctccac cagTGCAACA TGAATGTATC CCTCAAGCTA TCTTGGGCATGGATGTCATC TGCCAAGCAA AGTCTGGTAT GGGGAAGACT GCTGTGTTTG TCCTGTCTACTCTACAACAG ATTGAACCAT CTCCTGGCCA GGTTTCTGCA CTTGTCTTGT GCCATACAAGAGAGCTAGCT TACCAGgtat gaccttcttg tttcactcag gttcttggct tatagttttgttgtacgtct tcttcctcta atgctttttg ccttgatgct gacaattact tgcagATCTGCAATGAGTTT GTGCGATTCA GTACCTATCT GCCTGATACA AAGGTTTCGG TGTTCTATGGTGGAGTCAAC ATTAAAATTC ACAAAGACTT GCTGAAGAAT GAATGTCCTC ACATTGTTGTTGGTACCCCT GGTCGGGTGC TTGCACTTGC CAGGGAGAAA GATCTCTCTT TGAAGAATGTGAGGCATTTT ATTCTTGATG AATGTGATAA AATGCTCGAG TCACTTGgta tgctgatttctgacatcatt attacatcga tccctgaata attttatgtt ttaacacttt aactttttttttaccagACA TGCGAAGGGA TGTGCAGGAG ATTTTCAAGA TGACTCCTCA TGACAAACAAGTAATGATGT TCTCAGCAAC GCTCAGCAAA GAGATACGCC CAGTCTGCAA AAAATTTATGCAAGATgtaa tgttccatgg ccaattctct ctccctttgc aagtcttcta gttttcaactatttttagcc ttctatgagt gatcatagca ttagttgagc gtcttctgcg gttctgccctggaaaagcgg caactgatct ctcaatgggt ctcaatccaa taatggttgg gtagtttgtagggaacgaga actgtgagtg tgagactctg tagctttggt atggtttcta tgggtgattatagcattatt tgggcatctt ctgcggttct gccctggaaa agctgcaact gatctctcgatgggtctcaa tccactaatg ctttgggtag tttgtaggga tcgagaactg tgagtgtgagcctctgtagc attggtatga atgagtgacc attgcacaac aggatcttct ttcgtcattaccttttattc agtttcaatt tctttgcaat tctagcagtg ctgggtgggt tttgggtggggtactgtgtt gtcccaaggt ttcattgtga ttgtatgggc cttaatgttc cgagcaatatcgctgtatca tagcaaaact cacatctatg aagagaacct ggtggacgag gatctcagatcaggggtttt acatccatct tcacttttgt agtgtaaatc atttcctgag aaaagcttgctaattattac ctgatatcta ttcctttcag CCAATGGAAA TATATGTCGA TGATGAAGCCAAGTTGACTC TTCATGGGCT TGTCCAGgta ctcttatctg gtgttaggtc ttcttattcaatggaaatat agtttgttgt ttgatactta aaagaccttt tactgtcata ctgtaacagCACTATATCAA ACTGAGCGAG ATGGAGAAAA CCCGGAAGTT GAATGACCTT CTTGATGCGTTGGACTTCAA TCAAGTTGTC ATTTTTGTGA AGAGCGTGAG CAGGGCTGCT GAGCTGAACAAGTTACTGGT GGAATGCAAT TTCCCCTCAA TATGCATCCA CTCTGGAATG TCTCAAGAAGAGAGgtctgt acattctctt caaaattcaa tgtttttgaa ggaccctacc tgctcttaaagccctcatgg agaggagtcc aattcttaag gctaatacga tatgttatgt agGTTGACTCGATACAAAAG TTTCAAGGAA GGGCACAAAA GGATCCTTGT GGCGACTGAC TTGGTAGGAAGAGGGATTGA CATTGAGCGT GTCAACATTG TCATCAACTA TGACATGCCA GATTCTGCTGATACCTATCT TCACAGGgta agtacataat actgaaattt attatttgat tgttgatctcactgaaaggg ctcttgtaac tttaccgttt tgctgtgtat ggtatagGTT GGCAGAGCTGGTAGATTTGG AACCAAGGGT CTTGCAATCA CATTTGTTGC ATCTGCTTCA GATTCAGAGGTTCTTAACCA Ggtatggtgt tcaatctttg taataagtcc acggaaaact cctcttgaaattgagttgga tatttagtaa agtggcaatt ataaatcttg gacagGTACA AGAGAGGTTTGAGGTTGATA TAAAGGAACT TCCGGAGCAG ATTGATACTT CAACCTACAG TAAGTGTGAAATCCCTTACC AATTGTTTGT TTAAaagctt ggttttgtct ggttgtgata ttaatgttgtttcttcttct ttctttgttc agtgccttct taaacaagta gcacgtccct caggaaagaagctcttcaga tttcaacctt gtaggtgttc aaagggtcat gggggttcac aactatctctcgctccgttt gttttagtgt tttctatgac gacatttttt tccatatgtt tagaacgtctgttgtactct ttaaaggaga ttcgagtcac tctccaaatc gcacagttaa aagctgtccagttttttgta caagagatta ttatgtttga aatatcagga tttagtctcg acctgattactgtgttcctt aggaatcgat ctattatcaa tttatcatgg tgttgctaag aatcgtcattcatcagcgtt acttccttca tgtgatgctt tttttttata acacatttca tttagtgtggaagagataca acacgtatat atggttactt tatatattga aaag25cttgtaagtt gttttccttt tgggatatgg gaagtgactt ctccgaccct tgcaaactaacaatggccat tacacactaa ttacaagcca aatttcctca ctaagcaacc tctcgtgtttatcataagac accgctctat ctcttattat tttattcatt gttttctaat ttcagactgattaatcatac attagagaaa gtttattaaa accatctgat gtaaaaaatc acatttatctaaattaaata aatttgttat ctagtatata actatttatt gttttaacat ttggataaattgtaagaaat tagaatgtaa aataagacag aaaatggtca actatgagca tctatcgccatcatgatata gtttcgtcgt ttgcgttccc gacctaactc aaaacttcac caaccccatttttaagcccc tttctttgtt tttatcctcc gatcgatcaa accaagaaaa aacactttcgtatttccctc gacgaaaaaa ATGGCAACCA TTTCGAATCT CGCTAATCTT CCCCGCGCCACCTGCGTCGA CTCCAAATCT TCTTCCTCTT CCTCCGTCTT ACCTAGATCC TTCGTCAATTTCCGCGCTTT GAATGCAAAG CTTTCCTCTT CTCAGCTTTC TCTTCGTTAT AACCAACGATCAATACCTTC CCTCTCgtaa gtctttatat ccatttgatg catgtctttt gtctctgtttctcgctcttg gggttcacca aaaattgaat ctttttagct ggaaacgtac cacgaatctcaaagtaacat tttttataag atggattagg aaaagcaact gtatttcccc tttttggttggtaaaagtct gatttttttg tttaatttgc agTGTGAGGT GTTCAGTGTC TGGTGGAAATGGAACTGCTG GAAAGAGAAC GACTCTTCAT GATCTATATG AGAAGGAAGG TCAGAGTCCTTGGTATGATA ATCTTTGCCG TCCAGTCACA GATCTTCTCC CGTTGATTGC TCGTGGTGTTAGAGGTGTTA CTAGCAACCC TGCGgtaatt ttatcatctc tctttgtgtg tttggttttgcttttgctct gtgtttgttc atttgtcttt acttcttcac tttttataca tttgcagATCTTCCAAAAAG CCATTTCCAC TTCAAATGCT TATAATGATC AATTCAGgta tctttttgtgattgtcttag acttgtggtt gttaacaaca tgctattaaa actttagagt tcttctttatatgaaaagtt gtctgatatg ttaatggtat acctgacatg cactattagG ACACTTGTGGAATCGGGAAA GGACATTGAA AGTGCGTATT GGGAACTTGT GGTGAAGGAT ATTCAGGATGCCTGCAAACT TTTTGAGCCA ATCTATGACC AGACAGAAGG TGCGGATGGC TATGTCTCTGTTGAAGTTTC ACCTAGGCTT GCTGATGATA CCCAAGGAAC TGTTGAAGCT GCTAAATATCTTAGCAAGGT TGTCAACCGT CGTAATGTCT ACATTAAGAT TCCTGCTACT GCTCCATGCATTCCTTCCAT CAGGGATGTC ATTGCAGCTG GAATAAGTGT CAATGTCACG gtaagttatcctagtatgtt tcattattca agtttcttat tgcaagtttt aaagaacttc aaaataaaataagtcataat acttcaaatt catgtattgt gtgatgatgt gctagatcac tggatttcttgggcgtttta aacctgaaac tagattagtt caagggtgtt ccaaggatgc actgatgttaccttttctaa atcgtttctc atatgttctg ttctgtttca gCTTATATTC TCAATCGCCAGATATGAAGC AGTGATCGAT GCATATTTGG ATGGCCTCGA GGCGTCTGGA CTTGATGACCTCTCAAGAGT TACCAGTGTT GCTTCCTTCT TTGTCAGTCG GGTGGATACT CTCATGGACAAGATGCTTGA GCAAATTGGT ACCCCTGAAG CCTTAGATCT CCGTGGGAAG gtaaagctctattcatcgct gagatcttac accagccact gtgagtagag tattagctta tgacacatgatatgtttact cttgcagGCG GCTGTGGCTC AAGCTGCATT AGCATACAAG CTATACCAGCAGAAATTCTC TGGCCCAAGA TGGGAAGCTC TGGTAAAGAA AGGTGCCAAG AAACAGAGACTTCTCTGGGC ATCAACAAGT GTAAAGAACC CAGCTTACTC TGACACCTTA TATGTCGCTCCTCTCATCGG ACCTGACACT gtaagtcatc tttttgtttg tgttgaagtc aataggctgtattaacgctt tggaagtata ttcatagttt ttgtgggtgt gatttagGTA TCAACCATGCCGGATCAAGC CCTGGAAGCA TTCGCAGATC ATGGAATAGT GAAGAGGACA ATAGATGCGAATGTGTCAGA AGCAGAAGGG ATTTACAGTG CACTAGAGAA GCTGGGAATA GACTGGAACAAAGTAGGAGA ACAGTTGGAA GACGAAGGAG TAGATTCCTT CAAGAAGAGT TTCGAGAGTCTGCTCGGTAC ACTGCAAGAC AAGGCCAACA CTCTCAAACT AGCCAGCCAT TGAggaaatgagtcatcatt atgtttttgg ttacgctaaa ataaaaagaa gaacctttgg cttttgttcttcaatcctta tgcatgcttt ctaaagtggt tatgatggat tttgcttgat gttccacattatgggttatt ctattttctt tgttcttgta agatgatgct tcagaagagt ttgttactttttaccgtatt tgtaatttac attttcactg aaaacaattg gcgagtaaaa aagtgtccttgtcttcttct ttgttcggat tatatgaaca attgttccta gaagcctctc tacataaaaagctgagactt tatctctcat ctctctttag acgtacaaaa aaatcagttt tttaagtttcactctaatgg cgtcaatttc gtcctttggc tgcttccctc aatccacagc gctcgccggaacttcctcca ccaccgtacg acgccgcacc atctctctgt ttcttcttct tcttcctttttattcactga atc26attaagctct catttcggga agaattacta caaaagctac taatttgacc taattcatgcacaaatttga ttacaatgaa gaaataactt acaacgttga cgagcagaga aaccttgtagccggtaattg tcggcgagag agcttctacc cttctggttg gattttttag ggttttagaatttcattttc caacaaaaga taaacaaata aaaattggaa cttgtcgtta atacagccctttaatgggtc aacgggtctt atgtctcttg aaaaagccca tgggccaaga caggtaaaataacaatgtca ctttcgtaat tatcgcaaag tatatgcctt gttccatcag attccatttgcccaataaag cccgagtttc gagagttaat acctcattgg tgcttttggt tttggcaaagcgtgagtgag atcgggaatc aaacatcgcc tccgtctctc atttcaaacg ctatctccatctccttcctc cgccgccgcc ATGGAATCTC CGAAGAATTC TCTGATCCCG AGCTTCCTCTATTCATCATC TTCATCTCCG AGATCTTTCC TCCTCGACCA GGTGCTCAAT TCCAACTCCAACGCTGCATT CGAGAAATCT CCTTCTCCGG CCCCGCGTTC CTCTCCTACG TCGATGATTTCTCGGAAGAA TTTCCTTATT GCATCTCCCA CCGAGCCAGG GAAGGGGATC GAGATGTATTCACCTGCCTT CTACGCTGCT TGTACCTTTG GTGGAATTCT CAGCTGTGGT CTTACTCACATGACCGTGAC TCCTCTCGAT CTCGTCAAGT GCAATATGCA Ggtatgtaac ctttagatccgttgtctttc gtttgttttc tgagctcatg tttgtggatc tgtgttcctg tgttgtttaggtagtgagat ctgtgttgct agatctgtga tttgattttc tttatcgctt tgttgttttcctgactattg gttttgtgtt tgatttcaat atctgaagaa ttgtttgatc tctgataaacgcatcttcgt ctatccattt ccatgttata tatgaatcat tctatttcaa tatacgttaatatggtctga tttctggttc ttctttcgaa atattgttac ttgacgtgtt atgtgttgaatggttcactt ggtcttgcaa aactgatata tcttgttatc cagATTGATC CAGCGAAGTACAAGAGCATC TCGTCTGGTT TTGGAATTTT GCTGAAAGAG CAAGGAGTCA AAGGCTTTTTCCGTGGATGG GTTCCTACTC TTTTGGGTTA CAGTGCTCAG GGTGCCTGCA AGTTTGGATTCTACGAGTAC TTTAAGAAGA CTTACTCTGA CCTTGCTGGA CCTGAGTACA CTGCCAAATACAAGACTCTC ATCTACCTTG CTGGTTCTGC TTCTGCTGAG ATCATTGCCG ATATTGCACTTTGCCCATTT GAAGCTGTGA AGGTTCGTGT TCAGACACAG CCTGGATTTG CTAGGGGGATGTCTGATGGA TTTCCCAAGT TTATCAAGTC CGAAGGATAC GGAGGgtgag tttttcaataccaataacat tatctccctt gttactgcta gccttttggt ctgatttctg atttttttgcagCTTGTATA AGGGTCTTGC TCCACTCTGG GGACGTCAGA TTCCTTgtaa gttctggcctctattttgca acctgttgca caatcttttt tttttttttt ttttgtttat tgatgaaacatatgtagttc tttaaaagca aaaggtggtg atgatatcta tgaattttac agACACTATGATGAAGTTTG CTTCCTTTGA GACCATTGTT GAGATGATTT ACAAGTACGC AATCCCCAACCCAAAGAGTG AGTGCAGCAA AGGTCTGCAA CTCGGAGTGA GTTTTGCCGG AGGTTACGTTGCCGGAGTGT TCTGTGCCAT CGTTTCTCAT CCAGCAGACA ATCTAGTGTC ATTCCTCAACAACGCTAAGG GAGCAACCGT TGGAGATgta agtcactatg tttgaataca atagcctaatgctagaatgg ctgtggtttg gtagttgtat acaagctatt gatttctgtt acggtagaaataatatttaa tgtttgtaaa tgacatgttg cagGCGGTGA AGAAGATTGG TATGGTGGGACTGTTCACAA GAGGGCTTCC TCTTAGAATT GTGATGATCG GGACGTTGAC TGGAGCACAGTGGGGATTAT ACGATGCCTT CAAAGTGTTT GTTGGCCTgt aagttcctct ctctcttcacttactttcgt accttaattg taccttcaaa atgcaaaact ctcaattctt ttgatttggtattcagGCCA ACCACTGGTG GTGTTGCTCC AGCTCCTGCC ATCGCAGCTA CTGAAGCCAAAGCCTAAaca atgacgaaaa aggttattag gagttcgatg gggtaggatt tttgtttggaaaaataagag aaaccatacg gtgatgagga agagtgagta agctcaattt cttcctgatttgaactttat catttttgtt ttttttgaaa tttgtgttcc tgaattcagg atagtgctctctctctcttt acatactctc ttcctattgt ttcttgtcct ttttttcttt gtgtgatgtaatcttaaaag atgagaggga cacactccaa gatagagaga gtgggcatac acccactcactactttttat tcagtttcag ttgaaattct cttttggttg ctctatctat tattttacttttttgtttta gagattatat aaaatctcgt tttaaaacat caaatcatag atagatcttgaatactaatc atatgtatac gtttaaccgc taagcgctaa cataaggaaa atattatgtaggcaaatgat taataaacat atgataa27aatgattttg acctttttaa ataatatatt caaatgtgtt tcaaacacga atcaaactataccaaaaaaa aaaaaaaagt tggataaaaa ataaaacctg actacacctc aactttggatcaaaatctat gaatatattt tcaaaattat cttagtcaaa ttttaaatta attaattatttatataaaat ttaataatta tcataacctt ggattaaatt tatctacagt caaaaattaattttaaatca attaattaat agcattatta caatccctaa ttgtacggga cgaataaaaaagtagaaaac tcaagttcct ttctttacca tacagctttt tcgattggag ttgaataagtcttcatctga cacgtgtaac cctggcacat gccgtccact aaaacacgtg cgagatctgtataaatcaaa cctacgcgtt tcatctctct tttcaaaact caccgacgcg atccgatctcatctctctca tttcgaaacc ATGGTTGAGC CGGCGAATAC TGTTGGTCTT CCGGTGAACCCGACTCCGTT GCTGAAAGAT GAGCTCGATA TCGTGATTCC GACTATCAGA AACCTCGATTTCCTCGAGAT GTGGAGGCCT TTTCTTCAGC CTTACCATCT GATCATCGTC CAGGACGGAGATCCATCGAA GAAGATCCAT GTCCCTGAAG GTTACGACTA CGAGCTCTAC AACAGGAACGACATTAACCG AATCCTCGGA CCTAAGGCTT CTTGTATCTC GTTTAAGGAT TCTGCTTGTCGATGCTTTGG GTACATGGTG TCTAAGAAGA AGTATATCTT CACCATTGAT GACGATTGCTTCgtaagtta cttgaatttt gagttttgta ttcgttttta tgcttgattt gagagttttgtcaattttgg ttctagatct gtttttttga gcttatttgt ttgtgtttgt gtggatttttcaagttcatt gcttgaattt cgtagatttg gtgagagatc aattatacga ttcactaaatttgacggatc ttaggtttgt gagataatcc ttggttcgat tagctaggca attcaatgttttgtaccaga tccatagatc tgcttgttga gtctgaatat gttttcactt ttgtgtaattagccatgatc tctaatgttt acttgtagat tttctgtgag ctgatgtctc ttttgttgacgacattgttg ttgagctgat atctctgagt cattatagct acctttacga tatggttgcacgtccttgtt catcactttt ttcttttgtt ttaccttttt gagatttgtg gggcatatccaaggatgagt ctcgatgacg cttgtgttta gtttataatt ttctgagttt tttttggaggaactctttga tcaatggctt gatctggatt ttaaccgctt tttaattcat gtatttctttgatgtgtaca tgtagGTTGC CAAGGATCCA TCAGGCAAAG CAGTGAACGC TCTTGAGCAACACATCAAGA ACCTTCTCTG CCCATCGTCT CCCTTTTTCT TCAACACCTT GTATGATCCTTACCGTGAAG GTGCTGATTT CGTCCGTGGA TACCCTTTCA GTCTCCGTGA AGGTGTTTCCACTGCTGTTT CCCATGGTCT TTGGCTCAAC ATCCCTGACT ACGATGCCCC GACCCAACTCGTGAAGCCTA AGGAGAGGAA CACCAGgtga caataattat catcataaca tgtttatgtgtttttttgtc aggatattca aatgtcagtt tttgctaaac gtttgatatg tcagGTATGTGGATGCTGTC ATGACCATCC CAAAGGGAAC ACTTTTCCCA ATGTGTGGTA TGAACTTGGCTTTTGACCGT GATTTGATTG GCCCGGCTAT GTACTTTGGT CTCATGGGTG ATGGTCAGCCTATTGGTCGT TACGACGATA TGTGGGCTGG TTGGTGCATC AAGgtaattt cttcttattcccttgtaaga ctcataattg agtatagcta aatatgaagc acatgctctg tactaagcgatacctccatt tggggttgaa tcttttatag GTGATCTGTG ACCACTTGAG CTTGGGAGTGAAGACCGGTT TACCGTATAT CTACCACAGC AAAGCGAGCA ACCCTTTTGT TAACCTGAAGAAGGAATACA AGGGAATCTT CTGGCAGGAG GAGATCATTC CGTTCTTCCA GAACGCAAAGCTATCGAAAG AAGCAGTAAC TGTTCAGCAA TGCTACATTG AGCTCTCAAA GATGGTCAAGGAGAAGTTGA GCTCCTTAGA CCCGTACTTT GACAAGCTTG CAGATGCCAT GGTTACATGGATTGAAGCTT GGGATGAGCT TAACCCACCA GCAGCCAGTG GCAAAGCTTG Agagcagtatgagccaaaaa gaaaaagcca ccaaagtttt ggttattttt agctcaaatt atcgttacttttaaatttct gattttacga acctttcttg ctttttttac acatttgagt agttttcatcatcagtactt tctcattgtc cggttatggt ttttgcattt ggtttaaata tcaccggtttatttataaac agtggtggat tagtagtact attttctgag tttttttctt tgtttcattaataaaaaggc cttttcatag gtgtttgcaa ttagtttttt tcccccatta atcatcgattatcataggta tgttatggct ttaaatggta taaggaaatt gcttatagac caaaaaaaagttgaattgct attgagagag cttttacaaa agaaagagca ttgttcaata agcttttcacatttggtcga tattttgatc aacctatcat aggtatctca attaataaac cggaatgttaatatgttttg c28ttctttaatt tcttcgccaa gaagagcacg aaatgtttgc caaacgcata tgcaacaaccccacgttaca tatttctatt tgtagctata gagcaagcta tattgttaaa aactaaaaagaaaatcttta ctataacata tagatagagg attcgagata tcttgaaaga ctcaacttaataaataaagt cgaaaagaaa acacggaggc gagaggacca cacactcgca cagaaagagtctcatatcct ctataacaaa ttgataaact aaactaaaac gacacgtgat gtcttgatcagccaataaaa agctaccgac ataaggcaaa aatgatcgta ccattaaacg taatccacgtggtttcagat tacacgtggc accacacaag tatctccatt tggcctataa atataaacccttaagcccac atatcttctc aatccatcac aaacaaaaca cacatcaaaa acgattttacaagaaaaaaa tatctgaaaa ATGTCAGAGA CCAACAAGAA TGCCTTCCAA GCCGGTCAGGCCGCTGGCAA AGCTGAGgta ctctttctct cttagaacag agtactgata gattgttcaagttataactc tttgaaaaca gttgaaactt gatcactcct agaacttcca ttttcttgtttaatttagtt tgtcgtaatt atgtaattga ttttgtgttg accatggttg ttatatagGAGAAGAGCAAT GTTCTGCTGG ACAAGGCCAA GGATGCTGCT GCTGCAGCTG GAGCTTCCGCGCAACAGgta aacgatctat acacacatta tgacatttat gtaaagaatg aaaagtcttcttagagcata catttacgca gatttctgat attttcatat ggtttgatgt aaatgttatagGCGGGAAAG AGTATATCGG ATGCGGCAGT GGGAGGTGTT AACTTCGTGA AGGACAAGACCGGCCTGAAC AAGTAGcgat ccgagtcaac tttgggagtt ataatttccc ttttctaattaattgttggg attttcaaat aaaatttggg agtcataatt gattctcgta ctcatcgtacttgttgttgt ttttagtgtt gtaatgtttt aatgtttctt ctccctttag atgtactacgtttggaactt taagtttaat caacaaaatc tagtttaagt tctaagaact ttgttttaccatcctctttt ttattgcact taatgcttat agacttttat gtccatccat ttctcaattcggctacgttg aattataagg gtcacataag caaaaaaata tcttaaaaag tcataacattaaggcaaaga tagattctta aaagtactca aattgagatc acgaaaataa caagttagaagttagaactt ccgtaggata tttataagaa caaaagatta ataaatgaag gcaatgattctggattcctt gcaagttagg aagttcgaaa tcgttg29cgttattatt actacttcgc ttttagtgtg attcgtttca ttctcgtttt tttatattcctcgatctgtt tgctcatttg ttgagatcta ttcgctatgt gagttcattt gactcagatctggatatttc gtgttgttcg atttatagat ctggtttctg gatctgttta cgatctatcgtcatctttcc tttgaaaatg attggtgttt ctgtgttcgt attcgtttag atctaaagtttttgatcgat gaatgtcgca tgtgttttta tctgaaagtt ttcgattaca gtatcaagtggtggtagtag tagtagtaga ctcaaaaagc tgcacaaact ttttatacac gtgaattgtgattgctttac ggttttcttg gagtttgtta attaaatcat ttaatattaa gaagtttatgaattaagaga acgttatttt atactatgat tttgattttg atttggtttg tgtgttttaatgcagtaaaa gaaaatcaaa ATGGCTTCAC ACATTGTTGG ATACCCACGT ATGGGCCCTAAGAGAGAGCT CAAGTTTGCA TTGGAATCTT TCTGGGATGG TAAGAGCACT GCTGAGGATCTTCAGAAGGT GTCTGCTGAT CTCAGGTCAT CCATCTGGAA ACAGATGTCT GCCGCTGGGACTAAGTTCAT CCCTAGCAAC ACCTTTGCTC ACTACGACCA GGTTCTTGAC ACCACCGCCATGCTCGGTGC TGTTCCACCT AGGTATGGAT ACACTGGTGG TGAGATCGGC CTTGATGTTTACTTCTCCAT GGCTAGAGGA AATGCCTCTG TGCCTGCCAT GGAAATGACC AAGTGGTTCGACACCAACTA gtgagtcttc attgatctct tgtgttcttt ttgttgacat tggtctttttgagttgtgga ctaatttgat tatgcttttg ttgatgcagC CATTACATCG TCCCTGAGTTGGGCCCTGAG GTTAACTTCT CTTACGCATC CCACAAGGCG GTGAATGAGT ACAAGGAGGCCAAGGCTgta cgtatcattc tttactaata tccgtttctt aggaaattac tgtttgctcgtctaattaac tattagagat cataggcttt agtttgagga tatagtgttt aagcttagattcattgagtg gtgtttcact gaggatgcta atatgctagg aaggtctcgg atgcattgaatataaaaacc gttagaaaag tcatctggca ctggttgtct aaagtagttt ttttttctacgaagttctga tctggtttac ttgatgttta tgcagCTTGG TGTTGACACC GTCCCTGTACTTGTTGGCCC AGTCTCTTAC TTGCTGCTTT CCAAGGCTGC CAAGGGTGTT GACAAGTCATTCGAACTTCT TTCTCTTCTC CCTAAGATTC TCCCGATCTA CAAgtaagaa atcactttattgtttttctt tattatgcca tccgtatcct tgatgttatc aatgatcctc tgacataccactgatataat gactttgatt tgtgtacagG GAAGTGATTA CCGAGCTTAA GGCTGCTGGTGCCACCTGGA TTCAGCTTGA CGAGCCTGTC CTTGTTATGG ATCTTGAGGG TCAGAAACTCCAGGCCTTTA CTGGTGCCTA TGCTGAACTT GAATCAACTC TTTCTGGTTT GAATGTTCTTGTCGAGACCT ACTTCGCTGA TATCCCTGCT GAGGCATACA AGACCCTAAC CTCATTGAAGGGTGTGACTG CCTTTGGATT TGATTTGGTT CGTGGCACCA AGACCCTTGA TTTGGTCAAGGCAGGTTTCC CTGAGGGAAA GTACCTCTTT GCTGGTGTTG TTGATGGAAG GAACATCTGGGCCAACGACT TTGCTGCGTC CCTAAGCACC TTGCAGGCAC TTGAAGGCAT TGTTGGTAAAGgtaattgtt cttccaaaat catctgcctt ttacctgaca ttactaggga attattgaaaaacaactgta tgaaatgttg atctgttgtc tttttgatgc agACAAGCTT GTGGTCTCAACCTCCTGCTC TCTTCTCCAC ACCGCTGTTG ATCTTATCAA TGAGACTAAG CTTGATGATGAAATCAAGTC ATGGTTGGCG TTTGCTGCCC AGAAGGTCGT TGAAGTGAAC GCTTTGGCCAAGGCTTTGGC TGGTCAGAAG GACGAGgtat tttacccaca tgctccccta gtagtggacccttgaattat ctgtagtgta attgatccag aaaaatctag aactcaatat tttttttctttcagGCTCTT TTCTCTGCCA ATGCTGCGGC TTTGGCTTCA AGGAGATCTT CCCCAAGAGTCACCAACGAG GGTGTCCAGA AGGCTgtaag tttgatttca aactgatgca ctgtgctcacccaatggttt attttcctaa tcttgtattg attgagatag tttctcattc ttgttatctcagGCTGCTGC TTTGAAGGGA TCTGACCACC GTCGTGCAAC CAATGTTAGT GCTAGGCTAGATGCTCAGCA GAAGAAGCTC AATCTCCCAA TCCTACCAAC CACAACCATT GGATCCTTCCCACAGACTGT AGAGCTCAGG AGAGTTCGTC GTGAGTACAA GGCCAAAAAg ttagtctcctaaatttaatc cttgggctta tgcgtcacac attttcttaa attgttgtga tgctaatggtttctttaatc tctcttttac tagGGTCTCA GAGGAGGACT ACGTTAAAGC CATCAAGGAAGAGATCAAGA AAGTTGTTGA CCTCCAAGAG GAACTTGACA TCGATGTTCT TGTCCACGGAGAGCCAGAGg tgaatttttt ttattattct atgtttttgc ctgatatttc tagtaatccttggtactgtt tctgatgaga catgttttca caattttgta gAGAAACGAC ATGGTTGAGTACTTTGGTGA GCAGTTGTCT GGTTTTGCCT TCACTGCAAA CGGATGGGTC CAATCTTATGGATCTCGCTG TGTGAAGCCA CCAGTTATCT ATGGTGATGT GAGCCGTCCC AAGGCAATGACCGTCTTCTG GTCCGCAATG GCTCAGAGCA TGACCTCTCG CCCAATGAAG GGTATGCTTACTGGTCCCGT CACCATTCTC AACTGGTCCT TTGTCAGGAA CGACCAGCCC AGgtacataatgttactata atctaaaaac aaacataaac accaaataaa gaacaaaaca ctaagacaatcttggaatca ttgtagGCAC GAAACCTGTT ACCAGATCGC TTTGGCCATC AAGGACGAAGTCGAGGATCT TGAGAAAGGT GGAATCGGTG TCATTCAGAT TGATGAGGCT GCACTTAGAGAAGGACTACC ACTCAGGAAA TCCGAGCATG CTTTCTACTT GGACTGGGCC GTCCACTCCTTCAGAATCAC CAACTGTGGA GTCCAAGACA GCACCCAGgt ttgcttaaat aaaaactacacataacgagt ctcatgtagt gtaatgcttt ctcagttgct cataacttat gtgtttctggtgtttttttt ttgcagATCC ACACTCACAT GTGCTACTCC CACTTCAATG ACATCATACACTCCATCATC GACATGGATG CTGATGTCAT CACCATTGAG AACTCCAGGT CTGATGAGAAGCTTCTTTCC GTGTTCCGTG AAGGAGTGAA GTACGGTGCT GGAATCGGTC CAGGAGTCTACGACATCCAC TCTCCAAGAA TACCATCTTC TGAGGAAATC GCAGACAGGG TCAACAAGATGCTTGCTGTC CTAGAGCAGA ACATCCTTTG GGTTAACCCT GACTGTGGTC TCAAGACCCGTAAGTACACC GAGGTCAAGC CTGCACTCAA GAACATGGTT GATGCGGCTA AGCTCATCCGCTCCCAGCTC GCCAGTGCCA AGTGAagaaa agcttgattt gaacaaggaa acgtttttttttctctaaaa tggttgtgtt ttatttggtt taataacttt cttaaaaata tttttagtcgaaggtagatt tgatgcatat ggtttctttc ttgttgagag agagaaaggc tatagcatcctttggatttg atgcaatgtt tgtgattttc tttttgtctc caatatattt ctctgatggaatgtcttttt tctaaagtat cttgaaaagg aataagagga ttgattctta tacaaatacttttgtttgcg ttgtcctaaa ctcactactt ttttttatcc gacgcaatca gtgctttgtagcctgttctt gaagtaggcc cctttgtatg tctctatctg gctcctgtat cagattgttgtttcccttag atttctttat ttcgttggca aaaagaaaat ctgaattgcc ccacaaagagcgtggtggct gatgttaggt tgcagtctca tggtccacca cttta30ttttgcagaa acattacatt acagatggag aacgccaaaa atcgattctt ttttttaattttcttttttg acaaatcgca ttctgcacac attccttttt tttttaattt tctccactacaccactaatc ttgccgtgat aggtgcatgt gtatgtgttt aagacatatc tcttttgttccggttggatt agtttatgta ataaccaaca actatactta atacattttg tccacttttgaattttctgt ttcttatttt gtttactgta aaaaagaatg aaaatcattg agatattaaaactaactaat cactaaggcc catttagtag acccaataag gcccatatgc tattttttttctccagaatt tgacctttat gtatttgacc gagtggaaaa gtaatacagt tcttttcttctctcctcctc tttcttcttc atgattggaa ttttagggct tttgaaagca cgaacgcgtgaagctctaat cgagaaaaaa ATGGAGGTTT TGGATAGGAG AGACGATGAG ATCAGGGACTCGGGAAACAT GGACAGCATC AAGTCACACT ATGTTACCGA CTCTGTTTCC GAGGAACGCCGCTCTCGTGA GCTCAAGGAT GGAGACCATC CTTTACGGgt ttgtccttta tccttagtatcgattcattt gcaatttgaa tctgatctta gctgaaaatt tgattcccgt tcgtcaaagatttctgaact ggtgatatga cggtttatag ctagagtagt ggaagattcg gattctaaatctttgtttgt tggagttttt gttttcaaat taggttttgc gaatttgttt agatgtatgtgagctcaaat gttataggat tttcgtattg gtggtattga ttgtagctag aacaaggcagattgatttag aggaactgat ttcattgtta agagtaagta ctggctcagt gactctaggatttttggtaa tgatgcagTA CAAGTTTTCG ATATGGTACA CTCGTCGCAC ACCAGGGGTTCGGAACCAGT CTTATGAAGA TAACATCAAG AAGATGGTAG AATTCAGCAC Ggtaagtctaaatatactac tggaagttca ttgttgaagc tgtttgcgat actatcttgt tcgtttctgagttatggctt ttataaacta gGTTGAAGGA TTTTGGGCCT GCTACTGTCA CCTTGCTCGTTCTTCTCTCT TGCCTAGTCC AACAGATCTT CATTTCTTTA AGGATGGGAT TCGTCCATTGTGGGAGgtac gtattcccct gtgttgattt ttcgtattgt gtttttatct ggatcatcgatatagaggga accttttata caacaaaagt ttctcaagag ttgtatcttc ttcaataaaccaactaaact agctaaattc atcaccttta gGATGGTGCC AACTGCAATG GAGGAAAGTGGATCATACGT TTCTCAAAAG TTGTATCTGC TCGCTTCTGG GAGGATCTGg tgagttttattttcttgtgg gcactactat tggagtattg acacctttct actttattca aaagaaacccttttgtcaat gttatttata atccatttta catacttagg gtctgagaat catgttaaatactcttccgt ttatttgttt tcttcagCTT CTTGCGTTGG TAGGCGACCA GCTTGATGATGCTGATAACA TATGTGGGGC AGTACTGAGT GTCCGTTTCA ACGAGGACAT CATTAGTGTATGGAATCGCA ATGCTTCTGA CCATCAGgtg agaaaactgt tcacaagaag aactgtctctctccctctcc ttttgattgg tacttacaca gtgcaatgtt ttccttaaac agGCAGTGATGGGTTTGAGA GACTCAATCA AGCGGCATTT GAAGTTGCCT CATGCATATG TCATGGAATACAAGCCACAC GATGCTTCTC TCCGCGACAA CTCTTCCTAC AGAAACACAT GGCTGAGAGGATAGgcccaa agtcgatgat tgtatcatgt aatgtggaga agatttggga agctcatctgcaacctggga agatatctgg attgaaccct gtatccaata ccatactgta ccggaggcttacaatatcag aaaaaacaaa atccgggcta cttctgtgtc agtatgtgtt catttcgtttttcttttaca gtacatcttg ttaacttcaa tggtttgact cttgatcaaa actataaggatgtattttca atgaaaactg gaaattacgt tctggtttac attataactc atgtcttaaaaagtaacagg atgtcaatat acaatgtcac ttcgtacgat gatctctaat gtacatctactgatgaaaaa ctgagtgtgg ctctgtccgt tgatctcaaa agctatagtt tagcatccgcagatgattga agtccgatga tacctggttc aacatcaaag cctcgagtga attacttcacacaatggaaa ctagaaaata agag31MALKSKLVSL LFLIATLSST FAASFSDSDS DSDLLNELVS LRSTSESGVI HLDDHGISKFLTSASTPRPY SLLVFFDATQ LHSKNELRLQ ELRREFGIVS ASFLANNNGS EGTKLFFCEIEFSKSQSSFQ LFGVNALPHI RLVSPSISNL RDESGQMDQS DYSRLAESMA EFVEQRTKLKVGPIQRPPLL SKPQIGIIVA LIVIATPFII KRVLKGETIL HDTRLWLSGA IFIYFFSVAGTMHNIIRKMP MFLQDRNDPN KLVFFYQGSG MQLGAEGFAV GFLYTVVGLL LAFVTNVLVRVKNITAQRLI MLLALFISFW AVKKVVYLDN WKTGYGIHPY WPSSWR*32MTKTMMIFAA AMTVMALLLV PTIEAQTECV SKLVPCFNDL NTTTTPVKEC CDSIKEAVEKELTCLCTIYT SPGLLAQFNV TTEKALGLSR RCNVTTDLSA CTAKGAPSPK ASLPPPAPAGNTKKDAGAGN KLAGYGVTTV ILSLISSIFF *33MAAITEFLPK EYGYVVLVLV FYCFLNLWMG AQVGRARKRY NVPYPTLYAI ESENKDAKLFNCVQRGHQNS LEMMPMYFIL MILGGMKHPC ICTGLGLLYN VSRFFYFKGY ATGDPMKRLTIGKYGFLGLL GLMICTISFG VTLILA*34MSLLADLVNL DISDNSEKII AEYIWVGGSG MDMRSKARTL PGPVTDPSKL PKWNYDGSSTGQAPGQDSEV ILYPQAIFKD PFRRGNNILV MCDAYTPAGE PIPTNKRHAA AEIFANPDVIAEVPWYGIEQ EYTLLQKDVN WPLGWPIGGF PGPQGPYYCS IGADKSFGRD IVDAHYKASLYAGINISGIN GEVMPGQWEF QVGPSVGISA ADEIWIARYI LERITEIAGV VVSFDPKPIPGDWNGAGAHT NYSTKSMREE GGYEIIKKAI EKLGLRHKEH ISAYGEGNER RLTGHHETADINTFLWGVAN RGASIRVGRD TEKEGKGYFE DRRPASNMDP YVVTSMIAET TLLWNP*35MYQKFQISGK IVKTLGLKMK VLIAVSFGSL LFILSYSNNF NNKLLDATTK VDIKETEKPVDKLIGGLLTA DFDEGSCLSR YHKYFLYRKP SPYKPSEYLV SKLRSYEMLH KRCGPDTEYYKEAIEKLSRD DASESNGECR YIVWVAGYGL GNRLLTLASV FLYALLTERI ILVDNRKDVSDLLCEPFPGT SWLLPLDFPM LNYTYAWGYN KEYPRCYGTM SEKHSINSTS IPPHLYMHNLHDSRDSDKLF VCQKDQSLID KVPWLIVQAN VYFVPSLWFN PTFQTELVKL FPQKETVFHHLARYLFHPTN EVWDMVTDYY HAHLSKADER LGIQIRVFGK PDGRFKHVID QVISCTQREKLLPEFATPEE SKVNISKTPK LKSVLVASLY PEFSGNLTNM FSKRPSSTGE IVEVYQPSGERVQQTDKKSH DQKALAEMYL LSLTDNIVTS ARSTFGYVSY SLGGLKPWLL YQPTNFTTPNPPCVRSKSME PCYLTPPSHG CEADWGTNSG KILPFVRHCE DLIYGGLKLY DEF*36MRTVVHDLAV VLLVIFYDYY MLFILDRLLE ANYGGKWEKI LGNHVDIFKN YPLIGQLFVQDMYNSIMDFP SFFIFQALLE YERHKVSEGE LQIPLPLELE PMNIDNQASG SGRARRDAASRAMQGWHSQR LNGNGEVSDP AIKDKNLVLH QKREKQIGTT PGLLKRKRAA EHGAKNAIHVSKSMLDVTVV DVGPPADWVK INVQRTQDCF EVYALVPGLV REEVRVQSDP AGRLVISGEPENPMNPWGAT PFKKVVSLPT RIDPHHTSAV VTLNGQLFVR VPLEQLE*37MSWQSYVDDH LMCDVEGNHL TAAAILGQDG SVWAQSAKFP QLKPQEIDGI KKDFEEPGFLAPTGLFLGGE KYMVIQGEQG AVIRGKKGPG GGVIKKTNQA LVFGFYDEPM TGGQCNLVVERLGDYLIESE L*38MYVVKRDGRQ ETVHFDKITA RLKKLSYGLS SDHCDPVLVA QKVCAGVYKG VTTSQLDELAAETAAAMTCN HPDYASLAAR IAVSNLHKNT KKSFSETIKD MFYEVNDRSG LKSPLIADDVFEIIMQNAAR LDSEIIYDRD FEYDYFGFKT LERSYLLKVQ GTVVERPQHM LMRVAVGIHKDDIDSVIQTY HLMSQRWFTH ASPTLFNAGT PRPQLSSCFL VCMKDDSIEG IYETLKECAVISKSAGGIGV SVHNIRATGS YIRGTNGTSN GIVPMLRVFN DTARYVDQGG GKRKGAFAVYLEPWHADVYE FLELRKNHGK EEHRARDLFY ALWLPDLFME RVQNNGQWSL FCPNEAPGLADCWGAEFETL YTKYEREGKA KKVVQAQQLW YEILTSQVET GTPYMLFKDS CNRKSNQQNLGTIKSSNLCT EIIEYTSPTE TAVCNLASIA LPRFVREKGV PLDSHPPKLA GSLDSKNRYFDFEKLAEVTA TVTVNLNKII DVNYYPVETA KTSNMRHRPI GIGVQGLADA FILLGMPFDSPEAQQLNKDI FETIYYHALK ASTELAARLG PYETYAGSPV SKGILQPDMW NVIPSDRWDWAVLRDMISKN GVRNSLLVAP MPTASTSQIL GNNECFEPYT SNIYSRRVLS GEFVVVNKHLLHDLTDMGLW TPTLKNKLIN ENGSIVNVAE IPDDLKAIYR TVWEIKQRTV VDMAADRGCYIDQSQSLNIH MDKPNFAKLT SLHFYTWKKG LKTGMYYLRS RAAADAIKFT VDTAMLKEKPSVAEGDKEVE EEDNETKLAQ MVCSLTNPEE CLACGS*39MAYASRFLSR SKQLQGGLVI LQQQHAIPVR AFAKEAARPT FKGDEMLKGV FFDIKNKFQAAVDILRKEKI TLDPEDPAAV KQYANVMKTI RQKADMFSES QRIKHDIDTE TQDIPDARAYLLKLQEIRTR RGLTDELGAE AMMFEALEKV EKDIKKPLLR SDKKGMDLLV AEFEKGNKKLGIRKEDLPKY EENLELSMAK AQLDELKSDA VEAMESQKKK EEFQDEEMPD VKSLDIRNFI*40MHGYEDDLDE EAGYDDYYSG DEDEYEDEEE EDEEPPKEEL EFLESRQKLK ESIRKKMGNGSANAQSSQER RRKLPYNDFG SFFGPSRPVI SSRVIQESKS LLENELRKMS NSSQTMFLLMELFFGVQKKR PVPTNGSGSK NVSQEKRPKV VNEVRRKVET LKDTRDYSFL FSDDAELPVPKKESLSRSGS FPNSAYHFHE DNLYRFFADV QEARSAQLSS RPKQSSGING RTAHSPHREEKRPVSANGHS RPSSSGSQMN HSRPSSSGSK MNHSRPATSG SQMPNSRPAS SGSQMQSRAVSGSGRPASSG SQMQNSRPQN SRPASAGSQM QQRPASSGSQ RPASSGSQRP ASSGSQRPGSSTNRQAPMRP PGSGSTMNGQ SANRNGQLNS RSDSRRSAPA KVPVDHRKQM SSSNGVGPGRSATNARPLPS KSSLERKPSI SAGKSSLQSP QRPSSSRPMS SDPRQRVVEQ RKVSRDMATPRMIPKQSAPT SKHQMMSKPA LKRPPSRDID HERRLLKKKK PARSEDQEAF DMLRQLLPPKRFSRYDDDDI NMEAGFEDIQ KEERRSARIA REEDERELKL LEEEERRERL KKNRKLSR*41MDPNQRIARI SAHLNPPNLH NQIADGSGLN RVACRAKGGS PGFKVAILGA AGGIGQPLAMLMKMNPLVSV LHLYDVANAP GVTADISHMD TSAVVRGFLG QPQLEEALTG MDLVIIPAGVPRKPGMTRDD LFNINAGIVR TLSEAIAKCC PKAIVNIISN PVNSTVPIAA EVFKKAGTFDPKKLMGVTML DVVRANTFVA EVMSLDPREV EVPVVGGHAG VTILPLLSQV KPPCSFTQKEIEYLTDRIQN GGTEVVEAKA GAGSATLSMA YAAVEFADAC LRGLRGDANI VECAYVASHVTELPFFASKV RLGRCGIDEV YGLGPLNEYE RMGLEKAKKE LSVSIHKGVT FAKK*42MAQVQAPSSH SPPPPAVVND GAATASATPG IGVGGGGDGV THGALCSLYV GDLDFNVTDSQLYDYFTEVC QVVSVRVCRD AATNTSLGYG YVNYSNTDDA EKAMQKLNYS YLNGKMIRITYSSRDSSARR SGVGNLFVKN LDRSVDNKTL HEAFSGCGTI VSCKVATDHM GQSRGYGFVQFDTEDSAKNA TEKLNGKVLN DKQIFVGPFL RKEERESAAD KMKFTNVYVK NLSEATTDDELKTTPGQYGS ISSAVVMRDG DGKSRCFGFV NFENPEDAAR AVEALNGKKF DDKEWYVGKAQKKSERELEL SRRYEQGSSD GGNKFDGLNL YVKNLDDTVT DEKLRELFAE FGTITSCKVMRDPSGTSKGS GFVAFSAASE ASRVLNEMNG KMVGGKPLYV ALAQRKEERR AKLQAQFSQMRPAFIPGVGP RMPIFTGGAP GLGQQIFYGQ GPPPIIPHQP GFGYQPQLVP GMRPAFFGGPMMQPGQQGPR PGGRRSGDGP MRHQHQQPMP YMQPQMMPRG RGYRYPSGGR NMPDGPMPGGMVPVAYDMNV MPYSQPMSAG QLATSLANAT PAQQRTLLGE SLYPLVDQIE SEHAAKVTGMLLEMDQTEVL HLLESPEALN AKVSEALDVL RNVNQPSSQG SEGNKSGSPS DLLASLSINDHL*43MAENYDRASE LKAFDEMKIG VKGLVDAGVT KVPRIFHNPH VNVANPKPTS TVVMIPTIDLGGVFESTVVR ESVVAKVKDA MEKFGFFQAI NHGVPLDVME KMINGIRRFH DQDPEVRKMFYTRDKTKKLK YHSNADLYES PAASWRDTLS CVMAPDVPKA QDLPEVCGEI MLEYSKEVMKLAELMFEILS EALGLSPNHL KEMDCAKGLW MLCHCFPPCP EPNRTFGGAQ HTDRSFLTILLNDNNGGLQV LYDGYWIDVP PNPEALIFNV GDFLQLISND KFVSMEHRIL ANGGEEPRISVACFFVHTFT SPSSRVYGPI KELLSELNPP KYRDTTSESS NHYVARKPNG NSSLDHLRI*44MYKLDRKLGK GGFGQVYVGR KMGTSTSNAR FGPGALEVAL KFEHRTSKGC NYGPPYEWQVYNALGGSHGV PRVHFKGRQG DFYVMVMDIL GPSLWDVWNS TTQAMSTEMV ACIAIEAISILEKMHSRGYV HGDVKPENFL LGPPGTPEEK KLFLVDLGLA SKWRDTATGL HVEYDQRPDVFRGTVRYASV HAHLGRTCSR RDDLESLAYT LVFLLRGRLP WQGYQVGDTK NKGFLVCKKKMATSPETLCC FCPQPFRQFV EYVVNLKFDE EPDYAKYVSL FDGIVGPNPD IRPINTEGAQKVIW*45MAQRLEAKGG KGGNQWDDGA DHENVTKIHV RGGLEGIQFI KFEYVKAGQT VVGPIHGVSGKGFTQTFEIN HLNGEHVVSV KGCYDNISGV IQALQFETNQ RSSEVMGYDD TGTKFTLEISGNKITGFHGS ADANLKSLGA YFTPPPPIKQ EYQGGTGGSP WDHGIYTGIR KVYVTFSPVSISHIKVDYDK DGKVETRQDG DMLGENRVQG QPNEFVVDYP YEYITSIEVT CDKVSGNTNRVRSLSFKTSK DRTSPTYGRK SERTFVFESK GRALVGLHGR CCWAIDALGA HFGAPPIPPPPPTEKLQGSG GDGGESWDDG AFDGVRKIYV GQGENGIASV KFVYDKNNQL VLGEEHGKHTLLGYEEFELD YPSEYITAVE GYYDKVFGSE SSVIVMLKFK TNKRTSPPYG MDAGVSFILGKEGHKVVGFH GKASPELYQT GVTVAPITK*46MDIEKAGSRR EEEEPIVQRP RLDKGKGKAH VFAPPMNYNR IMDKHKQEKM SPAGWKRGVAIFDFVLRLIA AITAMAAAAK MATTEETLPF FTQFLQFQAD YTDLPTMSSF VIVNSIVGGYLTLSLPFSIV CILRPLAVPP RLFLILCDTV MMGLTLMAAS ASAAIVYLAR NGNSSSNWLPVCQQFGDFCQ GTSGAVVASF IAATLLMFLV ILSAFALKRT T*47MTTEEKEILA AKLEEQKIDL DKPEVEDDDD NEDDDSDDDD KDDDEADGLD GEAGGKSKQSRSEKKSRKAM LKLGMKPITG VSRVTVKKSK NILFVISKPD VFKSPASDTY VIFGEAKIEDLSSQIQSQAA EQFKAPDLSN VISKGESSSA AVVQDDEEVD EEGVEPKDIE LVMTQAGVSRPNAVKALKAA DGDIVSAIME LTT*48MLPSDAADPS VCYVPNPYNP YQYYNVYGSG QEWTDYPAYT NPEGVDMNSG IYGENGTVVYPQGYGYAAYP YSPATSPAPQ LGGEGQLYGA QQYQYPNYFP NSGPYASSVA TPTQPDLSANKPAGVKTLPA DSNNVASAAG ITKGSNGSAP VKPTNQATLN TSSNLYGMGA PGGGLAAGYQDPRYAYEGYY APVPWHDGSK YSDVQRPVSG SGVASSYSKS STVPSSRNQN YRSNSHYTSVHQPSSVTGYG TAQGYYNRMY QNKLYGQYGS TGRSALGYGS SGYDSRTNGR GWAATDNKYRSWGRGNSYYY GNENNVDGLN ELNRGPRAKG TKNQKGNLDD SLEVKEQTGE SNVTEVGEADNTCVVPDREQ YNKEDFPVDY ANAMFFIIKS YSEDDVHKSI KYNVWASTPN GNKKLAAAYQEAQQKAGGCP IFLFFSVNAS GQFVGLAEMT GPVDFNTNVE YWQQDKWTGS FPLKWHIVKDVPNSLLKHIT LENNENKPVT NSRDTQEVKL EQGLKIVKIF KEHSSKTCIL DDFSFYEVRQKTILEKKAKQ TQKQVSEEKV TDEKKESATA ESASKESPAA VQTSSDVKVA ENGSVAKPVTGDVVANGC*49MLAIFDKNVA KTPEALQGQE GGSVCALKDR FLPNHFSSVY PGAVTINLGS SGFIACSLEKQNPLLPRLFA VVDDMFCIFQ GHIENVPILK QQYGLTKTAT EVTIVIEAYR TLRDRGPYSAEQVVRDFQGK FGFMLYDCST QNVFLAGDVD GSVPLYWGTD AEGHLVVSDD VETVKKGCGKSFAPFPKGCF FTSSGGLRSY EHPSNELKPV PRVDSSGEVC GVTFKVDSEA KKEAMPRVGSVQNWSKQI*50MVNIPKTKNT YCKNKECKKH TLHKVTQYKK GKDSLAAQGK RRYDRKQSGY GGQTKPVFHKKAKTTKKIVL RLQCQSCKHF SQRPIKRCKH FEIGGDKKGK GTSLF*51MEKSNGLRVI LFPLPLQGCI NPMIQLAKIL HSRGFSITVI HTCFNAPKAS SHPLFTFLEIPDGLSETEKR TNNTKLLLTL LNRNCESPFR ECLSKLLQSA DSETGEEKQR ISCLIADSGWMFTQPIAQSL KLPILVLSVF TVSFFRCQFV LPKLRREVYL PLQDSEQEDL VQEFPPLRKKDIVRILDVET DILDPFLDKV LQMTKASSGL IFMSCEELDH DSVSQAREDF KIPIFGIGPSHSHFPATSSS LSTPDETCIP WLDKQEDKSV IYVSYGSIVT ISESDLIEIA WGLRNSDQPFLLVVRVGSVR GREWIETIPE EIMEKLNEKG KIVKWAPQQD VLKHRAIGGF LTHNGWSSTVESVCEAVPMI CLPFRWDQML NARFVSDVWM VGINLEDRVE RNEIEGAIRR LLVEPEGEAIRERIEHLKEK VGRSFQQNGS AYQSLQNLID YISSF*52MADGEDIQPL VCDNGTGMVK AGFAGDDAPR AVFPSIVGRP RHTGVMVGMG QKDAYVGDEAQSKRGILTLK YPIEHGIVSN WDDMEKIWHH TFYNELRVAP EEHPVLLTEA PLNPKANREKMTQIMFETFN VPAMYVAIQA VLSLYASGRT TGIVLDSGDG VSHTVPIYEG YALPHAILRLDLAGRDLTDS LMKILTERGY MFTTTAEREI TRDIKEKLAY VALDYEQELE TAKSSSSVEKNYELPDGQVI TIGAERFRCP EVLFQPSLIG MEAPGIHETT YNSIMKCDVD IRKDLYGNIVLSGGSTMFPG IADRMSKEIT ALAPSSMKIK VVAPPERKYS VWIGGSILAS LSTFQQMWISKSEYDESGPS IVHRKCF*53MEWEKWYLDA VLVPSALLMM FGYHIYLWYK VRTDPFCTIV GTNSRARRSW VAAIMKDNEKKNILAVQTLR NTIMGGTLMA TTCILLCAGL AAVLSSTYSI KKPLNDAVYG AHGDFTVALKYVTILTIFLF AFFSHSLSIR FINQVNILIN APQEPFSDDF GEIGSFVTPE YVSELLEKAFLLNTVGNRLF YMGLPLMLWI FGPVLVFLSS ALIIPVLYNL DFVFLLSNKE KGKVDCNGGCDDNFSP*54MGDARDNEAY EEELLDYEEE DEKVPDSGNK VNGEAVKKGY VGIHSSGFRD FLLKPELLRAIVDSGFEHPS EVQHECIPQA ILGMDVICQA KSGMGKTAVF VLSTLQQIEP SPGQVSALVLCETRELAYQI CNEFVRESTY LPDTKVSVFY GGVNIKIHKD LLKNECPHIV VGTPGRVLALAREKDLSLKN VRHFILDECD KMLESLDMRR DVQEIFKMTP HDKQVMMFSA TLSKEIRPVCKKFMQDPMEI YVDDEAKLTL HGLVQHYIKL SEMEKTRKLN DLLDALDFNQ VVIFVKSVSRAAELNKLLVE CNFPSICIHS GMSQEERLTR YKSFKEGHKR ILVATDLVGR GIDIERVNIVINYDMPDSAD TYLHRVGRAG RFGTKGLAIT FVASASDSEV LNQVQERFEV DIKELPEQIDTSTYSKCEIP YQLFV*55MATISNLANL PRATCVDSKS SSSSSVLPRS FVNFRALNAK LSSSQLSLRY NQRSIPSLSVRCSVSGGNGT AGKRTTLHDL YEKEGQSPWY DNLCRPVTDL LPLIARGVRG VTSNPAIFQKAISTSNAYND QFRTLVESGK DIESAYWELV VKDIQDACKL FEPIYDQTEG ADGYVSVEVSPRLADDTQGT VEAAKYLSKV VNRRNVYIKI PATAPCIPSI RDVIAAGISV NVTLIFSIARYEAVIDAYLD GLEASGLDDL SRVTSVASFF VSRVDTLMDK MLEQIGTPEA LDLRGKAAVAQAALAYKLYQ QKFSGPRWEA LVKKGAKKQR LLWASTSVKN PAYSDTLYVA PLIGPDTVSTMPDQALEAFA DHGIVKRTID ANVSEAEGIY SALEKLGIDW NKVGEQLEDE GVDSFKKSFESLLGTLQDKA NTLKLASH*56MESPKNSLIP SFLYSSSSSP RSFLLDQVLN SNSNAAFEKS PSPAPRSSPT SMISRKNFLIASPTEPGKGI EMYSPAFYAA CTFGGILSCG LTHMTVTPLD LVKCNMQIDP AKYKSISSGFGILLKEQGVK GFFRGWVPTL LGYSAQGACK FGFYEYFKKT YSDLAGPEYT AKYKTLIYLAGSASAEIIAD IALCPFEAVK VRVQTQPGFA RGMSDGFPKF IKSEGYGGLY KGLAPLWGRQIPYTMMKFAS FETIVEMIYK YAIPNPKSEC SKGLQLGVSF AGGYVAGVFC AIVSHPADNLVSFLNNAKGA TVGDAVKKIG MVGLFTRGLP LRIVMIGTLT GAQWGLYDAF KVFVGLPTTGGVAPAPAIAA TEAKA*57MVEPANTVGL PVNPTPLLKD ELDIVIPTIR NLDFLEMWRP FLQPYHLIIV QDGDPSKKIHVPEGYDYELY NRNDINRILG PKASCISFKD SACRCFGYMV SKKKYIFTID DDCFVAKDPSGKAVNALEQH IKNLLCPSSP FFFNTLYDPY REGADFVRGY PFSLREGVST AVSHGLWLNIPDYDAPTQLV KPKERNTRYV DAVMTIPKGT LFPMCGMNLA FDRDLIGPAM YFGLMGDGQPIGRYDDMWAG WCIKVICDHL SLGVKTGLPY IYHSKASNPF VNLKKEYKGI FWQEEIIPFFQNAKLSKEAV TVQQCYIELS KMVKEKLSSL DPYFDKLADA MVTWIEAWDE LNPPAASGKA*58MSETNKNAFQ AGQAAGKAEE KSNVLLDKAK DAAAAAGASA QQAGKSISDA AVGGVNFVKDKTGLNK*59MASHIVGYPR MGPKRELKFA LESFWDGKST AEDLQKVSAD LRSSIWKQMS AAGTKFIPSNTFAHYDQVLD TTAMLGAVPP RYGYTGGEIG LDVYFSMARG NASVPAMEMT KWFDTNYHYIVPELGPEVNF SYASHKAVNE YKEAKALGVD TVPVLVGPVS YLLLSKAAKG VDKSFELLSLLPKILPIYKE VITELKAAGA TWIQLDEPVL VMDLEGQKLQ AFTGAYAELE STLSGLNVLVETYFADIPAE AYKTLTSLKG VTAFGFDLVR GTKTLDLVKA GFPEGKYLFA GVVDGRNIWANDFAASLSTL QALEGIVGKD KLVVSTSCSL LHTAVDLINE TKLDDEIKSW LAFAAQKVVEVNALAKALAG QKDEALFSAN AAALASRRSS PRVTNEGVQK AAAALKGSDH RRATNVSARLDAQQKKLNLP ILPTTTTGSF PQTVELRRVR REYKAKKVSE EDYVKAIKEE IKKVVDLQEELDIDVLVHGE PERNDMVEYF GEQLSGFAFT ANGWVQSYGS RCVKPPVIYG DVSRPKAMTVFWSAMAQSMT SRPMKGMLTG PVTILNWSFV RNDQPRHETC YQIALAIKDE VEDLEKGGIGVIQIDEAALR EGLPLRKSEH AFYLDWAVHS FRITNCGVQD STQIHTHMCY SHFNDIIHSIIDMDADVITI ENSRSDEKLL SVFREGVKYG AGIGPGVYDI HSPRIPSSEE IADRVNKMLAVLEQNILWVN PDCGLKTRKY TEVKPALKNM VDAAKLIRSQ LASAK*60MEVLDRRDDE IRDSGNMDSI KSHYVTDSVS EERRSRELKD GDHPLRYKFS IWYTRRTPGVRNQSYEDNIK KMVEFSTVEG FWACYCHLAR SSLLPSPTDL HFFKDGIRPL WEDGANCNGGKWIIRFSKVV SARFWEDLLL ALVGDQLDDA DNICGAVLSV RFNEDIISVW NRNASDHQAVMGLRDSIKRH LKLPHAYVME YKPHDASLRD NSSYRNTWLR G*61GATCTCTGTTTCACAAG62GATCTGTGTTGTTAATT63GATCCTTGCTTGAGCTA64GATCCGTAACTCTTGAA65GATCCCTCTTTACAGTT66GATCCCGTGCTGCAGCT67GATCACTGGAATTTGAG68GATCGTTCCCTTGCTGC69GATCTTTTTTTTGTTCA70GATCCAATCTTAAAGGT71GATCATTTATGAGAAGC72GATCAATCAAGGAGAGT73GATCAGCATTTACAGTG74GATCCTCTTGATTAAAT75GATCTCAAAGGGTGAGT76GATCCGTTTCTTTGCCC77GATCAAAACACAATCCT78GATCGGTGGTGACAAGA79GATCGTTTCAACAAAAC80GATCAATCCTTGCATCC81GATCTTTGGGCCTGTGC82GATCTATTATCAATTTA83GATCATGGAATAGTGAA84GATCGGGACGTTGACTG85GATCATTCCGTTCTTCC86GATCCGAGTCAACTTTG87GATCCACACTCACATGT88GATCAAAACTATAAGGA89GATCTGAAAGAGAGAAG90GATCATCTTTTTTCTCC91GATCATGCATATTTGTT92GATCATTGAGAATCCAG93GATCATTCAAATCTTGT94GATCTCGACTTCTCTGC95GATCGTCTTCAAGGGCA96GATCACACCTCTGAGTC97GATCTACTATTATTAAG98GATCCGTTGATTTGCTC99GATCCAGACAACATGAA100GATCCCAATTCCTTGTT101GATCTCTCTGTCTCCCA102GATCTCTATTGGCAATA103GATCTCTACTCTCTTCT104GATCTGAGATAGAGACA105GATCCATTGAGATAATT106GATCTATTCCAGCGGAA107GATCCTAGAATATTTTT108GATCCTGTCATGGAATA109GATCGTTCGTGGTACTT110GATCGGCTTCTGCTCGA111GATCGGCATTACGACCC112GATCTCCTTTTGATTCT113GATCAAAATTCTCAACC114GATCTTGCCTTTTAAAC115GATCTTGTATAATGACA116GATCTTTATGGTGCTAG117GATCAACCCGATTCTTG118GATCAAGATTTTTTTTA119GATCACGCCTTTGTTTC120GATCAAGAATGTGTATG121GATCTGATTTTCTCAAC122GATCACACCGCAATGCT123GATCGACTCTTCTCGTT124GATCAATATGGTTTTGA125GATCGCGTCTGAATTGT126GATCTCTGTCATAGACT127GATCTCGGCATGTGTGT128GATCTTGGGTGCAATTT129GATCAACATGAATGAGG130GATCTTCTGCTAGGGAT131GATCCCGTATCTTGAAC132GATCCAGAAATTTCCAA133GATCGCGTCGTGTTACT134GATCTTAGCTTATGACT135GATCTATATTTTTCTAA136GATCCTTTTTGTAGTTT137GATCGACGATGTCATCT138GATCATTGAGTATGTTT139GATCAATCAATGGTTCA140GATCGACTCTCTTACTT141GATCTTTGTTTTTAAGA142GATCTTGGTTTTTAGAG143GATCTATTCGGTGAAAA144GATCACAGTGAACCCCG145GATCTTGTGGACATCTC146GATCGTTAATTCAATGC147GATCGAAGAAGCAGACC148GATCTGTGTGTCGTCCA149GATCTTCTGTGCTATGT150GATCTCTGGATTCATCG151GATCAGATGCAATTTGC152GATCCTCTCCTATGATG153GATCTTTGTAACGCACC154GATCTCATAAATGTTGG155GATCTCTGTGAGATTTG156GATCTGTAGCAAACACA157GATCATGCCTCTGTTCA158GATCTGGCGGAGCACCA159GATCTGACAAACGCAAC160GATCAATCAACCTTATG161GATCTGTAAAATACTAC162GATCATAAAGAGACAGA163GATCCGTGGTGTTAAGA164GATCCTTAACTTGAGGA165GATCGCAGTCGAGGAAT166GATCTTCTTGTTCGCAT167GATCATTCTTCTTTTGG168GATCTCGTCTTTGTTTT169GATCAGATAAAACACCT170GATCTGTAGCCAATGGA171GATCCAAATCCAAAGAG172GATCAGAGGAGAACGTG173GATCTAAGCTTAGCATC174GATCACAGTTTTGAAAT175GATCCAGAGGCGTTCAA176GATCTGATGAGCCAAAG177GATCAAAGCCATTGAAG178GATCCCGTGAGTGGATG179GATCCTGTTTTTGATTG180GATCTGAATAGCTGCGC181GATCATATACCAGTATT182GATCACATCTTTACCAG183GATCCTTCTAAGACTAA184GATCATTTCTGTTAGAA185GATCGTGGCCGTTGGAT186GATCATGCTCTCCAAAC187GATCCCAAACCGATGGT188GATCATTAGTCTCTCAT189GATCGGTGTGTTATACA190GATCTTGTCTCTGAGTA191GATCTTTCGCCTCTTCT192GATCTGCTGAAACTGAA193GATCTTTTTTTTTGTGT194GATCTCATCCATCTTCT195GATCTAAATCTGTGAAA196GATCAAAAAAAAAAAAA197GATCAAAACAACCTGCG198GATCAAAACAATGAGGG199GATCAAAACTGTTACAC200GATCAAAAGCTCTTACA201GATCAAAATTTGAGGGG202GATCAAAATTTGTAGTG203GATCAAACTGGTGAAGG204GATCAAACTTTGCTTGC205GATCAAATCATCTTCCA206GATCAAATGTCCCCACC207GATCAACGCAGCCAAGG208GATCAACTCTTTACATG209GATCAACTGTCAATTCA210GATCAACTTAAGCAAAA211GATCAACTTATAAGTGC212GATCAAGAAAGAAGAAG213GATCAAGAAGGTAACGC214GATCAAGCTGTCTTCAA215GATCAAGTTTACAGGAT216GATCAATAATTGTTTCT217GATCAATCTAGCGAACA218GATCAATTGATGGCGCA219GATCACAGATTCTGAAT220GATCACAGCAAGAGTGG221GATCACATGAGGAAGAT222GATCACCTTGTTGCTGC223GATCACGACCAAGTCAT224GATCACGGTTCTCGTCG225GATCACTGCTTTGGCTC226GATCACTTTCAGTGATA227GATCACTTTTAACTGTT228GATCACTTTTTTGTGGG229GATCAGAAGAGCAACGT230GATCAGAAGCAGTGCGT231GATCAGAAGGAACTGCA232GATCAGAATCATCAATA233GATCAGATGCAATGTGT234GATCAGATGGGATGGTA235GATCAGATTTTCTTGGG236GATCAGCGCCACTCTTC237GATCAGTTAGCTTCTCT238GATCAGTTGATGCTGGA239GATCATATGTTGCTGGA240GATCATCAAAACCATCC241GATCATCAAAATCAGTC242GATCATCACTATTTCAT243GATCATCCCCTGTCTGT244GATCATCCTTCTTTGCC245GATCATCGTTTCGTGTA246GATCATCTATTGGATGA247GATCATCTCACCTTTGT248GATCATCTGAAACCATC249GATCATCTGTGAATTTT250GATCATCTTTTGAATGT251GATCATGAAATGGTATG252GATCATGATTTCCTTCT253GATCATGCAATCAAGCA254GATCATGTGTTTGGTTT255GATCATTCTCCTCGCAA256GATCATTGGGAAATGAT257GATCATTGTTGTCTCAC258GATCATTTTATGTGATT259GATCATTTTCCAAACGC260GATCATTTTGATGCTTT261GATCATTTTTCTCTAAT262GATCATTTTTTTTTTTT263GATCCAAAAGACAAACA264GATCCAAAGAGTTGGAG265GATCCAAATCAACCTAA266GATCCAAGCTTTTAATG267GATCCAATAATACATAC268GATCCAATGGCACCAGC269GATCCAATTTGGTCAGA270GATCCACATGGAGGTAG271GATCCACCTGATGATGT272GATCCACGAGTTTCAGG273GATCCACGCGTGGGAGA274GATCCAGAAGCCGGAGT275GATCCAGAAGTTCTTGC276GATCCAGAGGTCTGGTT277GATCCAGCAGTGGTGTT278GATCCAGTTATTATGGA279GATCCAGTTTTTGTTTG280GATCCATGAACTGGACC281GATCCATTCACTGTTAA282GATCCATTCCGCAGTTC283GATCCATTTGTGATGAA284GATCCCAAACGACAAAA285GATCCCAAATTCCCAAT286GATCCCAGATTACGATT287GATCCCATTATCGCTAA288GATCCCATTTCTCACTG289GATCCCGATTGGAGTGC290GATCCCTCCGAAGCAGT291GATCCCTGCATACGGTG292GATCCGCTTCGCCTTCA293GATCCGGATATTTACAC294GATCCGTATCGTCGATT295GATCCGTCCTACTTGTC296GATCCGTCTTATTGCGT297GATCCTAACCATTATCC298GATCCTAGGAGAATACA299GATCCTATTCGTTGTTG300GATCCTCATCTTTCCTA301GATCCTCCTCGGACGAA302GATCCTCGGATGTGGCA303GATCCTGACGCCGTAGC304GATCCTGAGAATTTCTT305GATCCTTATCATCCGAG306GATCCTTATTTGGTGCC307GATCCTTCCGCAATGTT308GATCCTTCGTTAACGGC309GATCCTTGGATTTGGTC310GATCCTTGTGGCGACTG311GATCCTTTAGAACATTT312GATCCTTTCGACAAGAT313GATCCTTTCTTGGAAGA314GATCCTTTCTTTGGGGT315GATCCTTTTATCGAATC316GATCGAACCAAGTTTCA317GATCGAACCAGAGATAT318GATCGAATTCCTGGAAG319GATCGACAGTCTGGAGA320GATCGACGACTGGACTC321GATCGATGCCCTTGTGA322GATCGCCATTGAGAACA323GATCGCTGCAACGATGA324GATCGCTGCTCAGTTTG325GATCGGAAAGATTGTGG326GATCGGAATTCGTGATG327GATCGGAATTTCATGTG328GATCGGATTTTTTCTGA329GATCGGGAAGAGAGGAG330GATCGTATACTTCGTCC331GATCGTCAAGAAGAAGC332GATCGTCGTTCGATGAT333GATCGTGGTGTCCTCGC334GATCGTTAATTTTTTTT335GATCTAAACTTTTATGC336GATCTAAGTGGAATCTT337GATCTAATAGCAGAGTT338GATCTACCCGATTCTTT339GATCTACGCGTCCCTCT340GATCTACGTAAGTTTTC341GATCTACTCAACGAAGC342GATCTAGGCGCTTTTAC343GATCTATCCAGTTTGGT344GATCTATCTATTATTCC345GATCTATTCATAGAAGT346GATCTATTCTGTCCAAG347GATCTCAAAGTGACTGT348GATCTCAAGTTTCAATC349GATCTCAGATATTTTAA350GATCTCATACATTATGT351GATCTCATTATGCAATT352GATCTCCAGTTCGATAT353GATCTCCGTCCCAAGAA354GATCTCGAAAGCTATCA355GATCTCGGTGTTCCTTC356GATCTCTACAATTAGTG357GATCTCTCTAGCCTTTG358GATCTCTCTCGGCCTTG359GATCTCTCTTTATTGTC360GATCTCTTACACGTGCC361GATCTCTTTATGAAAGA362GATCTCTTTGTGACTAT363GATCTCTTTCTTTTTCT364GATCTGAAATCCGCCGT365GATCTGACTAATGTCAT366GATCTGAGTTTTATTTT367GATCTGATTGGTTTTGG368GATCTGATTGTGTTACC369GATCTGCACAAAGCATG370GATCTGCCAAAAGCACC371GATCTGCTGAAGAAAGT372GATCTGCTGGGAAAGTC373GATCTGGACCTTGTCCC374GATCTGGAGGTGCCTAA375GATCTGGTCTACTATAT376GATCTGGTTCGTTCCGT377GATCTGTTCTTCCAGCA378GATCTGTTTCATTAGAC379GATCTTAGTGACGATGA380GATCTTATTGTTGGTGA381GATCTTCAGTCTTGAGT382GATCTTCCCTTTTCTTT383GATCTTCTTGAGGAGGA384GATCTTCTTGGCATGCA385GATCTTGCAGCATTGGA386GATCTTGCTCGGCTTGC387GATCTTGTACCTTCTGA388GATCTTGTTGAAGGATG389GATCTTGTTTCTCGGTC390GATCTTTATCTTTATCT391GATCTTTCTTGTTTTGT392GATCTTTGTTGGTGTAA393GATCTTTTCTTGGATGA394GATCTTTTGGTCTTTTT395GATCTTTTTGGGGATAA396GATCTTTTTGTATGTTG397GATCTGAAAGAGAGAAG398GATCATCTTTTTTCTCC399GATCACTGGAATTTGAG400GATCGTTCCCTTGCTGC401GATCCAATCTTAAAGGT402GATCAATCAAGGAGAGT403GATCATGCATATTTGTT


Claims
  • 1. A composition comprising at least one expression vector, wherein the at least one expression vector comprises a nucleic acid comprising: (a) at least one polynucleotide sequence selected from the group consisting of: SEQ ID NO: 1-SEQ ID NO: 30 or a sequence complementary thereto; (b) at least one polynucleotide sequence comprising a conservative variation of a polynucleotide sequence of (a); (c) at least one polynucleotide encoding a polypeptide sequence selected from the group consisting of SEQ ID NO: 31-SEQ ID NO: 60, or conservative variations thereof; (d) at least one polynucleotide sequence that hybridizes under stringent conditions to a polynucleotide sequence of (a) or (b); (e) at least one polynucleotide that is at least about 70% identical to a polynucleotide sequence of (a), or (b); or, (f) at least one polynucleotide sequence comprising at least 10 contiguous nucleotides of a polynucleotide sequence selected from the group consisting of: SEQ ID NO: 1-SEQ ID NO: 30, or a sequence complementary thereto.
  • 2. The composition of claim 1, wherein the at least one expression vector comprises a promoter operably linked to the nucleic acid comprising the polynucleotide of (a), (b), (c), (d) or (e).
  • 3. The composition of claim 1, wherein the nucleic acid encodes a polypeptide.
  • 4. The composition of claim 1, wherein the polypeptide comprises a polypeptide subsequence of SEQ ID NO: 31-SEQ ID NO: 60.
  • 5. The composition of claim 1, wherein the nucleic acid encodes a sense or antisense RNA.
  • 6. A cell comprising the at least one expression vector of claim 1.
  • 7. The cell of claim 6, which cell expresses a polypeptide selected from the group consisting of SEQ ID NO: 31-SEQ ID NO: 60, and conservative variations thereof.
  • 8. An isolated or recombinant polypeptide comprising: (a) an amino acid sequence selected from the group consisting of SEQ ID NO: 31-SEQ ID NO: 60, and conservative variants thereof; (b) an amino acid sequence encoded by a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 30, and conservative variations thereof; (c) an amino acid sequence encoded by a polynucleotide sequence that hybridizes under stringent conditions to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 30; (d) an amino acid sequence encoded by a polynucleotide sequence that is at least about 70% identical to a polynucleotide selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 30, or (e) a polypeptide comprising an amino acid subsequence of (a), (b), (c) or (d).
  • 9. The isolated or recombinant polypeptide of claim 8, comprising a fusion protein.
  • 10. The isolated or recombinant polypeptide of claim 8, comprising a peptide or polypeptide tag.
  • 11. The isolated or recombinant polypeptide of claim 10, wherein the peptide or polypeptide tag comprises a reporter peptide or polypeptide.
  • 12. The isolated or recombinant polypeptide of claim 10, wherein the peptide or polypeptide tag comprises an epitope.
  • 13. The isolated or recombinant polypeptide of claim 10, wherein the peptide or polypeptide tag comprises a localization signal or sequence.
  • 14. An array of polypeptides comprising two or more different polypeptides of claim 8.
  • 15. An antibody specific for the isolated or recombinant polypeptide of claim 8.
  • 16. The antibody of claim 15, wherein the antibody comprises a monoclonal antibody or polyclonal serum.
  • 17. The antibody of claim 15, which antibody is specific for an epitope comprising a subsequence of a polypeptide selected from the group consisting of SEQ ID NO: 31-SEQ ID NO: 60.
  • 18. An isolated or recombinant polypeptide which specifically binds to the antibody of claim 15.
  • 19. A cell comprising at least one exogenous nucleic acid, which cell expresses a polypeptide of claim 8.
  • 20. The cell of claim 19, wherein the expressed polypeptide is encoded by the exogenous nucleic acid.
  • 21. The cell of claim 19, wherein the exogenous nucleic acid comprises a promoter, which promoter regulates transcription of an endogenous nucleic acid encoding the polypeptide.
  • 22. A labeled probe comprising a nucleic acid or polypeptide comprising: (a) a polynucleotide sequence selected from the group consisting of: SEQ ID NO: 1-SEQ ID NO: 30; conservative variants of any one of SEQ ID NO: 1-SEQ ID NO: 30; or, a subsequence of SEQ ID NO: 1-SEQ ID NO: 30; or a conservative variant thereof comprising at least 10 nucleotides; or a complementary sequence thereof; (b) a polypeptide or peptide comprising an amino acid sequence selected from the group consisting of: SEQ ID NO: 31-SEQ ID NO: 60; a conservative variant of any one of SEQ ID NO: 31-SEQ ID NO: 60, or, a subsequence of one or more of SEQ ID NO: 31-SEQ ID NO: 60, or one or more conservative variants thereof, comprising at least six amino acids; or, (c) an antibody specific for a polypeptide or peptide sequence of (b).
  • 23. The labeled probe of claim 22, wherein the polynucleotide sequence comprises a subsequence of SEQ ID NO: 1-SEQ ID NO: 30, comprising at least 12 nucleotides.
  • 24. The labeled probe of claim 22, wherein the polynucleotide sequence comprises a subsequence of SEQ ID NO: 1-SEQ ID NO: 30, comprising at least 14 nucleotides.
  • 25. The labeled probe of claim 22, wherein the polynucleotide sequence comprises a subsequence of SEQ ID NO: 1-SEQ ID NO: 30, comprising at least 16 nucleotides.
  • 26. The labeled probe of claim 22, wherein the polynucleotide sequence comprises subsequence of SEQ ID NO: 1-SEQ ID NO: 30 comprising at least 17 nucleotides.
  • 27. The labeled probe of claim 22, comprising an antigenic peptide.
  • 28. The labeled probe of claim 22, comprising a fusion protein.
  • 29. The labeled probe of claim 22, comprising an epitope tag.
  • 30. The labeled probe of claim 22, comprising an isotopic, fluorescent, fluorogenic or colorimetric label.
  • 31. The labeled probe of claim 22, comprising a DNA or RNA molecule.
  • 32. A labeled probe of claim 22, comprising a cDNA, an amplification product, a transcript, a restriction fragment, or an oligonucleotide.
  • 33. A labeled probe of claim 22, comprising an oligonucleotide consisting of a polynucleotide sequence selected from a subsequence of SEQ ID NO: 61 to SEQ ID NO: 403, or a conservative variation thereof.
  • 34. A marker set for predicting at least one growth trait of a plant cell, the marker set comprising a plurality of members, which members comprise: (a) one or more polynucleotides sequences selected from the group consisting of: SEQ ID NO: 1-SEQ ID NO: 30 or SEQ ID NO: 61-SEQ ID NO: 403; a conservative variant of any one of SEQ ID NO: 1-SEQ ID NO: 30 or SEQ ID NO: 61-SEQ ID NO: 403; a subsequence of SEQ ID NO: 1-SEQ ID NO: 30, SEQ ID NO: 61-SEQ ID NO: 403, or a conservative variant thereof comprising at least 10 nucleotides; and, a complementary sequence thereof; (b) one or more polypeptides or peptides comprising an amino acid selected from the group consisting of: SEQ ID NO: 31 to SEQ ID NO: 60; a conservative variant of any one of SEQ ID NO: 31 to SEQ ID NO: 60; or a subsequence of SEQ ID NO: 31-SEQ ID NO: 60 or a conservative variant thereof comprising at least six amino acids; and/or, (c) one or more antibodies specific for a polypeptide or peptide sequence of (b).
  • 35. The marker set of claim 34, wherein the nucleic acids comprise oligonucleotides, expression products, or amplification products.
  • 36. The marker set of claim 35, wherein the oligonucleotides are synthetic oligonucleotides.
  • 37. The marker set of claim 34, comprising a plurality of labeled nucleic acid probes.
  • 38. The marker set of claim 34, comprising a plurality of polypeptides or peptides.
  • 39. The marker set of claim 34, comprising a plurality of antibodies.
  • 40. The marker set of claim 34, comprising a plurality of members, which members include nucleic acids and polypeptides.
  • 41. The marker set of claim 34, wherein the nucleic acids or polypeptides are logically or physically arrayed.
  • 42. The marker set of claim 34, wherein the nucleic acids or polypeptides are physically arrayed in a solid phase or liquid phase array.
  • 43. The marker set of claim 41, wherein the array comprises a bead array.
  • 44. The marker set of claim 34, wherein each member of the marker set comprises at least 10 contiguous nucleotides from at least one of SEQ ID NO: 1-SEQ ID NO: 30.
  • 45. The marker set of claim 34, comprising a plurality of members that together comprise a plurality of sequences or subsequences selected from a plurality of nucleic acids represented by SEQ ID NO: 61-SEQ ID NO: 403.
  • 46. The marker set of claim 34, comprising a majority of members that together comprise a majority of sequences or subsequences selected from a majority of nucleic acids represented by SEQ ID NO: 61-SEQ ID NO: 403.
  • 47. The marker set of claim 34, wherein each member of the marker set comprises at least 10 contiguous nucleotides from at least one of SEQ ID NO: 61-SEQ ID NO: 403.
  • 48. The marker set of claim 34, wherein each member of the marker set comprises at least six contiguous amino acids from at least one of SEQ ID NO: 31-SEQ ID NO: 60.
  • 49. The marker set of claim 34, comprising at least one antibody specific for each of SEQ ID NO: 31-SEQ ID NO: 60, or a subsequence thereof.
  • 50. The marker set of claim 34, wherein a plant growth trait is predicted by hybridizing the nucleic acids of the marker set to a DNA or RNA sample from a cell or tissue, and detecting at least one polymorphic polynucleotide or differentially expressed expression product.
  • 51. An array comprising the marker set of claim 34.
  • 52. A method for modulating a plant growth trait, the method comprising: modulating expression or activity of at least one polypeptide encoded by a nucleic acid comprising: (a) at least one polynucleotide sequence selected from the group consisting of: SEQ ID NO: 1-SEQ ID NO: 30 or a sequence complementary thereto; (b) at least one polynucleotide sequence comprising a conservative variation of a polynucleotide sequence of (a); (c) at least one polynucleotide encoding a polypeptide sequence selected from the group consisting of SEQ ID NO: 31-SEQ ID NO: 60, or conservative variations thereof; (d) at least one polynucleotide sequence that hybridizes under stringent conditions to a polynucleotide sequence of (a) or (b); (e) at least one polynucleotide that is at least about 70% identical to a polynucleotide sequence of (a), or (b); or, (f) at least one polynucleotide sequence comprising at least 10 contiguous nucleotides of a polynucleotide sequence selected from the group consisting of: SEQ ID NO: 1-SEQ ID NO: 30, or a sequence complementary thereto.
  • 53. The method of claim 52, comprising modulating expression or activity of at least one polypeptide contributing to a plant growth trait.
  • 54. The method of claim 52, comprising modulating a plant growth trait in a flowering plant.
  • 55. The method of claim 52, comprising modulating a plant growth trait in a member of the family Brassicaceae.
  • 56. The method of claim 52, comprising modulating a plant growth trait in a plant selected from the group consisting of Arabidopsis, Brassica, Zea, Oryza, Triticum, Hordeum, Lolium, Sorghum, Glycine, Medicago, Helianthus, Lactuca, Beta, Vitis, Solanum, Lycopersicon, Capsicum, Gossypium, Hevea, Linum, Prunus, Citrus, Populus, Pinus, Quercus, and Saccharomyces.
  • 57. The method of claim 52, comprising modulating expression by expressing an exogenous nucleic acid comprising a polynucleotide sequence selected from SEQ ID NO: 1 to SEQ ID NO: 30.
  • 58. The method of claim 57, comprising modulating expression by inducing or suppressing expression of an endogenous nucleic acid.
  • 59. The method of claim 58, wherein the endogenous nucleic acid encodes a polypeptide selected from among SEQ ID NO: 31-SEQ ID NO: 60, or homologues thereof.
  • 60. The method of claim 57, comprising introducing the exogenous nucleic acid comprising at least one promoter, which promoter regulates expression of an endogenous nucleic acid modulating a plant growth trait.
  • 61. The method of claim 57, further comprising detecting altered expression or activity of an expression product encoded by a nucleic acid comprising a polynucleotide sequence selected from SEQ ID NO: 1-SEQ ID NO: 30, or conservative variants thereof.
  • 62. The method of claim 61, comprising detecting altered expression or activity in a high throughput assay.
  • 63. The method of claim 52, wherein expression is modulated in response to an environmental factor, a chemical or biological agent, a pathogen, a bacteria, a virus, a fungus, or an insect.
  • 64. The method of claim 63, comprising detecting altered expression or activity in response to the presence of a fertilizer, or an herbicide.
  • 65. The method of claim 63, wherein a plurality of expression products are detected.
  • 66. The method of claim 65, wherein the plurality of expression products are detected in an array.
  • 67. The method of claim 66, wherein the array comprises a bead array.
  • 68. The method of claim 63, wherein a data record comprising the altered expression or activity is recorded in a database.
  • 69. The method of claim 68, wherein the database comprises a plurality of character strings recorded on a computer or in a computer readable medium.
  • 70. A method for detecting genes for a plant growth trait, the method comprising: (i) providing a subject cell or tissue sample of nucleic acids; (ii) detecting at least one polymorphic nucleic acid or at least one expression product corresponding to a polynucleotide sequence, comprising; (a) at least one polynucleotide sequence selected from the group consisting of: SEQ ID NO: 1-SEQ ID NO: 30 or a sequence complementary thereto; (b) at least one polynucleotide sequence comprising a conservative variation of a polynucleotide sequence of (a); (c) at least one polynucleotide encoding a polypeptide sequence selected from the group consisting of SEQ ID NO: 31-SEQ ID NO: 60, or conservative variations thereof; (d) at least one polynucleotide sequence that hybridizes under stringent conditions to a polynucleotide sequence of (a) or (b); (e) at least one polynucleotide that is about 70% identical to a polynucleotide sequence of (a), or (b); or, (f) at least one polynucleotide sequence comprising at least 10 contiguous nucleotides of a polynucleotide sequence selected from the group consisting of: SEQ ID NO: 1-SEQ ID NO: 30, or a sequence complementary thereto.
  • 71. The method of claim 70, wherein the expression product comprises an RNA.
  • 72. The method of claim 70, wherein the detecting step comprises qualitative detection.
  • 73. The method of claim 70, wherein the detecting step comprises quantitative detection.
CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims priority to and benefit of a prior U.S. Provisional Application No. 60/347,288, Identification of Genes Associated with Growth in Plants, by Benjamin A Bowen, et al., filed Jan. 9, 2002. The full disclosure of the prior application is incorporated herein by reference.

STATEMENT AS TO RIGHTS TO INVENTIONS MADE UNDER FEDERALLY SPONSORED RESEARCH AND DEVELOPMENT

[0002] The work of Edward S. Buckler IV was sponsored by USDA CRIS 6645-21000-022-00D.

Provisional Applications (1)
Number Date Country
60347288 Jan 2002 US