Eph-related tyrosine kinases, nucleotide sequences and methods of use

Information

  • Patent Grant
  • 5457048
  • Patent Number
    5,457,048
  • Date Filed
    Friday, December 3, 1993
    31 years ago
  • Date Issued
    Tuesday, October 10, 1995
    29 years ago
Abstract
The invention is directed to substantially purified Eph-related protein tyrosine kinases, or functional fragments thereof, having about 23 to 66 percent amino acid sequence identity in their carboxyl terminal variable regions compared to known members of the Eph subclass of tyrosine kinases. Nucleic acids encoding such Eph-related protein tyrosine kinases, vectors and host cells are also provided. The invention is also directed to a method of diagnosing cancer and determining cancer prognosis. The method includes removing a tissue or cell sample from a subject suspected of having cancer and determining the level of Eph-related protein tyrosine kinase in the sample, wherein a change in the level or activity of a Eph-related protein tyrosine kinase compared to a normal sample indicates the presence of a cancer or indicates the level of malignancy of a cancer.
Description

BACKGROUND OF THE INVENTION
This invention relates generally to protein tyrosine kinases and, more particularly, to Eph-related receptor tyrosine kinases and their manipulation for the control of cellular processes.
Receptor tyrosine kinases comprise a large family of proteins that share a number of structural features such as a glycosylated extracellular ligand-binding domain, a hydrophobic transmembrane domain and a conserved cytoplasmic catalytic domain. Integral membrane tyrosine kinases have been shown to mediate cellular signals important for growth and differentiation. The transduction of many extracellular signals to the cytoplasm occurs as a result of the binding of ligands such as growth factors, for example, to receptor tyrosine kinases at the cell surface. In most cases, ligand binding activates the cytoplasmic tyrosine kinase catalytic domain and culminates in tyrosine phosphorylation of multiple substrates in the cytoplasm.
Increased expression of membrane-spanning receptor tyrosine kinases frequently has been associated with alterations in normal cellular processes. The affected cellular processes include cell proliferation, differentiation and cancer, including, for example, human cancers. Specific examples of such cancers can include glioblastomas, squamous carcinomas and mammary carcinomas, which are associated with the amplification of the EGF receptor gene. Adenocarcinomas, breast cancers and gastric cancers similarly are associated with aberrant expression of the HER2/neu receptor and certain breast carcinomas overexpress the erbB-3 gene, for example.
The correlation between aberrant expression and transforming ability also extends to members of the Eph subclass of receptor tyrosine kinases. For example, carcinomas of the liver, lung, breast and colon show elevated expression of Eph. Unlike many other tyrosine kinases, this elevated expression can occur in the absence of gene amplification or rearrangement. Such involvement of Eph in carcinogenesis also has been shown by the formation of foci of NIH 3T3 cells in soft agar and of tumors in nude mice following overexpression of Eph. Moreover, an antigen present on the surface of a pre-B cell leukemia cell line also has been identified as a member of the Eph subclass. Wicks et al., Proc. Natl. Acad. Sci., USA 89:1611-1615 (1992). This leukemia- specific marker, termed Hek, appears to be similar to the chicken Cek4 and mouse Mek4 of the Eph subclass of receptor tyrosine kinases (see Sajjadi et al., The New Biologist 3:769-778 (1991), which is incorporated herein by reference). As with Eph, Hek also was overexpressed in the absence of gene amplification or rearrangements in, for example, hemopoietic tumors and lymphoid tumor cell lines.
In addition to their roles in carcinogenesis, a number of transmembrane tyrosine kinases have been reported to play key roles during development. Examples include the mouse c-kit proto-oncogene and the Drosophila genes "sevenless" and "torso," which are involved in pattern formation. Consistent with this developmental role, many receptor tyrosine kinases other than those described above also have been shown to be developmentally regulated and predominantly expressed in embryonic tissues. Examples of these other tyrosine kinases include Cek1, which belongs to the FGF subclass, and the Cek4 and Cek5 tyrosine kinases (Pasquale et al., Proc. Natl. Acad. Sci., USA 86:5449-5453 (1989); Sajjadi et al. (1991); and Pasquale, E. B., Cell Regulation 2:523-534 (1991), all of which are incorporated herein by reference).
Eph was the first member of the Eph subclass of tyrosine kinases to be identified and characterized by molecular cloning (Hirai et al., Science 238:1717-1720 (1987)). The name Eph is derived from the name of the cell line from which the Eph cDNA was first isolated, the erythropoietin-producing human hepatocellular carcinoma cell line, ETL-1. The general structure of Eph is similar to that of other receptor tyrosine kinases and consists of an extracellular domain, a single membrane spanning region and a conserved tyrosine kinase catalytic domain. However, the structure of the extracellular domain of Eph, which comprises an immunoglobulin (Ig) domain at the amino terminus, followed by a cysteine-rich region and two fibronectin type III repeats in close proximity to the transmembrane domain, is completely distinct from that of previously described receptor tyrosine kinases. The juxtamembrane domain and carboxy-terminus regions of Eph also are unrelated to the corresponding regions of other tyrosine kinase receptors. Thus, the discovery of Eph defined a new subclass of receptor-type tyrosine kinases.
In addition to the isolation and characterization of Eph, other related tyrosine kinases now have been identified. Cek4 and Cek5 were identified by screening a chicken embryo cDNA expression library with anti-phosphotyrosine antibodies (Sajjadi et al. (1991) and Pasquale, E. B. (1991)). This method of identification was successful because Cek4 and Cek5 are expressed in embryonic tissues and have tyrosine kinase activity even when expressed as partial fragments in bacteria. Other Eph-related kinases that have been identified include Hek (Wicks et al. (1992)), Sek (Gilardi-Hebenstreit et al, Oncogene 7:2499-2506 (1992)), Eck (Lindberg and Hunter, Mol. Cell. Biol 10:6316-6324 (1990)), Elk (Lhotak et al., Mol. Cell. Biol. 11:2496-2502 (1991)) and Eek (Chan and Watt, Oncogene 6:1057-1061 (1991)). These tyrosine kinases were cloned using a variety of methods.
The number of existing Eph-related kinases is not known and cannot be predicted. However, the Eph subclass already represents the largest known subclass of receptor tyrosine kinases, comprising at least 10 distinct members. The kinases belonging to the Eph subclass are so classified because each includes features such as the amino terminal Ig domain, the cysteine-rich stretch and two fibronectin type III repeats in the extracellular domain, which are conserved within the Eph subclass. However, despite these common structural features, the overall amino acid sequences outside the catalytic domain are quite different, indicating that different members of the Eph subclass interact with distinct ligands and substrates and, thus, exert distinct functions. This notion is supported by the differential distribution of different Eph-related kinases in adult tissues.
There is no indication whether other Eph-related kinases exist and, if so, what their relationship is to the known Eph-related kinases. Nevertheless, despite similarities among the Eph-related receptor tyrosine kinases, each is different and, as such, functions in related but distinct cellular processes. For example, many members of the Eph subclass are expressed in the nervous system during development and thus are likely to be involved in nerve regeneration processes. The aberrant expression or uncontrolled regulation of any one of these receptor tyrosine kinases can result in different malignancies and pathological disorders. Therefore, the identification and characterization of novel transmembrane tyrosine kinases should provide important insights into the mechanisms underlying oncogenesis and cellular growth control pathways.
There thus exists a need to identify additional receptor tyrosine kinases and to manipulate them in order to diagnose pathological conditions and control cellular processes. The present invention satisfies this need and provides related advantages as well.
SUMMARY
The invention is directed to substantially purified Eph-related protein tyrosine kinases, or functional fragments thereof, having about 23 to 66 percent amino acid sequence identity in their carboxyl terminal variable region compared to the other known members of the Eph subclass of tyrosine kinases. Nucleic acids encoding such Eph-related protein tyrosine kinases, vectors and host cells also are provided. The invention also is directed to a method of diagnosing cancer. The method includes removing a tissue or cell sample from a subject suspected of having cancer and determining the level of Eph-related protein tyrosine kinase in the sample, wherein a change in the level or activity of a Eph-related protein tyrosine kinase compared to a normal sample indicates the presence of a cancer or correlates with a specific prognosis.





BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 shows a comparison of the amino acid sequences from members of the Eph family. Dots replace residues in Cek4, Cek6, Cek7, Cek8, Cek9 Cek10, Eck and Eph that are identical to the corresponding residue in Cek5. Dashes represent gaps introduced in the sequences to aid in the alignment. The insertion sequence of Cek5 also is presented (Cek5.sup.+) and the insertion sequences of Cek7.sup.+ and Cek10.sup.+ are in parentheses. The conserved cysteines are indicated by the symbol " and the kinase domain is delimited by arrows. Open circles indicate the hydrophobic and aromatic residues that are conserved in the first fibronectin type III repeat and asterisks indicate the conserved residues of the second fibronectin type III repeat. The filled circle indicates the site of putative tyrosine autophosphorylation in the catalytic domain. The putative signal peptide sequences and transmembrane domains are underlined. Amino acids are numbered at the left of the sequences. The symbol + indicates the location of the extracellular domain amino acid insertion RICTPDVSGTVGSRPAADH (SEQ. ID. NO. 23), corresponding to Cek6 amino acids 426-444. Alignments were made by eye in the regions corresponding to Cek5 residues 1-615 and using the program DFALIGN (Feng and Doolittle, J. Mol. Evol., 25:351-360 (1987), which is incorporated herein by reference) in the regions corresponding to Cek5 residues 616-995.
FIG. 2 shows a RNA blot analysis of Cek mRNAs. Polyadenylated chicken RNA from 10-day embryonic and adult tissues was hybridized with Cek-specific cDNA probes and with a chicken .beta.-actin probe. Hybridization conditions were as described in Example I. The positions of RNA molecular weight standards (in kilobases, kb) are indicated on the right. .beta.-actin transcripts are present in the .about.2.0 kb size range.
FIG. 3 shows a RNA blot analysis of Cek5mRNAs. Polyadenylated RNA from body tissues (lanes 1 and 2) and brain (lanes 3 and 4) of 10-day chicken embryos was hybridized with a Cek5-specific cDNA (lanes 1 and 3). The same blots were then stripped and rehybridized with a 48 bp oligonucleotide antisense probe corresponding to the juxtamembrane insertion sequence of Cek5 (lanes 2 and 4). Hybridization conditions were as described in Example I. The positions of RNA molecular weight standards (in kb) are indicated on the right.
FIG. 4 shows immunoblotting with antibodies to different Eph-related kinases. Fractions from 10-day embryonic brain containing either membrane-associated proteins (M) or soluble proteins (S) were probed with anti-Cek4 (4), anti-Cek8 (8,) or anti-Cek9 (9) antibodies. Equal amounts of protein were loaded in all the lanes. IP, immunoprecipitates from 11-day embryonic retina with anti-Cek8 antibodies (8) or with normal rabbit IgGs (Ig). The immunoprecipitates were then probed with anti-Cek8 antibodies.





DETAILED DESCRIPTION OF THE INVENTION
The invention relates to the identification and characterization of seven novel members of the Eph subclass of membrane-spanning tyrosine kinases. The identification of these members doubles the number of kinases within this subclass, bringing the total to at least ten different Eph-related kinases. These Eph-related kinases therefore comprise the largest known subclass of integral membrane tyrosine kinases. The large number of different Eph-related kinases indicates that these receptors regulate a number of distinct cellular processes during development as well as in the adult organism. Therefore, identification of novel proteins within this subclass and isolation of their encoding nucleic acids allows the control of different cellular processes through the-production of specific agonists and antagonists and through genetic therapy.
In one embodiment seven novel kinases of the Eph subclass of receptor protein tyrosine kinases have been identified. The cDNAs encoding these Eph-related kinases were identified by hybridization at differential stringencies to identify distinct, but related receptor tyrosine kinases. All of the kinases exhibit gross structural features of known receptor tyrosine kinases in that they contain an extracellular ligand binding domain, a transmembrane domain and a cytoplasmic catalytic domain. These novel kinases are related to the Eph subclass of receptor tyrosine kinases and are designated Cek6 through Cek10.sup.+ (SEQ. ID. NOS. 1-14, and 19-22.) The overall sequence identity between these Eph-related kinases varies significantly with each of the novel Eph-related receptors being identified by its carboxyl terminal variable region.
In another embodiment, the novel Eph-related kinases exhibit distinct tissue distribution patterns and developmental expression. Six of the kinases can be found to be expressed in both the embryonic brain and body tissues. The seventh Eph-related kinase, Cek5.sup.+ is expressed only in the embryonic brain. Indicative of their roles in cellular processes, such as embryonic signal transduction pathways, these Eph-related kinases display distinct patterns of expression in adult tissues, including the neuronal specific expression of Cek5.sup.+. These distinct patterns can be used to diagnose aberrations in normal cellular processes, such as those leading to uncontrolled malignant cell growth.
In addition to diagnosing such aberrations, it is also possible to treat defects caused by the unregulated expression of Eph-related kinases through the use of gene therapy. Reagents affecting the expression or activity of Eph-related kinases can also be useful for inducing nerve regeneration following injury.
As used herein, the term "Eph-related protein tyrosine kinase" or "Eph-related kinase" refers to a receptor tyrosine kinase having an extracellular ligand binding domain, a transmembrane domain and a cytoplasmic catalytic domain, and belonging to the Eph subclass of receptor tyrosine kinases. Eph-related kinases include, for example, the receptor tyrosine kinases Cek6, Cek7, Cek7.sup.+, Cek7', Cek8, Cek9, Cek10, Cek5.sup.+ and Cek10.sup.+ (SEQ. ID. NOS. 1-14 and 19-22.) Such kinases exhibit an overall amino acid sequence identity to Eph of greater than about 40 percent. The extreme carboxyl terminal cytoplasmic regions of the kinases are not well conserved and can be used to differentiate among them. This extreme carboxyl terminal cytoplasmic region begins just after the catalytic domain at about residue number 900 and extends to the C-terminal most residue. Therefore, the term "carboxyl terminal variable region" as used herein, refers to this extreme C-terminal region of the sequence which is divergent between the different members of the Eph subclass of tyrosine kinases. The actual sequence identities between different kinases within the Eph subclass are as follows: Cek5-Cek10: 66%; Cek5-Cek6: 54%; Cek5-Cek9: 50%; Cek5-Cek8: 38%; Cek5-Cek7: 34%; Cek5-Cek4: 24%; Cek5-Eek: 39%; Cek5-Eck: 36%; Cek5-Eph: 33%; Cek10-Cek6: 64%; Cek10-Cek9: 56%; Cek10-Cek8: 47%; Cek10 -Cek7: 45%; Cek10-Cek4:32%; Cek10-Eek: 41%; Cek10-Eck: 39%; Cek10-Eph: 37%; Cek6-Cek9: 46%; Cek6-Cek8: 50%; Cek6-Cek7: 40%; Cek6-Cek4: 31%; Cek6-Eek: 39%; Cek6-Eck: 36%; Cek6-Eph: 32%; Cek9-Cek8: 46%; Cek9-Cek7: 47%; Cek9-Cek4: 29%; Cek9-Eek: 36%; Cek9-Eck: 33%; Cek9-Eph: 35%; Cek8-Cek7: 37%; Cek8-Cek4: 26%; Cek8-Eek: 39%; Cek8-Eck: 36%; Cek8-Eph: 30%; Cek7-Cek4: 36%; Cek7-Eek: 35%; Cek7-Eck: 43%; Cek7-Eph: 37%; Cek4-Eek: 29%; Cek4-Eck: 27%; Cek4-Eph: 23%; Eek-Eck: 26%; Eek-Eph: 32%; Eck-Eph: 52%. Therefore, the carboxyl terminal variable region exhibits an amino acid sequence identity of about 23 to 66 percent between the different Eph-related kinases. The novel Eph-related kinases described herein fall within this level of sequence divergence and can therefore be distinguished by comparison to the known members of the Eph subclass. Known members of this subclass include, for example, Eph, Cek4, Cek5, Mek4, Hek, Sek (or mouse Cek8), Eck, Elk (or rat Cek6) and Eek.
It is understood that limited modifications may be made without destroying biological functions of Eph-related kinases and that only a portion of the entire primary structure may be required in order to effect a particular activity. Such biological functions and activities can include, for example, signal transduction, ligand binding and/or tyrosine kinase activity. For example, the Eph-related kinases of the invention have amino acid sequences substantially similar to those shown for Cek7, Cek7.sup.+, Cek7', Cek9, Cek10, Cek5.sup.+, Cek10.sup.+ and chicken Cek6 and Cek8 in FIG. 1 (hereinafter referred to as Cek6 through Cek10.sup.+ ; SEQ. ID. NOS. 2, 4, 6, 8, 10, 12, 14, 20 and 22,) but minor modifications of these sequences which do not destroy their activity also fall within the definition of Eph-related kinases and within the definition of the protein claimed as such. Moreover, fragments of the sequences of Cek6 through Cek10.sup.+ in FIG. 1 (SEQ. ID. NOS. 2, 4, 6, 8, 10, 12, 14, 20 and 22) which retain the function of the entire protein as well as functional domains that contain at least one function of the intact protein are included within the definition. Functional domains can include, for example, active ligand binding and catalytic domains. The boundaries of such domains are not important so long as activity is maintained. It is also understood that minor modifications of the primary amino acid sequence can result in proteins which have substantially equivalent or enhanced function as compared to the sequences set forth in FIG. 1 (SEQ. ID. NOS. 2, 4, 6, 8, 10, 12, 14, 20 and 22.) These modifications may be deliberate, as through site-directed mutagenesis, or may be accidental such as through mutation in hosts which produce Eph-related kinases. All of these modifications are included as long as biological function is retained. Further, various molecules can be attached to Eph-related kinases, for example, other proteins, carbohydrates, or lipids. Such modifications are included within the definition of Eph-related tyrosine kinase.
The term "substantially purified," when used to describe the state of Eph-related tyrosine kinases denotes the protein free of a portion of the other proteins and molecules normally associated with or occurring with Eph-related kinases in their native environment. Such substantially purified Eph-related kinases can be derived from natural sources, recombinantly expressed or synthesized by in vitro methods so long as some portion of normally associated molecules is absent.
"Isolated" when used to describe the state of the nucleic acids encoding Eph-related tyrosine kinases denotes the nucleic acids free of at least a portion of the molecules associated with or occurring with Eph-related nucleic acids in the native environment.
As used herein, the term "vector" includes nucleic acids that are capable of harboring a natural or recombinant DNA sequence of interest. Vectors are usually derived from, or contain some sequences from, a natural source. For example, bacteriophage vectors containing specially engineered features that are largely derived from the phage's genome and are capable of carrying out some part of its infectious cycle. On the other hand, the sequences contained within plasmids are usually derived from different sources and compiled into a single molecule to carry out specific tasks. Thus, there are many different types of vectors and each is used according to the need to perform a desired function. Functions can include, for example, propagation in a desired host, cloning recombinant or natural fragments of DNA, mutagenesis, expression and the like. In sum, "vector" is given a operative definition, and any DNA sequence which is capable of effecting a function of a specified DNA sequence disposed therein is included in this term as it is applied to the specified sequence.
The invention provides a substantially purified Eph-related protein tyrosine kinase, or functional fragment thereof. Also provided is a substantially purified chicken Eph-related protein tyrosine kinase. The substantially purified Eph-related protein tyrosine kinase exhibits about 23 to 66 percent amino acid sequence identity in its carboxyl terminal variable region compared to known members of the Eph subclass of tyrosine kinases. The amino acid sequences are substantially the same as that shown for Cek6 through Cek10.sup.+ in FIG. 1 (SEQ. ID. NOS. 2, 4, 6, 8, 10, 12, 14, 20 and 22.)
The invention also provides an isolated nucleic acid encoding a Eph-related protein tyrosine kinase, or functional fragment thereof. The isolated nucleic acid encoding a Eph-related protein tyrosine kinase exhibits about 23 to 66 percent amino acid sequence identity in its carboxyl terminal variable region compared to known members of the Eph subclass of tyrosine kinases. The encoding nucleotide sequences are substantially the same as that shown for Cek6 through Cek10.sup.+ in the sequence listing (SEQ. ID. NOS. 1, 3, 5, 7, 9, 11, 13, 19 and 21.)
The isolation of seven cDNAs that encode novel Eph-related receptor tyrosine kinases is disclosed herein. The predicted amino acid sequences of these Eph-related kinases are shown in FIG. 1 along with other known Cek kinase sequences and those of Eph and Eck. A number of conserved features serve to define the newly discovered kinases as members of the Eph subclass. These include an amino terminal immunoglobulin domain followed by a cysteine-rich stretch in the extracellular domain, with the position of most cysteines conserved, and sequences corresponding to two fibronectin type III repeats in close proximity to the transmembrane domain (O'Bryan et al., Mol. Cell. Biol. 11:5016-5031 (1991) and Pasquale, E. B. (1991), the former of which is incorporated herein by reference). Potential sites of N-glycosylation are primarily localized in the C-terminal half of the extracellular regions. The homologies in the extracellular domains indicates that the different members of the Eph family can bind a similar class of ligands. FIG. 1 also shows that the Eph family, with the inclusion of the new members that have been identified, can now be considered the largest known family of membrane-spanning tyrosine kinases. Such a large number of tyrosine kinases in this one class is surprising in view of the fact that the other families of receptor tyrosine kinases have fewer members.
The catalytic domains of the Eph-related kinases are highly conserved and exhibit amino acid identities ranging between 61% and 90%. The C-terminal tails are less conserved (FIG. 1) and therefore constitute a variable region which can be used to specify the distinct Eph-related kinases. Only one of the tyrosines in the C-terminal variable region, corresponding to tyrosine 939 of Cek5, is conserved in all the members of the Eph family, with the exception of Cek4. This conserved tyrosine residue represents a likely site of autophosphorylation and regulation, Ullrich and Schlessinger, Cell 61:203-212 (1990). The large size of the Eph subclass of receptor tyrosine kinases, the variability within their sequences and their different tissue distributions indicate that each receptor can, for example, serve distinct functions during cellular processes.
The variability in both the lengths and sequences of the juxtamembrane domains observed in the Eph-related kinases is unusual among tyrosine kinases belonging to the same subclass, Ullrich et al., supra. Because clones encoding variants with amino acid insertions in the juxtamembrane domain were isolated for Cek5, Cek7 and Cek10, the variability in the lengths of the juxtamembrane domains is likely to originate by alternative splicing (FIG. 1). Juxtamembrane domains are important for the modulation of receptor functions by heterologous stimuli, for example, through phosphorylation by other kinases. The juxtamembrane domains of the members of the Eph family contain numerous serines, threonines and tyrosines that can serve as sites of regulation by phosphorylation, Kemp et al., Trends Biol. Sciences 15:342-346 (1990), which is incorporated herein by reference. For example, Cek9 and Cek10, as well as Cek5, Cek6, and Eck contain the consensus sequence (S/T)P, which is recognized by proline-dependent protein kinases such as cdc2, Kemp et al. (1990). Juxtamembrane domains have also been indicated to be important in the regulation of the subcellular distribution of the kinase and in the binding of some substrates (Ullrich et al., supra).
The mRNA corresponding to Cek5.sup.+, the variant form of Cek5, was shown to be specifically expressed in the central nervous system, indicating that Cek5.sup.+ functions primarily in neuronal cellular functions. Indicative of this is another tyrosine kinase, src, which has been shown to encode neuronal specific variants containing 6 to 17 amino acid insertions in the regulatory (non-catalytic) region (Brugge et al., Nature 316:554-557 (1985); Martinez et al., Science 237:411-415 (1987); Pyper et al., Mol. Cell. Biol. 10:2035-2040 (1990), all of which are incorporated herein by reference). These neuronal forms of c-src have higher specific catalytic activity than non-neuronal c-src.
Although the predicted molecular masses of the different members of the Eph family are similar, the sizes of their transcripts appear quite varied (4 to 10 Kb). In addition, several mRNA species for each of the Eph-related kinases, particularly in the central nervous system, were detected using a panel of probes. As described below, the patterns of expression of these novel Eph-related kinases are also distinct.
DNA sequences encoding the polypeptides of Eph-related kinases can be obtained by methods known to one skilled in the art. The sequences described herein are sufficient for one skilled in the art to practice the invention. Such methods include, for example, cDNA synthesis and polymerase chain reaction (PCR). The need will determine which method or combination of methods is to be used to obtain the desired sequence. Expression can be performed in any compatible vector/host system. Such systems include, for example, plasmids or phagemids in procaryotes such as E. coli, yeast systems and other eucaryotic systems such as mammalian cells. Additionally, the Eph-related kinases can also be expressed in soluble or secreted form depending on the need and the vector/host system employed.
Such vectors and vector/host systems are known, or can be constructed by those skilled in the art and should contain all expression elements necessary for the transcription, translation, regulation, and sorting of the polypeptide which makes up the Eph-related kinase. Other beneficial characteristics may also be contained within the vectors such as mechanisms for recovery of the nucleic acids in a different form. Phagemids are a specific example of this because they can be used either as plasmids or as bacteriophage vectors. The vectors can also be for use in either procaryotic or eucaryotic host systems so long as the expression elements are of a compatible origin. One of ordinary skill in the art will know which host systems are compatible with a particular vector. Thus, the invention provides vectors, host cells transformed with the vectors and Eph-related kinases produced from the host cells containing a nucleic acid encoding a Eph-related kinase.
The invention also provides methods of diagnosing cancer and determining cancer prognosis. The method includes removing a tissue or cell sample from a subject suspected of having cancer and determining the level of Eph-related protein tyrosine kinase in said sample, wherein a change in the level or activity of a Eph-related protein tyrosine kinase compared to a normal sample indicates the presence of a cancer or indicates the level of malignancy of a cancer and, therefore, the most appropriate course of treatment.
As stated previously, receptor tyrosine kinases are involved in many signal transduction events that regulate important cellular processes. Such processes include, for example, cellular differentiation and proliferation. Abnormal regulation or expression of the signal transduction machinery can lead to aberrant and malignant growth of the abnormally regulated cells. Abnormal expression of Eph is known to be associated with carcinomas of the liver, lung, breast and colon, for example. Likewise, since some Eph-related tyrosine kinases are, at least, found within the same tissues as Eph, their abnormal expression may also lead to the development of the carcinomas described above as well as other types of cancers. Additionally, cancers of the neuronal linage are likely to be caused by the abnormal expression or regulation of Cek5.sup.+ since this Eph-related kinase is found exclusively in neuronal tissues. Cek5.sup.+, Cek5 and the other Eph-related kinases expressed in the nervous system also are likely to be involved in nerve regeneration.
The important role that these receptor tyrosine kinases play in cellular processes can be advantageously used to diagnose early stages of cancer within a cell sample or tissue. A change in the amount or activity of an Eph-related kinase in a suspected sample, compared to a normal sample, will be indicative of cancerous stages and of their level of malignancy. Depending on whether the normal state is caused by the presence or absence of an Eph-related kinase, the change can involve either an increase or decrease in the amount or activity of the Eph-related kinase. One skilled in the art can measure these parameters and compare them to those obtained from a normal sample. Methods for determining the levels or activity of Eph-related kinases are known to one skilled in the art and include, for example, RNA and protein blot analysis, ELISA using specific antibodies to each of the Eph-related kinases and direct measurement of catalytic activity such as tyrosine kinase activity. Such methods can be found in Harlow et al., Antibodies: A Laboratory Manual Cold Spring Harbor Laboratory (1989), which is incorporated herein by reference.
The following examples are intended to illustrate, but not limit the invention.
EXAMPLE I
Isolation and Characterization of Eph-Related Tyrosine Kinases
This example shows the cloning and sequencing of the Eph-related kinases Cek4 through Cek10.sup.+. Structural characteristics and patterns of expression are also described.
To find novel members of the Eph family, various cDNA probes were used at different stringencies to screen a 10 day embryonic library as well as a 13 day embryonic brain cDNA library. The probes were derived from cek4 or Cek5 which had been previously isolated based on phosphotyrosine content. Following subcloning and sequence analysis, it was found that the newly isolated cDNA clones encoded seven different Eph-related tyrosine kinases. Their isolation and structure are described below.
Briefly, a 10-day chicken embryo .lambda.gt11 cDNA library (Clontech) and a 13-day embryonic brain .lambda.gt11 cDNA library were used to isolate the cDNA clones. Screening was performed at different stringencies using the following procedure. Plaques were transferred to nylon membranes (Micron Separations Inc.) on duplicate filters and hybridized to the appropriate probes at one of two stringencies (50% formamide, 42.degree. C.; or 50% formamide, 37.degree. C.). Conditions used were those recommended by the manufacturer and probes were detected using a nonradioactive DNA labeling and detection method (Boehringer Mannheim). Plaques identified as positive were subjected to three rounds of purification prior to DNA extraction using Lambda-TRAP (Clontech). Inserts from recombinant lambda DNA were subcloned in pBluescript vectors (Stratagene, San Diego, Calif.) using standard procedures and the sequences were analyzed on both strands, using the dideoxynucleotide chain-termination technique with Sequenase (United States Biochemical, Cleveland, Ohio).
Several clones distinguishable over known Eph tyrosine kinases were isolated. The clones include: one Cek5.sup.+ cDNA clone (from the chick embryo library); three Cek6 clones (two from the embryonic brain and one from the chick embryo library); one Cek7 clone (from the chick embryo library); one Cek7.sup.+ clone (from the chick embryo library); one Cek7'.sup.+ clone (from the embryonic brain library); one Cek9 clone (from the chick embryo library); one Cek10.sup.+ clone (from the chick embryo library) and two Cek10 or Cek10.sup.+ clones, which are indistinguishable because they do not encode the juxtamembrane domain, (one from the chick embryo and one from the embryonic brain library). A Cek4 probe (corresponding to nucleotides 748-1756; see Sajjadi et al. 1991), on the other hand, was used to isolate one Cek8 clone (from the chick embryo library). Also, following its initial isolation, a Cek10 probe, corresponding to residues 400-596 in FIG. 2, was used to isolate clones extending further into the 5' end from the chick embryo library. Of the two clones isolated, one represented Cek10 and one Cek10.sup.+.
The above-identified Eph-related kinases were characterized in terms of tissue distribution and expression by RNA blot analysis. Poly-A RNA was prepared from chicken tissues using the procedure of Badley et al., Biotechniques 6:114-116 (1988), which is incorporated herein by reference. Poly-A RNA (4-5 .mu.g) was size-fractionated alongside RNA molecular weight markers on 0.9% agarose gels containing formaldehyde (Sambrook et al., Molecular Cloning: A Laboratory Manual (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989), which is incorporated herein be reference) and transferred to nitrocellulose filters (Schleicher & Schuell) according to methods known to one skilled in the art. The membranes were prehybridized for 2 hours and then hybridized under stringent conditions (50% formamide, 5.times. SSPE, 5.times. Denhardt's reagent, 0.5% SDS, 100 .mu.g/ml salmon testes DNA, 42.degree. C.). Probes were labeled with .sup.32 P dATP by the random-primed method of Feinberg & Vogelstein Anal. Biochem 132:6-13 (1983), which is incorporated herein by reference. T4 polynucleotide kinase was used to label the 5' end of the Cek5.sup.+ specific oligonucleotide (Sambrook et al. 1989). Filters were washed to a final stringency of 0.1.times. SSPE, 0.1% SDS at 58.degree. C. prior to exposure to Kodak XAR-5 X-ray film. For autoradiography of .beta.-actin controls, intensifying screens were typically omitted and exposure time was reduced to 2 hours.
The following cDNA probes were used for RNA blot analysis: Cek4, 1.2 Kb, same probe used for the library screening described previously, hybridizes to the region encoding amino acid residues 240-575; Cek5 probe, 1.2 Kb, hybridizes to the 3' untranslated region; Cek6 5' probe, 1.3 Kb, hybridizes to amino acid residues 1-438; Cek6 3' probe, 0.6 Kb, hybridizes to the region following amino acid 844; Cek7 5' probe, 0.4 Kb, hybridizes to amino acid residues 1-136; Cek7 3' probe, 2.0 Kb, hybridizes to the region following amino acid 137, including the 3' untranslated region; Cek8 probe, 1.2 Kb, hybridizes to the region encoding amino acid residues 1-406; Cek9 probe, 0.6 Kb, hybridizes to the region encoding amino acid residues 1-208; Cek10 probe, 0.6 Kb, hybridizes to the region encoding the 10 C-terminal amino acids and to about 600 nucleotides of 3' untranslated region. For Cek6 and Cek7, the 3' Cek6 probe and the 5' Cek7 probe were used for the embryonic tissues mRNAs and a mixture of 5' and 3' probes for the adult tissues mRNAs.
Polyadenylated RNA was isolated from a number of adult chick tissues, as well as from brain and body tissues of 10-day embryos. These RNAs were then used for RNA blot analysis using the above specific probes. Probes were designed to minimize the possibility of cross-hybridization among the related kinases. Chicken .beta.-actin DNA was used as a control probe (Cleveland et al., Cell 20:95-105 (1980), which is incorporated herein by reference).
The amino acid sequence of Cek4 (SEQ ID NO: 16) is 67% identical to that of Cek5 (SEQ. ID. NO. 18) in the catalytic and C-terminal regions and is most closely related to that of Cek7 (SEQ ID NO: 4) (75% amino acid identity in the same regions) (FIG. 1). Preliminary data had indicated that Cek4 was highly expressed in the chicken developing brain and embryonic tissues, but no information was obtained on the adult pattern of expression in the chick. These data were therefore included in FIG. 2. The 7.5 Kb Cek4 transcript previously described was confirmed to be abundant in 10 day embryonic tissues. Expression was pronounced in the adult brain and retina, and lower but detectable in all other adult tissues examined, except the liver. In addition to the major 7.5 kb transcript, a smaller Cek4 transcript (of about 5 Kb) was found to be expressed at lower levels in the adult brain.
The Cek6 amino acid sequence (SEQ ID NO: 2) is most closely related to that of rat Elk (96% identity in the catalytic and C-terminal regions). Of the Cek members of the Eph subclass, Cek6 is most closely related to Cek5 (SEQ ID NO: 18) and Cek10 (SEQ ID NO: 10) (82% amino acid identity with both, in the catalytic and C-terminal regions) (FIG. 1). The two Cek6 cDNAs that were isolated from a 13-day chick embryo brain library were identical and both encoded a protein with a deletion of 32 amino acids and an insertion of 19 amino acids in the extracellular region (FIG. 1). However, these may be cloning artifacts, particularly the deletion, since it causes a shift in the reading frame and the premature termination of the encoded protein. A 4.4 Kb Cek6 transcript was found to be expressed at high levels in the 10-days embryo and in adult brain, lung, heart and skeletal muscle (FIG. 2). Low levels of Cek6 expression were detected in all other adult tissues tested. A second larger Cek6 transcript (of about 6.5 Kb) was detected at low levels in the adult brain.
The amino acid sequence of Cek7 (SEQ ID NO: 4) is 71% identical to that of Cek5 (SEQ ID NO: 18) in the catalytic and C-terminal regions and is most closely related to those of Cek4 (SEQ ID NO: 16) and Cek9 (SEQ ID NO: 8) (75% amino acid identity with both, in the same regions) (FIG. 1). A variant form of Cek7, containing a 22 amino acid insertion in the juxtamembrane domain (FIG. 1) also was isolated and designated Cek7.sup.+. Cek7 and Cek7.sup.+ (SEQ ID NO: 22) may originate from the same gene by alternative splicing. A second variant form of Cek7, designated Cek7' (SEQ ID NO: 20), which also presumably originates via alternative splicing, differs from Cek7 in the C-terminal 33 amino acids. Cek7 appears to have the lowest levels of expression among all the Eph related kinases examined. Three different transcripts of about 4.4 Kb, 7 Kb and 8.5 Kb were detected in the 10-day embryonic brain. Expression was weaker in the rest of the 10-day embryo, where only the 4.4 Kb transcript could be detected (FIG. 2). Cek7 transcripts were not detected in the adult tissues, except for a barely detectable 8.5 Kb transcript in the brain (FIG. 2).
Cek8 (SEQ ID NO: 6) is equally related to Cek5 (SEQ ID NO: 18), Cek6 (SEQ ID NO: 2), Cek7 (SEQ ID NO: 4) and Cek10 (SEQ ID NO: 10) (74% amino acid identity in the catalytic and C-terminal regions) (FIG. 1). A single 6 Kb Cek8 transcript was found to be present in both the 10-day embryonic brain and body tissues (FIG. 2). Cek8 expression appears to be the highest in adult brain and retina and is also detectable in kidney, lung, skeletal muscle and thymus (FIG. 2). Cek8 expression was not detected in heart and liver.
Cek9 (SEQ ID NO: 8) is most closely related to Cek5 (SEQ ID NO: 18) (77% identity at the amino acid level in the catalytic and C-terminal regions (FIG. 1). A 4.4 Kb Cek9 transcript is present in embryonic brain and body tissues. Two additional and very minor transcripts of about 5.5 Kb and 6.5 Kb were detected exclusively in the 10-day embryonic brain (FIG. 2). Among the adult tissues examined, Cek9 expression is prominent in the thymus and detectable in brain, retina, kidney, lung and heart. None of the other kinases examined displays such an elevated level of expression in the thymus. Cek9 expression was not detected in skeletal muscle and liver.
Cek10 (SEQ ID NO: 10) is most closely related to Cek5 (SEQ ID NO: 18) and Cek6 (SEQ ID NO: 2) (84% amino acid identity with both in the catalytic and C-terminal regions) (FIG. 1). A variant form of Cek10, containing a 15 amino acid insertion in the juxtamembrane domain (FIG. 1), was also isolated and designated Cek10.sup.+ (SEQ ID NO: 14). Cek10 and Cek10.sup.+ may originate from the same gene by alternative splicing. Northern blot analysis identified two Cek10 transcripts of about 4.4 Kb and 6 Kb, present at different relative levels in 10-day embryonic brain and body tissues as well as in a number of adult tissues (FIG. 2). Among the adult tissues examined, Cek10 expression was particularly prominent in the kidney. Lower Cek10 expression was detected in the lung and barely detectable transcripts were also present in brain, liver, heart, skeletal muscle and thymus.
A variant form of Cek5, containing a 16 amino acid insertion in the juxtamembrane domain, was also identified and termed Cek5.sup.+ (SEQ ID NO: 12) (FIG. 1). This Cek5 variant may originate as a result of alternative splicing. With a Cek5 DNA probe recognizing both Cek5 and Cek5.sup.+ (see Material and Methods), a 4.4 Kb transcript was detected in both 10-day embryonic brain and body tissues (FIG. 3, lanes 1 and 3). In addition, a much larger transcript (of about 10 Kb) was detected in the 10-day embryonic brain (FIG. 3, lane 3). Consistently with the previously reported expression of the Cek5 protein, Cek5 transcripts are more abundant in the brain than in other 10-day embryonic tissues. Using a probe corresponding to the 16 amino insertion in the juxtamembrane domain (FIG. 3, lanes 2 and 4), Cek5.sup.+ was found to be exclusively expressed in the central nervous system and only as the 4.4 Kb transcript. Because Cek5 immunoreactivity in the central nervous system has been previously found to be confined to neurons, Cek5.sup.+ appears to be a neuronal specific variant of Cek5.
Polyclonal antibodies recognizing specifically Cek4, Cek8 and Cek9 have been obtained and will be used for the characterization of these kinases. Peptides corresponding to the carboxy-terminal ends of Cek4, Cek8 and Cek9 were coupled to bovine serum albumin with m-Maleimido benzoyl-N-hydroxysuccinimide ester (Cek4) or with glutaraldehyde (Cek8 and Cek9) and used as immunogens. The peptides used were the following: Cek4, CLETHTKNSPVPV (SEQ ID NO 24); Cek8, KMQQMHGRMVPV (SEQ ID NO 25) and Cek9, KVHLNQLEPVEV (SEQ ID NO 26). The carboxy-terminal regions were chosen because they are poorly conserved within the Eph subclass, increasing the likelihood of obtaining antibodies specific for each kinase. The antibodies were purified from the antiserum by affinity-chromatography on the appropriate peptides coupled to N-hydroxy-succinimide-activated agarose (BioRad). As shown in FIG. 4, after affinity purification the antibodies to Cek4, Cek8 and Cek9 recognize a single band of the expected molecular weight (about 120 kd) in membranes-containing fractions isolated from 10-day embryonic brain, but not in fractions containing soluble proteins. These antibodies do not cross-react significantly with related members of the Eph subclass (not shown) and can be used for different applications, such as immunoblotting (FIG. 4), immunofluorescence microscopy and immunoprecipitation (FIG. 4). All of the antibodies are capable of immunoprecipitating the kinases from tissue extracts and, as expected, the immunoprecipitated kinases undergo in vitro autophosphorylation in the presence of ATP. These techniques will allow the characterization of the kinases of the Eph subclass at the protein level. Coupled to a solid support, the antibodies can also be used to purify the kinases from tissues and cell lines. In the cases tested, antibodies generated to the chicken Eph-related kinases recognize the corresponding mammalian homologues. Thus, these antibodies could be used, for example, to screen tumor samples for the presence of the appropriate Eph-related kinases.
Although the invention has been described with reference to the disclosed embodiments, it should be understood that various modifications can be made without departing from the spirit of the invention. Accordingly, the invention is limited only by the claims.
__________________________________________________________________________SEQUENCE LISTING(1) GENERAL INFORMATION:(iii) NUMBER OF SEQUENCES: 26(2) INFORMATION FOR SEQ ID NO:1:(i) SEQUENCE CHARACTERISTICS:(A) LENGTH: 3133 base pairs(B) TYPE: nucleic acid(C) STRANDEDNESS: both(D) TOPOLOGY: linear(ix) FEATURE:(A) NAME/KEY: CDS (B) LOCATION: join(3..419, 421..2858)(xi) SEQUENCE DESCRIPTION: SEQ ID NO:1:CAGAAACCCTGATGGACACACGGACAGCGACGGCTGAGCTGGGCTGG47GluThrLeuMetAspThrArgThrAlaThrAlaGluLeuGlyTrp15 1015ACTGCCAACCCTCCGTCAGGGTGGGAAGAAGTGAGTGGCTACGACGAG95ThrAlaAsnProProSerGlyTrpGluGluValSerGlyTyrAspGlu 202530AACCTGAACACCATCCGTACCTACCAGGTGTGCAACGTCTTCGAGCCA143AsnLeuAsnThrIleArgThrTyrGlnValCysAsnValPheGluPro 354045AACCAGAACAACTGGCTCCTCACCACCTTCATCAACCGGCGCGGAGCC191AsnGlnAsnAsnTrpLeuLeuThrThrPheIleAsnArgArgGlyAla50 5560CACCGCATCTACACTGAGATGCGCTTCACTGTGCGGGACTGCAGCAGC239HisArgIleTyrThrGluMetArgPheThrValArgAspCysSerSer65 7075CTCCCCAACGTCCCCGGCTCCTGCAAGGAGACCTTCAACCTCTACTAC287LeuProAsnValProGlySerCysLysGluThrPheAsnLeuTyrTyr8085 9095TATGAGACAGACTCTGTCATTGCCACTAAGAAGTCGGCCTTCTGGACG335TyrGluThrAspSerValIleAlaThrLysLysSerAlaPheTrpThr 100105110GAGGCACCCTACCTCAAAGTGGACACCATTGCTGCTGACGAGAGCTTT383GluAlaProTyrLeuLysValAspThrIleAlaAlaAspGluSerPhe 115120125TCCCAGGTGGACTTTGGTGGCAGGTTGATGAAGGGTTTTCTTCAAG429SerGlnValAspPheGlyGlyArgLeuMetLysGlyPhePheLys130 135140AAGTGCCCAAGCGTGGTGCAGAACTTCGCTATCTTCCCTGAGACGATG477LysCysProSerValValGlnAsnPheAlaIlePheProGluThrMet145 150155ACGGGGGCAGAGAGCACCTCTCTGGTGACAGCACGGGGCACCTGCATC525ThrGlyAlaGluSerThrSerLeuValThrAlaArgGlyThrCysIle160 165170CCCAACGCTGAGGAGGTGGACGTGCCCATCAAGCTGTACTGCAACGGG573ProAsnAlaGluGluValAspValProIleLysLeuTyrCysAsnGly175180 185190GATGGGGAGTGGATGGTACCCATAGGTCGCTGCACCTGCAAGGCTGGT621AspGlyGluTrpMetValProIleGlyArgCysThrCysLysAlaGly195 200205TATGAGCCGGAAAACAACGTGGCTTGCAGAGCCTGCCCGGCTGGGACA669TyrGluProGluAsnAsnValAlaCysArgAlaCysProAlaGlyThr210 215220TTCAAAGCCAGTCAGGGTGCGGGGCTGTGTGCCCGCTGTCCCCCCAAC717PheLysAlaSerGlnGlyAlaGlyLeuCysAlaArgCysProProAsn225 230235AGCCGCTCCAGCGCCGAGGCCTCACCGCTCTGCGCCTGCCGCAACGGC765SerArgSerSerAlaGluAlaSerProLeuCysAlaCysArgAsnGly240 245250TACTTTCGGGCTGACCTGGACCCACCGACAGCTGCCTGCACCAGCGTC813TyrPheArgAlaAspLeuAspProProThrAlaAlaCysThrSerVal255260 265270CCCTCTGGTCCACGCAACGTCATCTCCATTGTCAATGAGACCTCCATC861ProSerGlyProArgAsnValIleSerIleValAsnGluThrSerIle275 280285ATCCTGGAGTGGAACCCGCCACGGGAGACAGGAGGCCGGGATGATGTC909IleLeuGluTrpAsnProProArgGluThrGlyGlyArgAspAspVal290 295300ACTTACAACATTGTCTGCAAGAAGTGCCGGGCAGACCGGCGTGCCTGC957ThrTyrAsnIleValCysLysLysCysArgAlaAspArgArgAlaCys305 310315TCCCGCTGCGACGACAACGTGGAGTTTGTGCCCCGACAGCTGGGGCTG1005SerArgCysAspAspAsnValGluPheValProArgGlnLeuGlyLeu320 325330ACAGAGACCCGCGTCTTCATCAGCAGCCTCTGGGCACACACACCCTAC1053ThrGluThrArgValPheIleSerSerLeuTrpAlaHisThrProTyr335340 345350ACCTTTGAGATCCAGGCGGTCAACGGGGTTTCCAACAAGAGCCCCTTC1101ThrPheGluIleGlnAlaValAsnGlyValSerAsnLysSerProPhe355 360365CCACCCCAGCACGTCTCCGTGAACATCACCACCAACCAAGCTGCACCC1149ProProGlnHisValSerValAsnIleThrThrAsnGlnAlaAlaPro370 375380TCCACTGTCCCCATCATGCACCAGGTGAGTGCCACCATGAGGAGCATC1197SerThrValProIleMetHisGlnValSerAlaThrMetArgSerIle385 390395ACGCTATCCTGGCCGCAGCCGGAGCAGCCCAACGGCATCATCCTGGAC1245ThrLeuSerTrpProGlnProGluGlnProAsnGlyIleIleLeuAsp400 405410TACGAGCTGCGCTACTACGAGAAGCTGAGCCGCATCTGCACGCCCGAT1293TyrGluLeuArgTyrTyrGluLysLeuSerArgIleCysThrProAsp415420 425430GTCAGCGGCACTGTGGGCTCGAGACCGGCGGCGGACCACAACGAGTAC1341ValSerGlyThrValGlySerArgProAlaAlaAspHisAsnGluTyr435 440445AACTCCTCTGTGGCCCGCAGTCAGACCAACACGGCCCGGCTGGAGGGG1389AsnSerSerValAlaArgSerGlnThrAsnThrAlaArgLeuGluGly450 455460CTGCGCCCTGGCATGGTGTACGTGGTGCAGGTGCGAGCAAGGACGGTG1437LeuArgProGlyMetValTyrValValGlnValArgAlaArgThrVal465 470475GCCGGCTATGGGAAGTACAGTGGGAAGATGTGCTTCCAGACACTGACC1485AlaGlyTyrGlyLysTyrSerGlyLysMetCysPheGlnThrLeuThr480 485490GATGATGACTACAAGTCTGAGCTGAGGGAGCAGCTGCCATTGATTGCG1533AspAspAspTyrLysSerGluLeuArgGluGlnLeuProLeuIleAla495500 505510GGGTCTGCAGCGGCCGGCGTGGTCTTCATTGTTTCGCTGGTGGCCATT1581GlySerAlaAlaAlaGlyValValPheIleValSerLeuValAlaIle515 520525TCCATAGTGTGCAGCAGGAAGCGAGCGTACAGCAAGGAGGTCGTTTAC1629SerIleValCysSerArgLysArgAlaTyrSerLysGluValValTyr530 535540AGCGATAAGCTGCAGCACTACAGCACCGGGAGAGGGTCTCCGGGAATG1677SerAspLysLeuGlnHisTyrSerThrGlyArgGlySerProGlyMet545 550555AAGATTTACATCGACCCCTTCACTTATGAGGACCCCAACGAGGCAGTG1725LysIleTyrIleAspProPheThrTyrGluAspProAsnGluAlaVal560 565570CGTGAGTTCGCCAAGGAGATTGACGTCTCCTTTGTGAAGATTGAAGAG1773ArgGluPheAlaLysGluIleAspValSerPheValLysIleGluGlu575580 585590GTCATTGGAGCAGGGGAGTTTGGAGAGGTGTACAAAGGCCGCCTGAAG1821ValIleGlyAlaGlyGluPheGlyGluValTyrLysGlyArgLeuLys595 600605TTGCCTGGCAAGCGGGAGATCTATGTGGCCATCAAAACACTGAAGGCT1869LeuProGlyLysArgGluIleTyrValAlaIleLysThrLeuLysAla610 615620GGCTACTCAGAGAAGCAGCGCCGGGATTTCCTGAGCGAAGCCAGCATC1917GlyTyrSerGluLysGlnArgArgAspPheLeuSerGluAlaSerIle625 630635ATGGGGCAGTTTGACCACCCCAACATCATCCGGCTGGAAGGGGTGGTG1965MetGlyGlnPheAspHisProAsnIleIleArgLeuGluGlyValVal640 645650ACCAAGAGCCGACCAGTCATGATTATCACAGAGTTCATGGAGAATGGG2013ThrLysSerArgProValMetIleIleThrGluPheMetGluAsnGly655660 665670GCCCTGGACTCGTTCCTGCGGCAAAATGATGGGCAGTTCACAGTGATC2061AlaLeuAspSerPheLeuArgGlnAsnAspGlyGlnPheThrValIle675 680685CAGCTGGTGGGGATGCTCAGAGGGATTGCTGCTGGGATGAAGTACCTG2109GlnLeuValGlyMetLeuArgGlyIleAlaAlaGlyMetLysTyrLeu690 695700GCAGAGATGAACTATGTCCACAGGGATCTGGCGGCCAGGAACATTCTG2157AlaGluMetAsnTyrValHisArgAspLeuAlaAlaArgAsnIleLeu705 710715GTCAACAGCAACCTGGTGTGCAAAGTGTCAGACTTTGGCCTCTCGCGC2205ValAsnSerAsnLeuValCysLysValSerAspPheGlyLeuSerArg720 725730TACCTGCAGGACGACACCTCTGATCCCACCTACACCAGCTCCTTGGGT2253TyrLeuGlnAspAspThrSerAspProThrTyrThrSerSerLeuGly735740 745750GGGAAGATCCCTGTGCGATGGACAGCACCAGAGGCCATTGCGTACCGC2301GlyLysIleProValArgTrpThrAlaProGluAlaIleAlaTyrArg755 760765AAGTTCACGTCAGCCAGTGACGTCTGGAGCTATGGCATCGTCATGTGG2349LysPheThrSerAlaSerAspValTrpSerTyrGlyIleValMetTrp770 775780GAGGTGATGTCGTTCGGAGAGAGGCCCTACTGGGACATGTCCAACCAG2397GluValMetSerPheGlyGluArgProTyrTrpAspMetSerAsnGln785 790795GACGTCATCAATGCCATCGAGCAGGACTACCGGCTCCCGCCGCCCATG2445AspValIleAsnAlaIleGluGlnAspTyrArgLeuProProProMet800 805810GACTGCCCAGCTGCCCTGCACCAACTGATGCTGGACTGCTGGCAGAAG2493AspCysProAlaAlaLeuHisGlnLeuMetLeuAspCysTrpGlnLys815820 825830GACCGCAACACCCGGCCTCGCTTGGCCGAGATTGTCAACACCCTGGAC2541AspArgAsnThrArgProArgLeuAlaGluIleValAsnThrLeuAsp835 840845AAAATGATCCGCAACCCGGCAAGCCTCAAAACTGTGGCTACCATCACC2589LysMetIleArgAsnProAlaSerLeuLysThrValAlaThrIleThr850 855860GCTGTGCCTTCTCAGCCCCTCCTCGACCGCTCTATCCCTGATTTCACT2637AlaValProSerGlnProLeuLeuAspArgSerIleProAspPheThr865 870875GCCTTTACCTCAGTAGAAGACTGGCTGAGTGCCGTCAAGATGAGCCAG2685AlaPheThrSerValGluAspTrpLeuSerAlaValLysMetSerGln880 885890TATAGAGACAACTTCCTGAGCGCTGGATTCACCTCCCTCCAGCTGGTC2733TyrArgAspAsnPheLeuSerAlaGlyPheThrSerLeuGlnLeuVal895900 905910GCCCAGATGACATCTGAAGACCTCCTGAGAATAGGAGTAACGCTGGCT2781AlaGlnMetThrSerGluAspLeuLeuArgIleGlyValThrLeuAla915 920925GGGCACCAGAAGAAGATCCTGAACAGCATCCAGTCCATGCGCGTGCAG2829GlyHisGlnLysLysIleLeuAsnSerIleGlnSerMetArgValGln930 935940ATGAGTCAGTCTCCGACCTCGATGGCGTGACGTCCCTCGCTCGACGAGGAGGGG2883MetSerGlnSerProThrSerMetAla945950GACGGGGA GGGCAGGTGGCAGAGGTGGGAGGGGAGGAACTGATCTGATGGGAGCCGTGGG2943GCCGCAGCTGGAGAGGGGCAGCCACGGCCGGGGCTGTGCCTGACCGCGGAGGACGTTCCT3003GGGACTCGCCTCGGCCTGGTGACTTCCATCCCTCACCAACAGAAGCACACT TACCGATGT3063CACGGGGGACAGCGTATAAATAAGTATAAATATGTACAAATCATATATTTAAAAAAAAAA3123AAAAAAAAAG3133(2) INFORMATION FOR SEQ ID NO:2:(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 951 amino acids(B) TYPE: amino acid(D) TOPOLOGY: linear(ii) MOLECULE TYPE: protein(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2:GluThrLeuMetAspThrArgThrAlaThrAlaGluLeuGlyTrpThr1510 15AlaAsnProProSerGlyTrpGluGluValSerGlyTyrAspGluAsn202530LeuAsnThrIleArgThrTyrGlnValCysAsnValPheGluPr oAsn354045GlnAsnAsnTrpLeuLeuThrThrPheIleAsnArgArgGlyAlaHis505560ArgIleTyrThrGlu MetArgPheThrValArgAspCysSerSerLeu65707580ProAsnValProGlySerCysLysGluThrPheAsnLeuTyrTyrTyr85 9095GluThrAspSerValIleAlaThrLysLysSerAlaPheTrpThrGlu100105110AlaProTyrLeuLysValAspThrI leAlaAlaAspGluSerPheSer115120125GlnValAspPheGlyGlyArgLeuMetLysGlyPhePheLysLysCys13013514 0ProSerValValGlnAsnPheAlaIlePheProGluThrMetThrGly145150155160AlaGluSerThrSerLeuValThrAlaArgGlyThrCysIleProAs n165170175AlaGluGluValAspValProIleLysLeuTyrCysAsnGlyAspGly180185190GluTrp MetValProIleGlyArgCysThrCysLysAlaGlyTyrGlu195200205ProGluAsnAsnValAlaCysArgAlaCysProAlaGlyThrPheLys210 215220AlaSerGlnGlyAlaGlyLeuCysAlaArgCysProProAsnSerArg225230235240SerSerAlaGluAlaSerProLeuCysA -aCysArgAsnGlyTyrPhe245250255ArgAlaAspLeuAspProProThrAlaAlaCysThrSerValProSer260265 270GlyProArgAsnValIleSerIleValAsnGluThrSerIleIleLeu275280285GluTrpAsnProProArgGluThrGlyGlyArgAspAspValThrTy r290295300AsnIleValCysLysLysCysArgAlaAspArgArgAlaCysSerArg305310315320CysAspAsp AsnValGluPheValProArgGlnLeuGlyLeuThrGlu325330335ThrArgValPheIleSerSerLeuTrpAlaHisThrProTyrThrPhe340 345350GluIleGlnAlaValAsnGlyValSerAsnLysSerProPheProPro355360365GlnHisValSerValAsnIleThrThrA snGlnAlaAlaProSerThr370375380ValProIleMetHisGlnValSerAlaThrMetArgSerIleThrLeu385390395 400SerTrpProGlnProGluGlnProAsnGlyIleIleLeuAspTyrGlu405410415LeuArgTyrTyrGluLysLeuSerArgIleCysThrProAspVa lSer420425430GlyThrValGlySerArgProAlaAlaAspHisAsnGluTyrAsnSer435440445SerValAla ArgSerGlnThrAsnThrAlaArgLeuGluGlyLeuArg450455460ProGlyMetValTyrValValGlnValArgAlaArgThrValAlaGly465470 475480TyrGlyLysTyrSerGlyLysMetCysPheGlnThrLeuThrAspAsp485490495AspTyrLysSerGluLeuArgGluG lnLeuProLeuIleAlaGlySer500505510AlaAlaAlaGlyValValPheIleValSerLeuValAlaIleSerIle515520 525ValCysSerArgLysArgAlaTyrSerLysGluValValTyrSerAsp530535540LysLeuGlnHisTyrSerThrGlyArgGlySerProGlyMetLysIle545 550555560TyrIleAspProPheThrTyrGluAspProAsnGluAlaValArgGlu565570575PheAla LysGluIleAspValSerPheValLysIleGluGluValIle580585590GlyAlaGlyGluPheGlyGluValTyrLysGlyArgLeuLysLeuPro595 600605GlyLysArgGluIleTyrValAlaIleLysThrLeuLysAlaGlyTyr610615620SerGluLysGlnArgArgAspPheLeuSerGluA laSerIleMetGly625630635640GlnPheAspHisProAsnIleIleArgLeuGluGlyValValThrLys645650 655SerArgProValMetIleIleThrGluPheMetGluAsnGlyAlaLeu660665670AspSerPheLeuArgGlnAsnAspGlyGlnPheThrValIleGl nLeu675680685ValGlyMetLeuArgGlyIleAlaAlaGlyMetLysTyrLeuAlaGlu690695700MetAsnTyrValHis ArgAspLeuAlaAlaArgAsnIleLeuValAsn705710715720SerAsnLeuValCysLysValSerAspPheGlyLeuSerArgTyrLeu725 730735GlnAspAspThrSerAspProThrTyrThrSerSerLeuGlyGlyLys740745750IleProValArgTrpThrAlaProG luAlaIleAlaTyrArgLysPhe755760765ThrSerAlaSerAspValTrpSerTyrGlyIleValMetTrpGluVal77077578 0MetSerPheGlyGluArgProTyrTrpAspMetSerAsnGlnAspVal785790795800IleAsnAlaIleGluGlnAspTyrArgLeuProProProMetAspCy s805810815ProAlaAlaLeuHisGlnLeuMetLeuAspCysTrpGlnLysAspArg820825830AsnThr ArgProArgLeuAlaGluIleValAsnThrLeuAspLysMet835840845IleArgAsnProAlaSerLeuLysThrValAlaThrIleThrAlaVal850 855860ProSerGlnProLeuLeuAspArgSerIleProAspPheThrAlaPhe865870875880ThrSerValGluAspTrpLeuSerAlaV alLysMetSerGlnTyrArg885890895AspAsnPheLeuSerAlaGlyPheThrSerLeuGlnLeuValAlaGln900905 910MetThrSerGluAspLeuLeuArgIleGlyValThrLeuAlaGlyHis915920925GlnLysLysIleLeuAsnSerIleGlnSerMetArgValGlnMetSe r930935940GlnSerProThrSerMetAla945950(2) INFORMATION FOR SEQ ID NO:3:(i) SEQUENCE CHARACTERISTICS:(A) LENGTH: 3059 base pairs(B) TYPE: nucleic acid(C) STRANDEDNESS: both (D) TOPOLOGY: linear(ix) FEATURE:(A) NAME/KEY: CDS(B) LOCATION: 2..2167(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3:CCTCAAATTCACCCTGAGGGACTGTAACAGCCTTCCAGGAGGACTT46LeuLysPheThrLeuArgAspCysAsnSerLeu ProGlyGlyLeu151015GGGACTTGCAAGGAGACTTTTAACATGTACTACTTTGAGTCAGATGAT94GlyThrCysLysGluThrPheAsnMetTyrT yrPheGluSerAspAsp202530GAAGATGGGAGGAACATCAGAGAGAATCAGTACATCAAGATAGATACC142GluAspGlyArgAsnIleArgGluAsnG lnTyrIleLysIleAspThr354045ATTGCTGCTGATGAGAGCTTCACGGAGTTGGACCTCGGCGACAGAGTT190IleAlaAlaAspGluSerPheThrGluL euAspLeuGlyAspArgVal505560ATGAAGTTAAACACAGAAGTGAGAGATGTTGGGCCTCTAACAAAAAAA238MetLysLeuAsnThrGluValArgAspValG lyProLeuThrLysLys657075GGATTTTACCTTGCTTTCCAGGATGTGGGCGCCTGCATTGCCCTGGTC286GlyPheTyrLeuAlaPheGlnAspValGlyAlaCysI leAlaLeuVal80859095TCTGTGCGTGTGTACTACAAGAAATGCCCATCAGTGATCCGCAACCTG334SerValArgValTyrTyrLysLysCysProS erValIleArgAsnLeu100105110GCACGCTTTCCAGATACCATCACAGGAGCAGATTCCTCGCAGCTGCTA382AlaArgPheProAspThrIleThrGlyA laAspSerSerGlnLeuLeu115120125GAAGTGTCAGGCGTCTGTGTCAACCACTCAGTGACTGATGAGGCACCA430GluValSerGlyValCysValAsnHisS erValThrAspGluAlaPro130135140AAGATGCACTGCAGTTCAGAGGGAGAATGGCTGGTGCCCATTGGGAAG478LysMetHisCysSerSerGluGlyGluTrpL euValProIleGlyLys145150155TGTTTGTGCAAGGCAGGGTACGAGGAGAAGAACAACACCTGCCAAGCA526CysLeuCysLysAlaGlyTyrGluGluLysAsnAsnT hrCysGlnAla160165170175CCTTCTCCAGTCAGTAGTGTGAAAAAAGGGAAGATAACTAAAAATAGC574ProSerProValSerSerValLysLysGlyL ysIleThrLysAsnSer180185190ATCTCCCTTTCCTGGCAGGAGCCAGATCGACCCAACGGCATCATCCTG622IleSerLeuSerTrpGlnGluProAspA rgProAsnGlyIleIleLeu195200205GAATACGAAATCAAATATTTTGAAAAGGACCAGGAGACAAGCTACACC670GluTyrGluIleLysTyrPheGluLysA spGlnGluThrSerTyrThr210215220ATCATCAAATCCAAAGAGACCGCAATTACGGCAGATGGCTTGAAACCA718IleIleLysSerLysGluThrAlaIleThrA laAspGlyLeuLysPro225230235GGCTCAGCGTACGTCTTCCAGATCCGAGCCCGGACAGCTGCTGGCTAC766GlySerAlaTyrValPheGlnIleArgAlaArgThrA laAlaGlyTyr240245250255GGTGGCTTCAGTCGAAGATTTGAGTTTGAAACCAGCCCAGTGTTAGCT814GlyGlyPheSerArgArgPheGluPheGluT hrSerProValLeuAla260265270GCATCCAGTGACCAGAGCCAGATTCCTATAATTGTTGTGTCTGTAACA862AlaSerSerAspGlnSerGlnIleProI leIleValValSerValThr275280285GTGGGAGTTATTCTGCTGGCTGTTGTTATCGGTTTCCTTCTCAGTGGA910ValGlyValIleLeuLeuAlaValValI leGlyPheLeuLeuSerGly290295300AGGCGCTGTGGCTACAGCAAGGCTAAACAAGACCCAGAAGAAGAAAAG958ArgArgCysGlyTyrSerLysAlaLysGlnA spProGluGluGluLys305310315ATGCATTTTCATAATGGCCACATTAAACTGCCTGGTGTAAGAACCTAC1006MetHisPheHisAsnGlyHisIleLysLeuProGlyV alArgThrTyr320325330335ATTGATCCCCACACCTATGAGGACCCTAATCAAGCTGTCCACGAGTTT1054IleAspProHisThrTyrGluAspProAsnG lnAlaValHisGluPhe340345350GCCAAGGAAATAGAAGCTTCGTGCATAACCATCGAGAGAGTTATCGGA1102AlaLysGluIleGluAlaSerCysIleT hrIleGluArgValIleGly355360365GCTGGTGAATTTGGAGAAGTCTGCAGTGGACGGCTGAAACTGCAGGGA1150AlaGlyGluPheGlyGluValCysSerG lyArgLeuLysLeuGlnGly370375380AAACGCGAGTTTCCAGTGGCTATCAAAACCCTGAAGGTGGGCTACACA1198LysArgGluPheProValAlaIleLysThrL euLysValGlyTyrThr385390395GAGAAGCAAAGGCGAGATTTCCTGGGAGAAGCGAGCATCATGGGGCAG1246GluLysGlnArgArgAspPheLeuGlyGluAlaSerI leMetGlyGln400405410415TTCGACCACCCCAACATCATCCACCTGGAAGGTGTCGTCACAAAAAGC1294PheAspHisProAsnIleIleHisLeuGluG lyValValThrLysSer420425430AAACCTGTAATGATAGTAACGGAATACATGGAAAATGGTTCTCTGGAT1342LysProValMetIleValThrGluTyrM etGluAsnGlySerLeuAsp435440445ACATTTTTAAAGAAGAACGATGGGCAGTTCACGGTCATTCAGCTGGTC1390ThrPheLeuLysLysAsnAspGlyGlnP heThrValIleGlnLeuVal450455460GGGATGCTGCGAGGCATCGCATCAGGGATGAAGTACCTGTCTGACATG1438GlyMetLeuArgGlyIleAlaSerGlyMetL ysTyrLeuSerAspMet465470475GGTTACGTACACAGAGACCTCGCTGCCAGGAATATCCTCATCAACAGC1486GlyTyrValHisArgAspLeuAlaAlaArgAsnIleL euIleAsnSer480485490495AACTTAGTCTGCAAGGTGTCTGACTTTGGCCTCTCCAGAGTCCTAGAA1534AsnLeuValCysLysValSerAspPheGlyL euSerArgValLeuGlu500505510GATGATCCTGAAGCAGCGTACACAACCAGGGGAGGGAAGATCCCCATC1582AspAspProGluAlaAlaTyrThrThrA rgGlyGlyLysIleProIle515520525CGATGGACGGCACCTGAAGCAATCGCCTTCCGCAAATTCACGTCGGCC1630ArgTrpThrAlaProGluAlaIleAlaP heArgLysPheThrSerAla530535540AGCGATGTGTGGAGCTACGGCATTGTGATGTGGGAAGTGATGTCCTAT1678SerAspValTrpSerTyrGlyIleValMetT rpGluValMetSerTyr545550555GGCGAGAGACCTTACTGGGAAATGACAAACCAAGATGTGATTAAAGCC1726GlyGluArgProTyrTrpGluMetThrAsnGlnAspV alIleLysAla560565570575GTGGAGGAAGGCTATCGCCTGCCAAGTCCCATGGACTGCCCTGCTGCT1774ValGluGluGlyTyrArgLeuProSerProM etAspCysProAlaAla580585590CTCTACCAGTTGATGCTTGACTGCTGGCAGAAAGACCGCAACAGCAGG1822LeuTyrGlnLeuMetLeuAspCysTrpG lnLysAspArgAsnSerArg595600605CCCAAGTTTGATGAAATTGTCAGCATGTTGGACAAGCTCATCCGTAAC1870ProLysPheAspGluIleValSerMetL euAspLysLeuIleArgAsn610615620CCAAGCAGCTTGAAGACGTTGGTTAATGCATCGAGCAGAGTATCAAAT1918ProSerSerLeuLysThrLeuValAsnAlaS erSerArgValSerAsn625630635TTGTTGGTAGAACACAGTCCAGTGGGGAGCGGTGCCTACAGGTCAGTG1966LeuLeuValGluHisSerProValGlySerGlyAlaT yrArgSerVal640645650655GGTGAGTGGCTGGAAGCCATCAAAATGGGTCGATACACCGAGATTTTC2014GlyGluTrpLeuGluAlaIleLysMetGlyA rgTyrThrGluIlePhe660665670ATGGAGAATGGATACAGTTCGATGGATTCTGTGGCTCAGGTGACCCTA2062MetGluAsnGlyTyrSerSerMetAspS erValAlaGlnValThrLeu675680685GAGGATTTGAGGCGGCTGGGAGTGACACTTGTTGGTCACCAGAAGAAG2110GluAspLeuArgArgLeuGlyValThrL euValGlyHisGlnLysLys690695700ATAATGAACAGCCTTCAAGAGATGAAGGTCCAGTTGGTGAATGGGATG2158IleMetAsnSerLeuGlnGluMetLysValG lnLeuValAsnGlyMet705710715GTGCCATTGTAACTCGGTTTTTAAGTCACTTCCTCGAGTGGTCGGTCCT2207ValProLeu720GCACTTTGTATACTAGCTCTGAGATT TATTTTGACTAAAGAAGAAAAAAGGGAAATTCAG2267TGGTTTCTGTAACTGAAGGACGCTGGCTTCTGCCACAGCATTTATAAAGCAGTGTTTGAC2327TGAAGTTTTCATTTTCTTCCTATTTGTGTCCTCATTCTCATGAAGTAAATGTAACATGCA2387 TGGAACATGGAAATGGATCTACTGTACATGAGGTTACCCAATTTCTTGCGCTTCAGCATG2447ACAACAGCAAGCCTTCCCACCACATGTTGTCTATACATGGGAGATATATATATATGCATA2507TATATATATAGCACCTTTATATACTGAATTACAGCAGCAGCACA TGTTAATACTTCCAAG2567GACTTACTTGACTAGAGAAGTTTTGCAGCCATTGTGGGCTCACACAAGCTGCGGTTTACT2627GAAGTTTACTTCAAGTCTTACTTGTCTACAGAAGTGTATTGAAGAGCAATATGATTAGAT2687TATTTCTGGATAGATATTT TGTTTTGTAAATTTAAAAAATCGTGTTACACAGCGTTAAGT2747TATAGAGACTAGTGTATAAACATGTTGCTTGCTCAATGGCAAATACAATACAGGGTGTAT2807ATTTTTTTCTCTCTGTGTTGCAAAGTTCTTTTAGTTTGCTCTTCTGTGAGGATAATACGT 2867TATGATGTATATACTGTACAGTTTGCTACACATCAGGTACAAGATTGGGGCTTTCTCAAT2927GTTTTGTTCTTTTTCCCTCTTTTGTTTCATTTTGTCTTCCTTTTGTGTTAACCACTATGC2987TTTGTATTTTTGCTGCTGTTTGGTTTGAGGCAACATA TAAAGCTTTCAGGTGTTTTGATT3047ATAAAAAAAAAG3059(2) INFORMATION FOR SEQ ID NO:4:(i) SEQUENCE CHARACTERISTICS:(A) LENGTH: 722 amino acids(B) TYPE: amino acid(D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4:LeuLysPheThrLeuArgAspCysAsnSerLeuProGlyGlyLeuGly151015ThrCysLysGluThrPheAsnMetTyrTyrPhe GluSerAspAspGlu202530AspGlyArgAsnIleArgGluAsnGlnTyrIleLysIleAspThrIle354045AlaAlaAspGluSerPheThrGluLeuAspLeuGlyAspArgValMet505560LysLeuAsnThrGluValArgAspValGlyProLeuThrLysLysGly65 707580PheTyrLeuAlaPheGlnAspValGlyAlaCysIleAlaLeuValSer859095ValArgValTyrTy rLysLysCysProSerValIleArgAsnLeuAla100105110ArgPheProAspThrIleThrGlyAlaAspSerSerGlnLeuLeuGlu115 120125ValSerGlyValCysValAsnHisSerValThrAspGluAlaProLys130135140MetHisCysSerSerGluGlyGluTrpLeuValProIleGly LysCys145150155160LeuCysLysAlaGlyTyrGluGluLysAsnAsnThrCysGlnAlaPro165170 175SerProValSerSerValLysLysGlyLysIleThrLysAsnSerIle180185190SerLeuSerTrpGlnGluProAspArgProAsnGlyIleIleLeuGlu 195200205TyrGluIleLysTyrPheGluLysAspGlnGluThrSerTyrThrIle210215220IleLysSerLysGluThrAlaIl eThrAlaAspGlyLeuLysProGly225230235240SerAlaTyrValPheGlnIleArgAlaArgThrAlaAlaGlyTyrGly245 250255GlyPheSerArgArgPheGluPheGluThrSerProValLeuAlaAla260265270SerSerAspGlnSerGlnIleProIleIleVal ValSerValThrVal275280285GlyValIleLeuLeuAlaValValIleGlyPheLeuLeuSerGlyArg290295300ArgC ysGlyTyrSerLysAlaLysGlnAspProGluGluGluLysMet305310315320HisPheHisAsnGlyHisIleLysLeuProGlyValArgThrTyrIle 325330335AspProHisThrTyrGluAspProAsnGlnAlaValHisGluPheAla340345350LysGluIleGluAl aSerCysIleThrIleGluArgValIleGlyAla355360365GlyGluPheGlyGluValCysSerGlyArgLeuLysLeuGlnGlyLys370375 380ArgGluPheProValAlaIleLysThrLeuLysValGlyTyrThrGlu385390395400LysGlnArgArgAspPheLeuGlyGluAlaSerIle MetGlyGlnPhe405410415AspHisProAsnIleIleHisLeuGluGlyValValThrLysSerLys420425 430ProValMetIleValThrGluTyrMetGluAsnGlySerLeuAspThr435440445PheLeuLysLysAsnAspGlyGlnPheThrValIleGlnLeuValGly45 0455460MetLeuArgGlyIleAlaSerGlyMetLysTyrLeuSerAspMetGly465470475480TyrValHisArgAspLe uAlaAlaArgAsnIleLeuIleAsnSerAsn485490495LeuValCysLysValSerAspPheGlyLeuSerArgValLeuGluAsp500 505510AspProGluAlaAlaTyrThrThrArgGlyGlyLysIleProIleArg515520525TrpThrAlaProGluAlaIleAlaPheArgLysPhe ThrSerAlaSer530535540AspValTrpSerTyrGlyIleValMetTrpGluValMetSerTyrGly545550555560GluArgProTyrTrpGluMetThrAsnGlnAspValIleLysAlaVal565570575GluGluGlyTyrArgLeuProSerProMetAspCysProAlaAlaLeu 580585590TyrGlnLeuMetLeuAspCysTrpGlnLysAspArgAsnSerArgPro595600605LysPheAspGluIleVa lSerMetLeuAspLysLeuIleArgAsnPro610615620SerSerLeuLysThrLeuValAsnAlaSerSerArgValSerAsnLeu625630 635640LeuValGluHisSerProValGlySerGlyAlaTyrArgSerValGly645650655GluTrpLeuGluAlaIleLysMetGlyArgTyr ThrGluIlePheMet660665670GluAsnGlyTyrSerSerMetAspSerValAlaGlnValThrLeuGlu675680685AspLeuArgArgLeuGlyValThrLeuValGlyHisGlnLysLysIle690695700MetAsnSerLeuGlnGluMetLysValGlnLeuValAsnGlyMetVal705 710715720ProLeu(2) INFORMATION FOR SEQ ID NO:5:(i) SEQUENCE CHARACTERISTICS:(A) LENGTH: 2820 base pairs(B) TYPE: nucleic acid(C) STRANDEDNESS: both(D) TOPOLOGY: linear(ix) FEATURE:(A) NAME/KEY: CDS (B) LOCATION: 2..2548(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5:CGGAGAGAGCCAGTTTGCCAAGATTGACACCATTGCTGCTGATGAG46GlyGluSerGlnPheAlaLysIleAspThrIleAlaAlaAspGlu15 1015AGCTTCACCCAGGTGGACATTGGTGACAGGATCATGAAGCTGAATACA94SerPheThrGlnValAspIleGlyAspArgIleMetLysLeuAsnThr20 2530GAGGTGCGGGACGTGGGGCCTCTCAGCAAGAAAGGGTTTTACTTGGCT142GluValArgAspValGlyProLeuSerLysLysGlyPheTyrLeuAla35 4045TTCCAGGACGTCGGTGCCTGCATTGCTTTGGTGTCTGTTCGTGTCTTC190PheGlnAspValGlyAlaCysIleAlaLeuValSerValArgValPhe50 5560TATAAGAAGTGCCCACTGACAGTTCGAAACCTGGCACAGTTTCCAGAC238TyrLysLysCysProLeuThrValArgAsnLeuAlaGlnPheProAsp65 7075ACCATTACTGGGGCTGATACATCCTCTCTGGTGGAGGTTCGTGGCTCC286ThrIleThrGlyAlaAspThrSerSerLeuValGluValArgGlySer8085 9095TGTGTCAACAACTCGGAAGAGAAGGACGTGCCAAAAATGTACTGCGGG334CysValAsnAsnSerGluGluLysAspValProLysMetTyrCysGly100 105110GCAGATGGTGAATGGCTGGTACCCATTGGCAACTGTCTGTGCAATGCT382AlaAspGlyGluTrpLeuValProIleGlyAsnCysLeuCysAsnAla115 120125GGCTATGAAGAACGCAATGGTGAATGCCAAGCTTGCAAAATCGGATAC430GlyTyrGluGluArgAsnGlyGluCysGlnAlaCysLysIleGlyTyr130 135140TACAAGGCGCTCTCAACAGATGTTGCATGTGCCAAATGCCCGCCTCAC478TyrLysAlaLeuSerThrAspValAlaCysAlaLysCysProProHis145 150155AGCTACTCCATCTGGGAAGGCTCTACCTCCTGCACCTGTGATCGGGGC526SerTyrSerIleTrpGluGlySerThrSerCysThrCysAspArgGly160165 170175TTCTTCCGAGCAGAAAATGATGCTGCATCCATGCCCTGCACTCGCCCT574PhePheArgAlaGluAsnAspAlaAlaSerMetProCysThrArgPro180 185190CCATCCGCACCCCAGAACCTGATTTCCAACGTCAACGAGACGTCAGTG622ProSerAlaProGlnAsnLeuIleSerAsnValAsnGluThrSerVal195 200205AACTTGGAGTGGAGCGCCCCACAGAACAAGGGAGGACGGGACGACATC670AsnLeuGluTrpSerAlaProGlnAsnLysGlyGlyArgAspAspIle210 215220TCCTACAACGTGGTGTGCAAGCGCTGCGGGGCAGGGGAGCCCAGCCAC718SerTyrAsnValValCysLysArgCysGlyAlaGlyGluProSerHis225 230235TGCCGGTCCTGTGGCAGTGGTGTACATTTCAGCCCCCAGCAGAACGGG766CysArgSerCysGlySerGlyValHisPheSerProGlnGlnAsnGly240245 250255CTGAAAACCACGAAGGTTTCCATCACTGACCTCCTGGCACACACCAAC814LeuLysThrThrLysValSerIleThrAspLeuLeuAlaHisThrAsn260 265270TACACCTTTGAGGTCTGGGCAGTGAATGGAGTGTCCAAGCACAACCCC862TyrThrPheGluValTrpAlaValAsnGlyValSerLysHisAsnPro275 280285AGCCAGGACCAAGCTGTGTCGGTCACTGTGACAACTAACCAAGCAGCT910SerGlnAspGlnAlaValSerValThrValThrThrAsnGlnAlaAla290 295300CCATCCCCAATTGCATTGATCCAGGCTAAAGAGATAACGAGGCACAGC958ProSerProIleAlaLeuIleGlnAlaLysGluIleThrArgHisSer305 310315GTTGCCTTGGCCTGGCTGGAACCTGACAGGCCCAATGGAGTCATCCTG1006ValAlaLeuAlaTrpLeuGluProAspArgProAsnGlyValIleLeu320325 330335GAGTACGAAGTCAAGTACTACGAAAAGGACCAAAACGAGCGCACGTAT1054GluTyrGluValLysTyrTyrGluLysAspGlnAsnGluArgThrTyr340 345350CGCATTGTGAAGACAGCCTCCAGGAATACTGACATCAAAGGTTTGAAC1102ArgIleValLysThrAlaSerArgAsnThrAspIleLysGlyLeuAsn355 360365CCCCTGACTTCATATGTATTTCATGTGCGGGCCAGGACAGCAGCAGGA1150ProLeuThrSerTyrValPheHisValArgAlaArgThrAlaAlaGly370 375380TACGGAGACTTCAGTGGGCCGTTTGAGTTCACAACTAACACAGTTCCT1198TyrGlyAspPheSerGlyProPheGluPheThrThrAsnThrValPro385 390395TCCCCCATCATTGGCGATGGTACCAATCCCACAGTGCTGCTTGTTTCA1246SerProIleIleGlyAspGlyThrAsnProThrValLeuLeuValSer400405 410415GTGGCTGGCAGTGTTGTTCTTGTGGTCATTCTCATTGCAGCCTTTGTC1294ValAlaGlySerValValLeuValValIleLeuIleAlaAlaPheVal420 425430ATCAGCAGGAGGCGCAGCAAATACAGTAAAGCTAAGCAAGAGGCAGAT1342IleSerArgArgArgSerLysTyrSerLysAlaLysGlnGluAlaAsp435 440445GAGGAGAAACATTTGAACCAAGGTGTCAGAACATATGTGGATCCTTTT1390GluGluLysHisLeuAsnGlnGlyValArgThrTyrValAspProPhe450 455460ACATATGAGGATCCAAATCAAGCTGTGAGGGAATTTGCCAAAGAAATT1438ThrTyrGluAspProAsnGlnAlaValArgGluPheAlaLysGluIle465 470475GATGCCTCCTGCATAAAGATTGAGAAAGTTATTGGTGTGGGGGAATTT1486AspAlaSerCysIleLysIleGluLysValIleGlyValGlyGluPhe480485 490495GGTGAAGTATGCAGTGGACGTCTCAAAGTTCCAGGAAAAAGAGAAATC1534GlyGluValCysSerGlyArgLeuLysValProGlyLysArgGluIle500 505510TGTGTGGCTATCAAGACTCTGAAAGCTGGTTACACTGACAAACAACGG1582CysValAlaIleLysThrLeuLysAlaGlyTyrThrAspLysGlnArg515 520525AGAGACTTCCTGAGTGAGGCCAGCATCATGGGACAATTTGACCACCCC1630ArgAspPheLeuSerGluAlaSerIleMetGlyGlnPheAspHisPro530 535540AATATCATCCACTTGGAAGGCGTTGTTACTAAATGTAAACCAGTAATG1678AsnIleIleHisLeuGluGlyValValThrLysCysLysProValMet545 550555ATCATAACTGAGTACATGGAGAATGGCTCCTTGGATGCCTTCCTCCGG1726IleIleThrGluTyrMetGluAsnGlySerLeuAspAlaPheLeuArg560565 570575AAGAATGATGGCAGATTTACAGTAATCCAGTTGGTGGGGATGCTTCGT1774LysAsnAspGlyArgPheThrValIleGlnLeuValGlyMetLeuArg580 585590GGCATCGGCTCAGGAATGAAGTATCTGTCTGACATGAGCTATGTGCAT1822GlyIleGlySerGlyMetLysTyrLeuSerAspMetSerTyrValHis595 600605CGGGATCTAGCTGCTCGAAACATACTGGTCAACAGCAACTTGGTCTGC1870ArgAspLeuAlaAlaArgAsnIleLeuValAsnSerAsnLeuValCys610 615620AAAGTGTCTGACTTTGGCATGTCCCGTGTCCTGGAAGATGACCCTGAG1918LysValSerAspPheGlyMetSerArgValLeuGluAspAspProGlu625 630635GCAGCTTATACCACACGGGGTGGCAAGATCCCTATCCGATGGACTGCA1966AlaAlaTyrThrThrArgGlyGlyLysIleProIleArgTrpThrAla640645 650655CCAGAGGCAATTGCCTACCGTAAATTTACATCGGCTAGTGACGTGTGG2014ProGluAlaIleAlaTyrArgLysPheThrSerAlaSerAspValTrp660 665670AGCTATGGCATCGTCATGTGGGAAGTGATGTCCTATGGAGAGAGACCT2062SerTyrGlyIleValMetTrpGluValMetSerTyrGlyGluArgPro675 680685TACTGGGATATGTCCAATCAAGACGTTATTAAAGCCATTGAGGAAGGG2110TyrTrpAspMetSerAsnGlnAspValIleLysAlaIleGluGluGly690 695700TATCGGTTGCCACCCCCAATGGACTGCCCCATTGCTCTCCATCAGCTG2158TyrArgLeuProProProMetAspCysProIleAlaLeuHisGlnLeu705 710715ATGTTAGACTGCTGGCAGAAGGAACGCAGCGACAGACCTAAATTTGGA2206MetLeuAspCysTrpGlnLysGluArgSerAspArgProLysPheGly720725 730735CAGATTGTCAACATGCTGGACAAACTCATCCGCAACCCTAACAGCCTG2254GlnIleValAsnMetLeuAspLysLeuIleArgAsnProAsnSerLeu740 745750AAGAGGACAGGCAGCGAGAGCTCCAGACCCAGCACAGCCCTGCTGGAT2302LysArgThrGlySerGluSerSerArgProSerThrAlaLeuLeuAsp755 760765CCCAGCTCCCCGGAGTTCTCGGCGGTTGTTTCTGTCAGTGACTGGCTC2350ProSerSerProGluPheSerAlaValValSerValSerAspTrpLeu770 775780CAAGCCATTAAAATGGAGCGATACAAGGATAACTTCACAGCTGCTGGC2398GlnAlaIleLysMetGluArgTyrLysAspAsnPheThrAlaAlaGly785 790795TATACCACCCTAGAGGCTGTGGTGCATATGAACCAGGACGACCTGGCC2446TyrThrThrLeuGluAlaValValHisMetAsnGlnAspAspLeuAla800805 810815AGGATCGGGATCACTGCCATCACACACCAGAACAAGATCTTGAGCAGC2494ArgIleGlyIleThrAlaIleThrHisGlnAsnLysIleLeuSerSer820 825830GTTCAAGCCATGCGCAGCCAAATGCAACAGATGCACGGCAGGATGGTG2542ValGlnAlaMetArgSerGlnMetGlnGlnMetHisGlyArgMetVal835 840845CCCGTCTGAGCCAGTACTGAATAAACTCAAAACTCTTGAAATTAGTTTACCTCATC2598ProValCATGCACTTTAATTGAAGAACTGCACTTTTTTTACTTCGTCTCCTCGCCCGTTGAAATAA2658AGATCTGCAGCATTGCTTGATGTACAGATTGTGGAAACCGAGCGTGTGTTGGGAGGGGGG2718CCTCCAGAAATGACAAGCCGTCATTTTAAACCAGACCTGGAACAAATTGTTTCTTGGAAC2778ATACTTCTCTGTTGATCAACGATATGTAAAATACATGTATCC 2820(2) INFORMATION FOR SEQ ID NO:6:(i) SEQUENCE CHARACTERISTICS:(A) LENGTH: 849 amino acids(B) TYPE: amino acid(D) TOPOLOGY: linear(ii) MOLECULE TYPE: protein(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6:GlyGluSerGlnPheAlaLysIleAspThrIleAlaAlaAs pGluSer151015PheThrGlnValAspIleGlyAspArgIleMetLysLeuAsnThrGlu202530 ValArgAspValGlyProLeuSerLysLysGlyPheTyrLeuAlaPhe354045GlnAspValGlyAlaCysIleAlaLeuValSerValArgValPheTyr50 5560LysLysCysProLeuThrValArgAsnLeuAlaGlnPheProAspThr65707580IleThrGlyAlaAspThrSerS erLeuValGluValArgGlySerCys859095ValAsnAsnSerGluGluLysAspValProLysMetTyrCysGlyAla10010 5110AspGlyGluTrpLeuValProIleGlyAsnCysLeuCysAsnAlaGly115120125TyrGluGluArgAsnGlyGluCysGlnAlaCysLysIleGl yTyrTyr130135140LysAlaLeuSerThrAspValAlaCysAlaLysCysProProHisSer145150155160Tyr SerIleTrpGluGlySerThrSerCysThrCysAspArgGlyPhe165170175PheArgAlaGluAsnAspAlaAlaSerMetProCysThrArgProPro 180185190SerAlaProGlnAsnLeuIleSerAsnValAsnGluThrSerValAsn195200205LeuGluTrpSerAlaProGlnA snLysGlyGlyArgAspAspIleSer210215220TyrAsnValValCysLysArgCysGlyAlaGlyGluProSerHisCys225230235 240ArgSerCysGlySerGlyValHisPheSerProGlnGlnAsnGlyLeu245250255LysThrThrLysValSerIleThrAspLeuLeuAlaHi sThrAsnTyr260265270ThrPheGluValTrpAlaValAsnGlyValSerLysHisAsnProSer275280285Gln AspGlnAlaValSerValThrValThrThrAsnGlnAlaAlaPro290295300SerProIleAlaLeuIleGlnAlaLysGluIleThrArgHisSerVal3053 10315320AlaLeuAlaTrpLeuGluProAspArgProAsnGlyValIleLeuGlu325330335TyrGluValLysTyrTyrG luLysAspGlnAsnGluArgThrTyrArg340345350IleValLysThrAlaSerArgAsnThrAspIleLysGlyLeuAsnPro355360 365LeuThrSerTyrValPheHisValArgAlaArgThrAlaAlaGlyTyr370375380GlyAspPheSerGlyProPheGluPheThrThrAsnThrValProSe r385390395400ProIleIleGlyAspGlyThrAsnProThrValLeuLeuValSerVal405410415 AlaGlySerValValLeuValValIleLeuIleAlaAlaPheValIle420425430SerArgArgArgSerLysTyrSerLysAlaLysGlnGluAlaAspGlu4 35440445GluLysHisLeuAsnGlnGlyValArgThrTyrValAspProPheThr450455460TyrGluAspProAsnGlnAlaValArgG luPheAlaLysGluIleAsp465470475480AlaSerCysIleLysIleGluLysValIleGlyValGlyGluPheGly48549 0495GluValCysSerGlyArgLeuLysValProGlyLysArgGluIleCys500505510ValAlaIleLysThrLeuLysAlaGlyTyrThrAspLy sGlnArgArg515520525AspPheLeuSerGluAlaSerIleMetGlyGlnPheAspHisProAsn530535540IleIleHis LeuGluGlyValValThrLysCysLysProValMetIle545550555560IleThrGluTyrMetGluAsnGlySerLeuAspAlaPheLeuArgLys 565570575AsnAspGlyArgPheThrValIleGlnLeuValGlyMetLeuArgGly580585590IleGlySerGlyMetLysT yrLeuSerAspMetSerTyrValHisArg595600605AspLeuAlaAlaArgAsnIleLeuValAsnSerAsnLeuValCysLys610615 620ValSerAspPheGlyMetSerArgValLeuGluAspAspProGluAla625630635640AlaTyrThrThrArgGlyGlyLysIleProIleArgTrpTh rAlaPro645650655GluAlaIleAlaTyrArgLysPheThrSerAlaSerAspValTrpSer660665670 TyrGlyIleValMetTrpGluValMetSerTyrGlyGluArgProTyr675680685TrpAspMetSerAsnGlnAspValIleLysAlaIleGluGluGlyTyr690 695700ArgLeuProProProMetAspCysProIleAlaLeuHisGlnLeuMet705710715720LeuAspCysTrpGlnLysGluA rgSerAspArgProLysPheGlyGln725730735IleValAsnMetLeuAspLysLeuIleArgAsnProAsnSerLeuLys74074 5750ArgThrGlySerGluSerSerArgProSerThrAlaLeuLeuAspPro755760765SerSerProGluPheSerAlaValValSerValSerAspTr pLeuGln770775780AlaIleLysMetGluArgTyrLysAspAsnPheThrAlaAlaGlyTyr785790795800Thr ThrLeuGluAlaValValHisMetAsnGlnAspAspLeuAlaArg805810815IleGlyIleThrAlaIleThrHisGlnAsnLysIleLeuSerSerVal 820825830GlnAlaMetArgSerGlnMetGlnGlnMetHisGlyArgMetValPro835840845Val(2) INFORMATION FOR SEQ ID NO:7:(i) SEQUENCE CHARACTERISTICS:(A) LENGTH: 3776 base pairs(B) TYPE: nucleic acid(C) STRANDEDNESS: both(D) TOPOLOGY: linear(ix) FEATURE:(A) NAME/KEY: CDS(B) LOCATION: 290..3208(xi) SEQUENCE DESCRIPTION: SEQ ID NO:7:CGGCTCTGACTTTGTGTTAACGGTTTATGGACTGGTTCCAAAGAGCTCAA AGGTACCAAA60ACACTCCAAGCAACCTCTGAACCATTCAAGCAAGTAGTGTGTGTTTATTGGATATGGTGG120AGTCTACAGAGAATCTTCATGGATTCTAATGCTGACATCAGTGCAAGAAGAGTGTCAGGA180ATGGATTGGCTCTGGCTGGTTTGC TTCTTTCATCTAGTCACTTCACTAGAAGACCTGCAT240CCTGACCAACCGGAAAGGTGAGCAGGATGAGGCCATTGGTGGTGCTGTCATGACT295MetThr 1GAAATACTTCTGGATACAACTGGAGAAACCTCAGAGATTGGCTGGACC343GluIleLeuLeuAspThrThrGlyGluThrSerGluIleGlyTrpThr 51015TCTCACCCTCCTGATGGGTGGGAAGAAGTAAGTGTCCGGGATGATAAG391SerHisProProAspGlyTrpGluGluValSerValArgAspAspLys20 2530GAGCGCCAGATCCGAACCTTTCAAGTTTGTAACATGGATGAACCAGGT439GluArgGlnIleArgThrPheGlnValCysAsnMetAspGluProGly35 404550CAGAATAACTGGTTGCGTACTCACTTCATAGAGCGACGTGGAGCCCAC487GlnAsnAsnTrpLeuArgThrHisPheIleGluArgArgGlyAlaHis 556065CGAGTCCATGTCCGCCTTCATTTCTCAGTGAGGGACTGTGCCAGCATG535ArgValHisValArgLeuHisPheSerValArgAspCysAlaSerMet 707580CGTACTGTGGCCTCTACTTGCAAAGAGACTTTCACACTCTACTACCAC583ArgThrValAlaSerThrCysLysGluThrPheThrLeuTyrTyrHis 859095CAGTCAGATGTCGACATAGCCTCTCAGGAACTGCCAGAGTGGCATGAA631GlnSerAspValAspIleAlaSerGlnGluLeuProGluTrpHisGlu100 105110GGCCCCTGGACCAAGGTGGATACTATTGCAGCTGATGAAAGCTTTTCC679GlyProTrpThrLysValAspThrIleAlaAlaAspGluSerPheSer1151 20125130CAGGTGGACAGAACTGGGAAGGTGGTAAGGATGAATGTTAAAGTACGC727GlnValAspArgThrGlyLysValValArgMetAsnValLysValArg 135140145AGCTTTGGGCCACTCACAAAGCATGGCTTCTACCTGGCCTTCCAGGAC775SerPheGlyProLeuThrLysHisGlyPheTyrLeuAlaPheGlnAsp 150155160TCAGGAGCCTGTATGTCCCTGGTGGCAGTCCAAGTCTTTTTCTACAAG823SerGlyAlaCysMetSerLeuValAlaValGlnValPhePheTyrLys1 65170175TGTCCAGCTGTGGTGAAAGGATTTGCCTCCTTCCCTGAAACTTTTGCT871CysProAlaValValLysGlyPheAlaSerPheProGluThrPheAla180 185190GGAGGAGAGAGGACCTCACTGGTGGAGTCACTAGGGACGTGTGTAGCA919GlyGlyGluArgThrSerLeuValGluSerLeuGlyThrCysValAla1952 00205210AATGCTGAAGAGGCAAGCACAACTGGGTCATCAGGTGTTCGGTTGCAC967AsnAlaGluGluAlaSerThrThrGlySerSerGlyValArgLeuHis 215220225TGCAATGGAGAAGGAGAGTGGATGGTGGCCACTGGACGATGCTCTTGC1015CysAsnGlyGluGlyGluTrpMetValAlaThrGlyArgCysSerCys 230235240AAGGCTGGTTACCAATCTGTTGACAATGAGCAAGCTTGTCAAGCTTGT1063LysAlaGlyTyrGlnSerValAspAsnGluGlnAlaCysGlnAlaCys2 45250255CCCATTGGTTCCTTTAAAGCATCTGTGGGAGATGACCCTTGCCTTCTC1111ProIleGlySerPheLysAlaSerValGlyAspAspProCysLeuLeu260 265270TGCCCTGCCCACAGCCATGCTCCACTGCCACTGCCAGGTTCCATTGAA1159CysProAlaHisSerHisAlaProLeuProLeuProGlySerIleGlu2752 80285290TGTGTGTGTCAGAGTCACTACTACCGATCTGCTTCTGACAATTCTGAT1207CysValCysGlnSerHisTyrTyrArgSerAlaSerAspAsnSerAsp 295300305GCTCCCTGCACTGGCATCCCCTCTGCTCCCCGTGACCTCAGTTATGAA1255AlaProCysThrGlyIleProSerAlaProArgAspLeuSerTyrGlu 310315320ATTGTTGGCTCCAACGTGCTCCTGACCTGGCGCCTCCCCAAGGACTTG1303IleValGlySerAsnValLeuLeuThrTrpArgLeuProLysAspLeu3 25330335GGTGGCCGCAAGGATGTCTTCTTCAATGTCATCTGCAAGGAATGCCCA1351GlyGlyArgLysAspValPhePheAsnValIleCysLysGluCysPro340 345350ACAAGGTCAGCAGGGACATGTGTGCGCTGTGGGGACAATGTACAGTTT1399ThrArgSerAlaGlyThrCysValArgCysGlyAspAsnValGlnPhe3553 60365370GAACCACGCCAAGTGGGCCTGACAGAAAGTCGTGTTCAAGTCTCCAAC1447GluProArgGlnValGlyLeuThrGluSerArgValGlnValSerAsn 375380385CTATTGGCCCGTGTGCAGTACACTTTTGAGATCCAGGCTGTCAATTTG1495LeuLeuAlaArgValGlnTyrThrPheGluIleGlnAlaValAsnLeu 390395400GTGACTGAGTTGAGTTCAGAAGCACCCCAGTATGCTACCATCAACGTT1543ValThrGluLeuSerSerGluAlaProGlnTyrAlaThrIleAsnVal4 05410415AGCACCAGCCAGTCAGTGCCCTCCGCAATCCCTATGATGCATCAGGTG1591SerThrSerGlnSerValProSerAlaIleProMetMetHisGlnVal420 425430AGTCGTGCTACCAGTAGCATCACACTGTCTTGGCCTCAGCCAGACCAG1639SerArgAlaThrSerSerIleThrLeuSerTrpProGlnProAspGln4354 40445450CCCAATGGGGTTATCCTGGATTACCAGCTACGGTACTTTGACAAGGCA1687ProAsnGlyValIleLeuAspTyrGlnLeuArgTyrPheAspLysAla 455460465GAAGATGAGGATAATTCATTTACTTTGACTAGTGAAACTAACATGGCC1735GluAspGluAspAsnSerPheThrLeuThrSerGluThrAsnMetAla 470475480ACTATATTAAATCTGAGTCCAGGCAAGATCTATGTCTTCCAAGTACGA1783ThrIleLeuAsnLeuSerProGlyLysIleTyrValPheGlnValArg4 85490495GCTAGAACAGCAGTGGGTTATGGCCCATACAGTGGAAAGATGTATTTC1831AlaArgThrAlaValGlyTyrGlyProTyrSerGlyLysMetTyrPhe500 505510CAGACTTTAATGGCAGGAGAGCACTCGGAGATGGCACAGGACCGACTG1879GlnThrLeuMetAlaGlyGluHisSerGluMetAlaGlnAspArgLeu5155 20525530CCACTTATTGTGGGCTCAGCACTTGGTGGTCTGGCATTCTTGGTAATT1927ProLeuIleValGlySerAlaLeuGlyGlyLeuAlaPheLeuValIle 535540545GCTGCCATTGCCATTCTTGCCATCATCTTCAAGAGTAAAAGGCGAGAG1975AlaAlaIleAlaIleLeuAlaIleIlePheLysSerLysArgArgGlu 550555560ACTCCATACACAGACCGCCTGCAGCAGTATATCAGTACACGAGGACTT2023ThrProTyrThrAspArgLeuGlnGlnTyrIleSerThrArgGlyLeu5 65570575GGAGTGAAGTATTACATTGATCCTTCCACGTATGAAGATCCCAATGAA2071GlyValLysTyrTyrIleAspProSerThrTyrGluAspProAsnGlu580 585590GCTATTCGAGAGTTTGCCAAAGAGATAGATGTGTCCTTCATCAAAATT2119AlaIleArgGluPheAlaLysGluIleAspValSerPheIleLysIle5956 00605610GAGGAGGTCATTGGATCAGGAGAATTTGGAGAGGTGTGCTTTGGGCGC2167GluGluValIleGlySerGlyGluPheGlyGluValCysPheGlyArg 615620625CTAAAACACCCAGGGAAACGTGAATACACAGTAGCTATTAAAACCCTG2215LeuLysHisProGlyLysArgGluTyrThrValAlaIleLysThrLeu 630635640AAGTCAGGTTATACTGATGAACAGCGTCGAGAGTTCCTGAGCGAGGCC2263LysSerGlyTyrThrAspGluGlnArgArgGluPheLeuSerGluAla6 45650655AGCATCATGGGGCAATTTGAGCATCCCAATGTCATCCACCTGGAGGGC2311SerIleMetGlyGlnPheGluHisProAsnValIleHisLeuGluGly660 665670GTGGTCACCAAAAGCCGACCAGTCATGATTGTCACAGAATTCATGGAG2359ValValThrLysSerArgProValMetIleValThrGluPheMetGlu6756 80685690AATGGATCACTGGATTCCTTCCTCAGGGAGAAGGAGGGACAGTTCAGT2407AsnGlySerLeuAspSerPheLeuArgGluLysGluGlyGlnPheSer 695700705GTGTTACAGCTGGTGGGAATGCTACGAGGGATTGCAGCAGGCATGCGC2455ValLeuGlnLeuValGlyMetLeuArgGlyIleAlaAlaGlyMetArg 710715720TACCTTTCAGACATGAACTATGTGCATCGTGATCTCGCAGCACGTAAC2503TyrLeuSerAspMetAsnTyrValHisArgAspLeuAlaAlaArgAsn7 25730735ATCTTAGTCAACAGTAACCTTGTATGCAAGGTGTCAGACTTTGGTTTG2551IleLeuValAsnSerAsnLeuValCysLysValSerAspPheGlyLeu740 745750TCTCGCTTTCTGGAAGATGATGCTTCAAATCCCACTTATACTGGAGCT2599SerArgPheLeuGluAspAspAlaSerAsnProThrTyrThrGlyAla7557 60765770CTGGGTTGCAAAATCCCCATCCGTTGGACTGCCCCTGAAGCTGTCCAG2647LeuGlyCysLysIleProIleArgTrpThrAlaProGluAlaValGln 775780785TATCGCAAGTTCACCTCCTCCAGTGATGTCTGGAGCTATGGCATTGTC2695TyrArgLysPheThrSerSerSerAspValTrpSerTyrGlyIleVal 790795800ATGTGGGAGGTGATGTCCTATGGTGAGAGACCTTACTGGGACATGTCC2743MetTrpGluValMetSerTyrGlyGluArgProTyrTrpAspMetSer8 05810815AACCAGGATGTAATTAATGCCATTGACCAGGACTATCGCCTGCCACCA2791AsnGlnAspValIleAsnAlaIleAspGlnAspTyrArgLeuProPro820 825830CCCCCAGACTGCCCAACTGTTTTGCATCTGCTGATGCTTGACTGCTGG2839ProProAspCysProThrValLeuHisLeuLeuMetLeuAspCysTrp8358 40845850CAGAAGGATCGAGTCCAGAGACCAAAATTTGAACAAATAGTCAGTGCC2887GlnLysAspArgValGlnArgProLysPheGluGlnIleValSerAla 855860865CTAGATAAAATGATCCGCAAGCCATCTGCTCTCAAAGCCACTGGCACT2935LeuAspLysMetIleArgLysProSerAlaLeuLysAlaThrGlyThr 870875880GGGAGCAGCAGACCATCTCAGCCTCTCCTGAGCAACTCCCCTCCAGAT2983GlySerSerArgProSerGlnProLeuLeuSerAsnSerProProAsp8 85890895TTTCCTTCACTCAGCAATGCCCACGAGTGGTTGGATGCCATCAAGATG3031PheProSerLeuSerAsnAlaHisGluTrpLeuAspAlaIleLysMet900 905910GGTCGTTACAAGGAGAATTTTGACCAGGCTGGTCTGATTACATTTGAT3079GlyArgTyrLysGluAsnPheAspGlnAlaGlyLeuIleThrPheAsp9159 20925930GTCATATCACGCATGACTCTGGAAGATCTCCAGCGTATTGGAATCACC3127ValIleSerArgMetThrLeuGluAspLeuGlnArgIleGlyIleThr 935940945CTGGTTGGTCACCAGAAAAAGATTCTAAACAGCATCCAGCTCATGAAA3175LeuValGlyHisGlnLysLysIleLeuAsnSerIleGlnLeuMetLys 950955960GTTCATTTGAACCAGCTTGAACCAGTTGAAGTGTGATGCTTTAAGTCTCTATT3228ValHisLeuAsnGlnLeuGluProValGluVal9659 70TCACCAGACTCAAATTCTGAAAGAGTCCTGAGGGGATTCAGAGGGATTGTCACTGTATGA3288AAAGGAAATGGCAAGATGCTCCTTGAAGACTTACTGCACCTAGAGAGTAGACATTACACA3348TTCCATTCCACCAGCAAAAAGAGAATCTTGCCATCATTT AAAAGCAGAGTTAAATAGCTG3408GTGGTTAAATATGACTGGCATCATACACTAGGAGTAGGTCAGGGAGGGAAAGTTATAGTA3468ATGCAGAGTGGAGCTGGTATAATAGTTTGGACAGACCACAAGCACCTGCTAGCTCTTCTC3528CACTAAATAAAAA ATCAGACAATTCTCCAGTGCCATCAGCAGGCTTTATCTGTGACTGGG3588AACAAAGAAATCACAATTTTTCCAAGAGAGTATCAGCACATTGTGAGAGTTATCACTCAG3648TTGGAAATGGACATCACTTGCTATGCCAGATTTGTGAGAAACTGGAGTTCCACTGAG TGC3708ACCATATGTGGTAAACAATAAGGTACATCACCTCGTAATTTTTACAGAGGTTGAGAGTAA3768AGGGCCCA3776(2) INFORMATION FOR SEQ ID NO:8:(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 973 amino acids(B) TYPE: amino acid(D) TOPOLOGY: linear(ii) MOLECULE TYPE: protein(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8:MetThrGluIleLeuLeuAspThrThrGlyGluThrSerGluIleGly1510 15TrpThrSerHisProProAspGlyTrpGluGluValSerValArgAsp202530AspLysGluArgGlnIleArgThrPheGlnValCysAsnMetAspGlu354045ProGlyGlnAsnAsnTrpLeuArgThrHisPheIleGluArgArgGly505560AlaHisArgValHisValAr gLeuHisPheSerValArgAspCysAla65707580SerMetArgThrValAlaSerThrCysLysGluThrPheThrLeuTyr85 9095TyrHisGlnSerAspValAspIleAlaSerGlnGluLeuProGluTrp100105110HisGluGlyProTrpThrLysValAspThr IleAlaAlaAspGluSer115120125PheSerGlnValAspArgThrGlyLysValValArgMetAsnValLys130135140V alArgSerPheGlyProLeuThrLysHisGlyPheTyrLeuAlaPhe145150155160GlnAspSerGlyAlaCysMetSerLeuValAlaValGlnValPhePhe 165170175TyrLysCysProAlaValValLysGlyPheAlaSerPheProGluThr180185190PheAlaGlyGl yGluArgThrSerLeuValGluSerLeuGlyThrCys195200205ValAlaAsnAlaGluGluAlaSerThrThrGlySerSerGlyValArg210215 220LeuHisCysAsnGlyGluGlyGluTrpMetValAlaThrGlyArgCys225230235240SerCysLysAlaGlyTyrGlnSerValAspAsn GluGlnAlaCysGln245250255AlaCysProIleGlySerPheLysAlaSerValGlyAspAspProCys260265 270LeuLeuCysProAlaHisSerHisAlaProLeuProLeuProGlySer275280285IleGluCysValCysGlnSerHisTyrTyrArgSerAlaSerAspAsn 290295300SerAspAlaProCysThrGlyIleProSerAlaProArgAspLeuSer305310315320TyrGluIleValGl ySerAsnValLeuLeuThrTrpArgLeuProLys325330335AspLeuGlyGlyArgLysAspValPhePheAsnValIleCysLysGlu340 345350CysProThrArgSerAlaGlyThrCysValArgCysGlyAspAsnVal355360365GlnPheGluProArgGlnValGlyLeuThrGlu SerArgValGlnVal370375380SerAsnLeuLeuAlaArgValGlnTyrThrPheGluIleGlnAlaVal385390395 400AsnLeuValThrGluLeuSerSerGluAlaProGlnTyrAlaThrIle405410415AsnValSerThrSerGlnSerValProSerAlaIleProMetMetHis420425430GlnValSerArgAlaThrSerSerIleThrLeuSerTrpProGlnPro435440445AspGlnProAsnGl yValIleLeuAspTyrGlnLeuArgTyrPheAsp450455460LysAlaGluAspGluAspAsnSerPheThrLeuThrSerGluThrAsn465470 475480MetAlaThrIleLeuAsnLeuSerProGlyLysIleTyrValPheGln485490495ValArgAlaArgThrAlaValGlyTyrGly ProTyrSerGlyLysMet500505510TyrPheGlnThrLeuMetAlaGlyGluHisSerGluMetAlaGlnAsp515520 525ArgLeuProLeuIleValGlySerAlaLeuGlyGlyLeuAlaPheLeu530535540ValIleAlaAlaIleAlaIleLeuAlaIleIlePheLysSerLysArg545 550555560ArgGluThrProTyrThrAspArgLeuGlnGlnTyrIleSerThrArg565570575GlyLeuGlyVa lLysTyrTyrIleAspProSerThrTyrGluAspPro580585590AsnGluAlaIleArgGluPheAlaLysGluIleAspValSerPheIle595 600605LysIleGluGluValIleGlySerGlyGluPheGlyGluValCysPhe610615620GlyArgLeuLysHisProGlyLysArgGluTyrThrVal AlaIleLys625630635640ThrLeuLysSerGlyTyrThrAspGluGlnArgArgGluPheLeuSer645650 655GluAlaSerIleMetGlyGlnPheGluHisProAsnValIleHisLeu660665670GluGlyValValThrLysSerArgProValMetIleValThrGluPhe675680685MetGluAsnGlySerLeuAspSerPheLeuArgGluLysGluGlyGln690695700PheSerValLeuGlnLeuVa lGlyMetLeuArgGlyIleAlaAlaGly705710715720MetArgTyrLeuSerAspMetAsnTyrValHisArgAspLeuAlaAla725 730735ArgAsnIleLeuValAsnSerAsnLeuValCysLysValSerAspPhe740745750GlyLeuSerArgPheLeuGluAspAspAla SerAsnProThrTyrThr755760765GlyAlaLeuGlyCysLysIleProIleArgTrpThrAlaProGluAla770775780V alGlnTyrArgLysPheThrSerSerSerAspValTrpSerTyrGly785790795800IleValMetTrpGluValMetSerTyrGlyGluArgProTyrTrpAsp 805810815MetSerAsnGlnAspValIleAsnAlaIleAspGlnAspTyrArgLeu820825830ProProProPr oAspCysProThrValLeuHisLeuLeuMetLeuAsp835840845CysTrpGlnLysAspArgValGlnArgProLysPheGluGlnIleVal850855 860SerAlaLeuAspLysMetIleArgLysProSerAlaLeuLysAlaThr865870875880GlyThrGlySerSerArgProSerGlnProLeu LeuSerAsnSerPro885890895ProAspPheProSerLeuSerAsnAlaHisGluTrpLeuAspAlaIle900905 910LysMetGlyArgTyrLysGluAsnPheAspGlnAlaGlyLeuIleThr915920925PheAspValIleSerArgMetThrLeuGluAspLeuGlnArgIleGly 930935940IleThrLeuValGlyHisGlnLysLysIleLeuAsnSerIleGlnLeu945950955960MetLysValHisLe uAsnGlnLeuGluProValGluVal965970(2) INFORMATION FOR SEQ ID NO:9:(i) SEQUENCE CHARACTERISTICS:(A) LENGTH: 3546 base pairs(B) TYPE: nucleic acid(C) STRANDEDNESS: both(D) TOPOLOGY: linear(ix) FEATURE: (A) NAME/KEY: CDS(B) LOCATION: 2..2920(xi) SEQUENCE DESCRIPTION: SEQ ID NO:9:CGGGGTCTCCTCGAGGGCGCGGCGGCCGCCGGGCAGCAGCAGGAGC46GlyValSerSerArgAlaArgArgProProGlySerSerArgSer1 51015AGCAGGAGGGGGGTGACCTCGGAGCTGGCATGGACAACCCATCCGGAG94SerArgArgGlyValThrSerGluLeuAlaTrpThrThrHisProGlu 202530ACGGGGTGGGAAGAGGTCAGTGGTTACGACGAGGCTATGAACCCCATC142ThrGlyTrpGluGluValSerGlyTyrAspGluAlaMetAsnProIle 354045CGCACATACCAGGTGTGCAACGTGCGGGAGGCCAACCAGAACAACTGG190ArgThrTyrGlnValCysAsnValArgGluAlaAsnGlnAsnAsnTrp 505560CTTCGCACCAAGTTCATTCAGCGCCAGGACGTCCAGCGTGTCTACGTG238LeuArgThrLysPheIleGlnArgGlnAspValGlnArgValTyrVal65 7075GAGCTGAAATTCACTGTGCGGGACTGCAACAGCATCCCCAACATCCCT286GluLeuLysPheThrValArgAspCysAsnSerIleProAsnIlePro80 859095GGTTCCTGCAAAGAGACCTTCAACCTCTTCTATTATGAGTCAGATACG334GlySerCysLysGluThrPheAsnLeuPheTyrTyrGluSerAspThr 100105110GATTCTGCCTCTGCCAATAGCCCTTTCTGGATGGAGAACCCCTATATC382AspSerAlaSerAlaAsnSerProPheTrpMetGluAsnProTyrIle 115120125AAAGTGGATACAATTGCTCCGGATGAGAGCTTCTCCAAACTGGAGTCC430LysValAspThrIleAlaProAspGluSerPheSerLysLeuGluSer 130135140GGCCGTGTGAACACCAAGGTGCGCAGCTTTGGGCCGCTCTCCAAGAAT478GlyArgValAsnThrLysValArgSerPheGlyProLeuSerLysAsn145 150155GGCTTTTATCTGGCTTTCCAGGACCTGGGGGCCTGCATGTCCCTTATC526GlyPheTyrLeuAlaPheGlnAspLeuGlyAlaCysMetSerLeuIle160 165170175TCCGTCCGGGCTTTCTACAAGAAATGTTCCAACACCATCGCTGGCTTT574SerValArgAlaPheTyrLysLysCysSerAsnThrIleAlaGlyPhe 180185190GCTATCTTCCCGGAGACCCTAACGGGGGCTGAGCCCACGTCGCTGGTC622AlaIlePheProGluThrLeuThrGlyAlaGluProThrSerLeuVal 195200205ATTGCGCCGGGCACCTGCATCCCCAACGCAGTGGAAGTGTCTGTGCCC670IleAlaProGlyThrCysIleProAsnAlaValGluValSerValPro 210215220CTGAAGCTGTACTGCAACGGTGATGGCGAGTGGATGGTGCCTGTGGGA718LeuLysLeuTyrCysAsnGlyAspGlyGluTrpMetValProValGly225 230235GCGTGCACGTGTGCTGCTGGGTACGAGCCAGCCATGAAGGATACCCAG766AlaCysThrCysAlaAlaGlyTyrGluProAlaMetLysAspThrGln240 245250255TGCCAAGCATGCGGCCCGGGGACGTTCAAATCCAAGCAGGGCGAGGGC814CysGlnAlaCysGlyProGlyThrPheLysSerLysGlnGlyGluGly 260265270CCCTGCTCCCCCTGCCCTCCCAACAGCCGCACCACCGCGGGGGCAGCC862ProCysSerProCysProProAsnSerArgThrThrAlaGlyAlaAla 275280285ACAGTCTGCATATGTCGCAGCGGCTTCTTCCGAGCAGACGCGGACCCC910ThrValCysIleCysArgSerGlyPhePheArgAlaAspAlaAspPro 290295300GCAGACAGCGCCTGCACCAGTGTGCCCTCAGCCCCACGCAGCGTCATC958AlaAspSerAlaCysThrSerValProSerAlaProArgSerValIle305 310315TCCAACGTGAATGAGACGTCGTTGGTGCTGGAGTGGAGCGAGCCGCAG1006SerAsnValAsnGluThrSerLeuValLeuGluTrpSerGluProGln320 325330335GACGCGGGCGGGCGGGATGACCTGCTCTACAACGTCATCTGCAAGAAG1054AspAlaGlyGlyArgAspAspLeuLeuTyrAsnValIleCysLysLys 340345350TGCAGCGTGGAGCGGCGGCTGTGCAGCCGCTGCGACGACAACGTGGAG1102CysSerValGluArgArgLeuCysSerArgCysAspAspAsnValGlu 355360365TTCGTGCCGCGCCAGCTGGGCCTCACTGGCCTCACTGAGCGACGCATC1150PheValProArgGlnLeuGlyLeuThrGlyLeuThrGluArgArgIle 370375380TACATCAGCAAGGTGATGGCCCACCCCCAGTACACCTTCGAGATCCAG1198TyrIleSerLysValMetAlaHisProGlnTyrThrPheGluIleGln385 390395GCGGTGAATGGCATCTCCAGCAAGAGCCCCTACCCTCCCCATTTTGCC1246AlaValAsnGlyIleSerSerLysSerProTyrProProHisPheAla400 405410415TCCGTCAACATCACGACCAACCAGGCAGCCCCATCTGCCGTGCCCACC1294SerValAsnIleThrThrAsnGlnAlaAlaProSerAlaValProThr 420425430ATGCATCTGCACAGCAGCACCGGGAACAGCATGACACTGTCATGGACT1342MetHisLeuHisSerSerThrGlyAsnSerMetThrLeuSerTrpThr 435440445CCCCCGGAAAGGCCCAACGGCATCATTCTCGACTATGAAATCAAGTAC1390ProProGluArgProAsnGlyIleIleLeuAspTyrGluIleLysTyr 450455460TCCGAGAAGCAAGGCCAGGGTGACGGCATTGCCAACACTGTCACCAGC1438SerGluLysGlnGlyGlnGlyAspGlyIleAlaAsnThrValThrSer465 470475CAGAAGAACTCGGTGCGGCTGGACGGACTGAAGGCCAATGCTCGGTAC1486GlnLysAsnSerValArgLeuAspGlyLeuLysAlaAsnAlaArgTyr480 485490495ATGGTGCAGGTCCGGGCGCGCACAGTGGCTGGATACGGCCGCTACAGC1534MetValGlnValArgAlaArgThrValAlaGlyTyrGlyArgTyrSer 500505510CTCCCCACCGAGTTCCAGACGACTGCGGAGGATGGCTCCACCAGCAAG1582LeuProThrGluPheGlnThrThrAlaGluAspGlySerThrSerLys 515520525ACTTTCCAGGAGCTTCCTCTCATCGTGGGTTCAGCCACCGCGGGACTG1630ThrPheGlnGluLeuProLeuIleValGlySerAlaThrAlaGlyLeu 530535540CTGTTTGTCATCGTGGTGGTCATCATCGCTATTGTCTGCTTCAGGAAG1678LeuPheValIleValValValIleIleAlaIleValCysPheArgLys545 550555CAGCGCAACAGCACAGATCCCGAGTACACAGAGAAGCTGCAGCAATAT1726GlnArgAsnSerThrAspProGluTyrThrGluLysLeuGlnGlnTyr560 565570575GTCACTCCTGGGATGAAGGTCTACATTGACCCCTTCACCTATGAAGAC1774ValThrProGlyMetLysValTyrIleAspProPheThrTyrGluAsp 580585590CCAAATGAAGCTGTCCGGGAATTCGCCAAAGAGATTGATATCTCCTGT1822ProAsnGluAlaValArgGluPheAlaLysGluIleAspIleSerCys 595600605GTCAAAATTGAGGAGGTCATTGGAGCAGGAGAGTTTGGTGAGGTGTGC1870ValLysIleGluGluValIleGlyAlaGlyGluPheGlyGluValCys 610615620CGTGGGCGCCTGAAGCTGCCTGGCCGCCGTGAGATCTTTGTGGCCATC1918ArgGlyArgLeuLysLeuProGlyArgArgGluIlePheValAlaIle625 630635AAGACACTGAAGGTGGGCTACACAGAGAGGCAGCGGCGGGACTTCCTG1966LysThrLeuLysValGlyTyrThrGluArgGlnArgArgAspPheLeu640 645650655AGTGAGGCCAGCATCATGGGCCAGTTCGACCACCCCAACATCATCCAC2014SerGluAlaSerIleMetGlyGlnPheAspHisProAsnIleIleHis 660665670CTGGAGGGCGTGGTGACCAAGAGCCGCCCTGTCATGATCATCACAGAG2062LeuGluGlyValValThrLysSerArgProValMetIleIleThrGlu 675680685TTCATGGAGAACTGCGCTCTCGACTCCTTCCTCCGGCTGAATGATGGG2110PheMetGluAsnCysAlaLeuAspSerPheLeuArgLeuAsnAspGly 690695700CAGTTCACGGTCATCCAGCTGGTGGGGATGCTGCGAGGCATCGCTGCT2158GlnPheThrValIleGlnLeuValGlyMetLeuArgGlyIleAlaAla705 710715GGCATGAAGTACCTCTCAGAGATGAACTACGTGCACCGAGACCTGGCT2206GlyMetLysTyrLeuSerGluMetAsnTyrValHisArgAspLeuAla720 725730735GCCCGCAACATCCTGGTCAACAGCAACTTGGTCTGCAAAGTGTCTGAC2254AlaArgAsnIleLeuValAsnSerAsnLeuValCysLysValSerAsp 740745750TTCGGGCTCTCCCGCTTTTTGGAGGATGATCCAGCCGACCCCACCTAC2302PheGlyLeuSerArgPheLeuGluAspAspProAlaAspProThrTyr 755760765ACCAGCTCCCTGGGAGGCAAGATCCCCATCAGGTGGACAGCTCCTGAG2350ThrSerSerLeuGlyGlyLysIleProIleArgTrpThrAlaProGlu 770775780GCCATCGCCTACCGCAAATTCACGTCGGCCAGCGACGTGTGGAGCTAC2398AlaIleAlaTyrArgLysPheThrSerAlaSerAspValTrpSerTyr785 790795GGCATCGTCATGTGGGAAGTGATGTCCTACGGGGAGCGACCCTACTGG2446GlyIleValMetTrpGluValMetSerTyrGlyGluArgProTyrTrp800 805810815GACATGTCCAACCAGGATGTGATCAACGCGGTGGAGCAGGATTACCGC2494AspMetSerAsnGlnAspValIleAsnAlaValGluGlnAspTyrArg 820825830CTGCCACCCCCCATGGACTGCCCCACAGCACTGCACCAGCTGATGCTG2542LeuProProProMetAspCysProThrAlaLeuHisGlnLeuMetLeu 835840845GACTGCTGGGTGCGGGACCGCAACCTGCGGCCCAAGTTTGCACAGATT2590AspCysTrpValArgAspArgAsnLeuArgProLysPheAlaGlnIle 850855860GTCAACACGCTGGACAAGCTGATCCGCAATGCTGCCAGCCTGAAGGTC2638ValAsnThrLeuAspLysLeuIleArgAsnAlaAlaSerLeuLysVal865 870875ATCGCCAGCGTCCAGTCCGGTGTCTCCCAGCCGCTCCTGGACCGCACC2686IleAlaSerValGlnSerGlyValSerGlnProLeuLeuAspArgThr880 885890895GTGCCCGATTACACCACCTTCACCACCGTGGGAGACTGGCTGGATGCC2734ValProAspTyrThrThrPheThrThrValGlyAspTrpLeuAspAla 900905910ATCAAAATGGGACGGTACAAGGAGAACTTCGTCAACGCCGGCTTCGCC2782IleLysMetGlyArgTyrLysGluAsnPheValAsnAlaGlyPheAla 915920925TCCTTTGACCTGGTGGCACAGATGACAGCAGAGGACCTGCTAAGGATA2830SerPheAspLeuValAlaGlnMetThrAlaGluAspLeuLeuArgIle 930935940GGAGTGACGCTAGCAGGGCACCAGAAGAAGATCCTGAGCAGCATTCAG2878GlyValThrLeuAlaGlyHisGlnLysLysIleLeuSerSerIleGln945 950955GACATGAGGCTGCAGATGAACCAGACGCTGCCGGTTCAGGTT2920AspMetArgLeuGlnMetAsnGlnThrLeuProValGlnVal960965 970TGACCGCAGGGACTCTGCATTGGAACGGACTGAGGGAACCTGCCAACCAGGTTCTGTTTG2980CGGTGCAGCCCGGCTTCCCGATTTCCCCTTCCCGTGGCGCTCCTCTGCCTCGGACGCTCG3040CCGGGGACAGGCTGGGCCGGGCC ACCCTTCCCTGGATCAGAGGCACTCGTGCCGGGAGGG3100AGCCCGGCTTTTCGTCCCGTGTCCCGCAGCGGCGAGGCAGTGAACGCAGTCTTCATATTG3160AAGATGGATTATGGGACGGAGATGGCGCATCCGCTTCCCGCCCTGTCTCAGTGCTCATCA3220GTTTGAAGAGATGTTCTGCTTCTTGGATTTCTTTACACCCCGGTTTTCCCCCCTCGAGTC3280CTCACTTCCCCCTATCCCTGAGGCCACAGACTGTTGACCCGTCCGCTGAGTCCGTCAGAC3340GCTCCGAAGCCTTCCCCGAGCCCGGTCCCCGCGTGGAGACG GCGCCAGGGACGGGGCTAC3400GGCCCCAGACAATCACTCCACCCCTCCGCACGAGGGTCCTCACTGGGACGTGTCTGAAGG3460GGAAAGGCTCTGCTCCCTTTTTGGCTTTGCACGCCAGAACCCGAACCCCGTGAGATTTAC3520TATGCAGGGAGTTAGG CAAAAAAAAG3546(2) INFORMATION FOR SEQ ID NO:10:(i) SEQUENCE CHARACTERISTICS:(A) LENGTH: 973 amino acids(B) TYPE: amino acid(D) TOPOLOGY: linear(ii) MOLECULE TYPE: protein(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10:GlyValSerSerArgAlaA rgArgProProGlySerSerArgSerSer151015ArgArgGlyValThrSerGluLeuAlaTrpThrThrHisProGluThr20 2530GlyTrpGluGluValSerGlyTyrAspGluAlaMetAsnProIleArg354045ThrTyrGlnValCysAsnValArgGluAlaAsnGlnAs nAsnTrpLeu505560ArgThrLysPheIleGlnArgGlnAspValGlnArgValTyrValGlu65707580 LeuLysPheThrValArgAspCysAsnSerIleProAsnIleProGly859095SerCysLysGluThrPheAsnLeuPheTyrTyrGluSerAspThrAsp 100105110SerAlaSerAlaAsnSerProPheTrpMetGluAsnProTyrIleLys115120125ValAspThrIleAlaProA spGluSerPheSerLysLeuGluSerGly130135140ArgValAsnThrLysValArgSerPheGlyProLeuSerLysAsnGly14515015 5160PheTyrLeuAlaPheGlnAspLeuGlyAlaCysMetSerLeuIleSer165170175ValArgAlaPheTyrLysLysCysSerAsnThrIl eAlaGlyPheAla180185190IlePheProGluThrLeuThrGlyAlaGluProThrSerLeuValIle195200205 AlaProGlyThrCysIleProAsnAlaValGluValSerValProLeu210215220LysLeuTyrCysAsnGlyAspGlyGluTrpMetValProValGlyAla225 230235240CysThrCysAlaAlaGlyTyrGluProAlaMetLysAspThrGlnCys245250255GlnAlaCysGlyProG lyThrPheLysSerLysGlnGlyGluGlyPro260265270CysSerProCysProProAsnSerArgThrThrAlaGlyAlaAlaThr27528 0285ValCysIleCysArgSerGlyPhePheArgAlaAspAlaAspProAla290295300AspSerAlaCysThrSerValProSerAlaProArgSerValIl eSer305310315320AsnValAsnGluThrSerLeuValLeuGluTrpSerGluProGlnAsp325330335AlaGlyGlyArgAspAspLeuLeuTyrAsnValIleCysLysLysCys340345350SerValGluArgArgLeuCysSerArgCysAspAspAsnValGluPhe 355360365ValProArgGlnLeuGlyLeuThrGlyLeuThrGluArgArgIleTyr370375380IleSerLysValMetAlaHisProG lnTyrThrPheGluIleGlnAla385390395400ValAsnGlyIleSerSerLysSerProTyrProProHisPheAlaSer405 410415ValAsnIleThrThrAsnGlnAlaAlaProSerAlaValProThrMet420425430HisLeuHisSerSerThrGlyAsnSerMetThrLe uSerTrpThrPro435440445ProGluArgProAsnGlyIleIleLeuAspTyrGluIleLysTyrSer450455460GluLys GlnGlyGlnGlyAspGlyIleAlaAsnThrValThrSerGln465470475480LysAsnSerValArgLeuAspGlyLeuLysAlaAsnAlaArgTyrMet 485490495ValGlnValArgAlaArgThrValAlaGlyTyrGlyArgTyrSerLeu500505510ProThrGluPheGlnT hrThrAlaGluAspGlySerThrSerLysThr515520525PheGlnGluLeuProLeuIleValGlySerAlaThrAlaGlyLeuLeu530535 540PheValIleValValValIleIleAlaIleValCysPheArgLysGln545550555560ArgAsnSerThrAspProGluTyrThrGluLysLeuGl nGlnTyrVal565570575ThrProGlyMetLysValTyrIleAspProPheThrTyrGluAspPro580585590AsnGluAlaValArgGluPheAlaLysGluIleAspIleSerCysVal595600605LysIleGluGluValIleGlyAlaGlyGluPheGlyGluValCysArg610 615620GlyArgLeuLysLeuProGlyArgArgGluIlePheValAlaIleLys625630635640ThrLeuLysValGlyTyrT hrGluArgGlnArgArgAspPheLeuSer645650655GluAlaSerIleMetGlyGlnPheAspHisProAsnIleIleHisLeu660 665670GluGlyValValThrLysSerArgProValMetIleIleThrGluPhe675680685MetGluAsnCysAlaLeuAspSerPheLeuArgLeuAs nAspGlyGln690695700PheThrValIleGlnLeuValGlyMetLeuArgGlyIleAlaAlaGly705710715720 MetLysTyrLeuSerGluMetAsnTyrValHisArgAspLeuAlaAla725730735ArgAsnIleLeuValAsnSerAsnLeuValCysLysValSerAspPhe 740745750GlyLeuSerArgPheLeuGluAspAspProAlaAspProThrTyrThr755760765SerSerLeuGlyGlyLysI leProIleArgTrpThrAlaProGluAla770775780IleAlaTyrArgLysPheThrSerAlaSerAspValTrpSerTyrGly78579079 5800IleValMetTrpGluValMetSerTyrGlyGluArgProTyrTrpAsp805810815MetSerAsnGlnAspValIleAsnAlaValGluGl nAspTyrArgLeu820825830ProProProMetAspCysProThrAlaLeuHisGlnLeuMetLeuAsp835840845 CysTrpValArgAspArgAsnLeuArgProLysPheAlaGlnIleVal850855860AsnThrLeuAspLysLeuIleArgAsnAlaAlaSerLeuLysValIle865 870875880AlaSerValGlnSerGlyValSerGlnProLeuLeuAspArgThrVal885890895ProAspTyrThrThrP heThrThrValGlyAspTrpLeuAspAlaIle900905910LysMetGlyArgTyrLysGluAsnPheValAsnAlaGlyPheAlaSer91592 0925PheAspLeuValAlaGlnMetThrAlaGluAspLeuLeuArgIleGly930935940ValThrLeuAlaGlyHisGlnLysLysIleLeuSerSerIleGl nAsp945950955960MetArgLeuGlnMetAsnGlnThrLeuProValGlnVal965970(2) INFORMATION FOR SEQ ID NO:11:(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 4097 base pairs(B) TYPE: nucleic acid(C) STRANDEDNESS: both(D) TOPOLOGY: linear(ix) FEATURE:(A) NAME/KEY: CDS(B) LOCATION: 10..3042(xi) SEQUENCE DESCRIPTION: SEQ ID NO:11:CGGCTTCTGATGCCCGGCCCGGAGCGCACCATGGGGCCGTTGTGGTT C48MetProGlyProGluArgThrMetGlyProLeuTrpPhe1510TGCTGTTTGCCCCTCGCCCTCTTGCCTCTGCTCGCCGCCGTGGAAGAG 96CysCysLeuProLeuAlaLeuLeuProLeuLeuAlaAlaValGluGlu152025ACGCTGATGGACTCCACAACGGCCACAGCAGAGCTGGGCTGGATGGTG144T hrLeuMetAspSerThrThrAlaThrAlaGluLeuGlyTrpMetVal30354045CATCCTCCCTCAGGGTGGGAAGAGGTGAGTGGATACGATGAGAACATG 192HisProProSerGlyTrpGluGluValSerGlyTyrAspGluAsnMet505560AACACCATCCGCACCTACCAGGTGTGCAACGTCTTTGAATCCAGCCAA 240AsnThrIleArgThrTyrGlnValCysAsnValPheGluSerSerGln657075AACAACTGGCTGCGGACCAAGTACATCCGGAGGCGAGGAGCGCACCGC 288AsnAsnTrpLeuArgThrLysTyrIleArgArgArgGlyAlaHisArg808590ATCCACGTGGAGATGAAATTCTCCGTTCGGGACTGCAGCAGCATCCCC 336IleHisValGluMetLysPheSerValArgAspCysSerSerIlePro95100105AACGTCCCGGGCTCCTGTAAGGAGACTTTTAACCTCTATTACTACGAA384A snValProGlySerCysLysGluThrPheAsnLeuTyrTyrTyrGlu110115120125TCAGACTTTGACTCTGCCACCAAGACTTTTCCTAACTGGATGGAAAAC 432SerAspPheAspSerAlaThrLysThrPheProAsnTrpMetGluAsn130135140CCTTGGATGAAGGTAGATACAATTGCTGCCGACGAGAGCTTCTCGCAG 480ProTrpMetLysValAspThrIleAlaAlaAspGluSerPheSerGln145150155GTGGACCTTGGTGGGCGGGTGATGAAGATTAACACCGAGGTGCGCAGT 528ValAspLeuGlyGlyArgValMetLysIleAsnThrGluValArgSer160165170TTTGGGCCTGTCTCCAAAAACGGTTTCTACCTGGCCTTCCAGGACTAC 576PheGlyProValSerLysAsnGlyPheTyrLeuAlaPheGlnAspTyr175180185GGGGGCTGCATGTCCTTGATTGCAGTCCGTGTCTTTTACCGCAAGTGT624G lyGlyCysMetSerLeuIleAlaValArgValPheTyrArgLysCys190195200205CCCCGTGTGATCCAGAACGGGGCGGTCTTCCAGGAAACCCTCTCGGGA 672ProArgValIleGlnAsnGlyAlaValPheGlnGluThrLeuSerGly210215220GCGGAGAGCACATCTCTGGTGGCAGCCCGGGGGACGTGCATCAGCAAT 720AlaGluSerThrSerLeuValAlaAlaArgGlyThrCysIleSerAsn225230235GCGGAGGAGGTGGATGTGCCCATCAAGCTGTACTGCAATGGGGATGGC 768AlaGluGluValAspValProIleLysLeuTyrCysAsnGlyAspGly240245250GAGTGGCTGGTGCCCATCGGCCGCTGCATGTGCAGGCCGGGCTATGAG 816GluTrpLeuValProIleGlyArgCysMetCysArgProGlyTyrGlu255260265TCGGTGGAGAATGGGACCGTCTGCAGAGGCTGCCCATCAGGGACCTTC864S erValGluAsnGlyThrValCysArgGlyCysProSerGlyThrPhe270275280285AAGGCCAGCCAAGGAGATGAAGGATGTGTCCATTGTCCAATTAACAGC 912LysAlaSerGlnGlyAspGluGlyCysValHisCysProIleAsnSer290295300CGGACGACTTCGGAAGGGGCCACGAACTGCGTGTGCCGAAACGGATAT 960ArgThrThrSerGluGlyAlaThrAsnCysValCysArgAsnGlyTyr305310315TACCGGGCAGATGCTGACCCCGTCGACATGCCATGCACCACCATCCCA 1008TyrArgAlaAspAlaAspProValAspMetProCysThrThrIlePro320325330TCTGCCCCCCAGGCCGTGATCTCCAGCGTGAATGAAACCTCCCTGATG 1056SerAlaProGlnAlaValIleSerSerValAsnGluThrSerLeuMet335340345CTGGAGTGGACCCCACCACGAGACTCAGGGGGCCGGGAGGATCTGGTA1104L euGluTrpThrProProArgAspSerGlyGlyArgGluAspLeuVal350355360365TACAACATCATCTGCAAGAGCTGTGGGTCAGGCCGTGGGGCGTGCACG 1152TyrAsnIleIleCysLysSerCysGlySerGlyArgGlyAlaCysThr370375380CGCTGTGGGGACAACGTGCAGTTTGCCCCACGCCAGCTGGGCCTGACG 1200ArgCysGlyAspAsnValGlnPheAlaProArgGlnLeuGlyLeuThr385390395GAGCCTCGCATCTACATCAGCGACCTGCTGGCCCACACGCAGTACACC 1248GluProArgIleTyrIleSerAspLeuLeuAlaHisThrGlnTyrThr400405410TTTGAGATCCAGGCTGTGAATGGGGTCACCGACCAGAGCCCCTTCTCC 1296PheGluIleGlnAlaValAsnGlyValThrAspGlnSerProPheSer415420425CCACAGTTTGCATCAGTGAATATCACCACCAACCAGGCTGCTCCTTCA1344P roGlnPheAlaSerValAsnIleThrThrAsnGlnAlaAlaProSer430435440445GCCGTGTCCATAATGCACCAGGTCAGCCGCACTGTGGACAGCATTACC 1392AlaValSerIleMetHisGlnValSerArgThrValAspSerIleThr450455460CTCTCGTGGTCTCAACCTGACCAGCCCAATGGAGTCATCCTGGATTAT 1440LeuSerTrpSerGlnProAspGlnProAsnGlyValIleLeuAspTyr465470475GAGCTGCAATACTATGAGAAGAACCTGAGTGAGTTAAATTCAACAGCA 1488GluLeuGlnTyrTyrGluLysAsnLeuSerGluLeuAsnSerThrAla480485490GTGAAGAGCCCCACCAACACTGTGACAGTGCAAAACCTCAAAGCTGGC 1536ValLysSerProThrAsnThrValThrValGlnAsnLeuLysAlaGly495500505ACCATCTATGTCTTCCAAGTGCGAGCACGTACCGTGGCTGGGTATGGC1584T hrIleTyrValPheGlnValArgAlaArgThrValAlaGlyTyrGly510515520525CGGTATAGTGGCAAGATGTACTTCCAGACCATGACTGAAGCCGAGTAC 1632ArgTyrSerGlyLysMetTyrPheGlnThrMetThrGluAlaGluTyr530535540CAGACCAGTGTCCAGGAGAAGCTGCCACTCATCATTGGCTCCTCTGCA 1680GlnThrSerValGlnGluLysLeuProLeuIleIleGlySerSerAla545550555GCAGGACTGGTGTTTCTCATTGCTGTTGTCGTCATCATTATTGTCTGC 1728AlaGlyLeuValPheLeuIleAlaValValValIleIleIleValCys560565570AACAGAAGACGGGGCTTTGAACGTGCTGACTCTGAGTACACTGACAAG 1776AsnArgArgArgGlyPheGluArgAlaAspSerGluTyrThrAspLys575580585CTGCAGCACTATACCAGTGGCCACAGTACGTACCGTGGTCCCCCGCCA1824L euGlnHisTyrThrSerGlyHisSerThrTyrArgGlyProProPro590595600605GGCCTGGGGGTCCGCTCTCTCTTCGTGACTCCAGGGATGAAGATTTAT 1872GlyLeuGlyValArgSerLeuPheValThrProGlyMetLysIleTyr610615620ATCGATCCATTTACCTACGAAGATCCCAATGAGGCTGTCAGGGAATTT 1920IleAspProPheThrTyrGluAspProAsnGluAlaValArgGluPhe625630635GCAAAAGAAATTGATATCTCCTGTGTGAAAATCGAGCAGGTGATTGGG 1968AlaLysGluIleAspIleSerCysValLysIleGluGlnValIleGly640645650GCAGGGGAGTTTGGTGAGGTGTGCAGTGGGCATCTCAAGCTTCCTGGC 2016AlaGlyGluPheGlyGluValCysSerGlyHisLeuLysLeuProGly655660665AAAAGAGAGATCTTTGTGGCCATCAAGACCCTGAAGTCTGGTTACACA2064L ysArgGluIlePheValAlaIleLysThrLeuLysSerGlyTyrThr670675680685GAGAAGCAGAGACGGGACTTCCTGAGTGAAGCCAGCATCATGGGGCAG 2112GluLysGlnArgArgAspPheLeuSerGluAlaSerIleMetGlyGln690695700TTTGACCACCCCAATGTCATCCACCTGGAAGGGGTGGTGACCAAGAGT 2160PheAspHisProAsnValIleHisLeuGluGlyValValThrLysSer705710715TCCCCAGTCATGATCATTACAGAGTTCATGGAGAATGGCTCGTTGGAC 2208SerProValMetIleIleThrGluPheMetGluAsnGlySerLeuAsp720725730TCCTTCTTGAGGCAAAATGATGGGCAGTTCACAGTGATCCAGCTGGTG 2256SerPheLeuArgGlnAsnAspGlyGlnPheThrValIleGlnLeuVal735740745GGCATGTTGCGTGGCATTGCAGCAGGCATGAAGTACCTGGCTGATATG2304G lyMetLeuArgGlyIleAlaAlaGlyMetLysTyrLeuAlaAspMet750755760765AACTACGTGCACCGGGACCTGGCTGCCCGCAACATCCTGGTCAACAGC 2352AsnTyrValHisArgAspLeuAlaAlaArgAsnIleLeuValAsnSer770775780AACCTGGTCTGCAAGGTGTCCGACTTCGGCCTCTCCCGTTTCCTGGAG 2400AsnLeuValCysLysValSerAspPheGlyLeuSerArgPheLeuGlu785790795GATGACACCTCTGATCCCACTTACACCAGCGCACTGGGTGGAAAGATC 2448AspAspThrSerAspProThrTyrThrSerAlaLeuGlyGlyLysIle800805810CCAATACGGTGGACAGCGCCTGAGGCAATTCAGTACCGAAAATTCACA 2496ProIleArgTrpThrAlaProGluAlaIleGlnTyrArgLysPheThr815820825TCAGCCAGCGATGTGTGGAGCTATGGAATAGTCATGTGGGAGGTGATG2544S erAlaSerAspValTrpSerTyrGlyIleValMetTrpGluValMet830835840845TCGTACGGCGAGCGGCCTTACTGGGACATGACCAATCAAGATGTGATA 2592SerTyrGlyGluArgProTyrTrpAspMetThrAsnGlnAspValIle850855860AATGCTATTGAGCAGGACTATCGGCTACCACCCCCTATGGATTGTCCA 2640AsnAlaIleGluGlnAspTyrArgLeuProProProMetAspCysPro865870875AATGCCCTGCACCAGCTAATGCTTGACTGCTGGCAGAAGGATCGAAAC 2688AsnAlaLeuHisGlnLeuMetLeuAspCysTrpGlnLysAspArgAsn880885890CACAGACCCAAATTTGGACAGATTGTCAACACTTTAGACAAAATGATC 2736HisArgProLysPheGlyGlnIleValAsnThrLeuAspLysMetIle895900905CGAAATCCTAATAGTCTGAAAGCCATGGCACCTCTCTCCTCTGGGGTT2784A rgAsnProAsnSerLeuLysAlaMetAlaProLeuSerSerGlyVal910915920925AACCTCCCTCTACTTGACCGCACAATCCCAGATTATACCAGCTTCAAC 2832AsnLeuProLeuLeuAspArgThrIleProAspTyrThrSerPheAsn930935940ACTGTGGATGAATGGCTGGATGCCATCAAGATGAGCCAGTACAAGGAG 2880ThrValAspGluTrpLeuAspAlaIleLysMetSerGlnTyrLysGlu945950955AGCTTTGCCAGTGCTGGCTTCACCACCTTTGATATAGTATCTCAGATG 2928SerPheAlaSerAlaGlyPheThrThrPheAspIleValSerGlnMet960965970ACTGTAGAGGACATTCTACGAGTTGGGGTCACTTTAGCAGGACACCAG 2976ThrValGluAspIleLeuArgValGlyValThrLeuAlaGlyHisGln975980985AAGAAAATTCTGAACAGTATCCAGGTGATGAGAGCACAGATGAACCAA3024L ysLysIleLeuAsnSerIleGlnValMetArgAlaGlnMetAsnGln99099510001005ATTCAGTCTGTGGAGGTTTGATAGCAACACGTCCTCGTGCTCCACTTC 3072IleGlnSerValGluVal1010CTTGAGGCCCTGCTCCCCTCTGCCCCTGTGTGTCTGAGCTCCAGTTCTTGAGTGTTCTGC3132GTGGATCAGAGACAGGCAGCTGCTCTGAGGATCATGGCAACAGGAAGAAATGCCCTATCA 3192TTGACAACGAGAAGTCATCAAGAGGTGAAACAATGGAAAACAATGGAAAAAGGGAACAAG3252TAAAGACAGCTATTTTGAAAACCGAAAACAAACAGTGAATTATTTTTAAATAATAATAAA3312GCAATTGCAGTCTTGAAAAGGGCTCCAAGACCAAT GGGAGTCTCCAAAGGAAGAGAATAG3372AGCAGCTTCATCTATTTCCTCTTACACAAGGGTTGCTGCAGCTGGGCCCAGACACTTCTG3432GAGTAACGAGACTTTTCAAGAAGATGAATGCAAAGAATGGTCACAAGAAGCACTTCTCTT3492TCTCACATGG GATGGCAGCTCTGGGAATGCCCGGCAGTCCTTCCTGAAAGCCCTGTTGGC3552AAATCGAAGAGGAGAGCCGAAGCTCTTTGGTGCTGTGGAACCAAGTGCATCTCAGAAATT3612GTTGGACTTCTACAAAAGCTGAAGACATTCTTTTTTTTTAAACAAGTAAACTG ATACTAG3672AAGAGGCTGTTTCCGTCAAATGAGAAGGAATCTGTAACACTGGCCCGGGGGGGGTGGGGA3732ATGGGGGAAATCAGTCCTTTTTACATCTCTTTATTTTCTCTTGTCATGGAACAGTTTTGT3792GAGTGACAGTTTCCTAAGGGTCCGTCCA TCCACCCTCCAATGGCATCATTGTTTCATACA3852TATCATATGCACAAGACTTATAGTGATGTCCTCACTCGATGCCAATGATCTTTCCCCAGA3912AGACTTCCCAAGTACAGTATGTAGTAGATTTTGATTACAAATGCTGACGTGTACCTTTAT3972TT TTCGGTTGTCGTTGTTGGGAGATTCGTCCTTTTACCTTGCTTTGTTAACACCAATTTG4032TGAGTTTGGGGTTGGAATTTTTTTGGTCGATTGGGGTTGTTTTTTTTTTTTTTTTTTTTT4092AACCG 4097(2) INFORMATION FOR SEQ ID NO:12:(i) SEQUENCE CHARACTERISTICS:(A) LENGTH: 1011 amino acids(B) TYPE: amino acid(D) TOPOLOGY: linear(ii) MOLECULE TYPE: protein(xi) SEQUENCE DESCRIPTION: SEQ ID NO:12:MetProGlyProGluArgThrMetGlyProLeuTrpPheCysCy sLeu151015ProLeuAlaLeuLeuProLeuLeuAlaAlaValGluGluThrLeuMet202530Asp SerThrThrAlaThrAlaGluLeuGlyTrpMetValHisProPro354045SerGlyTrpGluGluValSerGlyTyrAspGluAsnMetAsnThrIle50 5560ArgThrTyrGlnValCysAsnValPheGluSerSerGlnAsnAsnTrp65707580LeuArgThrLysTyrIleArgArgA rgGlyAlaHisArgIleHisVal859095GluMetLysPheSerValArgAspCysSerSerIleProAsnValPro100105 110GlySerCysLysGluThrPheAsnLeuTyrTyrTyrGluSerAspPhe115120125AspSerAlaThrLysThrPheProAsnTrpMetGluAsnProTr pMet130135140LysValAspThrIleAlaAlaAspGluSerPheSerGlnValAspLeu145150155160GlyGly ArgValMetLysIleAsnThrGluValArgSerPheGlyPro165170175ValSerLysAsnGlyPheTyrLeuAlaPheGlnAspTyrGlyGlyCys1 80185190MetSerLeuIleAlaValArgValPheTyrArgLysCysProArgVal195200205IleGlnAsnGlyAlaValPheGlnG luThrLeuSerGlyAlaGluSer210215220ThrSerLeuValAlaAlaArgGlyThrCysIleSerAsnAlaGluGlu225230235 240ValAspValProIleLysLeuTyrCysAsnGlyAspGlyGluTrpLeu245250255ValProIleGlyArgCysMetCysArgProGlyTyrGluSe rValGlu260265270AsnGlyThrValCysArgGlyCysProSerGlyThrPheLysAlaSer275280285GlnGly AspGluGlyCysValHisCysProIleAsnSerArgThrThr290295300SerGluGlyAlaThrAsnCysValCysArgAsnGlyTyrTyrArgAla305310 315320AspAlaAspProValAspMetProCysThrThrIleProSerAlaPro325330335GlnAlaValIleSerSerValA snGluThrSerLeuMetLeuGluTrp340345350ThrProProArgAspSerGlyGlyArgGluAspLeuValTyrAsnIle355360 365IleCysLysSerCysGlySerGlyArgGlyAlaCysThrArgCysGly370375380AspAsnValGlnPheAlaProArgGlnLeuGlyLeuThrGluProArg 385390395400IleTyrIleSerAspLeuLeuAlaHisThrGlnTyrThrPheGluIle405410415Gln AlaValAsnGlyValThrAspGlnSerProPheSerProGlnPhe420425430AlaSerValAsnIleThrThrAsnGlnAlaAlaProSerAlaValSer435 440445IleMetHisGlnValSerArgThrValAspSerIleThrLeuSerTrp450455460SerGlnProAspGlnProAsnGlyValIleL euAspTyrGluLeuGln465470475480TyrTyrGluLysAsnLeuSerGluLeuAsnSerThrAlaValLysSer485490 495ProThrAsnThrValThrValGlnAsnLeuLysAlaGlyThrIleTyr500505510ValPheGlnValArgAlaArgThrValAlaGlyTyrGlyAr gTyrSer515520525GlyLysMetTyrPheGlnThrMetThrGluAlaGluTyrGlnThrSer530535540ValGlnGluLys LeuProLeuIleIleGlySerSerAlaAlaGlyLeu545550555560ValPheLeuIleAlaValValValIleIleIleValCysAsnArgArg5 65570575ArgGlyPheGluArgAlaAspSerGluTyrThrAspLysLeuGlnHis580585590TyrThrSerGlyHisSerThrT yrArgGlyProProProGlyLeuGly595600605ValArgSerLeuPheValThrProGlyMetLysIleTyrIleAspPro610615 620PheThrTyrGluAspProAsnGluAlaValArgGluPheAlaLysGlu625630635640IleAspIleSerCysValLysIleGluGlnValIleGlyAlaGl yGlu645650655PheGlyGluValCysSerGlyHisLeuLysLeuProGlyLysArgGlu660665670Ile PheValAlaIleLysThrLeuLysSerGlyTyrThrGluLysGln675680685ArgArgAspPheLeuSerGluAlaSerIleMetGlyGlnPheAspHis690 695700ProAsnValIleHisLeuGluGlyValValThrLysSerSerProVal705710715720MetIleIleThrGluPheMetGluA snGlySerLeuAspSerPheLeu725730735ArgGlnAsnAspGlyGlnPheThrValIleGlnLeuValGlyMetLeu740745 750ArgGlyIleAlaAlaGlyMetLysTyrLeuAlaAspMetAsnTyrVal755760765HisArgAspLeuAlaAlaArgAsnIleLeuValAsnSerAsnLe uVal770775780CysLysValSerAspPheGlyLeuSerArgPheLeuGluAspAspThr785790795800SerAsp ProThrTyrThrSerAlaLeuGlyGlyLysIleProIleArg805810815TrpThrAlaProGluAlaIleGlnTyrArgLysPheThrSerAlaSer8 20825830AspValTrpSerTyrGlyIleValMetTrpGluValMetSerTyrGly835840845GluArgProTyrTrpAspMetThrA snGlnAspValIleAsnAlaIle850855860GluGlnAspTyrArgLeuProProProMetAspCysProAsnAlaLeu865870875 880HisGlnLeuMetLeuAspCysTrpGlnLysAspArgAsnHisArgPro885890895LysPheGlyGlnIleValAsnThrLeuAspLysMetIleAr gAsnPro900905910AsnSerLeuLysAlaMetAlaProLeuSerSerGlyValAsnLeuPro915920925LeuLeu AspArgThrIleProAspTyrThrSerPheAsnThrValAsp930935940GluTrpLeuAspAlaIleLysMetSerGlnTyrLysGluSerPheAla945950 955960SerAlaGlyPheThrThrPheAspIleValSerGlnMetThrValGlu965970975AspIleLeuArgValGlyValT hrLeuAlaGlyHisGlnLysLysIle980985990LeuAsnSerIleGlnValMetArgAlaGlnMetAsnGlnIleGlnSer9951000 1005ValGluVal1010(2) INFORMATION FOR SEQ ID NO:13:(i) SEQUENCE CHARACTERISTICS:(A) LENGTH: 3591 base pairs(B) TYPE: nucleic acid(C) STRANDEDNESS: both(D) TOPOLOGY: linear(ix) FEATURE:(A) NAME/KEY: CDS(B) LOCATION: 2..2965( xi) SEQUENCE DESCRIPTION: SEQ ID NO:13:CGGGGTCTCCTCGAGGGCGCGGCGGCCGCCGGGCAGCAGCAGGAGC46GlyValSerSerArgAlaArgArgProProGlySerSerArgSer1510 15AGCAGGAGGGGGGTGACCTCGGAGCTGGCATGGACAACCCATCCGGAG94SerArgArgGlyValThrSerGluLeuAlaTrpThrThrHisProGlu2025 30ACGGGGTGGGAAGAGGTCAGTGGTTACGACGAGGCTATGAACCCCATC142ThrGlyTrpGluGluValSerGlyTyrAspGluAlaMetAsnProIle3540 45CGCACATACCAGGTGTGCAACGTGCGGGAGGCCAACCAGAACAACTGG190ArgThrTyrGlnValCysAsnValArgGluAlaAsnGlnAsnAsnTrp5055 60CTTCGCACCAAGTTCATTCAGCGCCAGGACGTCCAGCGTGTCTACGTG238LeuArgThrLysPheIleGlnArgGlnAspValGlnArgValTyrVal6570 75GAGCTGAAATTCACTGTGCGGGACTGCAACAGCATCCCCAACATCCCT286GluLeuLysPheThrValArgAspCysAsnSerIleProAsnIlePro808590 95GGTTCCTGCAAAGAGACCTTCAACCTCTTCTATTATGAGTCAGATACG334GlySerCysLysGluThrPheAsnLeuPheTyrTyrGluSerAspThr100105 110GATTCTGCCTCTGCCAATAGCCCTTTCTGGATGGAGAACCCCTATATC382AspSerAlaSerAlaAsnSerProPheTrpMetGluAsnProTyrIle115120 125AAAGTGGATACAATTGCTCCGGATGAGAGCTTCTCCAAACTGGAGTCC430LysValAspThrIleAlaProAspGluSerPheSerLysLeuGluSer130135 140GGCCGTGTGAACACCAAGGTGCGCAGCTTTGGGCCGCTCTCCAAGAAT478GlyArgValAsnThrLysValArgSerPheGlyProLeuSerLysAsn145150 155GGCTTTTATCTGGCTTTCCAGGACCTGGGGGCCTGCATGTCCCTTATC526GlyPheTyrLeuAlaPheGlnAspLeuGlyAlaCysMetSerLeuIle160165170 175TCCGTCCGGGCTTTCTACAAGAAATGTTCCAACACCATCGCTGGCTTT574SerValArgAlaPheTyrLysLysCysSerAsnThrIleAlaGlyPhe180185 190GCTATCTTCCCGGAGACCCTAACGGGGGCTGAGCCCACGTCGCTGGTC622AlaIlePheProGluThrLeuThrGlyAlaGluProThrSerLeuVal195200 205ATTGCGCCGGGCACCTGCATCCCCAACGCAGTGGAAGTGTCTGTGCCC670IleAlaProGlyThrCysIleProAsnAlaValGluValSerValPro210215 220CTGAAGCTGTACTGCAACGGTGATGGCGAGTGGATGGTGCCTGTGGGA718LeuLysLeuTyrCysAsnGlyAspGlyGluTrpMetValProValGly225230 235GCGTGCACGTGTGCTGCTGGGTACGAGCCAGCCATGAAGGATACCCAG766AlaCysThrCysAlaAlaGlyTyrGluProAlaMetLysAspThrGln240245250 255TGCCAAGCATGCGGCCCGGGGACGTTCAAATCCAAGCAGGGCGAGGGC814CysGlnAlaCysGlyProGlyThrPheLysSerLysGlnGlyGluGly260265 270CCCTGCTCCCCCTGCCCTCCCAACAGCCGCACCACCGCGGGGGCAGCC862ProCysSerProCysProProAsnSerArgThrThrAlaGlyAlaAla275280 285ACAGTCTGCATATGTCGCAGCGGCTTCTTCCGAGCAGACGCGGACCCC910ThrValCysIleCysArgSerGlyPhePheArgAlaAspAlaAspPro290295 300GCAGACAGCGCCTGCACCAGTGTGCCCTCAGCCCCACGCAGCGTCATC958AlaAspSerAlaCysThrSerValProSerAlaProArgSerValIle305310 315TCCAACGTGAATGAGACGTCGTTGGTGCTGGAGTGGAGCGAGCCGCAG1006SerAsnValAsnGluThrSerLeuValLeuGluTrpSerGluProGln320325330 335GACGCGGGCGGGCGGGATGACCTGCTCTACAACGTCATCTGCAAGAAG1054AspAlaGlyGlyArgAspAspLeuLeuTyrAsnValIleCysLysLys340345 350TGCAGCGTGGAGCGGCGGCTGTGCAGCCGCTGCGACGACAACGTGGAG1102CysSerValGluArgArgLeuCysSerArgCysAspAspAsnValGlu355360 365TTCGTGCCGCGCCAGCTGGGCCTCACTGGCCTCACTGAGCGACGCATC1150PheValProArgGlnLeuGlyLeuThrGlyLeuThrGluArgArgIle370375 380TACATCAGCAAGGTGATGGCCCACCCCCAGTACACCTTCGAGATCCAG1198TyrIleSerLysValMetAlaHisProGlnTyrThrPheGluIleGln385390 395GCGGTGAATGGCATCTCCAGCAAGAGCCCCTACCCTCCCCATTTTGCC1246AlaValAsnGlyIleSerSerLysSerProTyrProProHisPheAla400405410 415TCCGTCAACATCACGACCAACCAGGCAGCCCCATCTGCCGTGCCCACC1294SerValAsnIleThrThrAsnGlnAlaAlaProSerAlaValProThr420425 430ATGCATCTGCACAGCAGCACCGGGAACAGCATGACACTGTCATGGACT1342MetHisLeuHisSerSerThrGlyAsnSerMetThrLeuSerTrpThr435440 445CCCCCGGAAAGGCCCAACGGCATCATTCTCGACTATGAAATCAAGTAC1390ProProGluArgProAsnGlyIleIleLeuAspTyrGluIleLysTyr450455 460TCCGAGAAGCAAGGCCAGGGTGACGGCATTGCCAACACTGTCACCAGC1438SerGluLysGlnGlyGlnGlyAspGlyIleAlaAsnThrValThrSer465470 475CAGAAGAACTCGGTGCGGCTGGACGGACTGAAGGCCAATGCTCGGTAC1486GlnLysAsnSerValArgLeuAspGlyLeuLysAlaAsnAlaArgTyr480485490 495ATGGTGCAGGTCCGGGCGCGCACAGTGGCTGGATACGGCCGCTACAGC1534MetValGlnValArgAlaArgThrValAlaGlyTyrGlyArgTyrSer500505 510CTCCCCACCGAGTTCCAGACGACTGCGGAGGATGGCTCCACCAGCAAG1582LeuProThrGluPheGlnThrThrAlaGluAspGlySerThrSerLys515520 525ACTTTCCAGGAGCTTCCTCTCATCGTGGGTTCAGCCACCGCGGGACTG1630ThrPheGlnGluLeuProLeuIleValGlySerAlaThrAlaGlyLeu530535 540CTGTTTGTCATCGTGGTGGTCATCATCGCTATTGTCTGCTTCAGGAAA1678LeuPheValIleValValValIleIleAlaIleValCysPheArgLys545550 555GGGATGGTTACTGAACAACTCCTCTCGTCTCCTTTGGGCAGGAAGCAG1726GlyMetValThrGluGlnLeuLeuSerSerProLeuGlyArgLysGln560565570 575CGCAACAGCACAGATCCCGAGTACACAGAGAAGCTGCAGCAATATGTC1774ArgAsnSerThrAspProGluTyrThrGluLysLeuGlnGlnTyrVal580585 590ACTCCTGGGATGAAGGTCTACATTGACCCCTTCACCTATGAAGACCCA1822ThrProGlyMetLysValTyrIleAspProPheThrTyrGluAspPro595600 605AATGAAGCTGTCCGGGAATTCGCCAAAGAGATTGATATCTCCTGTGTC1870AsnGluAlaValArgGluPheAlaLysGluIleAspIleSerCysVal610615 620AAAATTGAGGAGGTCATTGGAGCAGGAGAGTTTGGTGAGGTGTGCCGT1918LysIleGluGluValIleGlyAlaGlyGluPheGlyGluValCysArg625630 635GGGCGCCTGAAGCTGCCTGGCCGCCGTGAGATCTTTGTGGCCATCAAG1966GlyArgLeuLysLeuProGlyArgArgGluIlePheValAlaIleLys640645650 655ACACTGAAGGTGGGCTACACAGAGAGGCAGCGGCGGGACTTCCTGAGT2014ThrLeuLysValGlyTyrThrGluArgGlnArgArgAspPheLeuSer660665 670GAGGCCAGCATCATGGGCCAGTTCGACCACCCCAACATCATCCACCTG2062GluAlaSerIleMetGlyGlnPheAspHisProAsnIleIleHisLeu675680 685GAGGGCGTGGTGACCAAGAGCCGCCCTGTCATGATCATCACAGAGTTC2110GluGlyValValThrLysSerArgProValMetIleIleThrGluPhe690695 700ATGGAGAACTGCGCTCTCGACTCCTTCCTCCGGCTGAATGATGGGCAG2158MetGluAsnCysAlaLeuAspSerPheLeuArgLeuAsnAspGlyGln705710 715TTCACGGTCATCCAGCTGGTGGGGATGCTGCGAGGCATCGCTGCTGGC2206PheThrValIleGlnLeuValGlyMetLeuArgGlyIleAlaAlaGly720725730 735ATGAAGTACCTCTCAGAGATGAACTACGTGCACCGAGACCTGGCTGCC2254MetLysTyrLeuSerGluMetAsnTyrValHisArgAspLeuAlaAla740745 750CGCAACATCCTGGTCAACAGCAACTTGGTCTGCAAAGTGTCTGACTTC2302ArgAsnIleLeuValAsnSerAsnLeuValCysLysValSerAspPhe755760 765GGGCTCTCCCGCTTTTTGGAGGATGATCCAGCCGACCCCACCTACACC2350GlyLeuSerArgPheLeuGluAspAspProAlaAspProThrTyrThr770775 780AGCTCCCTGGGAGGCAAGATCCCCATCAGGTGGACAGCTCCTGAGGCC2398SerSerLeuGlyGlyLysIleProIleArgTrpThrAlaProGluAla785790 795ATCGCCTACCGCAAATTCACGTCGGCCAGCGACGTGTGGAGCTACGGC2446IleAlaTyrArgLysPheThrSerAlaSerAspValTrpSerTyrGly800805810 815ATCGTCATGTGGGAAGTGATGTCCTACGGGGAGCGACCCTACTGGGAC2494IleValMetTrpGluValMetSerTyrGlyGluArgProTyrTrpAsp820825 830ATGTCCAACCAGGATGTGATCAACGCGGTGGAGCAGGATTACCGCCTG2542MetSerAsnGlnAspValIleAsnAlaValGluGlnAspTyrArgLeu835840 845CCACCCCCCATGGACTGCCCCACAGCACTGCACCAGCTGATGCTGGAC2590ProProProMetAspCysProThrAlaLeuHisGlnLeuMetLeuAsp850855 860TGCTGGGTGCGGGACCGCAACCTGCGGCCCAAGTTTGCACAGATTGTC2638CysTrpValArgAspArgAsnLeuArgProLysPheAlaGlnIleVal865870 875AACACGCTGGACAAGCTGATCCGCAATGCTGCCAGCCTGAAGGTCATC2686AsnThrLeuAspLysLeuIleArgAsnAlaAlaSerLeuLysValIle880885890 895GCCAGCGTCCAGTCCGGTGTCTCCCAGCCGCTCCTGGACCGCACCGTG2734AlaSerValGlnSerGlyValSerGlnProLeuLeuAspArgThrVal900905 910CCCGATTACACCACCTTCACCACCGTGGGAGACTGGCTGGATGCCATC2782ProAspTyrThrThrPheThrThrValGlyAspTrpLeuAspAlaIle915920 925AAAATGGGACGGTACAAGGAGAACTTCGTCAACGCCGGCTTCGCCTCC2830LysMetGlyArgTyrLysGluAsnPheValAsnAlaGlyPheAlaSer930935 940TTTGACCTGGTGGCACAGATGACAGCAGAGGACCTGCTAAGGATAGGA2878PheAspLeuValAlaGlnMetThrAlaGluAspLeuLeuArgIleGly945950 955GTGACGCTAGCAGGGCACCAGAAGAAGATCCTGAGCAGCATTCAGGAC2926ValThrLeuAlaGlyHisGlnLysLysIleLeuSerSerIleGlnAsp960965970 975ATGAGGCTGCAGATGAACCAGACGCTGCCGGTTCAGGTTTGACCGCAGG2975MetArgLeuGlnMetAsnGlnThrLeuProValGlnVal980985GACTCTGCATT GGAACGGACTGAGGGAACCTGCCAACCAGGTTCTGTTTGCGGTGCAGCC3035CGGCTTCCCGATTTCCCCTTCCCGTGGCGCTCCTCTGCCTCGGACGCTCGCCGGGGACAG3095GCTGGGCCGGGCCACCCTTCCCTGGATCAGAGGCACTCGTGCCGGGAGGGAGCCC GGCTT3155TTCGTCCCGTGTCCCGCAGCGGCGAGGCAGTGAACGCAGTCTTCATATTGAAGATGGATT3215ATGGGACGGAGATGGCGCATCCGCTTCCCGCCCTGTCTCAGTGCTCATCAGTTTGAAGAG3275ATGTTCTGCTTCTTGGATTTCTTTACACCC CGGTTTTCCCCCCTCGAGTCCTCACTTCCC3335CCTATCCCTGAGGCCACAGACTGTTGACCCGTCCGCTGAGTCCGTCAGACGCTCCGAAGC3395CTTCCCCGAGCCCGGTCCCCGCGTGGAGACGGCGCCAGGGACGGGGCTACGGCCCCAGAC3455AATC ACTCCACCCCTCCGCACGAGGGTCCTCACTGGGACGTGTCTGAAGGGGAAAGGCTC3515TGCTCCCTTTTTGGCTTTGCACGCCAGAACCCGAACCCCGTGAGATTTACTATGCAGGGA3575GTTAGGCAAAAAAAAG 3591(2) INFORMATION FOR SEQ ID NO:14:(i) SEQUENCE CHARACTERISTICS:(A) LENGTH: 988 amino acids(B) TYPE: amino acid(D) TOPOLOGY: linear(ii) MOLECULE TYPE: protein(xi) SEQUENCE DESCRIPTION: SEQ ID NO:14:GlyValSerSerArgAlaArgArgProProGlySerSerArgSer Ser151015ArgArgGlyValThrSerGluLeuAlaTrpThrThrHisProGluThr202530GlyTr pGluGluValSerGlyTyrAspGluAlaMetAsnProIleArg354045ThrTyrGlnValCysAsnValArgGluAlaAsnGlnAsnAsnTrpLeu50 5560ArgThrLysPheIleGlnArgGlnAspValGlnArgValTyrValGlu65707580LeuLysPheThrValArgAspCysAsn SerIleProAsnIleProGly859095SerCysLysGluThrPheAsnLeuPheTyrTyrGluSerAspThrAsp100105 110SerAlaSerAlaAsnSerProPheTrpMetGluAsnProTyrIleLys115120125ValAspThrIleAlaProAspGluSerPheSerLysLeuGluSer Gly130135140ArgValAsnThrLysValArgSerPheGlyProLeuSerLysAsnGly145150155160PheTyrLe uAlaPheGlnAspLeuGlyAlaCysMetSerLeuIleSer165170175ValArgAlaPheTyrLysLysCysSerAsnThrIleAlaGlyPheAla180 185190IlePheProGluThrLeuThrGlyAlaGluProThrSerLeuValIle195200205AlaProGlyThrCysIleProAsnAla ValGluValSerValProLeu210215220LysLeuTyrCysAsnGlyAspGlyGluTrpMetValProValGlyAla225230235 240CysThrCysAlaAlaGlyTyrGluProAlaMetLysAspThrGlnCys245250255GlnAlaCysGlyProGlyThrPheLysSerLysGlnGlyGlu GlyPro260265270CysSerProCysProProAsnSerArgThrThrAlaGlyAlaAlaThr275280285ValCysIl eCysArgSerGlyPhePheArgAlaAspAlaAspProAla290295300AspSerAlaCysThrSerValProSerAlaProArgSerValIleSer305310 315320AsnValAsnGluThrSerLeuValLeuGluTrpSerGluProGlnAsp325330335AlaGlyGlyArgAspAspLeuLeu TyrAsnValIleCysLysLysCys340345350SerValGluArgArgLeuCysSerArgCysAspAspAsnValGluPhe355360 365ValProArgGlnLeuGlyLeuThrGlyLeuThrGluArgArgIleTyr370375380IleSerLysValMetAlaHisProGlnTyrThrPheGluIleGlnAla38 5390395400ValAsnGlyIleSerSerLysSerProTyrProProHisPheAlaSer405410415ValAs nIleThrThrAsnGlnAlaAlaProSerAlaValProThrMet420425430HisLeuHisSerSerThrGlyAsnSerMetThrLeuSerTrpThrPro435 440445ProGluArgProAsnGlyIleIleLeuAspTyrGluIleLysTyrSer450455460GluLysGlnGlyGlnGlyAspGlyIleAlaAsn ThrValThrSerGln465470475480LysAsnSerValArgLeuAspGlyLeuLysAlaAsnAlaArgTyrMet485490 495ValGlnValArgAlaArgThrValAlaGlyTyrGlyArgTyrSerLeu500505510ProThrGluPheGlnThrThrAlaGluAspGlySerThrSer LysThr515520525PheGlnGluLeuProLeuIleValGlySerAlaThrAlaGlyLeuLeu530535540PheValIleValVa lValIleIleAlaIleValCysPheArgLysGly545550555560MetValThrGluGlnLeuLeuSerSerProLeuGlyArgLysGlnArg565 570575AsnSerThrAspProGluTyrThrGluLysLeuGlnGlnTyrValThr580585590ProGlyMetLysValTyrIleAsp ProPheThrTyrGluAspProAsn595600605GluAlaValArgGluPheAlaLysGluIleAspIleSerCysValLys610615 620IleGluGluValIleGlyAlaGlyGluPheGlyGluValCysArgGly625630635640ArgLeuLysLeuProGlyArgArgGluIlePheValAlaIleLys Thr645650655LeuLysValGlyTyrThrGluArgGlnArgArgAspPheLeuSerGlu660665670AlaSe rIleMetGlyGlnPheAspHisProAsnIleIleHisLeuGlu675680685GlyValValThrLysSerArgProValMetIleIleThrGluPheMet690 695700GluAsnCysAlaLeuAspSerPheLeuArgLeuAsnAspGlyGlnPhe705710715720ThrValIleGlnLeuValGlyMetLeu ArgGlyIleAlaAlaGlyMet725730735LysTyrLeuSerGluMetAsnTyrValHisArgAspLeuAlaAlaArg740745 750AsnIleLeuValAsnSerAsnLeuValCysLysValSerAspPheGly755760765LeuSerArgPheLeuGluAspAspProAlaAspProThrTyrThr Ser770775780SerLeuGlyGlyLysIleProIleArgTrpThrAlaProGluAlaIle785790795800AlaTyrAr gLysPheThrSerAlaSerAspValTrpSerTyrGlyIle805810815ValMetTrpGluValMetSerTyrGlyGluArgProTyrTrpAspMet820 825830SerAsnGlnAspValIleAsnAlaValGluGlnAspTyrArgLeuPro835840845ProProMetAspCysProThrAlaLeu HisGlnLeuMetLeuAspCys850855860TrpValArgAspArgAsnLeuArgProLysPheAlaGlnIleValAsn865870875 880ThrLeuAspLysLeuIleArgAsnAlaAlaSerLeuLysValIleAla885890895SerValGlnSerGlyValSerGlnProLeuLeuAspArgThr ValPro900905910AspTyrThrThrPheThrThrValGlyAspTrpLeuAspAlaIleLys915920925MetGlyAr gTyrLysGluAsnPheValAsnAlaGlyPheAlaSerPhe930935940AspLeuValAlaGlnMetThrAlaGluAspLeuLeuArgIleGlyVal945950 955960ThrLeuAlaGlyHisGlnLysLysIleLeuSerSerIleGlnAspMet965970975ArgLeuGlnMetAsnGlnThrLeu ProValGlnVal980985(2) INFORMATION FOR SEQ ID NO:15:(i) SEQUENCE CHARACTERISTICS:(A) LENGTH: 3254 base pairs(B) TYPE: nucleic acid(C) STRANDEDNESS: both(D) TOPOLOGY: linear(ix) FEATURE:(A) NAME/KEY: CDS (B) LOCATION: 32..2980(xi) SEQUENCE DESCRIPTION: SEQ ID NO:15:CGCTCTGCTCGCGCGCTGCTGCCCCGCCGACATGGACCGCCGCCGCCTGCCG52MetAspArgArgArgLeuPro 15CTGCTGCTGCTCTGCGCTGCCCTCGGCTCCGCCGGGCGTCTGAGCGCC100LeuLeuLeuLeuCysAlaAlaLeuGlySerAlaGlyArgLeuSerAla1015 20CGCCCCGGCAACGAAGTTAATCTGCTGGATTCAAAAACAATTCAAGGG148ArgProGlyAsnGluValAsnLeuLeuAspSerLysThrIleGlnGly2530 35GAGCTGGGCTGGATCTCCTACCCATCACATGGGTGGGAAGAGATTAGT196GluLeuGlyTrpIleSerTyrProSerHisGlyTrpGluGluIleSer404550 55GGTGTTGATGAGCATTATACTCCAATCAGAACTTACCAAGAGAGCAAT244GlyValAspGluHisTyrThrProIleArgThrTyrGlnGluSerAsn6065 70GTTATGGATCACAGTCAAAACAATTGGCTGCGAACAAACTGGATTCCA292ValMetAspHisSerGlnAsnAsnTrpLeuArgThrAsnTrpIlePro7580 85CGCAATTCAGCGCAGAAGATATATGTGGAGCTCAAGTTTACCTTGAGG340ArgAsnSerAlaGlnLysIleTyrValGluLeuLysPheThrLeuArg9095 100GACTGCAATAGTATCCCTCTAGTTCTGGGCACTTGCAAAGAGACTTTC388AspCysAsnSerIleProLeuValLeuGlyThrCysLysGluThrPhe105110 115AATCTGTATTACATGGAATCCGATGATGACCATTTGGCAAAGTTCAGA436AsnLeuTyrTyrMetGluSerAspAspAspHisLeuAlaLysPheArg120125130 135GAGCACCAATTTACGAAGATTGACACCATGGCGGCTGATGAGAGCTTC484GluHisGlnPheThrLysIleAspThrMetAlaAlaAspGluSerPhe140145 150ACCCAGATGGATCTTGGGGACCGGATTCTCAAGCTGAATACCGAAGTC532ThrGlnMetAspLeuGlyAspArgIleLeuLysLeuAsnThrGluVal155160 165CGCGAGGTGGGACCTGTTAGTAAGAAGGGCTTTTACTTGGCTTTCCAA580ArgGluValGlyProValSerLysLysGlyPheTyrLeuAlaPheGln170175 180GATGTAGGTGCATGTGTTGCCTTAGTCTCGGTGCGAGTGTACTTCAAG628AspValGlyAlaCysValAlaLeuValSerValArgValTyrPheLys185190 195AAGTGCCCTTTCACTGTCAAGAACCTCGCCATGTTTCCAGATACAGTT676LysCysProPheThrValLysAsnLeuAlaMetPheProAspThrVal200205210 215CCTATGGACTCCCAGTCCCTGGTGGAGGTGCGGGGTTCTTGTGTCAAT724ProMetAspSerGlnSerLeuValGluValArgGlySerCysValAsn220225 230CATTCCAAGGAGGAAGAGCCACCCAAGATGTACTGCAGCACGGAAGGA772HisSerLysGluGluGluProProLysMetTyrCysSerThrGluGly235240 245GAATGGCTAGTGCCCATAGGGAAGTGCTTGTGTAATGCTGGCTATGAA820GluTrpLeuValProIleGlyLysCysLeuCysAsnAlaGlyTyrGlu250255 260GAGAGAGGCTTTGCGTGCCAAGCTTGTCGACCTGGGTTCTATAAAGCT868GluArgGlyPheAlaCysGlnAlaCysArgProGlyPheTyrLysAla265270 275TCTGCTGGCAATGTGAAGTGTGCCAAATGCCCACCTCACAGCTCTACC916SerAlaGlyAsnValLysCysAlaLysCysProProHisSerSerThr280285290 295TATGAAGATGCATCTCTGAACTGCAGGTGTGAAAAGAATTACTTTCGC964TyrGluAspAlaSerLeuAsnCysArgCysGluLysAsnTyrPheArg300305 310TCTGAGAAAGACCCTCCATCCATGGCTTGCACCAGACCACCATCTGCT1012SerGluLysAspProProSerMetAlaCysThrArgProProSerAla315320 325CCAAGAAACGTTATTTCTAACATCAATGAGACATCTGTTATTCTGGAC1060ProArgAsnValIleSerAsnIleAsnGluThrSerValIleLeuAsp330335 340TGGAGCTGGCCTCTTGATACAGGAGGTCGAAAAGATGTCACTTTCAAC1108TrpSerTrpProLeuAspThrGlyGlyArgLysAspValThrPheAsn345350 355ATCATTTGCAAAAAATGTGGAGGAAGCAGCAAGATATGTGAGCCTTGC1156IleIleCysLysLysCysGlyGlySerSerLysIleCysGluProCys360365370 375AGTGACAACGTACGGTTCTTACCCCGTCAGACTGGCCTCACCAACACC1204SerAspAsnValArgPheLeuProArgGlnThrGlyLeuThrAsnThr380385 390ACGGTGACAGTAGTGGACCTTTTGGCACATACCAATTACACTTTTGAG1252ThrValThrValValAspLeuLeuAlaHisThrAsnTyrThrPheGlu395400 405ATTGATGCAGTCAACGGGGTATCTGACTTGAGTACACTTTCGAGACAA1300IleAspAlaValAsnGlyValSerAspLeuSerThrLeuSerArgGln410415 420TTTGCTGCTGTCAGCATCACGACTAATCAGGCTGCGCCATCCCCCATC1348PheAlaAlaValSerIleThrThrAsnGlnAlaAlaProSerProIle425430 435ACAGTGATAAGGAACGACCGGACATCCAGGAACAGCGTGTCTCTGTCT1396ThrValIleArgAsnAspArgThrSerArgAsnSerValSerLeuSer440445450 455TGGCAGGAGCCTGAGCACCCAAATGGAATCATCTTGGACTACGAGGTC1444TrpGlnGluProGluHisProAsnGlyIleIleLeuAspTyrGluVal460465 470AAATACTACGAAAAGCAGGAACAAGAGACAAGCTATACTATTCTGAGA1492LysTyrTyrGluLysGlnGluGlnGluThrSerTyrThrIleLeuArg475480 485GCCAAAAGCACTAACGTTACTATCAGCGGCCTCAAACCTGATACCACC1540AlaLysSerThrAsnValThrIleSerGlyLeuLysProAspThrThr490495 500TACGTCTTCCAAATTCGAGCCCGAACTGCAGCTAGATATGGGACAAGC1588TyrValPheGlnIleArgAlaArgThrAlaAlaArgTyrGlyThrSer505510 515AGCCGCAAGTTTGAATTTGAAACCAGTCCAGATTCATTCTCCATTTCC1636SerArgLysPheGluPheGluThrSerProAspSerPheSerIleSer520525530 535AGTGAAAATAGCCAGGTCGTTATGATTGCCATTTCAGCTGCAGTTGCC1684SerGluAsnSerGlnValValMetIleAlaIleSerAlaAlaValAla540545 550ATCATTCTCCTCACGGTTGTTGTGTACGTCTTGATTGGGAGATTCTGC1732IleIleLeuLeuThrValValValTyrValLeuIleGlyArgPheCys555560 565GGATACAAGAAGTCTAAACATGGTACCGATGAGAAAAGACTACATTTT1780GlyTyrLysLysSerLysHisGlyThrAspGluLysArgLeuHisPhe570575 580GGGAATGGCCACTTAAAACTCCCAGGCCTGAGAACTTATGTAGATCCA1828GlyAsnGlyHisLeuLysLeuProGlyLeuArgThrTyrValAspPro585590 595CATACGTACGAAGATCCCAATCAAGCTGTACATGAATTTGCCAAGGAA1876HisThrTyrGluAspProAsnGlnAlaValHisGluPheAlaLysGlu600605610 615CTAGATGCTTCTAATATATCAATTGATAAAGTTGTTGGAGCAGGGGAA1924LeuAspAlaSerAsnIleSerIleAspLysValValGlyAlaGlyGlu620625 630TTTGGAGAAGTGTGCAGTGGGCGCCTGAAGCTGCCTTCTAAAAAGGAA1972PheGlyGluValCysSerGlyArgLeuLysLeuProSerLysLysGlu635640 645ATTTCAGTGGCCATCAAAACTCTGAAAGCTGGCTACACAGAAAAACAG2020IleSerValAlaIleLysThrLeuLysAlaGlyTyrThrGluLysGln650655 660AGAAGGGATTTCCTGGGAGAAGCAAGCATCATGGGGCAGTTTGACCAC2068ArgArgAspPheLeuGlyGluAlaSerIleMetGlyGlnPheAspHis665670 675CCCAACATCATCCGACTGGAGGGCGTTGTGACTAAAAGTAAACCAGTT2116ProAsnIleIleArgLeuGluGlyValValThrLysSerLysProVal680685690 695ATGATTGTTACTGAATACATGGAAAACGGTTCCTTGGACAGCTTCCTA2164MetIleValThrGluTyrMetGluAsnGlySerLeuAspSerPheLeu700705 710CGGAAACATGATGCCCAGTTCACAGTCATTCAGCTAGTAGGCATGCTT2212ArgLysHisAspAlaGlnPheThrValIleGlnLeuValGlyMetLeu715720 725CGTGGGATCGCATCTGGCATGAAATATTTGTCAGATATGGGTTATGTC2260ArgGlyIleAlaSerGlyMetLysTyrLeuSerAspMetGlyTyrVal730735 740CACCGAGATCTAGCTGCTCGTAATATACTCATCAATAGTAACTTGGTG2308HisArgAspLeuAlaAlaArgAsnIleLeuIleAsnSerAsnLeuVal745750 755TGCAAAGTCTCAGATTTTGGTCTTTCTCGTGTATTGGAAGATGACCCA2356CysLysValSerAspPheGlyLeuSerArgValLeuGluAspAspPro760765770 775GAAGCTGCTTACACAACAAGGGGGGGCAAGATTCCCATCCGATGGACG2404GluAlaAlaTyrThrThrArgGlyGlyLysIleProIleArgTrpThr780785 790TCACCAGAAGCCATTGCATACCGGAAGTTCACATCAGCCAGTGATGCG2452SerProGluAlaIleAlaTyrArgLysPheThrSerAlaSerAspAla795800 805TGGAGCTATGGGATTGTCCTCTGGGAGGTGATGTCTTATGGAGAAAGG2500TrpSerTyrGlyIleValLeuTrpGluValMetSerTyrGlyGluArg810815 820CCGTACTGGGAGATGTCCTTCCAGGACGTAATTAAAGCCGTTGATGAA2548ProTyrTrpGluMetSerPheGlnAspValIleLysAlaValAspGlu825830 835GGGTATCGCTTGCCACCTCCTATGGACTGCCCAGCTGCCTTGTATCAG2596GlyTyrArgLeuProProProMetAspCysProAlaAlaLeuTyrGln840845850 855CTGATGCTGGACTGCTGGCAGAAAGACAGAAACAACAGACCCAAGTTT2644LeuMetLeuAspCysTrpGlnLysAspArgAsnAsnArgProLysPhe860865 870GAGCAGATTGTCAGCATCCTGGATAAGCTGATCCGTAATCCCAGCAGT2692GluGlnIleValSerIleLeuAspLysLeuIleArgAsnProSerSer875880 885CTGAAAATAATCACCAATGCGGCAGCAAGGCCATCAAATCTTCTCCTG2740LeuLysIleIleThrAsnAlaAlaAlaArgProSerAsnLeuLeuLeu890895 900GACCAAAGTAACATTGACATTTCAGCGTTCCGCACGGCAGGTGATTGG2788AspGlnSerAsnIleAspIleSerAlaPheArgThrAlaGlyAspTrp905910 915CTCAATGGTTTTCGAACAGGACAGTGCAAAGGCATTTTCACGGGTGTG2836LeuAsnGlyPheArgThrGlyGlnCysLysGlyIlePheThrGlyVal920925930 935GAGTACAGCTCCTGTGATACAATAGCCAAGATTTCCACTGATGACATG2884GluTyrSerSerCysAspThrIleAlaLysIleSerThrAspAspMet940945 950AAGAAAGTTGGTGTTACAGTTGTGGGGCCTCAAAAGAAGATTGTTAGC2932LysLysValGlyValThrValValGlyProGlnLysLysIleValSer955960 965AGTATCAAAACTCTAGAAACTCATACGAAGAACAGCCCTGTTCCTGTG2980SerIleLysThrLeuGluThrHisThrLysAsnSerProValProVal970975 980TAAGGTACCAAAATGATGTTGCTGAGGACAGAAAAAAAAGAAAAGTCGCATCAAAGTGCA3040AAAGCGATGGCTGATAAACGGCACGGTTTAAAGGAGTTCTTTGCAGCAGTTTTGGAAACA3100TAATGGTTGAAATTTCAAACCCACTG AGACACTCAAATACTGAGTATAAATGCCTTAAAA3160ATAGGAGCGAACTTGTTTTCTATCTGTTAATCCTGAAGGGTGGGTGCTCTTAACTGACTG3220TTAATGCAGATAGTAAATTTCAAAAAAAAAAACG3254 (2) INFORMATION FOR SEQ ID NO:16:(i) SEQUENCE CHARACTERISTICS:(A) LENGTH: 983 amino acids(B) TYPE: amino acid(D) TOPOLOGY: linear(ii) MOLECULE TYPE: protein(xi) SEQUENCE DESCRIPTION: SEQ ID NO:16:MetAspArgArgArgLeuProLeuLeuLeuLeuCysAlaAlaLeuGly1 51015SerAlaGlyArgLeuSerAlaArgProGlyAsnGluValAsnLeuLeu202530AspSerLysThrIleGlnGlyGl uLeuGlyTrpIleSerTyrProSer354045HisGlyTrpGluGluIleSerGlyValAspGluHisTyrThrProIle5055 60ArgThrTyrGlnGluSerAsnValMetAspHisSerGlnAsnAsnTrp65707580LeuArgThrAsnTrpIleProArgAsnSerAlaGlnLysIleTyr Val859095GluLeuLysPheThrLeuArgAspCysAsnSerIleProLeuValLeu100105110GlyT hrCysLysGluThrPheAsnLeuTyrTyrMetGluSerAspAsp115120125AspHisLeuAlaLysPheArgGluHisGlnPheThrLysIleAspThr130 135140MetAlaAlaAspGluSerPheThrGlnMetAspLeuGlyAspArgIle145150155160LeuLysLeuAsnThrGluValArgGl uValGlyProValSerLysLys165170175GlyPheTyrLeuAlaPheGlnAspValGlyAlaCysValAlaLeuVal180185 190SerValArgValTyrPheLysLysCysProPheThrValLysAsnLeu195200205AlaMetPheProAspThrValProMetAspSerGlnSerLeuVal Glu210215220ValArgGlySerCysValAsnHisSerLysGluGluGluProProLys225230235240MetTyrC ysSerThrGluGlyGluTrpLeuValProIleGlyLysCys245250255LeuCysAsnAlaGlyTyrGluGluArgGlyPheAlaCysGlnAlaCys26 0265270ArgProGlyPheTyrLysAlaSerAlaGlyAsnValLysCysAlaLys275280285CysProProHisSerSerThrTyrGl uAspAlaSerLeuAsnCysArg290295300CysGluLysAsnTyrPheArgSerGluLysAspProProSerMetAla305310315 320CysThrArgProProSerAlaProArgAsnValIleSerAsnIleAsn325330335GluThrSerValIleLeuAspTrpSerTrpProLeuAspThr GlyGly340345350ArgLysAspValThrPheAsnIleIleCysLysLysCysGlyGlySer355360365SerLysI leCysGluProCysSerAspAsnValArgPheLeuProArg370375380GlnThrGlyLeuThrAsnThrThrValThrValValAspLeuLeuAla385390 395400HisThrAsnTyrThrPheGluIleAspAlaValAsnGlyValSerAsp405410415LeuSerThrLeuSerArgGlnPh eAlaAlaValSerIleThrThrAsn420425430GlnAlaAlaProSerProIleThrValIleArgAsnAspArgThrSer435440 445ArgAsnSerValSerLeuSerTrpGlnGluProGluHisProAsnGly450455460IleIleLeuAspTyrGluValLysTyrTyrGluLysGlnGluGlnGlu4 65470475480ThrSerTyrThrIleLeuArgAlaLysSerThrAsnValThrIleSer485490495GlyL euLysProAspThrThrTyrValPheGlnIleArgAlaArgThr500505510AlaAlaArgTyrGlyThrSerSerArgLysPheGluPheGluThrSer515 520525ProAspSerPheSerIleSerSerGluAsnSerGlnValValMetIle530535540AlaIleSerAlaAlaValAlaIleIleLeuLe uThrValValValTyr545550555560ValLeuIleGlyArgPheCysGlyTyrLysLysSerLysHisGlyThr565570 575AspGluLysArgLeuHisPheGlyAsnGlyHisLeuLysLeuProGly580585590LeuArgThrTyrValAspProHisThrTyrGluAspProAsn GlnAla595600605ValHisGluPheAlaLysGluLeuAspAlaSerAsnIleSerIleAsp610615620LysValValGlyA laGlyGluPheGlyGluValCysSerGlyArgLeu625630635640LysLeuProSerLysLysGluIleSerValAlaIleLysThrLeuLys64 5650655AlaGlyTyrThrGluLysGlnArgArgAspPheLeuGlyGluAlaSer660665670IleMetGlyGlnPheAspHisPr oAsnIleIleArgLeuGluGlyVal675680685ValThrLysSerLysProValMetIleValThrGluTyrMetGluAsn690695 700GlySerLeuAspSerPheLeuArgLysHisAspAlaGlnPheThrVal705710715720IleGlnLeuValGlyMetLeuArgGlyIleAlaSerGlyMetLys Tyr725730735LeuSerAspMetGlyTyrValHisArgAspLeuAlaAlaArgAsnIle740745750LeuI leAsnSerAsnLeuValCysLysValSerAspPheGlyLeuSer755760765ArgValLeuGluAspAspProGluAlaAlaTyrThrThrArgGlyGly770 775780LysIleProIleArgTrpThrSerProGluAlaIleAlaTyrArgLys785790795800PheThrSerAlaSerAspAlaTrpSe rTyrGlyIleValLeuTrpGlu805810815ValMetSerTyrGlyGluArgProTyrTrpGluMetSerPheGlnAsp820825 830ValIleLysAlaValAspGluGlyTyrArgLeuProProProMetAsp835840845CysProAlaAlaLeuTyrGlnLeuMetLeuAspCysTrpGlnLys Asp850855860ArgAsnAsnArgProLysPheGluGlnIleValSerIleLeuAspLys865870875880LeuIleA rgAsnProSerSerLeuLysIleIleThrAsnAlaAlaAla885890895ArgProSerAsnLeuLeuLeuAspGlnSerAsnIleAspIleSerAla90 0905910PheArgThrAlaGlyAspTrpLeuAsnGlyPheArgThrGlyGlnCys915920925LysGlyIlePheThrGlyValGluTy rSerSerCysAspThrIleAla930935940LysIleSerThrAspAspMetLysLysValGlyValThrValValGly945950955 960ProGlnLysLysIleValSerSerIleLysThrLeuGluThrHisThr965970975LysAsnSerProValProVal980(2) INFORMATION FOR SEQ ID NO:17:(i) SEQUENCE CHARACTERISTICS:(A) LENGTH: 4049 base pairs(B) TYPE: nucleic acid(C) STRANDEDNESS: both(D) TOPOLOGY: linear(ix) FEATURE:(A) NAME/KEY: CDS(B) LOCATION: 10..2994(xi) SEQUENCE DESCRIPTION: SEQ ID NO:17:CGGCTTCTGATGCCCGGCCCGGAGCGCACC ATGGGGCCGTTGTGGTTC48MetProGlyProGluArgThrMetGlyProLeuTrpPhe1510TGCTGTTTGCCCCTCGCCCTCTTGCCTCTGCTCGC CGCCGTGGAAGAG96CysCysLeuProLeuAlaLeuLeuProLeuLeuAlaAlaValGluGlu152025ACGCTGATGGACTCCACAACGGCCACAGCAGAGCTGGGCTG GATGGTG144ThrLeuMetAspSerThrThrAlaThrAlaGluLeuGlyTrpMetVal30354045CATCCTCCCTCAGGGTGGGAAGAGGTGAGTGGATA CGATGAGAACATG192HisProProSerGlyTrpGluGluValSerGlyTyrAspGluAsnMet505560AACACCATCCGCACCTACCAGGTGTGCAACGT CTTTGAATCCAGCCAA240AsnThrIleArgThrTyrGlnValCysAsnValPheGluSerSerGln657075AACAACTGGCTGCGGACCAAGTACATCCGGAG GCGAGGAGCGCACCGC288AsnAsnTrpLeuArgThrLysTyrIleArgArgArgGlyAlaHisArg808590ATCCACGTGGAGATGAAATTCTCCGTTCGGGACTG CAGCAGCATCCCC336IleHisValGluMetLysPheSerValArgAspCysSerSerIlePro95100105AACGTCCCGGGCTCCTGTAAGGAGACTTTTAACCTCTATTA CTACGAA384AsnValProGlySerCysLysGluThrPheAsnLeuTyrTyrTyrGlu110115120125TCAGACTTTGACTCTGCCACCAAGACTTTTCCTAA CTGGATGGAAAAC432SerAspPheAspSerAlaThrLysThrPheProAsnTrpMetGluAsn130135140CCTTGGATGAAGGTAGATACAATTGCTGCCGA CGAGAGCTTCTCGCAG480ProTrpMetLysValAspThrIleAlaAlaAspGluSerPheSerGln145150155GTGGACCTTGGTGGGCGGGTGATGAAGATTAA CACCGAGGTGCGCAGT528ValAspLeuGlyGlyArgValMetLysIleAsnThrGluValArgSer160165170TTTGGGCCTGTCTCCAAAAACGGTTTCTACCTGGC CTTCCAGGACTAC576PheGlyProValSerLysAsnGlyPheTyrLeuAlaPheGlnAspTyr175180185GGGGGCTGCATGTCCTTGATTGCAGTCCGTGTCTTTTACCG CAAGTGT624GlyGlyCysMetSerLeuIleAlaValArgValPheTyrArgLysCys190195200205CCCCGTGTGATCCAGAACGGGGCGGTCTTCCAGGA AACCCTCTCGGGA672ProArgValIleGlnAsnGlyAlaValPheGlnGluThrLeuSerGly210215220GCGGAGAGCACATCTCTGGTGGCAGCCCGGGG GACGTGCATCAGCAAT720AlaGluSerThrSerLeuValAlaAlaArgGlyThrCysIleSerAsn225230235GCGGAGGAGGTGGATGTGCCCATCAAGCTGTA CTGCAATGGGGATGGC768AlaGluGluValAspValProIleLysLeuTyrCysAsnGlyAspGly240245250GAGTGGCTGGTGCCCATCGGCCGCTGCATGTGCAG GCCGGGCTATGAG816GluTrpLeuValProIleGlyArgCysMetCysArgProGlyTyrGlu255260265TCGGTGGAGAATGGGACCGTCTGCAGAGGCTGCCCATCAGG GACCTTC864SerValGluAsnGlyThrValCysArgGlyCysProSerGlyThrPhe270275280285AAGGCCAGCCAAGGAGATGAAGGATGTGTCCATTG TCCAATTAACAGC912LysAlaSerGlnGlyAspGluGlyCysValHisCysProIleAsnSer290295300CGGACGACTTCGGAAGGGGCCACGAACTGCGT GTGCCGAAACGGATAT960ArgThrThrSerGluGlyAlaThrAsnCysValCysArgAsnGlyTyr305310315TACCGGGCAGATGCTGACCCCGTCGACATGCC ATGCACCACCATCCCA1008TyrArgAlaAspAlaAspProValAspMetProCysThrThrIlePro320325330TCTGCCCCCCAGGCCGTGATCTCCAGCGTGAATGA AACCTCCCTGATG1056SerAlaProGlnAlaValIleSerSerValAsnGluThrSerLeuMet335340345CTGGAGTGGACCCCACCACGAGACTCAGGGGGCCGGGAGGA TCTGGTA1104LeuGluTrpThrProProArgAspSerGlyGlyArgGluAspLeuVal350355360365TACAACATCATCTGCAAGAGCTGTGGGTCAGGCCG TGGGGCGTGCACG1152TyrAsnIleIleCysLysSerCysGlySerGlyArgGlyAlaCysThr370375380CGCTGTGGGGACAACGTGCAGTTTGCCCCACG CCAGCTGGGCCTGACG1200ArgCysGlyAspAsnValGlnPheAlaProArgGlnLeuGlyLeuThr385390395GAGCCTCGCATCTACATCAGCGACCTGCTGGC CCACACGCAGTACACC1248GluProArgIleTyrIleSerAspLeuLeuAlaHisThrGlnTyrThr400405410TTTGAGATCCAGGCTGTGAATGGGGTCACCGACCA GAGCCCCTTCTCC1296PheGluIleGlnAlaValAsnGlyValThrAspGlnSerProPheSer415420425CCACAGTTTGCATCAGTGAATATCACCACCAACCAGGCTGC TCCTTCA1344ProGlnPheAlaSerValAsnIleThrThrAsnGlnAlaAlaProSer430435440445GCCGTGTCCATAATGCACCAGGTCAGCCGCACTGT GGACAGCATTACC1392AlaValSerIleMetHisGlnValSerArgThrValAspSerIleThr450455460CTCTCGTGGTCTCAACCTGACCAGCCCAATGG AGTCATCCTGGATTAT1440LeuSerTrpSerGlnProAspGlnProAsnGlyValIleLeuAspTyr465470475GAGCTGCAATACTATGAGAAGAACCTGAGTGA GTTAAATTCAACAGCA1488GluLeuGlnTyrTyrGluLysAsnLeuSerGluLeuAsnSerThrAla480485490GTGAAGAGCCCCACCAACACTGTGACAGTGCAAAA CCTCAAAGCTGGC1536ValLysSerProThrAsnThrValThrValGlnAsnLeuLysAlaGly495500505ACCATCTATGTCTTCCAAGTGCGAGCACGTACCGTGGCTGG GTATGGC1584ThrIleTyrValPheGlnValArgAlaArgThrValAlaGlyTyrGly510515520525CGGTATAGTGGCAAGATGTACTTCCAGACCATGAC TGAAGCCGAGTAC1632ArgTyrSerGlyLysMetTyrPheGlnThrMetThrGluAlaGluTyr530535540CAGACCAGTGTCCAGGAGAAGCTGCCACTCAT CATTGGCTCCTCTGCA1680GlnThrSerValGlnGluLysLeuProLeuIleIleGlySerSerAla545550555GCAGGACTGGTGTTTCTCATTGCTGTTGTCGT CATCATTATTGTCTGC1728AlaGlyLeuValPheLeuIleAlaValValValIleIleIleValCys560565570AACAGAAGACGGGGCTTTGAACGTGCTGACTCTGA GTACACTGACAAG1776AsnArgArgArgGlyPheGluArgAlaAspSerGluTyrThrAspLys575580585CTGCAGCACTATACCAGTGGCCACATGACTCCAGGGATGAA GATTTAT1824LeuGlnHisTyrThrSerGlyHisMetThrProGlyMetLysIleTyr590595600605ATCGATCCATTTACCTACGAAGATCCCAATGAGGC TGTCAGGGAATTT1872IleAspProPheThrTyrGluAspProAsnGluAlaValArgGluPhe610615620GCAAAAGAAATTGATATCTCCTGTGTGAAAAT CGAGCAGGTGATTGGG1920AlaLysGluIleAspIleSerCysValLysIleGluGlnValIleGly625630635GCAGGGGAGTTTGGTGAGGTGTGCAGTGGGCA TCTCAAGCTTCCTGGC1968AlaGlyGluPheGlyGluValCysSerGlyHisLeuLysLeuProGly640645650AAAAGAGAGATCTTTGTGGCCATCAAGACCCTGAA GTCTGGTTACACA2016LysArgGluIlePheValAlaIleLysThrLeuLysSerGlyTyrThr655660665GAGAAGCAGAGACGGGACTTCCTGAGTGAAGCCAGCATCAT GGGGCAG2064GluLysGlnArgArgAspPheLeuSerGluAlaSerIleMetGlyGln670675680685TTTGACCACCCCAATGTCATCCACCTGGAAGGGGT GGTGACCAAGAGT2112PheAspHisProAsnValIleHisLeuGluGlyValValThrLysSer690695700TCCCCAGTCATGATCATTACAGAGTTCATGGA GAATGGCTCGTTGGAC2160SerProValMetIleIleThrGluPheMetGluAsnGlySerLeuAsp705710715TCCTTCTTGAGGCAAAATGATGGGCAGTTCAC AGTGATCCAGCTGGTG2208SerPheLeuArgGlnAsnAspGlyGlnPheThrValIleGlnLeuVal720725730GGCATGTTGCGTGGCATTGCAGCAGGCATGAAGTA CCTGGCTGATATG2256GlyMetLeuArgGlyIleAlaAlaGlyMetLysTyrLeuAlaAspMet735740745AACTACGTGCACCGGGACCTGGCTGCCCGCAACATCCTGGT CAACAGC2304AsnTyrValHisArgAspLeuAlaAlaArgAsnIleLeuValAsnSer750755760765AACCTGGTCTGCAAGGTGTCCGACTTCGGCCTCTC CCGTTTCCTGGAG2352AsnLeuValCysLysValSerAspPheGlyLeuSerArgPheLeuGlu770775780GATGACACCTCTGATCCCACTTACACCAGCGC ACTGGGTGGAAAGATC2400AspAspThrSerAspProThrTyrThrSerAlaLeuGlyGlyLysIle785790795CCAATACGGTGGACAGCGCCTGAGGCAATTCA GTACCGAAAATTCACA2448ProIleArgTrpThrAlaProGluAlaIleGlnTyrArgLysPheThr800805810TCAGCCAGCGATGTGTGGAGCTATGGAATAGTCAT GTGGGAGGTGATG2496SerAlaSerAspValTrpSerTyrGlyIleValMetTrpGluValMet815820825TCGTACGGCGAGCGGCCTTACTGGGACATGACCAATCAAGA TGTGATA2544SerTyrGlyGluArgProTyrTrpAspMetThrAsnGlnAspValIle830835840845AATGCTATTGAGCAGGACTATCGGCTACCACCCCC TATGGATTGTCCA2592AsnAlaIleGluGlnAspTyrArgLeuProProProMetAspCysPro850855860AATGCCCTGCACCAGCTAATGCTTGACTGCTG GCAGAAGGATCGAAAC2640AsnAlaLeuHisGlnLeuMetLeuAspCysTrpGlnLysAspArgAsn865870875CACAGACCCAAATTTGGACAGATTGTCAACAC TTTAGACAAAATGATC2688HisArgProLysPheGlyGlnIleValAsnThrLeuAspLysMetIle880885890CGAAATCCTAATAGTCTGAAAGCCATGGCACCTCT CTCCTCTGGGGTT2736ArgAsnProAsnSerLeuLysAlaMetAlaProLeuSerSerGlyVal895900905AACCTCCCTCTACTTGACCGCACAATCCCAGATTATACCAG CTTCAAC2784AsnLeuProLeuLeuAspArgThrIleProAspTyrThrSerPheAsn910915920925ACTGTGGATGAATGGCTGGATGCCATCAAGATGAG CCAGTACAAGGAG2832ThrValAspGluTrpLeuAspAlaIleLysMetSerGlnTyrLysGlu930935940AGCTTTGCCAGTGCTGGCTTCACCACCTTTGA TATAGTATCTCAGATG2880SerPheAlaSerAlaGlyPheThrThrPheAspIleValSerGlnMet945950955ACTGTAGAGGACATTCTACGAGTTGGGGTCAC TTTAGCAGGACACCAG2928ThrValGluAspIleLeuArgValGlyValThrLeuAlaGlyHisGln960965970AAGAAAATTCTGAACAGTATCCAGGTGATGAGAGC ACAGATGAACCAA2976LysLysIleLeuAsnSerIleGlnValMetArgAlaGlnMetAsnGln975980985ATTCAGTCTGTGGAGGTTTGATAGCAACACGTCCTCGTGCTCCACT TC3024IleGlnSerValGluVal990995CTTGAGGCCCTGCTCCCCTCTGCCCCTGTGTGTCTGAGCTCCAGTTCTTGAGTGTTCTGC3084GTGGATCAGAGACAGGCAGCTGCTCTGAGGATCATGGCAACAGGAA GAAATGCCCTATCA3144TTGACAACGAGAAGTCATCAAGAGGTGAAACAATGGAAAACAATGGAAAAAGGGAACAAG3204TAAAGACAGCTATTTTGAAAACCGAAAACAAACAGTGAATTATTTTTAAATAATAATAAA3264GCAATTGCAGTCTTGAAAAG GGCTCCAAGACCAATGGGAGTCTCCAAAGGAAGAGAATAG3324AGCAGCTTCATCTATTTCCTCTTACACAAGGGTTGCTGCAGCTGGGCCCAGACACTTCTG3384GAGTAACGAGACTTTTCAAGAAGATGAATGCAAAGAATGGTCACAAGAAGCACTTCTCTT34 44TCTCACATGGGATGGCAGCTCTGGGAATGCCCGGCAGTCCTTCCTGAAAGCCCTGTTGGC3504AAATCGAAGAGGAGAGCCGAAGCTCTTTGGTGCTGTGGAACCAAGTGCATCTCAGAAATT3564GTTGGACTTCTACAAAAGCTGAAGACATTCTTTTTTTTT AAACAAGTAAACTGATACTAG3624AAGAGGCTGTTTCCGTCAAATGAGAAGGAATCTGTAACACTGGCCCGGGGGGGGTGGGGA3684ATGGGGGAAATCAGTCCTTTTTACATCTCTTTATTTTCTCTTGTCATGGAACAGTTTTGT3744GAGTGACAGTTTC CTAAGGGTCCGTCCATCCACCCTCCAATGGCATCATTGTTTCATACA3804TATCATATGCACAAGACTTATAGTGATGTCCTCACTCGATGCCAATGATCTTTCCCCAGA3864AGACTTCCCAAGTACAGTATGTAGTAGATTTTGATTACAAATGCTGACGTGTACCTT TAT3924TTTTCGGTTGTCGTTGTTGGGAGATTCGTCCTTTTACCTTGCTTTGTTAACACCAATTTG3984TGAGTTTGGGGTTGGAATTTTTTTGGTCGATTGGGGTTGTTTTTTTTTTTTTTTTTTTTT4044AACCG 4049(2) INFORMATION FOR SEQ ID NO:18:(i) SEQUENCE CHARACTERISTICS:(A) LENGTH: 995 amino acids(B) TYPE: amino acid(D) TOPOLOGY: linear(ii) MOLECULE TYPE: protein(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18:MetProGlyProGluArgThrMetGlyProLe uTrpPheCysCysLeu151015ProLeuAlaLeuLeuProLeuLeuAlaAlaValGluGluThrLeuMet2025 30AspSerThrThrAlaThrAlaGluLeuGlyTrpMetValHisProPro354045SerGlyTrpGluGluValSerGlyTyrAspGluAsnMetAsnThrIle 505560ArgThrTyrGlnValCysAsnValPheGluSerSerGlnAsnAsnTrp65707580LeuArgThrLysT yrIleArgArgArgGlyAlaHisArgIleHisVal859095GluMetLysPheSerValArgAspCysSerSerIleProAsnValPro100 105110GlySerCysLysGluThrPheAsnLeuTyrTyrTyrGluSerAspPhe115120125AspSerAlaThrLysThrPheProAsnTrpMe tGluAsnProTrpMet130135140LysValAspThrIleAlaAlaAspGluSerPheSerGlnValAspLeu145150155 160GlyGlyArgValMetLysIleAsnThrGluValArgSerPheGlyPro165170175ValSerLysAsnGlyPheTyrLeuAlaPheGlnAspTyrGlyGlyCys180185190MetSerLeuIleAlaValArgValPheTyrArgLysCysProArgVal195200205IleGlnAsnGlyA laValPheGlnGluThrLeuSerGlyAlaGluSer210215220ThrSerLeuValAlaAlaArgGlyThrCysIleSerAsnAlaGluGlu225230 235240ValAspValProIleLysLeuTyrCysAsnGlyAspGlyGluTrpLeu245250255ValProIleGlyArgCysMetCysArgPr oGlyTyrGluSerValGlu260265270AsnGlyThrValCysArgGlyCysProSerGlyThrPheLysAlaSer275280 285GlnGlyAspGluGlyCysValHisCysProIleAsnSerArgThrThr290295300SerGluGlyAlaThrAsnCysValCysArgAsnGlyTyrTyrArgAla305 310315320AspAlaAspProValAspMetProCysThrThrIleProSerAlaPro325330335GlnAlaValI leSerSerValAsnGluThrSerLeuMetLeuGluTrp340345350ThrProProArgAspSerGlyGlyArgGluAspLeuValTyrAsnIle355 360365IleCysLysSerCysGlySerGlyArgGlyAlaCysThrArgCysGly370375380AspAsnValGlnPheAlaProArgGlnLeuGlyLeuTh rGluProArg385390395400IleTyrIleSerAspLeuLeuAlaHisThrGlnTyrThrPheGluIle405410 415GlnAlaValAsnGlyValThrAspGlnSerProPheSerProGlnPhe420425430AlaSerValAsnIleThrThrAsnGlnAlaAlaProSerAlaValSer435440445IleMetHisGlnValSerArgThrValAspSerIleThrLeuSerTrp450455460SerGlnProAspGlnProA snGlyValIleLeuAspTyrGluLeuGln465470475480TyrTyrGluLysAsnLeuSerGluLeuAsnSerThrAlaValLysSer485 490495ProThrAsnThrValThrValGlnAsnLeuLysAlaGlyThrIleTyr500505510ValPheGlnValArgAlaArgThrValAl aGlyTyrGlyArgTyrSer515520525GlyLysMetTyrPheGlnThrMetThrGluAlaGluTyrGlnThrSer530535540 ValGlnGluLysLeuProLeuIleIleGlySerSerAlaAlaGlyLeu545550555560ValPheLeuIleAlaValValValIleIleIleValCysAsnArgArg 565570575ArgGlyPheGluArgAlaAspSerGluTyrThrAspLysLeuGlnHis580585590TyrThrSerG lyHisMetThrProGlyMetLysIleTyrIleAspPro595600605PheThrTyrGluAspProAsnGluAlaValArgGluPheAlaLysGlu61061 5620IleAspIleSerCysValLysIleGluGlnValIleGlyAlaGlyGlu625630635640PheGlyGluValCysSerGlyHisLeuLysLe uProGlyLysArgGlu645650655IlePheValAlaIleLysThrLeuLysSerGlyTyrThrGluLysGln660665 670ArgArgAspPheLeuSerGluAlaSerIleMetGlyGlnPheAspHis675680685ProAsnValIleHisLeuGluGlyValValThrLysSerSerProVal 690695700MetIleIleThrGluPheMetGluAsnGlySerLeuAspSerPheLeu705710715720ArgGlnAsnAspG lyGlnPheThrValIleGlnLeuValGlyMetLeu725730735ArgGlyIleAlaAlaGlyMetLysTyrLeuAlaAspMetAsnTyrVal740 745750HisArgAspLeuAlaAlaArgAsnIleLeuValAsnSerAsnLeuVal755760765CysLysValSerAspPheGlyLeuSerArgPh eLeuGluAspAspThr770775780SerAspProThrTyrThrSerAlaLeuGlyGlyLysIleProIleArg785790795 800TrpThrAlaProGluAlaIleGlnTyrArgLysPheThrSerAlaSer805810815AspValTrpSerTyrGlyIleValMetTrpGluValMetSerTyrGly820825830GluArgProTyrTrpAspMetThrAsnGlnAspValIleAsnAlaIle835840845GluGlnAspTyrA rgLeuProProProMetAspCysProAsnAlaLeu850855860HisGlnLeuMetLeuAspCysTrpGlnLysAspArgAsnHisArgPro865870 875880LysPheGlyGlnIleValAsnThrLeuAspLysMetIleArgAsnPro885890895AsnSerLeuLysAlaMetAlaProLeuSe rSerGlyValAsnLeuPro900905910LeuLeuAspArgThrIleProAspTyrThrSerPheAsnThrValAsp915920 925GluTrpLeuAspAlaIleLysMetSerGlnTyrLysGluSerPheAla930935940SerAlaGlyPheThrThrPheAspIleValSerGlnMetThrValGlu945 950955960AspIleLeuArgValGlyValThrLeuAlaGlyHisGlnLysLysIle965970975LeuAsnSerI leGlnValMetArgAlaGlnMetAsnGlnIleGlnSer980985990ValGluVal995(2) INFORMATION FOR SEQ ID NO:19:(i) SEQUENCE CHARACTERISTICS:(A) LENGTH: 3125 base pairs(B) TYPE: nucleic acid(C) STRANDEDNESS: both(D) TOPOLOGY: linear(ix) FEATURE:(A) NAME/KEY: CDS(B) LOCATION: 2..2233(xi) SEQUENCE DESCRIPTION: SEQ ID NO:19:CCTCAAATTCACCCTGAGGGACTGTAACAGCCTTCCAGGAGGACTT46LeuLysPheTh rLeuArgAspCysAsnSerLeuProGlyGlyLeu151015GGGACTTGCAAGGAGACTTTTAACATGTACTACTTTGAGTCAGATGAT94GlyThrCys LysGluThrPheAsnMetTyrTyrPheGluSerAspAsp202530GAAGATGGGAGGAACATCAGAGAGAATCAGTACATCAAGATAGATACC142GluAsp GlyArgAsnIleArgGluAsnGlnTyrIleLysIleAspThr354045ATTGCTGCTGATGAGAGCTTCACGGAGTTGGACCTCGGCGACAGAGTT190IleAla AlaAspGluSerPheThrGluLeuAspLeuGlyAspArgVal505560ATGAAGTTAAACACAGAAGTGAGAGATGTTGGGCCTCTAACAAAAAAA238MetLysLeu AsnThrGluValArgAspValGlyProLeuThrLysLys657075GGATTTTACCTTGCTTTCCAGGATGTGGGCGCCTGCATTGCCCTGGTC286GlyPheTyrLeuAla PheGlnAspValGlyAlaCysIleAlaLeuVal80859095TCTGTGCGTGTGTACTACAAGAAATGCCCATCAGTGATCCGCAACCTG334SerValArg ValTyrTyrLysLysCysProSerValIleArgAsnLeu100105110GCACGCTTTCCAGATACCATCACAGGAGCAGATTCCTCGCAGCTGCTA382AlaArg PheProAspThrIleThrGlyAlaAspSerSerGlnLeuLeu115120125GAAGTGTCAGGCGTCTGTGTCAACCACTCAGTGACTGATGAGGCACCA430GluVal SerGlyValCysValAsnHisSerValThrAspGluAlaPro130135140AAGATGCACTGCAGTTCAGAGGGAGAATGGCTGGTGCCCATTGGGAAG478LysMetHis CysSerSerGluGlyGluTrpLeuValProIleGlyLys145150155TGTTTGTGCAAGGCAGGGTACGAGGAGAAGAACAACACCTGCCAAGCA526CysLeuCysLysAla GlyTyrGluGluLysAsnAsnThrCysGlnAla160165170175CCTTCTCCAGTCAGTAGTGTGAAAAAAGGGAAGATAACTAAAAATAGC574ProSerPro ValSerSerValLysLysGlyLysIleThrLysAsnSer180185190ATCTCCCTTTCCTGGCAGGAGCCAGATCGACCCAACGGCATCATCCTG622IleSer LeuSerTrpGlnGluProAspArgProAsnGlyIleIleLeu195200205GAATACGAAATCAAATATTTTGAAAAGGACCAGGAGACAAGCTACACC670GluTyr GluIleLysTyrPheGluLysAspGlnGluThrSerTyrThr210215220ATCATCAAATCCAAAGAGACCGCAATTACGGCAGATGGCTTGAAACCA718IleIleLys SerLysGluThrAlaIleThrAlaAspGlyLeuLysPro225230235GGCTCAGCGTACGTCTTCCAGATCCGAGCCCGGACAGCTGCTGGCTAC766GlySerAlaTyrVal PheGlnIleArgAlaArgThrAlaAlaGlyTyr240245250255GGTGGCTTCAGTCGAAGATTTGAGTTTGAAACCAGCCCAGTGTTAGCT814GlyGlyPhe SerArgArgPheGluPheGluThrSerProValLeuAla260265270GCATCCAGTGACCAGAGCCAGATTCCTATAATTGTTGTGTCTGTAACA862AlaSer SerAspGlnSerGlnIleProIleIleValValSerValThr275280285GTGGGAGTTATTCTGCTGGCTGTTGTTATCGGTTTCCTTCTCAGTGGA910ValGly ValIleLeuLeuAlaValValIleGlyPheLeuLeuSerGly290295300AGTTGCTGCGATCATGGCTGTGGGTGGGCTTCTTCTCTGCGTGCTGTT958SerCysCys AspHisGlyCysGlyTrpAlaSerSerLeuArgAlaVal305310315GCCTATCCGAGCCTAATATGGCGCTGTGGCTACAGCAAGGCTAAACAA1006AlaTyrProSerLeu IleTrpArgCysGlyTyrSerLysAlaLysGln320325330335GACCCAGAAGAAGAAAAGATGCATTTTCATAATGGCCACATTAAACTG1054AspProGlu GluGluLysMetHisPheHisAsnGlyHisIleLysLeu340345350CCTGGTGTAAGAACCTACATTGATCCCCACACCTATGAGGACCCTAAT1102ProGly ValArgThrTyrIleAspProHisThrTyrGluAspProAsn355360365CAAGCTGTCCACGAGTTTGCCAAGGAAATAGAAGCTTCGTGCATAACC1150GlnAla ValHisGluPheAlaLysGluIleGluAlaSerCysIleThr370375380ATCGAGAGAGTTATCGGAGCTGGTGAATTTGGAGAAGTCTGCAGTGGA1198IleGluArg ValIleGlyAlaGlyGluPheGlyGluValCysSerGly385390395CGGCTGAAACTGCAGGGAAAACGCGAGTTTCCAGTGGCTATCAAAACC1246ArgLeuLysLeuGln GlyLysArgGluPheProValAlaIleLysThr400405410415CTGAAGGTGGGCTACACAGAGAAGCAAAGGCGAGATTTCCTGGGAGAA1294LeuLysVal GlyTyrThrGluLysGlnArgArgAspPheLeuGlyGlu420425430GCGAGCATCATGGGGCAGTTCGACCACCCCAACATCATCCACCTGGAA1342AlaSer IleMetGlyGlnPheAspHisProAsnIleIleHisLeuGlu435440445GGTGTCGTCACAAAAAGCAAACCTGTAATGATAGTAACGGAATACATG1390GlyVal ValThrLysSerLysProValMetIleValThrGluTyrMet450455460GAAAATGGTTCTCTGGATACATTTTTAAAGAAGAACGATGGGCAGTTC1438GluAsnGly SerLeuAspThrPheLeuLysLysAsnAspGlyGlnPhe465470475ACGGTCATTCAGCTGGTCGGGATGCTGCGAGGCATCGCATCAGGGATG1486ThrValIleGlnLeu ValGlyMetLeuArgGlyIleAlaSerGlyMet480485490495AAGTACCTGTCTGACATGGGTTACGTACACAGAGACCTCGCTGCCAGG1534LysTyrLeu SerAspMetGlyTyrValHisArgAspLeuAlaAlaArg500505510AATATCCTCATCAACAGCAACTTAGTCTGCAAGGTGTCTGACTTTGGC1582AsnIle LeuIleAsnSerAsnLeuValCysLysValSerAspPheGly515520525CTCTCCAGAGTCCTAGAAGATGATCCTGAAGCAGCGTACACAACCAGG1630LeuSer ArgValLeuGluAspAspProGluAlaAlaTyrThrThrArg530535540GGAGGGAAGATCCCCATCCGATGGACGGCACCTGAAGCAATCGCCTTC1678GlyGlyLys IleProIleArgTrpThrAlaProGluAlaIleAlaPhe545550555CGCAAATTCACGTCGGCCAGCGATGTGTGGAGCTACGGCATTGTGATG1726ArgLysPheThrSer AlaSerAspValTrpSerTyrGlyIleValMet560565570575TGGGAAGTGATGTCCTATGGCGAGAGACCTTACTGGGAAATGACAAAC1774TrpGluVal MetSerTyrGlyGluArgProTyrTrpGluMetThrAsn580585590CAAGATGTGATTAAAGCCGTGGAGGAAGGCTATCGCCTGCCAAGTCCC1822GlnAsp ValIleLysAlaValGluGluGlyTyrArgLeuProSerPro595600605ATGGACTGCCCTGCTGCTCTCTACCAGTTGATGCTTGACTGCTGGCAG1870MetAsp CysProAlaAlaLeuTyrGlnLeuMetLeuAspCysTrpGln610615620AAAGACCGCAACAGCAGGCCCAAGTTTGATGAAATTGTCAGCATGTTG1918LysAspArg AsnSerArgProLysPheAspGluIleValSerMetLeu625630635GACAAGCTCATCCGTAACCCAAGCAGCTTGAAGACGTTGGTTAATGCA1966AspLysLeuIleArg AsnProSerSerLeuLysThrLeuValAsnAla640645650655TCGAGCAGAGTATCAAATTTGTTGGTAGAACACAGTCCAGTGGGGAGC2014SerSerArg ValSerAsnLeuLeuValGluHisSerProValGlySer660665670GGTGCCTACAGGTCAGTGGGTGAGTGGCTGGAAGCCATCAAAATGGGT2062GlyAla TyrArgSerValGlyGluTrpLeuGluAlaIleLysMetGly675680685CGATACACCGAGATTTTCATGGAGAATGGATACAGTTCGATGGATTCT2110ArgTyr ThrGluIlePheMetGluAsnGlyTyrSerSerMetAspSer690695700GTGGCTCAGGTGACCCTAGAGGATTTGAGGCGGCTGGGAGTGACACTT2158ValAlaGln ValThrLeuGluAspLeuArgArgLeuGlyValThrLeu705710715GTTGGTCACCAGAAGAAGATAATGAACAGCCTTCAAGAGATGAAGGTC2206ValGlyHisGlnLys LysIleMetAsnSerLeuGlnGluMetLysVal720725730735CAGTTGGTGAATGGGATGGTGCCATTGTAACTCGGTTTTTAAGTCAC2253GlnLeuVal AsnGlyMetValProLeu740TTCCTCGAGTGGTCGGTCCTGCACTTTGTATACTAGCTCTGAGATTTATTTTGACTAAAG2313AAGAAAAAAGGGAAATTCAGTGGTTTCTGTAACTGAAGGACGCTGGCTTCTGCCACAGCA2373 TTTATAAAGCAGTGTTTGACTGAAGTTTTCATTTTCTTCCTATTTGTGTCCTCATTCTCA2433TGAAGTAAATGTAACATGCATGGAACATGGAAATGGATCTACTGTACATGAGGTTACCCA2493ATTTCTTGCGCTTCAGCATGACAACAGCAAGCCTTCCCACCAC ATGTTGTCTATACATGG2553GAGATATATATATATGCATATATATATATAGCACCTTTATATACTGAATTACAGCAGCAG2613CACATGTTAATACTTCCAAGGACTTACTTGACTAGAGAAGTTTTGCAGCCATTGTGGGCT2673CACACAAGCTGCGGTTTA CTGAAGTTTACTTCAAGTCTTACTTGTCTACAGAAGTGTATT2733GAAGAGCAATATGATTAGATTATTTCTGGATAGATATTTTGTTTTGTAAATTTAAAAAAT2793CGTGTTACACAGCGTTAAGTTATAGAGACTAGTGTATAAACATGTTGCTTGCTCAATGGC 2853AAATACAATACAGGGTGTATATTTTTTTCTCTCTGTGTTGCAAAGTTCTTTTAGTTTGCT2913CTTCTGTGAGGATAATACGTTATGATGTATATACTGTACAGTTTGCTACACATCAGGTAC2973AAGATTGGGGCTTTCTCAATGTTTTGTTCTTTTTCC CTCTTTTGTTTCATTTTGTCTTCC3033TTTTGTGTTAACCACTATGCTTTGTATTTTTGCTGCTGTTTGGTTTGAGGCAACATATAA3093AGCTTTCAGGTGTTTTGATTATAAAAAAAAAG3125(2) INFORMATION FOR SEQ ID NO:20:(i) SEQUENCE CHARACTERISTICS:(A) LENGTH: 744 amino acids(B) TYPE: amino acid(D) TOPOLOGY: linear(ii) MOLECULE TYPE: protein(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20:LeuLysPheThrLeuArgAspCysAsnSerLeuProGlyGlyLeuGly15 1015ThrCysLysGluThrPheAsnMetTyrTyrPheGluSerAspAspGlu202530AspGlyArgAsnIleArgGluAsnGlnTyrI leLysIleAspThrIle354045AlaAlaAspGluSerPheThrGluLeuAspLeuGlyAspArgValMet505560Lys LeuAsnThrGluValArgAspValGlyProLeuThrLysLysGly65707580PheTyrLeuAlaPheGlnAspValGlyAlaCysIleAlaLeuValSer 859095ValArgValTyrTyrLysLysCysProSerValIleArgAsnLeuAla100105110ArgPheProAsp ThrIleThrGlyAlaAspSerSerGlnLeuLeuGlu115120125ValSerGlyValCysValAsnHisSerValThrAspGluAlaProLys130135 140MetHisCysSerSerGluGlyGluTrpLeuValProIleGlyLysCys145150155160LeuCysLysAlaGlyTyrGluGluLysAsnAsnT hrCysGlnAlaPro165170175SerProValSerSerValLysLysGlyLysIleThrLysAsnSerIle180185 190SerLeuSerTrpGlnGluProAspArgProAsnGlyIleIleLeuGlu195200205TyrGluIleLysTyrPheGluLysAspGlnGluThrSerTyrThrIle 210215220IleLysSerLysGluThrAlaIleThrAlaAspGlyLeuLysProGly225230235240SerAlaTyrValPhe GlnIleArgAlaArgThrAlaAlaGlyTyrGly245250255GlyPheSerArgArgPheGluPheGluThrSerProValLeuAlaAla260 265270SerSerAspGlnSerGlnIleProIleIleValValSerValThrVal275280285GlyValIleLeuLeuAlaValValIleGlyPheL euLeuSerGlySer290295300CysCysAspHisGlyCysGlyTrpAlaSerSerLeuArgAlaValAla30531031532 0TyrProSerLeuIleTrpArgCysGlyTyrSerLysAlaLysGlnAsp325330335ProGluGluGluLysMetHisPheHisAsnGlyHisIleLysLeuPro 340345350GlyValArgThrTyrIleAspProHisThrTyrGluAspProAsnGln355360365AlaValHisGluPhe AlaLysGluIleGluAlaSerCysIleThrIle370375380GluArgValIleGlyAlaGlyGluPheGlyGluValCysSerGlyArg385390 395400LeuLysLeuGlnGlyLysArgGluPheProValAlaIleLysThrLeu405410415LysValGlyTyrThrGluLysGlnArgArgA spPheLeuGlyGluAla420425430SerIleMetGlyGlnPheAspHisProAsnIleIleHisLeuGluGly43544044 5ValValThrLysSerLysProValMetIleValThrGluTyrMetGlu450455460AsnGlySerLeuAspThrPheLeuLysLysAsnAspGlyGlnPheThr465 470475480ValIleGlnLeuValGlyMetLeuArgGlyIleAlaSerGlyMetLys485490495TyrLeuSerAsp MetGlyTyrValHisArgAspLeuAlaAlaArgAsn500505510IleLeuIleAsnSerAsnLeuValCysLysValSerAspPheGlyLeu515 520525SerArgValLeuGluAspAspProGluAlaAlaTyrThrThrArgGly530535540GlyLysIleProIleArgTrpThrAlaProGluAlaIleA laPheArg545550555560LysPheThrSerAlaSerAspValTrpSerTyrGlyIleValMetTrp565570 575GluValMetSerTyrGlyGluArgProTyrTrpGluMetThrAsnGln580585590AspValIleLysAlaValGluGluGlyTyrArgLeuProSerProMet 595600605AspCysProAlaAlaLeuTyrGlnLeuMetLeuAspCysTrpGlnLys610615620AspArgAsnSerArgProLys PheAspGluIleValSerMetLeuAsp625630635640LysLeuIleArgAsnProSerSerLeuLysThrLeuValAsnAlaSer645 650655SerArgValSerAsnLeuLeuValGluHisSerProValGlySerGly660665670AlaTyrArgSerValGlyGluTrpLeuGluA laIleLysMetGlyArg675680685TyrThrGluIlePheMetGluAsnGlyTyrSerSerMetAspSerVal690695700Ala GlnValThrLeuGluAspLeuArgArgLeuGlyValThrLeuVal705710715720GlyHisGlnLysLysIleMetAsnSerLeuGlnGluMetLysValGln 725730735LeuValAsnGlyMetValProLeu740(2) INFORMATION FOR SEQ ID NO:21:(i) SEQUENCE CHARACTERISTICS:(A) LENGTH: 3056 base pairs(B) TYPE: nucleic acid(C) STRANDEDNESS: both (D) TOPOLOGY: linear(ix) FEATURE:(A) NAME/KEY: CDS(B) LOCATION: 2..2131(xi) SEQUENCE DESCRIPTION: SEQ ID NO:21:CCTCAAATTCACCCTGAGGGACTGTAACAGCCTTCCAGGAGGACTT46LeuLysPheThrLeuArgAspCysAsnSerLeu ProGlyGlyLeu151015GGGACTTGCAAGGAGACTTTTAACATGTACTACTTTGAGTCAGATGAT94GlyThrCysLysGluThrPheAsnMetTyrT yrPheGluSerAspAsp202530GAAGATGGGAGGAACATCAGAGAGAATCAGTACATCAAGATAGATACC142GluAspGlyArgAsnIleArgGluAsnG lnTyrIleLysIleAspThr354045ATTGCTGCTGATGAGAGCTTCACGGAGTTGGACCTCGGCGACAGAGTT190IleAlaAlaAspGluSerPheThrGluL euAspLeuGlyAspArgVal505560ATGAAGTTAAACACAGAAGTGAGAGATGTTGGGCCTCTAACAAAAAAA238MetLysLeuAsnThrGluValArgAspValG lyProLeuThrLysLys657075GGATTTTACCTTGCTTTCCAGGATGTGGGCGCCTGCATTGCCCTGGTC286GlyPheTyrLeuAlaPheGlnAspValGlyAlaCysI leAlaLeuVal80859095TCTGTGCGTGTGTACTACAAGAAATGCCCATCAGTGATCCGCAACCTG334SerValArgValTyrTyrLysLysCysProS erValIleArgAsnLeu100105110GCACGCTTTCCAGATACCATCACAGGAGCAGATTCCTCGCAGCTGCTA382AlaArgPheProAspThrIleThrGlyA laAspSerSerGlnLeuLeu115120125GAAGTGTCAGGCGTCTGTGTCAACCACTCAGTGACTGATGAGGCACCA430GluValSerGlyValCysValAsnHisS erValThrAspGluAlaPro130135140AAGATGCACTGCAGTTCAGAGGGAGAATGGCTGGTGCCCATTGGGAAG478LysMetHisCysSerSerGluGlyGluTrpL euValProIleGlyLys145150155TGTTTGTGCAAGGCAGGGTACGAGGAGAAGAACAACACCTGCCAAGCA526CysLeuCysLysAlaGlyTyrGluGluLysAsnAsnT hrCysGlnAla160165170175CCTTCTCCAGTCAGTAGTGTGAAAAAAGGGAAGATAACTAAAAATAGC574ProSerProValSerSerValLysLysGlyL ysIleThrLysAsnSer180185190ATCTCCCTTTCCTGGCAGGAGCCAGATCGACCCAACGGCATCATCCTG622IleSerLeuSerTrpGlnGluProAspA rgProAsnGlyIleIleLeu195200205GAATACGAAATCAAATATTTTGAAAAGGACCAGGAGACAAGCTACACC670GluTyrGluIleLysTyrPheGluLysA spGlnGluThrSerTyrThr210215220ATCATCAAATCCAAAGAGACCGCAATTACGGCAGATGGCTTGAAACCA718IleIleLysSerLysGluThrAlaIleThrA laAspGlyLeuLysPro225230235GGCTCAGCGTACGTCTTCCAGATCCGAGCCCGGACAGCTGCTGGCTAC766GlySerAlaTyrValPheGlnIleArgAlaArgThrA laAlaGlyTyr240245250255GGTGGCTTCAGTCGAAGATTTGAGTTTGAAACCAGCCCAGTGTTAGCT814GlyGlyPheSerArgArgPheGluPheGluT hrSerProValLeuAla260265270GCATCCAGTGACCAGAGCCAGATTCCTATAATTGTTGTGTCTGTAACA862AlaSerSerAspGlnSerGlnIleProI leIleValValSerValThr275280285GTGGGAGTTATTCTGCTGGCTGTTGTTATCGGTTTCCTTCTCAGTGGA910ValGlyValIleLeuLeuAlaValValI leGlyPheLeuLeuSerGly290295300AGGCGCTGTGGCTACAGCAAGGCTAAACAAGACCCAGAAGAAGAAAAG958ArgArgCysGlyTyrSerLysAlaLysGlnA spProGluGluGluLys305310315ATGCATTTTCATAATGGCCACATTAAACTGCCTGGTGTAAGAACCTAC1006MetHisPheHisAsnGlyHisIleLysLeuProGlyV alArgThrTyr320325330335ATTGATCCCCACACCTATGAGGACCCTAATCAAGCTGTCCACGAGTTT1054IleAspProHisThrTyrGluAspProAsnG lnAlaValHisGluPhe340345350GCCAAGGAAATAGAAGCTTCGTGCATAACCATCGAGAGAGTTATCGGA1102AlaLysGluIleGluAlaSerCysIleT hrIleGluArgValIleGly355360365GCTGGTGAATTTGGAGAAGTCTGCAGTGGACGGCTGAAACTGCAGGGA1150AlaGlyGluPheGlyGluValCysSerG lyArgLeuLysLeuGlnGly370375380AAACGCGAGTTTCCAGTGGCTATCAAAACCCTGAAGGTGGGCTACACA1198LysArgGluPheProValAlaIleLysThrL euLysValGlyTyrThr385390395GAGAAGCAAAGGCGAGATTTCCTGGGAGAAGCGAGCATCATGGGGCAG1246GluLysGlnArgArgAspPheLeuGlyGluAlaSerI leMetGlyGln400405410415TTCGACCACCCCAACATCATCCACCTGGAAGGTGTCGTCACAAAAAGC1294PheAspHisProAsnIleIleHisLeuGluG lyValValThrLysSer420425430AAACCTGTAATGATAGTAACGGAATACATGGAAAATGGTTCTCTGGAT1342LysProValMetIleValThrGluTyrM etGluAsnGlySerLeuAsp435440445ACATTTTTAAAGAAGAACGATGGGCAGTTCACGGTCATTCAGCTGGTC1390ThrPheLeuLysLysAsnAspGlyGlnP heThrValIleGlnLeuVal450455460GGGATGCTGCGAGGCATCGCATCAGGGATGAAGTACCTGTCTGACATG1438GlyMetLeuArgGlyIleAlaSerGlyMetL ysTyrLeuSerAspMet465470475GGTTACGTACACAGAGACCTCGCTGCCAGGAATATCCTCATCAACAGC1486GlyTyrValHisArgAspLeuAlaAlaArgAsnIleL euIleAsnSer480485490495AACTTAGTCTGCAAGGTGTCTGACTTTGGCCTCTCCAGAGTCCTAGAA1534AsnLeuValCysLysValSerAspPheGlyL euSerArgValLeuGlu500505510GATGATCCTGAAGCAGCGTACACAACCAGGGGAGGGAAGATCCCCATC1582AspAspProGluAlaAlaTyrThrThrA rgGlyGlyLysIleProIle515520525CGATGGACGGCACCTGAAGCAATCGCCTTCCGCAAATTCACGTCGGCC1630ArgTrpThrAlaProGluAlaIleAlaP heArgLysPheThrSerAla530535540AGCGATGTGTGGAGCTACGGCATTGTGATGTGGGAAGTGATGTCCTAT1678SerAspValTrpSerTyrGlyIleValMetT rpGluValMetSerTyr545550555GGCGAGAGACCTTACTGGGAAATGACAAACCAAGATGTGATTAAAGCC1726GlyGluArgProTyrTrpGluMetThrAsnGlnAspV alIleLysAla560565570575GTGGAGGAAGGCTATCGCCTGCCAAGTCCCATGGACTGCCCTGCTGCT1774ValGluGluGlyTyrArgLeuProSerProM etAspCysProAlaAla580585590CTCTACCAGTTGATGCTTGACTGCTGGCAGAAAGACCGCAACAGCAGG1822LeuTyrGlnLeuMetLeuAspCysTrpG lnLysAspArgAsnSerArg595600605CCCAAGTTTGATGAAATTGTCAGCATGTTGGACAAGCTCATCCGTAAC1870ProLysPheAspGluIleValSerMetL euAspLysLeuIleArgAsn610615620CCAAGCAGCTTGAAGACGTTGGTTAATGCATCGAGCAGAGTATCAAAT1918ProSerSerLeuLysThrLeuValAsnAlaS erSerArgValSerAsn625630635TTGTTGGTAGAACACAGTCCAGTGGGGAGCGGTGCCTACAGGTCAGTG1966LeuLeuValGluHisSerProValGlySerGlyAlaT yrArgSerVal640645650655GGTGAGTGGCTGGAAGCCATCAAAATGGGTCGATACACCGAGATTTTC2014GlyGluTrpLeuGluAlaIleLysMetGlyA rgTyrThrGluIlePhe660665670ATGGAGAATGGATACAGTTCGATGGATTCTGTGGCTCAGGTGACCCTA2062MetGluAsnGlyTyrSerSerMetAspS erValAlaGlnValThrLeu675680685GAGGACGAATCACCTTGTGAAAAGTGGAGCCTCACCCTCCACCCCCTC2110GluAspGluSerProCysGluLysTrpS erLeuThrLeuHisProLeu690695700TTTCCAACTGGATATCAGACTTGAAGGAAACCTTTCCAGTGGACCAGACCT2161PheProThrGlyTyrGlnThr705 710GCTCTTTAAACTTGTGGACCACCTAGTGACTTTGAGTGTGTCTGGAGCTCTTTCAATCCA2221CTGCAAGAATAACTTTACCAGGACAGTACTCAAGAATAGATAGATCCATGACATGAGTTT2281CAGTCTGATATTTGACTGGACCAATT ACTAACAAAATGTGGACTGCATACTTACACCTTT2341TGAAAGATCTGTACTCACCGAATCTCAGGACACCCTGTTGTTTGTTATTAGATGAAGAAC2401TCTGAATATTTGTAATAATATGTGATGTGTTGCTTTGCATTGTATTTTTTTCTTATAAAA2461 TAAAATAAATTATTTATTAAAAGTTATACTGGGATGAAGACCATTTAAGAGTTCACCTGC2521TCTAGATGCTTATTCTTAACCTGAAACCTCAGTTCCGGATAGTGATACTGCACACGCTTG2581TGAACAAACCCATTCTCGTGTCATAACCAAACAGGATGGGAGTA ATGAATAAGAGCAGAT2641GAACTCTTAAAAGAAAGATCCTAATCTCATGCAAAGGTCCCTTGCAAGTGGATTCCTCTC2701TCCCTAGCGTCTTCTAAAGGTCTTTGAGGTTATTCTTTCCCCTCTTTCAAACTGACAGCT2761AACTCTGTGAGTAGTGTCA GTCTGCATGGGCCAGTGTAGAACTGCACCATGTTGAAGAAG2821AGTGCTGCAATATGGCTGGGGTGGGAGATGAAATGCAAAGTAATCTCTGGTAGGCTGATG2881GCTTCCAGCCATGGAGGTATTTCAGGAACCTGGCCCTTTTGCTTGCATGAGTAATGAATG 2941GAGTGGTGAGGAGTGTTGTATTTTATGTGGCAATCCAGTCCTAGTCTACACTGTGTTTGA3001CAAATTGGTCCATGGTGTATAAGTAGTTCTATTTGTAAATAAAATGTTTTAAATG3056(2) INFORMATION FOR SEQ ID NO:22:(i) SEQUENCE CHARACTERISTICS:(A) LENGTH: 710 amino acids(B) TYPE: amino acid(D) TOPOLOGY: linear(ii) MOLECULE TYPE: protein(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22:LeuLysPheThrLeuArgAspCysAsnSerLeuProGlyGlyLeuGly15101 5ThrCysLysGluThrPheAsnMetTyrTyrPheGluSerAspAspGlu202530AspGlyArgAsnIleArgGluAsnGlnTyrIleLysIleAspThrIle 354045AlaAlaAspGluSerPheThrGluLeuAspLeuGlyAspArgValMet505560LysLeuAsnThrGluValArgAsp ValGlyProLeuThrLysLysGly65707580PheTyrLeuAlaPheGlnAspValGlyAlaCysIleAlaLeuValSer85 9095ValArgValTyrTyrLysLysCysProSerValIleArgAsnLeuAla100105110ArgPheProAspThrIleThrGlyAlaAspSerS erGlnLeuLeuGlu115120125ValSerGlyValCysValAsnHisSerValThrAspGluAlaProLys130135140MetHis CysSerSerGluGlyGluTrpLeuValProIleGlyLysCys145150155160LeuCysLysAlaGlyTyrGluGluLysAsnAsnThrCysGlnAlaPro 165170175SerProValSerSerValLysLysGlyLysIleThrLysAsnSerIle180185190SerLeuSerTrpGln GluProAspArgProAsnGlyIleIleLeuGlu195200205TyrGluIleLysTyrPheGluLysAspGlnGluThrSerTyrThrIle210215 220IleLysSerLysGluThrAlaIleThrAlaAspGlyLeuLysProGly225230235240SerAlaTyrValPheGlnIleArgAlaArgThrAlaA laGlyTyrGly245250255GlyPheSerArgArgPheGluPheGluThrSerProValLeuAlaAla26026527 0SerSerAspGlnSerGlnIleProIleIleValValSerValThrVal275280285GlyValIleLeuLeuAlaValValIleGlyPheLeuLeuSerGlyArg290 295300ArgCysGlyTyrSerLysAlaLysGlnAspProGluGluGluLysMet305310315320HisPheHisAsnGlyHis IleLysLeuProGlyValArgThrTyrIle325330335AspProHisThrTyrGluAspProAsnGlnAlaValHisGluPheAla340 345350LysGluIleGluAlaSerCysIleThrIleGluArgValIleGlyAla355360365GlyGluPheGlyGluValCysSerGlyArgLeuLysL euGlnGlyLys370375380ArgGluPheProValAlaIleLysThrLeuLysValGlyTyrThrGlu385390395400 LysGlnArgArgAspPheLeuGlyGluAlaSerIleMetGlyGlnPhe405410415AspHisProAsnIleIleHisLeuGluGlyValValThrLysSerLys 420425430ProValMetIleValThrGluTyrMetGluAsnGlySerLeuAspThr435440445PheLeuLysLysAsnAsp GlyGlnPheThrValIleGlnLeuValGly450455460MetLeuArgGlyIleAlaSerGlyMetLysTyrLeuSerAspMetGly4654704 75480TyrValHisArgAspLeuAlaAlaArgAsnIleLeuIleAsnSerAsn485490495LeuValCysLysValSerAspPheGlyLeuSerA rgValLeuGluAsp500505510AspProGluAlaAlaTyrThrThrArgGlyGlyLysIleProIleArg515520525 TrpThrAlaProGluAlaIleAlaPheArgLysPheThrSerAlaSer530535540AspValTrpSerTyrGlyIleValMetTrpGluValMetSerTyrGly545 550555560GluArgProTyrTrpGluMetThrAsnGlnAspValIleLysAlaVal565570575GluGluGlyTyrArg LeuProSerProMetAspCysProAlaAlaLeu580585590TyrGlnLeuMetLeuAspCysTrpGlnLysAspArgAsnSerArgPro5956 00605LysPheAspGluIleValSerMetLeuAspLysLeuIleArgAsnPro610615620SerSerLeuLysThrLeuValAsnAlaSerSerArgValSerA snLeu625630635640LeuValGluHisSerProValGlySerGlyAlaTyrArgSerValGly64565065 5GluTrpLeuGluAlaIleLysMetGlyArgTyrThrGluIlePheMet660665670GluAsnGlyTyrSerSerMetAspSerValAlaGlnValThrLeuGlu 675680685AspGluSerProCysGluLysTrpSerLeuThrLeuHisProLeuPhe690695700ProThrGlyTyrGlnThr705 710(2) INFORMATION FOR SEQ ID NO:23:(i) SEQUENCE CHARACTERISTICS:(A) LENGTH: 19 amino acids(B) TYPE: amino acid(D) TOPOLOGY: linear(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23:ArgIleCysThrProAspValSerGlyThrValGlySerArgProAla1 51015AlaAspHis(2) INFORMATION FOR SEQ ID NO:24:(i) SEQUENCE CHARACTERISTICS:(A) LENGTH: 13 amino acids(B) TYPE: amino acid(D) TOPOLOGY: linear(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24:CysLeuGluTh rHisThrLysAsnSerProValProVal1510(2) INFORMATION FOR SEQ ID NO:25:(i) SEQUENCE CHARACTERISTICS:(A) LENGTH: 12 amino acids(B) TYPE: amino acid(D) TOPOLOGY: linear(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25:LysMet GlnGlnMetHisGlyArgMetValProVal1510(2) INFORMATION FOR SEQ ID NO:26:(i) SEQUENCE CHARACTERISTICS:(A) LENGTH: 12 amino acids(B) TYPE: amino acid(D) TOPOLOGY: linear(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26:Lys ValHisLeuAsnGlnLeuGluProValGluVal1510
Claims
  • 1. An isolated nucleic acid molecule encoding an Eph-related protein tyrosine kinase, having a nucleotide sequence selected from the group consisting of Cek7 (SEQ ID NO: 3), Cek 9 (SEQ ID NO: 7), Cek10 (SEQ ID NO: 9), Cek5.sup.+ (SEQ ID NO: 11), Cek10.sup.+ (SEQ ID NO: 13), Cek7.sup.+ (SEQ ID NO: 19) and Cek7' (SEQ ID NO: 21) as shown in FIG. 1.
  • 2. A composition of matter, comprising a vector containing the nucleic acid of claim 1.
  • 3. The composition of claim 2, wherein said vector is for the expression of a recombinant EPH-related protein tyrosine kinase in a eucaryotic host.
  • 4. A composition of matter, comprising a host cell containing the vector of claim 2.
Government Interests

This invention was funded in part by NIH Grants HD 26351 and CA 56721. Accordingly, the United States government has certain rights in the invention.

Non-Patent Literature Citations (11)
Entry
Wicks et al (1992) Proc. Natl Acad. Sci. 89, 1611-1615.
Pasquale (1991) Cell Regul. 2, 523-534.
Frohman et al (1988) Proc. Natl Acad. Sci 85, 8998-9002.
Roux et al (1980) BioTechniques 8(1) 48-57.
Sambrook et al. in "Molecular Cloning: A Laboratory Manual" 2nd Ed. CHS Press, (1989) pp. 11-2-11.57.
Lhotak, Vladmir, et al. "Characterization of Elk, a Brain-Specific Receptor Tyrosine Kinase." Mole. Cell. Biol. 11(5):2496-2502 (1991).
Gilardi-Hebenstreit, Pascale et al. "An Eph-Related Receptor Protein Tyrosine Kinase Gene Segmentally Expressed in the Developing Mouse Hindbrain." Oncogene 7:2499-2506 (1992).
Nieto, M. Angela et al. "A Receptor Protein Tyrosine Kinase Implicated in the Segmental Patterning of the Hindbrain and Mesoderm." Development 116:1137-1150 (1992).
Letwin, Kenneth et al. "Novel Protein-Tyrosine Kinase cDNAS Related to fps/fes and eph Cloned Using Anti-Phosphotyrosine Antibody." Oncogene 3:621-627 (1988).
Maisonpierre, Peter C. et al. "Ehk-1 and Ehk-2: Two Novel Members of the Eph Receptor-like Tyrosine Kinase Family with Distinctive Structures and Neuronal Expression." Oncogene 8:3277-3288 (1993).
Bohme, Beatrix, et al., "PCR Mediated Detection of a New Human Receptor-Tyrosine-Kinase, HEK 2." Oncogene 8:2857-2862 (1993).