Novel nucleic acids and polypeptides

Information

  • Patent Application
  • 20030224379
  • Publication Number
    20030224379
  • Date Filed
    September 12, 2002
    22 years ago
  • Date Published
    December 04, 2003
    20 years ago
Abstract
The present invention provides novel nucleic acids, novel polypeptide sequences encoded by these nucleic acids and uses thereof.
Description


BACKGROUND OF THE INVENTION

[0002] 1. Technical Field


[0003] The present invention provides novel polynucleotides and proteins encoded by such polynucleotides, along with uses for these polynucleotides and proteins, for example in therapeutic, diagnostic and research methods.


[0004] 2. Background


[0005] Technology aimed at the discovery of protein factors (including e.g., cytokines, such as lymphokines, interferons, circulating soluble factors, chemokines, and interleukins) has matured rapidly over the past decade. The now routine hybridization cloning and expression cloning techniques clone novel polynucleotides “directly” in the sense that they rely on information directly related to the discovered protein (i.e., partial DNA/amino acid sequence of the protein in the case of hybridization cloning; activity of the protein in the case of expression cloning). More recent “indirect” cloning techniques such as signal sequence cloning, which isolates DNA sequences based on the presence of a now well-recognized secretory leader sequence motif, as well as various PCR-based or low stringency hybridization-based cloning techniques, have advanced the state of the art by making available large numbers of DNA/amino acid sequences for proteins that are known to have biological activity, for example, by virtue of their secreted nature in the case of leader sequence cloning, by virtue of their cell or tissue source in the case of PCR-based techniques, or by virtue of structural similarity to other genes of known biological activity.


[0006] Identified polynucleotide and polypeptide sequences have numerous applications in, for example, diagnostics, forensics, gene mapping; identification of mutations responsible for genetic disorders or other traits, to assess biodiversity, and to produce many other types of data and products dependent on DNA and amino acid sequences.



SUMMARY OF THE INVENTION

[0007] The compositions of the present invention include novel isolated polypeptides, novel isolated polynucleotides encoding such polypeptides, including recombinant DNA molecules, cloned genes or degenerate variants thereof, especially naturally occurring variants such as allelic variants, antisense polynucleotide molecules, and antibodies that specifically recognize one or more epitopes present on such polypeptides, as well as hybridomas producing such antibodies.


[0008] The compositions of the present invention additionally include vectors, including expression vectors, containing the polynucleotides of the invention, cells genetically engineered to contain such polynucleotides and cells genetically engineered to express such polynucleotides.


[0009] The present invention relates to a collection or library of at least one novel nucleic acid sequence assembled from expressed sequence tags (ESTs) isolated mainly by sequencing by hybridization (SBH), and in some cases, sequences obtained from one or more public databases. The invention relates also to the proteins encoded by such polynucleotides, along with therapeutic, diagnostic and research utilities for these polynucleotides and proteins. These nucleic acid sequences are designated as SEQ ID NO: 1-337, or 675-836 and are provided in the Sequence Listing. In the nucleic acids provided in the Sequence Listing, A is adenine; C is cytosine; G is guanine; T is thymine; and N is any of the four bases or unknown. In the amino acids provided in the Sequence Listing, * corresponds to the stop codon.


[0010] The nucleic acid sequences of the present invention also include, nucleic acid sequences that hybridize to the complement of SEQ ID NO: 1-337, or 675-836 under stringent hybridization conditions; nucleic acid sequences which are allelic variants or species homologues of any of the nucleic acid sequences recited above, or nucleic acid sequences that encode a peptide comprising a specific domain or truncation of the peptides encoded by SEQ ID NO: 1-337, or 675-836. A polynucleotide comprising a nucleotide sequence having at least 90% identity to an identifying sequence of SEQ ID NO: 1-337, or 675-836 or a degenerate variant or fragment thereof. The identifying sequence can be 100 base pairs in length.


[0011] The nucleic acid sequences of the present invention also include the sequence information from the nucleic acid sequences of SEQ ID NO: 1-337, or 675-836. The sequence information can be a segment of any one of SEQ ID NO: 1-337, or 675-836 that uniquely identifies or represents the sequence information of SEQ ID NO: 1-337, or 675-836.


[0012] A collection as used in this application can be a collection of only one polynucleotide. The collection of sequence information or identifying information of each sequence can be provided on a nucleic acid array. In one embodiment, segments of sequence information are provided on a nucleic acid array to detect the polynucleotide that contains the segment. The array can be designed to detect full-match or mismatch to the polynucleotide that contains the segment. The collection can also be provided in a computer-readable format.


[0013] This invention also includes the reverse or direct complement of any of the nucleic acid sequences recited above; cloning or expression vectors containing the nucleic acid sequences; and host cells or organisms transformed with these expression vectors. Nucleic acid sequences (or their reverse or direct complements) according to the invention have numerous applications in a variety of techniques known to those skilled in the art of molecular biology, such as use as hybridization probes, use as primers for PCR, use in an array, use in computer-readable media, use in sequencing full-length genes, use for chromosome and gene mapping, use in the recombinant production of protein, and use in the generation of anti-sense DNA or RNA, their chemical analogs and the like.


[0014] In a preferred embodiment, the nucleic acid sequences of SEQ ID NO: 1-337, or 675-836 or novel segments or parts of the nucleic acids of the invention are used as primers in expression assays that are well known in the art. In a particularly preferred embodiment, the nucleic acid sequences of SEQ ID NO: 1-337, or 675-836 or novel segments or parts of the nucleic acids provided herein are used in diagnostics for identifying expressed genes or, as well known in the art and exemplified by Vollrath et al., Science 258:52-59 (1992), as expressed sequence tags for physical mapping of the human genome.


[0015] The isolated polynucleotides of the invention include, but are not limited to, a polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO: 1-337, or 675-836; a polynucleotide comprising any of the full length protein coding sequences of SEQ ID NO: 1-337, or 675-836; and a polynucleotide comprising any of the nucleotide sequences of the mature protein coding sequences of SEQ ID NO: 1-337, or 675-836. The polynucleotides of the present invention also include, but are not limited to, a polynucleotide that hybridizes under stringent hybridization conditions to (a) the complement of any one of the nucleotide sequences set forth in SEQ ID NO: 1-337, or 675-836; (b) a nucleotide sequence encoding any one of the amino acid sequences set forth in SEQ ID NO: 1-337, or 675-836; (c) a polynucleotide which is an allelic variant of any polynucleotides recited above; (d) a polynucleotide which encodes a species homologue (e.g. orthologs) of any of the proteins recited above; or (e) a polynucleotide that encodes a polypeptide comprising a specific domain or truncation of any of the polypeptides comprising an amino acid sequence set forth in SEQ ID NO: 338-674, or 837-998, or Tables 3, 5, 6, 8, or 9.


[0016] The isolated polypeptides of the invention include, but are not limited to, a polypeptide comprising any of the amino acid sequences set forth in the Sequence Listing; or the corresponding full length or mature protein. Polypeptides of the invention also include polypeptides with biological activity that are encoded by (a) any of the polynucleotides having a nucleotide sequence set forth in SEQ ID NO: 1-337, or 675-836; or (b) polynucleotides that hybridize to the complement of the polynucleotides of (a) under stringent hybridization conditions. Biologically active variants of any of the polypeptide sequences in the Sequence Listing, and “substantial equivalents” thereof (e.g., with at least about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% amino acid sequence identity) that preferably retain biological activity are also contemplated. The polypeptides of the invention may be wholly or partially chemically synthesized but are preferably produced by recombinant means using the genetically engineered cells (e.g. host cells) of the invention.


[0017] The invention also provides compositions comprising a polypeptide of the invention. Polypeptide compositions of the invention may further comprise an acceptable carrier, such as a hydrophilic, e.g., pharmaceutically acceptable, carrier.


[0018] The invention also provides host cells transformed or transfected with a polynucleotide of the invention.


[0019] The invention also relates to methods for producing a polypeptide of the invention comprising growing a culture of the host cells of the invention in a suitable culture medium under conditions permitting expression of the desired polypeptide, and purifying the polypeptide from the culture or from the host cells. Preferred embodiments include those in which the protein produced by such processes is a mature form of the protein.


[0020] Polynucleotides according to the invention have numerous applications in a variety of techniques Known to those skilled in the art of molecular biology. These techniques include use as hybridization probes, use as oligomers, or primers, for PCR, use for chromosome and gene mapping, use in the recombinant production of protein, and use in generation of anti-sense DNA or RNA, their chemical analogs and the like. For example, when the expression of an mRNA is largely restricted to a particular cell or tissue type, polynucleotides of the invention can be used as hybridization probes to detect the presence of the particular cell or tissue mRNA in a sample using, e.g., in situ hybridization.


[0021] In other exemplary embodiments, the polynucleotides are used in diagnostics as expressed sequence tags for identifying expressed genes or, as well known in the art and exemplified by Vollrath et al., Science 258:52-59 (1992), as expressed sequence tags for physical mapping of the human genome.


[0022] The polypeptides according to the invention can be used in a variety of conventional procedures and methods that are currently applied to other proteins. For example, a polypeptide of the invention can be used to generate an antibody that specifically binds the polypeptide. Such antibodies, particularly monoclonal antibodies, are useful for detecting or quantitating the polypeptide in tissue. The polypeptides of the invention can also be used as molecular weight markers, and as a food supplement.


[0023] Methods are also provided for preventing, treating, or ameliorating a medical condition which comprises the step of administering to a mammalian subject a therapeutically effective amount of a composition comprising a polypeptide of the present invention and a pharmaceutically acceptable carrier.


[0024] In particular, the polypeptides and polynucleotides of the invention can be utilized, for example, in methods for the prevention and/or treatment of disorders involving aberrant protein expression or biological activity.


[0025] The present invention further relates to methods for detecting the presence of the polynucleotides or polypeptides of the invention in a sample. Such methods can, for example, be utilized as part of prognostic and diagnostic evaluation of disorders as recited herein and for the identification of subjects exhibiting a predisposition to such conditions. The invention provides a method for detecting the polynucleotides of the invention in a sample, comprising contacting the sample with a compound that binds to and forms a complex with the polynucleotide of interest for a period sufficient to form the complex and under conditions sufficient to form a complex and detecting the complex such that if a complex is detected, the polynucleotide of interest is detected. The invention also provides a method for detecting the polypeptides of the invention in a sample comprising contacting the sample with a compound that binds to and forms a complex with the polypeptide under conditions and for a period sufficient to form the complex and detecting the formation of the complex such that if a complex is formed, the polypeptide is detected.


[0026] The invention also provides kits comprising polynucleotide probes and/or monoclonal antibodies, and optionally quantitative standards, for carrying out methods of the invention. Furthermore, the invention provides methods for evaluating the efficacy of drugs, and monitoring the progress of patients, involved in clinical trials for the treatment of disorders as recited above.


[0027] The invention also provides methods for the identification of compounds that modulate (i.e., increase or decrease) the expression or activity of the polynucleotides and/or polypeptides of the invention. Such methods can be utilized, for example, for the identification of compounds that can ameliorate symptoms of disorders as recited herein. Such methods can include, but are not limited to, assays for identifying compounds and other substances that interact with (e.g., bind to) the polypeptides of the invention. The invention provides a method for identifying a compound that binds to the polypeptides of the invention comprising contacting the compound with a polypeptide of the invention in a cell for a time sufficient to form a polypeptide/compound complex, wherein the complex drives expression of a reporter gene sequence in the cell; and detecting the complex by detecting the reporter gene sequence expression such that if expression of the reporter gene is detected the compound that binds to a polypeptide of the invention is identified.


[0028] The methods of the invention also provide methods for treatment which involve the administration of the polynucleotides or polypeptides of the invention to individuals exhibiting symptoms or tendencies. In addition, the invention encompasses methods for treating diseases or disorders as recited herein comprising administering compounds and other substances that modulate the overall activity of the target gene products. Compounds and other substances can affect such modulation either on the level of target gene/protein expression or target protein activity.


[0029] The polypeptides of the present invention and the polynucleotides encoding them are also useful for the same functions known to one of skill in the art as the polypeptides and polynucleotides to which they have homology (set forth in Table 2); for which they have a signature region (as set forth in Table 3); or for which they have homology to a gene family (as set forth in Table 4). If no homology is set forth for a sequence, then the polypeptides and polynucleotides of the present invention are useful for a variety of applications, as described herein, including use in arrays for detection.



DETAILED DESCRIPTION OF THE INVENTION


Definitions

[0030] It must be noted that as used herein and in the appended claims, the singular forms “a”, “an” and “the” include plural references unless the context clearly dictates otherwise.


[0031] The term “active” refers to those forms of the polypeptide which retain the biologic and/or immunologic activities of any naturally occurring polypeptide. According to the invention, the terms “biologically active” or “biological activity” refer to a protein or peptide having structural, regulatory or biochemical functions of a naturally occurring molecule. Likewise “immunologically active” or “immunological activity” refers to the capability of the natural, recombinant or synthetic polypeptide to induce a specific immune response in appropriate animals or cells and to bind with specific antibodies.


[0032] The term “activated cells” as used in this application are those cells which are engaged in extracellular or intracellular membrane trafficking, including the export of secretory or enzymatic molecules as part of a normal or disease process.


[0033] The terms “complementary” or “complementary” refer to the natural binding of polynucleotides by base pairing. For example, the sequence 5′-AGT-3′ binds to the complementary sequence 3′-TCA-5′. Complementary between two single-stranded molecules may be “partial” such that only certain portion(s) of the nucleic acids bind or it may be “complete” such that total complementarity exists between the single stranded molecules. The degree of complementarity between the nucleic acid strands has significant effects on the efficiency and strength of the hybridization between the nucleic acid strands.


[0034] The term “embryonic stem cells (ES)” refers to a cell that can give rise to many differentiated cell types in an embryo or an adult, including the germ cells. The term “germ line stem cells (GSCs)” refers to stem cells derived from primordial stem cells that provide a steady and continuous source of germ cells for the production of gametes. The term “primordial germ cells (PGCs)” refers to a small population of cells set aside from other cell lineages particularly from the yolk sac, mesenteries, or gonadal ridges during embryogenesis that have the potential to differentiate into germ cells and other cells. PGCs are the source from which GSCs and ES cells are derived. The PGCs, the GSCs and the ES cells are capable of self-renewal. Thus these cells not only populate the germ line and give rise to a plurality of terminally differentiated cells that comprise the adult specialized organs, but are able to regenerate themselves.


[0035] The term “expression modulating fragment,” EMF, means a series of nucleotides which modulates the expression of an operably linked ORF or another EMF.


[0036] As used herein, a sequence is said to “modulate the expression of an operably linked sequence” when the expression of the sequence is altered by the presence of the EMF. EMFs include, but are not limited to, promoters, and promoter modulating sequences (inducible elements). One class of EMFs are nucleic acid fragments which induce the expression of an operably linked ORF in response to a specific regulatory factor or physiological event.


[0037] The terms “nucleotide sequence” or “nucleic acid” or “polynucleotide” or “oligonucleotide” are used interchangeably and refer to a heteropolymer of nucleotides or the sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic or synthetic origin which may be single-stranded or double-stranded and may represent the sense or the antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or RNA-like material. In the sequences herein A is adenine, C is cytosine, T is thymine, G is guanine and N is A, C, G, or T (U) or unknown. It is contemplated that where the polynucleotide is RNA, the T (thymine) in the sequences provided herein is substituted with U (uracil). Generally, nucleic acid segments provided by this invention may be assembled from fragments of the genome and short oligonucleotide linkers, or from a series of oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic acid which is capable of being expressed in a recombinant transcriptional unit comprising regulatory elements derived from a microbial or viral operon, or a eukaryotic gene.


[0038] The terms “oligonucleotide fragment” or a “polynucleotide fragment”, “portion,” or “segment” or “probe” or “primer” are used interchangeably and refer to a sequence of nucleotide residues which are at least about 5 nucleotides, more preferably at least about 7 nucleotides, more preferably at least about 9 nucleotides, more preferably at least about 11 nucleotides and most preferably at least about 17 nucleotides. The fragment is preferably less than about 500 nucleotides, preferably less than about 200 nucleotides, more preferably less than about 100 nucleotides, more preferably less than about 50 nucleotides and most preferably less than 30 nucleotides. Preferably the probe is from about 6 nucleotides to about 200 nucleotides, preferably from about 15 to about 50 nucleotides, more preferably from about 17 to 30 nucleotides and most preferably from about 20 to 25 nucleotides. Preferably the fragments can be used in polymerase chain reaction (PCR), various hybridization procedures or microarray procedures to identify or amplify identical or related parts of mRNA or DNA molecules. A fragment or segment may uniquely identify each polynucleotide sequence of the present invention. Preferably the fragment comprises a sequence substantially similar to any one of SEQ ID NO: 1-337, or 675-836.


[0039] Probes may, for example, be used to determine whether specific mRNA molecules are present in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal DNA as described by Walsh et al. (Walsh, P. S. et al., 1992, PCR Methods Appl 1:241-250). They may be labeled by nick translation, Klenow fill-in reaction, PCR, or other methods well known in the art. Probes of the present invention, their preparation and/or labeling are elaborated in Sambrook, J. et al., 1989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY; or Ausubel, F. M. et al., 1989, Current Protocols in Molecular Biology, John Wiley & Sons, New York N.Y., both of which are incorporated herein by reference in their entirety.


[0040] The nucleic acid sequences of the present invention also include the sequence information from the nucleic acid sequences of SEQ ID NO: 1-337, or 675-836. The sequence information can be a segment of any one of SEQ ID NO: 1-337, or 675-836 that uniquely identifies or represents the sequence information of that sequence of SEQ ID NO: 1-337, or 675-836, or those segments identified in Tables 3, 5, 6, 8, or 9. One such segment can be a twenty-mer nucleic acid sequence because the probability that a twenty-mer is fully matched in the human genome is 1 in 300. In the human genome, there are three billion base pairs in one set of chromosomes. Because 420 possible twenty-mers exist, there are 300 times more twenty-mers than there are base pairs in a set of human chromosomes. Using the same analysis, the probability for a seventeen-mer to be fully matched in the human genome is approximately 1 in 5. When these segments are used in arrays for expression studies, fifteen-mer segments can be used. The probability that the fifteen-mer is fully matched in the expressed sequences is also approximately one in five because expressed sequences comprise less than approximately 5% of the entire genome sequence.


[0041] Similarly, when using sequence information for detecting a single mismatch, a segment can be a twenty-five mer. The probability that the twenty-five mer would appear in a human genome with a single mismatch is calculated by multiplying the probability for a full match (1÷425) times the increased probability for mismatch at each nucleotide position (3×25). The probability that an eighteen mer with a single mismatch can be detected in an array for expression studies is approximately one in five. The probability that a twenty-mer with a single mismatch can be detected in a human genome is approximately one in five.


[0042] The term “open reading frame,” ORF, means a series of nucleotide triplets coding for amino acids without any termination codons and is a sequence translatable into protein.


[0043] The terms “operably linked” or “operably associated” refer to functionally related nucleic acid sequences. For example, a promoter is operably associated or operably linked with a coding sequence if the promoter controls the transcription of the coding sequence. While operably linked nucleic acid sequences can be contiguous and in the same reading frame, certain genetic elements e.g. repressor genes are not contiguously linked to the coding sequence but still control transcription/translation of the coding sequence.


[0044] The term “pluripotent” refers to the capability of a cell to differentiate into a number of differentiated cell types that are present in an adult organism. A pluripotent cell is restricted in its differentiation capability in comparison to a totipotent cell.


[0045] The terms “polypeptide” or “peptide” or “amino acid sequence” refer to an oligopeptide, peptide, polypeptide or protein sequence or fragment thereof and to naturally occurring or synthetic molecules. A polypeptide “fragment,” “portion,” or “segment” is a stretch of amino acid residues of at least about 5 amino acids, preferably at least about 7 amino acids, more preferably at least about 9 amino acids and most preferably at least about 17 or more amino acids. The peptide preferably is not greater than about 200 amino acids, more preferably less than 150 amino acids and most preferably less than 100 amino acids. Preferably the peptide is from about 5 to about 200 amino acids. To be active, any polypeptide must have sufficient length to display biological and/or immunological activity.


[0046] The term “naturally occurring polypeptide” refers to polypeptides produced by cells that have not been genetically engineered and specifically contemplates various polypeptides arising from post-translational modifications of the polypeptide including. but not limited to, acetylation, carboxylation, glycosylation, phosphorylation, lipidation and acylation.


[0047] The term “translated protein coding portion” means a sequence which encodes for the full-length protein which may include any leader sequence or any processing sequence.


[0048] The term “mature protein coding sequence” means a sequence which encodes a peptide or protein without a signal or leader sequence. The “mature protein portion” means that portion of the protein which does not include a signal or leader sequence. The peptide may have been produced by processing in the cell which removes any leader/signal sequence. The mature protein portion may or may not include the initial methionine residue. The methionine residue may be removed from the protein during processing in the cell. The peptide may be produced synthetically or the protein may have been produced using a polynucleotide only encoding for the mature protein coding sequence.


[0049] The term “derivative” refers to polypeptides chemically modified by such techniques as ubiquitination, labeling (e.g., with radionuclides or various enzymes), covalent polymer attachment such as pegylation (derivatization with polyethylene glycol) and insertion or substitution by chemical synthesis of amino acids such as ornithine, which do not normally occur in human proteins.


[0050] The term “variant” (or “analog”) refers to any polypeptide differing from naturally occurring polypeptides by amino acid insertions, deletions, and substitutions, created using, e g., recombinant DNA techniques. Guidance in determining which amino acid residues may be replaced, added or deleted without abolishing activities of interest, may be found by comparing the sequence of the particular polypeptide with that of homologous peptides and minimizing the number of amino acid sequence changes made in regions of high homology (conserved regions) or by replacing amino acids with consensus sequence.


[0051] Alternatively, recombinant variants encoding these same or similar polypeptides may be synthesized or selected by making use of the “redundancy” in the genetic code. Various codon substitutions, such as the silent changes which produce various restriction sites, may be introduced to optimize cloning into a plasmid or viral vector or expression in a particular prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be reflected in the polypeptide or domains of other peptides added to the polypeptide to modify the properties of any part of the polypeptide, to change characteristics such as ligand-binding, affinities, interchain affinities, or degradation/turnover rate.


[0052] Preferably, amino acid “substitutions” are the result of replacing one amino acid with another amino acid having similar structural and/or chemical properties, i.e., conservative amino acid replacements. “Conservative” amino acid substitutions may be made on the basis of similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic nature of the residues involved. For example, nonpolar (hydrophobic) amino acids include alanine, leucine, isoleucine, valine, proline, phenylalanine, tlyptophan, and methionine; polar neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and glutamine; positively charged (basic) amino acids include arginine, lysine, and histidine; and negatively charged (acidic) amino acids include aspartic acid and glutamic acid. “Insertions” or “deletions” are preferably in the range of about 1 to 20 amino acids, more preferably 1 to 10 amino acids. The variation allowed may be experimentally determined by systematically making insertions, deletions, or substitutions of amino acids in a polypeptide molecule using recombinant DNA techniques and assaying the resulting recombinant variants for activity.


[0053] Alternatively, where alteration of function is desired, insertions, deletions or non-conservative alterations can be engineered to produce altered polypeptides. Such alterations can, for example, alter one or more of the biological functions or biochemical characteristics of the polypeptides of the invention. For example, such alterations may change polypeptide characteristics such as ligand-binding affinities, interchain affinities, or degradation/turnover rate. Further, such alterations can be selected so as to generate polypeptides that are better suited for expression, scale up and the like in the host cells chosen for expression. For example, cysteine residues can be deleted or substituted with another amino acid residue in order to eliminate disulfide bridges.


[0054] The terms “purified” or “substantially purified” as used herein denotes that the indicated nucleic acid or polypeptide is present in the substantial absence of other biological macromolecules, e.g., polynucleotides, proteins, and the like. In one embodiment, the polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, more preferably at least 99% by weight, of the indicated biological macromolecules present (but water, buffers, and other small molecules, especially molecules having a molecular weight of less than 1000 daltons, can be present).


[0055] The term “isolated” as used herein refers to a nucleic acid or polypeptide separated from at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic acid or polypeptide in its natural source. In one embodiment, the nucleic acid or polypeptide is found in the presence of (if anything) only a solvent, buffer, ion, or other component normally present in a solution of the same. The terms “isolated” and “purified” do not encompass nucleic acids or polypeptides present in their natural source.


[0056] The term “recombinant,” when used herein to refer to a polypeptide or protein, means that a polypeptide or protein is derived from recombinant (e.g., microbial, insect, or mammalian) expression systems. “Microbial” refers to recombinant polypeptides or proteins made in bacterial or fungal (e.g., yeast) expression systems. As a product, “recombinant microbial” defines a polypeptide or protein essentially free of native endogenous substances and unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most bacterial cultures, e.g., E. coli, will be free of glycosylation modifications; polypeptides or proteins expressed in yeast will have a glycosylation pattern in general different from those expressed in mammalian cells.


[0057] The term “recombinant expression vehicle or vector” refers to a plasmid or phage or virus or vector, for expressing a polypeptide from a DNA (RNA) sequence. An expression vehicle can comprise a transcriptional unit comprising an assembly of (1) a genetic element or elements having a regulatory role in gene expression, for example, promoters or enhancers, (2) a structural or coding sequence which is transcribed into mRNA and translated into protein, and (3) appropriate transcription initiation and termination sequences. Structural units intended for use in yeast or eukaryotic expression systems preferably include a leader sequence enabling extracellular secretion of translated protein by a host cell. Alternatively, where recombinant protein is expressed without a leader or transport sequence, it may include an amino terminal methionine residue. This residue may or may not be subsequently cleaved from the expressed recombinant protein to provide a final product.


[0058] The term “recombinant expression system” means host cells which have stably integrated a recombinant transcriptional unit into chromosomal DNA or caney the recombinant transcriptional unit extrachromosomally. Recombinant expression systems as defined herein will express heterologous polypeptides or proteins upon induction of the regulatory elements linked to the DNA segment or synthetic gene to be expressed. This term also means host cells which have stably integrated a recombinant genetic element or elements having a regulatory role in gene expression, for example, promoters or enhancers. Recombinant expression systems as defined herein will express polypeptides or proteins endogenous to the cell upon induction of the regulatory elements linked to the endogenous DNA segment or gene to be expressed. The cells can be prokaryotic or eukaryotic.


[0059] The term “secreted” includes a protein that is transported across or through a membrane, including transport as a result of signal sequences in its amino acid sequence when it is expressed in a suitable host cell. “Secreted” proteins include without limitation proteins secreted wholly (e.g., soluble proteins) or partially (e.g., receptors) from the cell in which they are expressed. “Secreted” proteins also include without limitation proteins that are transported across the membrane of the endoplasmic reticulum. “Secreted” proteins are also intended to include proteins containing non-typical signal sequences (e.g. Interleukin-1 Beta, see Krasney, P. A. and Young, P. R. (1992) Cytokine 4(2): 134-143) and factors released from damaged cells (e.g. Interleukin-1 Receptor Antagonist. see Arend, W. P. et. al. (1998) Annu. Rev. Immunol. 16:27-55)


[0060] Where desired, an expression vector may be designed to contain a “signal or leader sequence” which will direct the polypeptide through the membrane of a cell. Such a sequence may be naturally present on the polypeptides of the present invention or provided from heterologous protein sources by recombinant DNA techniques.


[0061] The term “stringent” is used to refer to conditions that are commonly understood in the art as stringent. Stringent conditions can include highly stringent conditions (i.e., hybridization to filter-bound DNA in 0.5 M NaHPO4, 7% sodium dodecyl sulfate (SDS), 1 mM EDTA at 65° C., and washing in 0.1×SSC/0.1% SDS at 68° C.), and moderately stringent conditions (i.e., washing in 0.2×SSC/0.1% SDS at 42° C.). Other exemplary hybridization conditions are described herein in the examples.


[0062] In instances of hybridization of deoxyoligonucleotides, additional exemplary stringent hybridization conditions include washing in 6×SSC/0.05% sodium pyrophosphate at 37° C. (for 14-base oligonucleotides), 48° C. (for 17-base oligonucleotides), 55° C. (for 20-base oligonucleotides), and 60° C. (for 23-base oligonucleotides).


[0063] As used herein, “substantially equivalent” or “substantially similar” can refer both to nucleotide and amino acid sequences, for example a mutant sequence, that varies from a reference sequence by one or more substitutions, deletions, or additions, the net effect of which does not result in an adverse functional dissimilarity between the reference and subject sequences. Typically, such a substantially equivalent sequence varies from one of those listed herein by no more than about 35% (i.e., the number of individual residue substitutions, additions, and/or deletions in a substantially equivalent sequence, as compared to the corresponding reference sequence, divided by the total number of residues in the substantially equivalent sequence is about 0.35 or less). Such a sequence is said to have 65% sequence identity to the listed sequence. In one embodiment, a substantially equivalent, e.g., mutant, sequence of the invention varies from a listed sequence by no more than 30% (70% sequence identity); in a variation of this embodiment, by no more than 25% (75% sequence identity); and in a further variation of this embodiment, by no more than 20% (80% sequence identity) and in a further variation of this embodiment, by no more than 10% (90% sequence identity) and in a further variation of this embodiment, by no more that 5% (95% sequence identity). Substantially equivalent, e.g., mutant, amino acid sequences according to the invention preferably have at least 80% sequence identity with a listed amino acid sequence, more preferably at least 85% sequence identity, more preferably at least 90% sequence identity, more preferably at least 95% sequence identity, more preferably at least 98% sequence identity, and most preferably at least 99% sequence identity. Substantially equivalent nucleotide sequence of the invention can have lower percent sequence identities, taking into account, for example, the redundancy or degeneracy of the genetic code. Preferably, the nucleotide sequence has at least about 65% identity, more preferably at least about 75% identity, more preferably at least about 80% sequence identity, more preferably at least 85% sequence identity, more preferably at least 90% sequence identity, more preferably at least about 95% sequence identity, more preferably at least 98% sequence identity, and most preferably at least 99% sequence identity. For the purposes of the present invention, sequences having substantially equivalent biological activity and substantially equivalent expression characteristics are considered substantially equivalent. For the purposes of determining equivalence, truncation of the mature sequence (e.g., via a mutation which creates a new stop codon) should be disregarded. Sequence identity may be determined, e.g., using the Jotun Hein method (Hein, J. (1990) Methods Enzymol. 183:626-645). Identity between sequences can also be determined by other methods known in the art, e.g. by varying hybridization conditions.


[0064] The term “totipotent” refers to the capability of a cell to differentiate into all of the cell types of an adult organism.


[0065] The term “transformation” means introducing DNA into a suitable host cell so that the DNA is replicable, either as an extrachromosomal element, or by chromosomal integration. The term “transfection” refers to the taking up of an expression vector by a suitable host cell, whether or not any coding sequences are in fact expressed. The term “infection” refers to the introduction of nucleic acids into a suitable host cell by use of a virus or viral vector.


[0066] As used herein, an “uptake modulating fragment,” UMF, means a series of nucleotides which mediate the uptake of a linked DNA fragment into a cell. UMFs can be readily identified using known UMFs as a target sequence or target motif with the computer-based systems described below. The presence and activity of a UMF can be confirmed by attaching the suspected UMF to a marker sequence. The resulting nucleic acid molecule is then incubated with an appropriate host under appropriate conditions and the uptake of the marker sequence is determined. As described above, a UMF will increase the frequency of uptake of a linked marker sequence.


[0067] Each of the above terms is meant to encompass all that is described for each, unless the context dictates otherwise.



NUCLEIC ACIDS OF THE INVENTION

[0068] Nucleotide sequences of the invention are set forth in the Sequence Listing.


[0069] The isolated polynucleotides of the invention include a polynucleotide comprising the nucleotide sequences of SEQ ID NO: 1-337, or 675-836; a polynucleotide encoding any one of the peptide sequences of SEQ ID NO: 1-337, or 675-836; and a polynucleotide comprising the nucleotide sequence encoding the mature protein coding sequence of the polynucleotides of any one of SEQ ID NO: 1-337, or 675-836. The polynucleotides of the present invention also include, but are not limited to, a polynucleotide that hybridizes under stringent conditions to (a) the complement of any of the nucleotides sequences of SEQ ID NO: 1-337, or 675-836; (b) nucleotide sequences encoding any one of the amino acid sequences set forth in the Sequence Listing, or Table 8; (c) a polynucleotide which is an allelic variant of any polynucleotide recited above; (d) a polynucleotide which encodes a species homologue of any of the proteins recited above; or (e) a polynucleotide that encodes a polypeptide comprising a specific domain or truncation of the polypeptides of SEQ ID NO: 338-674, or 837-998 (for example, as set forth in Tables 3, 5, 6, 8, or 9). Domains of interest may depend on the nature of the encoded polypeptide; e.g., domains in receptor-like polypeptides include ligand-binding, extracellular, transmembrane, or cytoplasmic domains, or combinations thereof; domains in immunoglobulin-like proteins include the variable immunoglobulin-like domains; domains in enzyme-like polypeptides include catalytic and substrate binding domains; and domains in ligand polypeptides include receptor-binding domains.


[0070] The polynucleotides of the invention include naturally occurring or wholly or partially synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The polynucleotides may include entire coding region of the cDNA or may represent a portion of the coding region of the cDNA.


[0071] The present invention also provides genes corresponding to the cDNA sequences disclosed herein. The corresponding genes can be isolated in accordance with known methods using the sequence information disclosed herein. Such methods include the preparation of probes or primers from the disclosed sequence information for identification and/or amplification of genes in appropriate genomic libraries or other sources of genomic materials. Further 5′ and 3′ sequence can be obtained using methods known in the art. For example, full length cDNA or genomic DNA that corresponds to any of the polynucleotides of SEQ ID NO: 1-337, or 675-836 can be obtained by screening appropriate cDNA or genomic DNA libraries under suitable hybridization conditions using any of the polynucleotides of SEQ ID NO: 1-337, or 675-836 or a portion thereof as a probe. Alternatively, the polynucleotides of SEQ ID NO: 1-337, or 675-836 may be used as the basis for suitable primer(s) that allow identification and/or amplification of genes in appropriate genomic DNA or cDNA libraries.


[0072] The nucleic acid sequences of the invention can be assembled from ESTs and sequences (including cDNA and genomic sequences) obtained from one or more public databases, such as dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence information, representative fragment or segment information, or novel segment information for the full-length gene.


[0073] The polynucleotides of the invention also provide polynucleotides including nucleotide sequences that are substantially equivalent to the polynucleotides recited above. Polynucleotides according to the invention can have, e.g., at least about 65%, at least about 70%, at least about 75%, at least about 80%, 81%, 82%, 83%, 84%, more typically at least about 85%, 86%, 87%, 88%, 89%, more typically at least about 90%, 91%, 92%, 93%, 94%, and even more typically at least about 95%, 96%, 97%, 98%, 99% sequence identity to a polynucleotide recited above.


[0074] Included within the scope of the nucleic acid sequences of the invention are nucleic acid sequence fragments that hybridize under stringent conditions to any of the nucleotide sequences of SEQ ID NO: 1-337, or 675-836, or complements thereof, which fragment is greater than about 5 nucleotides, preferably 7 nucleotides, more preferably greater than 9 nucleotides and most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 20 nucleotides or more that are selective for (i.e. specifically hybridize to) any one of the polynucleotides of the invention are contemplated. Probes capable of specifically hybridizing to a polynucleotide can differentiate polynucleotide sequences of the invention from other polynucleotide sequences in the same family of genes or can differentiate human genes from genes of other species, and are preferably based on unique nucleotide sequences.


[0075] The sequences falling within the scope of the present invention are not limited to these specific sequences, but also include allelic and species variations thereof. Allelic and species variations can be routinely determined by comparing the sequence provided in SEQ ID NO: 1-337, or 675-836, a representative fragment thereof, or a nucleotide sequence at least 90% identical, preferably 95% identical, to SEQ ID NO: 1-337, or 675-836 with a sequence from another isolate of the same species. Furthermore, to accommodate codon variability, the invention includes nucleic acid molecules coding for the same amino acid sequences as do the specific ORFs disclosed herein. In other words, in the coding region of an ORF, substitution of one codon for another codon that encodes the same amino acid is expressly contemplated.


[0076] The nearest neighbor or homology results for the nucleic acids of the present invention, including SEQ ID NO: 1-337, or 675-836 can be obtained by searching a database using an algorithm or a program. Preferably, a BLAST (Basic Local Alignment Search Tool) program is used to search for local sequence alignments (Altshul, S. F. J Mol. Evol. 36 290-300 (1993) and Altschul S. F. et al. J. Mol. Biol. 21:403-410 (1990)). Alternatively a FASTA version 3 search against Genpept, using FASTXY algorithm may be performed.


[0077] Species homologs (or orthologs) of the disclosed polynucleotides and proteins are also provided by the present invention. Species homologs may be isolated and identified by making suitable probes or primers from the sequences provided herein and screening a suitable nucleic acid source from the desired species.


[0078] The invention also encompasses allelic variants of the disclosed polynucleotides or proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide which also encode proteins which are identical, homologous or related to that encoded by the polynucleotides.


[0079] The nucleic acid sequences of the invention are further directed to sequences which encode variants of the described nucleic acids. These amino acid sequence variants may be prepared by methods known in the art by introducing appropriate nucleotide changes into a native or variant polynucleotide. There are two variables in the construction of amino acid sequence variants: the location of the mutation and the nature of the mutation. Nucleic acids encoding the amino acid sequence variants are preferably constructed by mutating the polynucleotide to encode an amino acid sequence that does not occur in nature. These nucleic acid alterations can be made at sites that differ in the nucleic acids from different species (variable positions) or in highly conserved regions (constant regions). Sites at such locations will typically be modified in series, e.g., by substituting first with conservative choices (e.g., hydrophobic amino acid to a different hydrophobic amino acid) and then with more distant choices (e.g., hydrophobic amino acid to a charged amino acid), and then deletions or insertions may be made at the target site. Amino acid sequence deletions generally range from about 1 to 30 residues, preferably about 1 to 10 residues, and are typically contiguous. Amino acid insertions include amino- and/or carboxyl-terminal fusions ranging in length from one to one hundred or more residues, as well as intrasequence insertions of single or multiple amino acid residues. Intrasequence insertions may range generally from about 1 to 10 amino residues, preferably from 1 to 5 residues. Examples of terminal insertions include the heterologous signal sequences necessary for secretion or for intracellular targeting in different host cells and sequences such as FLAG or poly-histidine sequences useful for purifying the expressed protein.


[0080] In a preferred method, polynucleotides encoding the novel amino acid sequences are changed via site-directed mutagenesis. This method uses oligonucleotide sequences to alter a polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent nucleotides on both sides of the changed amino acid to form a stable duplex on either side of the site of being changed. In general, the techniques of site-directed mutagenesis are well known to those of skill in the art and this technique is exemplified by publications such as, Edelman et al., DNA 2:183 (1983). A versatile and efficient method for producing site-specific changes in a polynucleotide sequence was published by Zoller and Smith, Nucleic Acids Res. 10:6487-6500 (1982). PCR may also be used to create amino acid sequence variants of the novel nucleic acids. When small amounts of template DNA are used as starting material, primer(s) that differs slightly in sequence from the corresponding region in the template DNA can generate the desired amino acid variant. PCR amplification results in a population of product DNA fragments that differ from the polynucleotide template encoding the polypeptide at the position specified by the primer. The product DNA fragments replace the corresponding region in the plasmid and this gives a polynucleotide encoding the desired amino acid variant.


[0081] A further technique for generating amino acid variants is the cassette mutagenesis technique described in Wells et al., Gene 34:315 (1985); and other mutagenesis techniques well known in the art, such as, for example, the techniques in Sambrook et al., supra, and Current Protocols in Molecular Biology, Ausubel et al. Due to the inherent degeneracy of the genetic code, other DNA sequences which encode substantially the same or a functionally equivalent amino acid sequence may be used in the practice of the invention for the cloning and expression of these novel nucleic acids. Such DNA sequences include those which are capable of hybridizing to the appropriate novel nucleic acid sequence under stringent conditions.


[0082] Polynucleotides encoding preferred polypeptide truncations of the invention could be used to generate polynucleotides encoding chimeric or fusion proteins comprising one or more domains of the invention and heterologous protein sequences.


[0083] The polynucleotides of the invention additionally include the complement of any of the polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, amplified, or synthetic) or RNA. Methods and algorithms for obtaining such polynucleotides are well known to those of skill in the art and can include, for example, methods for determining hybridization conditions that can routinely isolate polynucleotides of the desired sequence identities.


[0084] In accordance with the invention, polynucleotide sequences comprising the mature protein coding sequences corresponding to any one of SEQ ID NO: 1-337, or 675-836, or functional equivalents thereof, may be used to generate recombinant DNA molecules that direct the expression of that nucleic acid, or a functional equivalent thereof, in appropriate host cells. Also included are the cDNA inserts of any of the clones identified herein.


[0085] A polynucleotide according to the invention can be joined to any of a variety of other nucleotide sequences by well-established recombinant DNA techniques (see Sambrook J et al. (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY). Useful nucleotide sequences for joining to polynucleotides include an assortment of vectors, e.g., plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well known in the art. Accordingly, the invention also provides a vector including a polynucleotide of the invention and a host cell containing the polynucleotide. In general, the vector contains an origin of replication functional in at least one organism, convenient restriction endonuclease sites, and a selectable marker for the host cell. Vectors according to the invention include expression vectors, replication vectors, probe generation vectors, and sequencing vectors. A host cell according to the invention can be a prokaryotic or eukaryotic cell and can be a unicellular organism or part of a multicellular organism.


[0086] The present invention further provides recombinant constructs comprising a nucleic acid having any of the nucleotide sequences of SEQ ID NO: 1-337, or 675-836 or a fragment thereof or any other polynucleotides of the invention. In one embodiment, the recombinant constructs of the present invention comprise a vector, such as a plasmid or viral vector, into which a nucleic acid having any of the nucleotide sequences of SEQ ID NO: 1-337, or 675-836 or a fragment thereof is inserted, in a forward or reverse orientation. In the case of a vector comprising one of the ORFs of the present invention, the vector may further comprise regulatory sequences, including for example, a promoter, operably linked to the ORF. Large numbers of suitable vectors and promoters are known to those of skill in the art and are commercially available for generating the recombinant constructs of the present invention. The following vectors are provided by way of example: Bacterial: pBs, phagescript, PsiX174, pBluescript SK, pBs KS, pNH8a, pNH16a, pNH18a, pNH46a (Stratagene), pTrc99A, pKK223-3, pKK233-3, pDR540, pRIT5 (Pharmacia); Eukaryotic: pWLneo, pSV2cat, pOG44, PXTI, pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL (Pharmacia).


[0087] The isolated polynucleotide of the invention may be operably linked to an expression control sequence such as the pMT2 or pED expression vectors disclosed in Kaufman et al., Nucleic Acids Res. 19, 4485-4490 (1991), in order to produce the protein recombinantly. Many suitable expression control sequences are known in the art. General methods of expressing recombinant proteins are also known and are exemplified in R. Kaufman, Methods in Enzymology 185, 537-566 (1990). As defined herein “operably linked” means that the isolated polynucleotide of the invention and an expression control sequence are situated within a vector or cell in such a way that the protein is expressed by a host cell which has been transformed (transfected) with the ligated polynucleotide/expression control sequence.


[0088] Promoter regions can be selected from any desired gene using CAT (chloramphenicol transferase) vectors or other vectors with selectable markers. Two appropriate vectors are pKK232-8 and pCM7. Particular named bacterial promoters include lacI, lacZ, T3, T7, gpt, lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HSV thymidine kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-I. Selection of the appropriate vector and promoter is well within the level of ordinary skill in the art. Generally, recombinant expression vectors will include origins of replication and selectable markers permitting transformation of the host cell, e.g., the ampicillin resistance gene of E. coli and S. cerevisiae TRP1 gene, and a promoter derived from a highly expressed gene to direct transcription of a downstream structural sequence. Such promoters can be derived from operons encoding lycolytic enzymes such as 3-phosphoglycerate kinase (PGK), a-factor, acid phosphatase, or heat shock proteins, among others. The heterologous structural sequence is assembled in appropriate phase with translation initiation and termination sequences, and preferably, a leader sequence capable of directing secretion of translated protein into the periplasmic space or extracellular medium. Optionally, the heterologous sequence can encode a fusion protein including an amino terminal identification peptide imparting desired characteristics, e.g., stabilization or simplified purification of expressed recombinant product. Useful expression vectors for bacterial use are constructed by inserting a structural DNA sequence encoding a desired protein together with suitable translation initiation and termination signals in operable reading phase with a functional promoter. The vector will comprise one or more phenotypic selectable markers and an origin of replication to ensure maintenance of the vector and to, if desirable, provide amplification within the host. Suitable prokaryotic hosts for transformation include E. coli, Bacillus subtilis, Salmonella typhimurium and various species within the genera Pseudomonas, Streptomyces, and Staphylococcus, although others may also be employed as a matter of choice.


[0089] As a representative but non-limiting example, useful expression vectors for bacterial use can comprise a selectable marker and bacterial origin of replication derived from commercially available plasmids comprising genetic elements of the well known cloning vector pBR322 (ATCC 37017). Such commercial vectors include, for example, pKK223-3 (Pharmacia Fine Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, Wis., USA). These pBR322 “backbone” sections are combined with an appropriate promoter and the structural sequence to be expressed. Following transformation of a suitable host strain and growth of the host strain to an appropriate cell density, the selected promoter is induced or derepressed by appropriate means (e.g., temperature shift or chemical induction) and cells are cultured for an additional period. Cells are typically harvested by centrifugation, disrupted by physical or chemical means, and the resulting crude extract retained for further purification.


[0090] Polynucleotides of the invention can also be used to induce immune responses. For example, as described in Fan et al., Nat. Biotech 17, 870-872 (1999), incorporated herein by reference, nucleic acid sequences encoding a polypeptide may be used to generate antibodies against the encoded polypeptide following topical administration of naked plasmid DNA or following injection, and preferably intra-muscular injection of the DNA. The nucleic acid sequences are preferably inserted in a recombinant expression vector and may be in the form of naked DNA.



Antisense

[0091] Another aspect of the invention pertains to isolated antisense nucleic acid molecules that are hybridizable to or complementary to the nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 1-337, or 675-836, or fragments, analogs or derivatives thereof. An “antisense” nucleic acid comprises a nucleotide sequence that is complementary to a “sense” nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. In specific aspects, antisense nucleic acid molecules are provided that comprise a sequence complementary to at least about 10, 25, 50, 100, 250 or 500 nucleotides or an entire coding strand, or to only a portion thereof. Nucleic acid molecules encoding fragments, homologs, derivatives and analogs of a protein of any of SEQ ID NO: 1-337, or 675-836 or antisense nucleic acids complementary to a nucleic acid sequence of SEQ ID NO: 1-337, or 675-836 are additionally provided.


[0092] In one embodiment, an antisense nucleic acid molecule is antisense to a “coding region” of the coding strand of a nucleotide sequence of the invention. The term “coding region” refers to the region of the nucleotide sequence comprising codons which are translated into amino acid residues. In another embodiment, the antisense nucleic acid molecule is antisense to a “noncoding region” of the coding strand of a nucleotide sequence of the invention. The term “noncoding region” refers to 5′ and 3′ sequences that flank the coding region that are not translated into amino acids (i.e., also referred to as 5′ and 3′ untranslated regions).


[0093] Given the coding strand sequences encoding a nucleic acid disclosed herein (e.g., SEQ ID NO: 1-337, or 675-836, antisense nucleic acids of the invention can be designed according to the rules of Watson and Crick or Hoogsteen base pairing. The antisense nucleic acid molecule can be complementary to the entire coding region of an mRNA, but more preferably is an oligonucleotide that is antisense to only a portion of the coding or noncoding region of an mRNA. For example, the antisense oligonucleotide can be complementary to the region surrounding the translation start site of an mRNA. An antisense oligonucleotide can be, for example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length. An antisense nucleic acid of the invention can be constructed using chemical synthesis or enzymatic ligation reactions using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used.


[0094] Examples of modified nucleotides that can be used to generate the antisense nucleic acid include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl-2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, inosine, N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, beta-D-mannosylqueosine, 5′-methoxycarboxymethyluracil, 5-methoxyuracil, 2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 3-(3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, the antisense nucleic acid can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).


[0095] The antisense nucleic acid molecules of the invention are typically administered to a subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or genomic DNA encoding a protein according to the invention to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by conventional nucleotide complementarity to form a stable duplex, or, for example, in the case of an antisense nucleic acid molecule that binds to DNA duplexes, through specific interactions in the major groove of the double helix. An example of a route of administration of antisense nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, antisense nucleic acid molecules can be modified to target selected cells and then administered systemically. For example, for systernic administration, antisense molecules can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using the vectors described herein. To achieve sufficient intracellular concentrations of antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed under the control of a strong pol II or pol III promoter are preferred.


[0096] In yet another embodiment, the antisense nucleic acid molecule of the invention is an α-anomeric nucleic acid molecule. An α-anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual α-units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids Res 15: 6625-6641). The antisense nucleic acid molecule can also comprise a 2′-o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res 15: 6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBS Lett 215: 327-330).



Ribozymes and PNA Moieties

[0097] In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a single-stranded nucleic acid, such as an mRNA, to which they have a complementary region. Thus, ribozymes (e.g., hammerhead ribozymes (described in Haselhoff and Gerlach (1988) Nature 334:585-591)) can be used to catalytically cleave mRNA transcripts to thereby inhibit translation of an mRNA. A ribozyme having specificity for a nucleic acid of the invention can be designed based upon the nucleotide sequence of a DNA disclosed herein (i.e., SEQ ID NO: 1-337, or 675-836). For example, a derivative of Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the active site is complementary to the nucleotide sequence to be cleaved in a mRNA. See, e.g., Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742. Alternatively, mRNA of the invention can be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., Bartel et al., (1993) Science 261:1411-1418.


[0098] Alternatively, gene expression can be inhibited by targeting nucleotide sequences complementary to the regulatory region (e.g., promoter and/or enhancers) to form triple helical structures that prevent transcription of the gene in target cells. See generally, Helene. (1991) Anticancer Drug Des. 6: 569-84; Helene. et al. (1992) Ann. N.Y. Acad. Sci. 660:27-36; and Maher (1992) Bioassays 14: 807-15.


[0099] In various embodiments, the nucleic acids of the invention can be modified at the base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic acids can be modified to generate peptide nucleic acids (see Hyrup et al. (1996) Bioorg Med Chem 4: 5-23). As used herein, the terms “peptide nucleic acids” or “PNAs” refer to nucleic acid mimics, e.g., DNA mimics, in which the deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and only the four natural nucleobases are retained. The neutral backbone of PNAs has been shown to allow for specific hybridization to DNA and RNA under conditions of low ionic strength. The synthesis of PNA oligomers can be performed using standard solid phase peptide synthesis protocols as described in Hyrup et al. (1996) above; Perry-O'Keefe et al. (1996) PNAS 93: 14670-675.


[0100] PNAs of the invention can be used in therapeutic and diagnostic applications. For example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of gene expression by, e.g., inducing transcription or translation arrest or inhibiting replication. PNAs of the invention can also be used, e.g., in the analysis of single base pair mutations in a gene by, e.g., PNA directed PCR clamping; as artificial restriction enzymes when used in combination with other enzymes, e.g., S1 nucleases (Hyrup B. (1996) above); or as probes or primers for DNA sequence and hybridization (Hyrup et al. (1996), above; Perry-O'Keefe (1996), above).


[0101] In another embodiment, PNAs of the invention can be modified, e.g., to enhance their stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug delivery known in the art. For example, PNA-DNA chimeras can be generated that may combine the advantageous properties of PNA and DNA. Such chimeras allow DNA recognition enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA portion while the PNA portion would provide high binding affinity and specificity. PNA-DNA chimeras can be linked using linkers of appropriate lengths selected in terms of base stacking, number of bonds between the nucleobases, and orientation (Hyrup (1996) above). The synthesis of PNA-DNA chimeras can be performed as described in Hyrup (1996) above and Finn et al. (1996) Nucl Acids Res 24: 3357-63. For example, a DNA chain can be synthesized on a solid support using standard phosphoramidite coupling chemistry, and modified nucleoside analogs, e.g., 5′-(4-methoxytrityl)amino-5′-deoxy-thymidine phosphoramidite, can be used between the PNA and the 5′ end of DNA (Mag et al. (1989) Nucl Acid Res 17: 5973-88). PNA monomers are then coupled in a stepwise manner to produce a chimeric molecule with a 5′ PNA segment and a 3′ DNA segment (Finn et al. (1996) above). Alternatively, chimeric molecules can be synthesized with a 5′ DNA segment and a 3′ PNA segment. See, Petersen et al. (1975) Bioorg Med Chem Lett 5: 1119-11124.


[0102] In other embodiments, the oligonucleotide may include other appended groups such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the cell membrane (see, e.g., Letsinger et al., 1989, Proc. Natl. Acad. Sci. U.S.A. 86:6553-6556; Lemaitre et al., 1987, Proc. Natl. Acad. Sci. 84:648-652; PCT Publication No. W088/09810) or the blood-brain barrier (see, e.g., PCT Publication No. W089/10134). In addition, oligonucleotides can be modified with hybridization triggered cleavage agents (See, e.g., Krol et al., 1988, BioTechniques 6:958-976) or intercalating agents. (See, e.g., Zon, 1988, Pharm. Res. 5: 539-549). To this end, the oligonucleotide may be conjugated to another molecule, e.g., a peptide, a hybridization triggered cross-linking agent, a transport agent, a hybiidization-triggered cleavage agent, etc.



Hosts

[0103] The present invention further provides host cells genetically engineered to contain the polynucleotides of the invention. For example, such host cells may contain nucleic acids of the invention introduced into the host cell using known transformation, transfection or infection methods. The present invention still further provides host cells genetically engineered to express the polynucleotides of the invention, wherein such polynucleotides are in operative association with a regulatory sequence heterologous to the host cell which drives expression of the polynucleotides in the cell.


[0104] Knowledge of nucleic acid sequences allows for modification of cells to permit, or increase, expression of endogenous polypeptide. Cells can be modified (e.g., by homologous recombination) to provide increased polypeptide expression by replacing, in whole or in part, the naturally occurring promoter with all or part of a heterologous promoter so that the cells express the polypeptide at higher levels. The heterologous promoter is inserted in such a manner that it is operatively linked to the encoding sequences. See, for example, PCT International Publication No. WO94/12650, PCT International Publication No. WO92/20808, and PCT International Publication No. WO91/09955. It is also contemplated that, in addition to heterologous promoter DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or intron DNA may be inserted along with the heterologous promoter DNA. If linked to the coding sequence, amplification of the marker DNA by standard selection methods results in co-amplification of the desired protein coding sequences in the cells.


[0105] The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaryotic cell, such as a bacterial cell. Introduction of the recombinant construct into the host cell can be effected by calcium phosphate transfection, DEAE, dextran mediated transfection, or electroporation (Davis, L. et al., Basic Methods in Molecular Biology (1986)). The host cells containing one of the polynucleotides of the invention, can be used in conventional manners to produce the gene product encoded by the isolated fragment (in the case of an ORF) or can be used to produce a heterologous protein under the control of the EMF.


[0106] Any host/vector system can be used to express one or more of the ORFs of the present invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, Cv-1 cell, COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as E. coli and B. subtilis. The most preferred cells are those which do not normally express the particular polypeptide or protein or which expresses the polypeptide or protein at low natural level. Mature proteins can be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate promoters. Cell-free translation systems can also be employed to produce such proteins using RNAs derived from the DNA constructs of the present invention. Appropriate cloning and expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook, et al., in Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, N.Y. (1989), the disclosure of which is hereby incorporated by reference.


[0107] Various mammalian cell culture systems can also be employed to express recombinant protein. Examples of mammalian expression systems include the COS-7 lines of monkey kidney fibroblasts, described by Gluzman, Cell 23:175 (1981). Other cell lines capable of expressing a compatible vector are, for example, the C127, monkey COS cells, Chinese Hamster Ovary (CHO) cells, human kidney 293 cells, human epidermal A431 cells, human Colo205 cells, 3T3 cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell strains derived from in vitro culture of primary tissue, primary explants, HeLa cells, mouse L cells, BHK, EL-60, U937, HaK or Jurkat cells. Mammalian expression vectors will comprise an origin of replication, a suitable promoter and also any necessary ribosome binding sites, polyadenylation site, splice donor and acceptor sites, transcriptional termination sequences, and 5 flanking nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for example, SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide the required nontranscribed genetic elements. Recombinant polypeptides and proteins produced in bacterial culture are usually isolated by initial extraction from cell pellets, followed by one or more salting-out, aqueous ion exchange or size exclusion chromatography steps. Protein refolding steps can be used, as necessary, in completing configuration of the mature protein. Finally, high performance liquid chromatography (HPLC) can be employed for final purification steps. Microbial cells employed in expression of proteins can be disrupted by any convenient method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing agents.


[0108] Alternatively, it may be possible to produce the protein in lower eukaryotes such as yeast or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains include Saccharornyces cerevisiae, Schizosaccharomyces pombe, Kluyveronyces strains, Candida, or any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial strains include Escherichia coli, Bacillus subtilis, Salmonella typhimurium, or any bacterial strain capable of expressing heterologous proteins. if the protein is made in yeast or bacteria, it may be necessary to modify the protein produced therein, for example by phosphorylation or glycosylation of the appropriate sites, in order to obtain the functional protein. Such covalent attachments may be accomplished using known chemical or enzymatic methods.


[0109] In another embodiment of the present invention, cells and tissues may be engineered to express an endo genous gene comprising the polynucleotides of the invention under the control of inducible regulatory elements, in which case the regulatory sequences of the endogenous gene may be replaced by homologous recombination. As described herein, gene targeting can be used to replace a gene's existing regulatory region with a regulatory sequence isolated from a different gene or a novel regulatory sequence synthesized by genetic engineering methods. Such regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment regions, negative regulatory elements, transcriptional initiation sites, and regulatory protein binding sites or combinations of said sequences. Alternatively, sequences which affect the structure or stability of the RNA or protein produced may be replaced, removed, added, or otherwise modified by targeting. These sequence include polyadenylation signals, mRNA stability elements, splice sites, leader sequences for enhancing or modifying transport or secretion properties of the protein, or other sequences which alter or improve the function or stability of protein or RNA molecules.


[0110] The targeting event may be a simple insertion of the regulatory sequence, placing the gene under the control of the new regulatory sequence, e.g., inserting a new promoter or enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple deletion of a regulatory element, such as the deletion of a tissue-specific negative regulatory element. Alternatively, the targeting event may replace an existing element; for example, a tissue-specific enhancer can be replaced by an enhancer that has broader or different cell-type specificity than the naturally occurring elements. Here, the naturally occurring sequences are deleted and new sequences are added. In all cases, the identification of the targeting event may be facilitated by the use of one or more selectable marker genes that are contiguous with the targeting DNA, allowing for the selection of cells in which the exogenous DNA has integrated into the host cell genome. The identification of the targeting event may also be facilitated by the use of one or more marker genes exhibiting the property of negative selection, such that the negatively selectable marker is linked to the exogenous DNA, but configured such that the negatively selectable marker flanks the targeting sequence, and such that a correct homologous recombination event with sequences in the host cell genome does not result in the stable integration of the negatively selectable marker. Markers useful for this purpose include the Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine phosphoribosyl-transferase (gpt) gene.


[0111] The gene targeting or gene activation techniques which can be used in accordance with this aspect of the invention are more particularly described in U.S. Pat. No. 5,272,071 to Chappel; U.S. Pat. No. 5,578,461 to Sherwin et al.; International Application No. PCT/US92/09627 (WO93/09222) by Selden et al.; and International Application No. PCT/US90/06436 (WO91/06667) by Skoultchi et al., each of which is incorporated by reference herein in its entirety.



Polypeptides of the Invention

[0112] The isolated polypeptides of the invention include, but are not limited to, a polypeptide comprising: the amino acid sequences set forth as any one of SEQ ID NO: 338-674, or 837-998 or an amino acid sequence encoded by any one of the nucleotide sequences SEQ ID NO: 1-337, or 675-836 or the corresponding full length or mature protein. Polypeptides of the invention also include polypeptides preferably with biological or immunological activity that are encoded by: (a) a polynucleotide having any one of the nucleotide sequences set forth in SEQ ID NO: 1-337, or 675-836 or (b) polynucleotides encoding any one of the amino acid sequences set forth as SEQ ID NO: 338-674, or 837-998 or (c) polynucleotides that hybridize to the complement of the polynucleotides of either (a) or (b) under stringent hybridization conditions. The invention also provides biologically active or immunologically active variants of any of the amino acid sequences set forth as SEQ ID NO: 338-674, or 837-998 or the corresponding full length or mature protein; and “substantial equivalents” thereof (e.g., with at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, 86%, 87%, 88%, 89%, at least about 90%, 91%, 92%, 93%, 94%, typically at least about 95%, 96%, 97%, more typically at least about 98%, or most typically at least about 99% amino acid identity) that retain biological activity. Polypeptides encoded by allelic variants may have a similar, increased, or decreased activity compared to polypeptides comprising SEQ ID NO: 338-674, or 837-998.


[0113] Fragments of the proteins of the present invention which are capable of exhibiting biological activity are also encompassed by the present invention. Fragments of the protein may be in linear form or they may be cyclized using known methods, for example, as described in H. U. Saragovi, et al., Bio/Technology 10, 773-778 (1992) and in R. S. McDowell, et al., J. Amer. Chem. Soc. 114, 9245-9253 (1992), both of which are incorporated herein by reference. Such fragments may be fused to carrier molecules such as immunoglobulins for many purposes, including increasing the valency of protein binding sites. Fragments are also identified in Tables 3, 5, 6, 8, or 9.


[0114] The present invention also provides both full-length and mature forms (for example, without a signal sequence or precursor sequence) of the disclosed proteins. The protein coding sequence is identified in the sequence listing by translation of the disclosed nucleotide sequences. The predicted signal sequence is set forth in Table 6. The mature form of such protein may be obtained and confirmed by expression of a full-length polynucleotide in a suitable mammalian cell or other host cell and sequencing of the cleaved product. One of skill in the art will recognize that the actual cleavage site may be different than that predicted in Table 6. The sequence of the mature form of the protein is also determinable from the amino acid sequence of the full-length form. Where proteins of the present invention are membrane bound, soluble forms of the proteins are also provided. In such forms, part or all of the regions causing the proteins to be membrane bound are deleted so that the proteins are fully secreted from the cell in which they are expressed (See, e.g., Sakal et al., Prep. Biochem. Biotechnol. (2000), 30(2), pp. 107-23, incorporated herein by reference).


[0115] Protein compositions of the present invention may further comprise an acceptable carrier, such as a hydrophilic, e.g., pharmaceutically acceptable, carrier.


[0116] The present invention further provides isolated polypeptides encoded by the nucleic acid fragments of the present invention or by degenerate variants of the nucleic acid fragments of the present invention. By “degenerate variant” is intended nucleotide fragments which differ from a nucleic acid fragment of the present invention (e.g., an ORF) by nucleotide sequence but, due to the degeneracy of the genetic code, encode an identical polypeptide sequence. Preferred nucleic acid fragments of the present invention are the ORFs that encode proteins.


[0117] A variety of methodologies known in the art can be utilized to obtain any one of the isolated polypeptides or proteins of the present invention. At the simplest level, the amino acid sequence can be synthesized using commercially available peptide synthesizers. The synthetically-constructed protein sequences, by virtue of sharing primary, secondary or tertiary structural and/or conformational characteristics with proteins may possess biological properties in common therewith, including protein activity. This technique is particularly useful in producing small peptides and fragments of larger polypeptides. Fragments are useful, for example, in generating antibodies against the native polypeptide. Thus, they may be employed as biologically active or immunological substitutes for natural, purified proteins in screening of therapeutic compounds and in immunological processes for the development of antibodies.


[0118] The polypeptides and proteins of the present invention can alternatively be purified from cells which have been altered to express the desired polypeptide or protein. As used herein, a cell is said to be altered to express a desired polypeptide or protein when the cell, through genetic manipulation, is made to produce a polypeptide or protein which it normally does not produce or which the cell normally produces at a lower level. One skilled in the art can readily adapt procedures for introducing and expressing either recombinant or synthetic sequences into eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides or proteins of the present invention.


[0119] The invention also relates to methods for producing a polypeptide comprising growing a culture of host cells of the invention in a suitable culture medium, and purifying the protein from the cells or the culture in which the cells are grown. For example, the methods of the invention include a process for producing a polypeptide in which a host cell containing a suitable expression vector that includes a polynucleotide of the invention is cultured under conditions that allow expression of the encoded polypeptide. The polypeptide can be recovered from the culture, conveniently from the culture medium, or from a lysate prepared from the host cells and further purified. Preferred embodiments include those in which the protein produced by such process is a full length or mature form of the protein.


[0120] In an alternative method, the polypeptide or protein is purified from bacterial cells which naturally produce the polypeptide or protein. One skilled in the art can readily follow known methods for isolating polypeptides and proteins in order to obtain one of the isolated polypeptides or proteins of the present invention. These include, but are not limited to, immunochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, and immuno-affinity chromatography. See, e.g., Scopes, Protein Purification: Principles and Practice, Springer-Verlag (1994); Sambrook, et al., in Molecular Cloning: A Laboratory Manual; Ausubel et al., Current Protocols in Molecular Biology. Polypeptide fragments that retain biological/immunological activity include fragments comprising greater than about 100 amino acids, or greater than about 200 amino acids, and fragments that encode specific protein domains.


[0121] The purified polypeptides can be used in in vitro binding assays which are well known in the art to identify molecules which bind to the polypeptides. These molecules include but are not limited to, for e.g., small molecules, molecules from combinatorial libraries, antibodies or other proteins. The molecules identified in the binding assay are then tested for antagonist or agonist activity in in vivo tissue culture or animal models that are well known in the art. In brief, the molecules are titrated into a plurality of cell cultures or animals and then tested for either cell/animal death or prolonged survival of the animal/cells.


[0122] In addition, the peptides of the invention or molecules capable of binding to the peptides may be complexed with toxins, e.g., ricin or cholera, or with other compounds that are toxic to cells. The toxin-binding, molecule complex is then targeted to a tumor or other cell by the specificity of the binding molecule for SEQ ID NO: 338-674, or 837-998.


[0123] The protein of the invention may also be expressed as a product of transgenic animals, e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which are characterized by somatic or germ cells containing a nucleotide sequence encoding the protein.


[0124] The proteins provided herein also include proteins characterized by amino acid sequences similar to those of purified proteins but into which modification are naturally provided or deliberately engineered. For example, modifications, in the peptide or DNA sequence, can be made by those skilled in the art using known techniques. Modifications of interest in the protein sequences may include the alteration, substitution, replacement, insertion or deletion of a selected amino acid residue in the coding sequence. For example, one or more of the cysteine residues may be deleted or replaced with another amino acid to alter the conformation of the molecule. Techniques for such alteration, substitution, replacement, insertion or deletion are well known to those skilled in the art (see, e.g., U.S. Pat. No. 4,518,584). Preferably, such alteration, substitution, replacement, insertion or deletion retains the desired activity of the protein. Regions of the protein that are important for the protein function can be determined by various methods known in the art including the alanine-scanning method which involved systematic substitution of single or strings of amino acids with alanine, followed by testing the resulting alanine-containing variant for biological activity. This type of analysis determines the importance of the substituted amino acid(s) in biological activity. Regions of the protein that are important for protein function may be determined by the eMATRIX program.


[0125] Other fragments and derivatives of the sequences of proteins which would be expected to retain protein activity in whole or in part and are useful for screening or other immunological methodologies may also be easily made by those skilled in the art given the disclosures herein. Such modifications are encompassed by the present invention.


[0126] The protein may also be produced by operably linking the isolated polynucleotide of the invention to suitable control sequences in one or more insect expression vectors, and employing an insect expression system. Materials and methods for baculovirus/insect cell expression systems are commercially available in kit form from, e.g., Invitrogen, San Diego, Calif., U.S.A. (the MaxBat™ kit), and such methods are well known in the art, as described in Summers and Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987), incorporated herein by reference. As used herein, an insect cell capable of expressing a polynucleotide of the present invention is “transformed.”


[0127] The protein of the invention may be prepared by culturing transformed host cells under culture conditions suitable to express the recombinant protein. The resulting expressed protein may then be purified from such culture (i.e., from culture medium or cell extracts) using known purification processes, such as gel filtration and ion exchange chromatography. The purification of the protein may also include an affinity column containing agents which will bind to the protein; one or more column steps over such affinity resins as concanavalin A-agarose, heparin-toyopearl™ or Cibacrom blue 3GA Sepharose™; one or more steps involving hydrophobic interaction chromatography using such resins as phenyl ether, butyl ether, or propyl ether; or immunoaffinity chromatography.


[0128] Alternatively, the protein of the invention may also be expressed in a form which will facilitate purification. For example, it may be expressed as a fusion protein, such as those of maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX), or as a His tag. Kits for expression and purification of such fusion proteins are commercially available from New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, N.J.) and Invitrogen, respectively. The protein can also be tagged with an epitope and subsequently purified by using a specific antibody directed to such epitope. One such epitope (“FLAG®”) is commercially available from Kodak (New Haven, Conn.).


[0129] Finally, one or more reverse-phase high performance liquid chromatography (RP-HPLC) steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant methyl or other aliphatic groups, can be employed to further purify the protein. Some or all of the foregoing purification steps, in various combinations, can also be employed to provide a substantially homogeneous isolated recombinant protein. The protein thus purified is substantially free of other mammalian proteins and is defined in accordance with the present invention as an “isolated protein.”


[0130] The polypeptides of the invention include analogs (variants). This embraces fragments, as well as peptides in which one or more amino acids has been deleted, inserted, or substituted. Also, analogs of the polypeptides of the invention embrace fusions of the polypeptides or modifications of the polypeptides of the invention, wherein the polypeptide or analog is fused to another moiety or moieties, e.g., targeting moiety or another therapeutic agent. Such analogs may exhibit improved properties such as activity and/or stability. Examples of moieties which may be fused to the polypeptide or an analog include, for example, targeting moieties which provide for the delivery of polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, antibodies to immune cells such as T-cells, monocytes, dendritic cells, granulocytes, etc., as well as receptor and ligands expressed on pancreatic or immune cells. Other moieties which may be fused to the polypeptide include therapeutic agents which are used for treatment, for example, immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 antibodies and steroids. Also, polypeptides may be fused to immune modulators, and other cytokines such as alpha or beta interferon.



Determining Polypeptide and Polynucleotide Identity and Similarity

[0131] Preferred identity and/or similarity are designed to give the largest match between the sequences tested. Methods to determine identity and similarity are codified in computer programs including, but are not limited to, the GCG program package, including GAP (Devereux, J., et al., Nucleic Acids Research 12(1):387 (1984); Genetics Computer Group, University of Wisconsin, Madison, Wis.), BLASTP, BLASTN, BLASTX, FASTA (Altschul, S. F. et al., J. Molec. Biol. 215:403-410 (1990), PSI-BLAST (Altschul S. F. et al., Nucleic Acids Res. vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu et al., J. Comp. Biol., Vol. 6, pp. 219-235 (1999), herein incorporated by reference), eMotif software (Nevill-Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein incorporated by reference), Pfam software (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1998), herein incorporated by reference) and the Kyte-Doolittle hydrophobocity prediction algorithm (J. Mol Biol, 157, pp. 105-31 (1982), the GeneAtlas software (Molecular Simulations Inc. (MSI), San Diego, Calif.) (Sanchez and Sali (1998) Proc. Natl. Acad. Sci., 95, 13597-13602; Kitson DH et al, (2000) “Remote homology detection using structural modeling—an evaluation” Submitted; Fischer and Eisenberg (1996) Protein Sci. 5, 947-955), Neural Network SignalP V1.1 program (from Center for Biological Sequence Analysis, The Technical University of Denmark) incorporated herein by reference). Polypeptide sequences were examined by a proprietary algorithm, SeqLoc that separates the proteins into three sets of locales: intracellular, membrane, or secreted. This prediction is based upon three characteristics of each polypeptide, including percentage of cysteine residues, Kyte-Doolittle scores for the first 20 amino acids of each protein, and Kyte-Doolittle scores to calculate the longest hydrophobic stretch of the said protein. Values of predicted proteins are compared against the values from a set of 592 proteins of known cellular localization from the Swissprot database (http://www.expasv.ch/sprot). Predictions are based upon the maximum likelihood estimation.


[0132] Pesence of transmembrane region(s) was detected using the TMpred program (http://www.ch.embnet.org/software/TMPRED form.html).


[0133] The BLAST programs are publicly available from the National Center for Biotechnology Information (NCBI) and other sources (BLAST Manual, Altschul, S., et al. NCBI NLM NIH Bethesda, Md. 20894; Altschul, S., et al., J. Mol. Biol. 215:403-410 (1990).



Chimeric and Fusion Proteins

[0134] The invention also provides chimeric or fusion proteins. As used herein, a “chimeric protein” or “fusion protein” comprises a polypeptide of the invention operatively linked to another polypeptide. Within a fusion protein the polypeptide according to the invention can correspond to all or a portion of a protein according to the invention. In one embodiment, a fusion protein comprises at least one biologically active portion of a protein according to the invention. In another embodiment, a fusion protein comprises at least two biologically active portions of a protein according to the invention. Within the fusion protein, the term “operatively linked” is intended to indicate that the polypeptide according to the invention and the other polypeptide are fused in-frame to each other. The polypeptide can be fused to the N-terminus or C-terminus, or to the middle.


[0135] For example, in one embodiment a fusion protein comprises a polypeptide according to the invention operably linked to the extracellular domain of a second protein.


[0136] In another embodiment, the fusion protein is a GST-fusion protein in which the polypeptide sequences of the invention are fused to the C-terminus of the GST (i.e., glutathione S-transferase) sequences.


[0137] In another embodiment, the fusion protein is an immunoglobulin fusion protein in which the polypeptide sequences according to the invention comprise one or more domains fused to sequences derived from a member of the immunoglobulin protein family. The immunoglobulin fusion proteins of the invention can be incorporated into pharmaceutical compositions and administered to a subject to inhibit an interaction between a ligand and a protein of the invention on the surface of a cell, to thereby suppress signal transduction in vivo. The immunoglobulin fusion proteins can be used to affect the bioavailability of a cognate ligand. Inhibition of the ligand/protein interaction may be useful therapeutically for both the treatment of proliferative and differentiative disorders, e.g., cancer as well as modulating (e.g., promoting or inhibiting) cell survival. Moreover, the immunoglobulin fusion proteins of the invention can be used as immunogens to produce antibodies in a subject, to purify ligands, and in screening assays to identify molecules that inhibit the interaction of a polypeptide of the invention with a ligand.


[0138] A chimeric or fusion protein of the invention can be produced by standard recombinant DNA techniques. For example, DNA fragments coding for the different polypeptide sequences are ligated together in-frame in accordance with conventional techniques, e.g., by employing blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for appropriate termini, filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to avoid undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene can be synthesized by conventional techniques including automated DNA synthesizers. Alternatively, PCR amplification of gene fragments can be carried out using anchor primers that give rise to complementary overhangs between two consecutive gene fragments that can subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for example, Ausubel et al. (eds.) Current Protocols in Molecular Biology, John Wiley & Sons, 1992). Moreover, many expression vectors are commercially available that already encode a fusion moiety (e.g., a GST polypeptide). A nucleic acid encoding a polypeptide of the invention can be cloned into such an expression vector such that the fusion moiety is linked in-frame to the protein of the invention.



Gene Therapy

[0139] Mutations in the polynucleotides of the invention gene may result in loss of normal function of the encoded protein. The invention thus provides gene therapy to restore normal activity of the polypeptides of the invention; or to treat disease states involving polypeptides of the invention. Delivery of a functional gene encoding polypeptides of the invention to appropriate cells is effected ex vivo, in situ, or in vivo by use of vectors, and more particularly viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo by use of physical DNA transfer methods (e.g., liposomes or chemical treatments). See, for example, Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1998). For additional reviews of gene therapy technology see Friedmann, Science, 244: 1275-1281 (1989); Verma, Scientific American: 68-84 (1990); and Miller, Nature, 357: 455-460 (1992). Introduction of any one of the nucleotides of the present invention or a gene encoding the polypeptides of the present invention can also be accomplished with extrachromosomal substrates (transient expression) or artificial chromosomes (stable expression). Cells may also be cultured ex vivo in the presence of proteins of the present invention in order to proliferate or to produce a desired effect on or activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. Alternatively, it is contemplated that in other human disease states, preventing the expression of or inhibiting the activity of polypeptides of the invention will be useful in treating the disease states. It is contemplated that antisense therapy or gene therapy could be applied to negatively regulate the expression of polypeptides of the invention.


[0140] Other methods inhibiting expression of a protein include the introduction of antisense molecules to the nucleic acids of the present invention, their complements, or their translated RNA sequences, by methods known in the art. Further, the polypeptides of the present invention can be inhibited by using targeted deletion methods, or the insertion of a negative regulatory element such as a silencer, which is tissue specific.


[0141] The present invention still further provides cells genetically engineered in vivo to express the polynucleotides of the invention, wherein such polynucleotides are in operative association with a regulatory sequence heterologous to the host cell which drives expression of the polynucleotides in the cell. These methods can be used to increase or decrease the expression of the polynucleotides of the present invention.


[0142] Knowledge of DNA sequences provided by the invention allows for modification of cells to permit, increase, or decrease, expression of endogenous polypeptide. Cells can be modified (e.g., by homologous recombination) to provide increased polypeptide expression by replacing, in whole or in part, the naturally occurring promoter with all or part of a heterologous promoter so that the cells express the protein at higher levels. The heterologous promoter is inserted in such a manner that it is operatively linked to the desired protein encoding sequences. See, for example, PCT International Publication No. WO 94/12650, PCT International Publication No. WO 92/20808, and PCT International Publication No. WO 91/09955. It is also contemplated that, in addition to heterologous promoter DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or intron DNA may be inserted along with the heterologous promoter DNA. If linked to the desired protein coding sequence, amplification of the marker DNA by standard selection methods results in co-amplification of the desired protein coding sequences in the cells.


[0143] In another embodiment of the present invention, cells and tissues may be engineered to express an endogenous gene comprising the polynucleotides of the invention under the control of inducible regulatory elements, in which case the regulatory sequences of the endogenous gene may be replaced by homologous recombination. As described herein, gene targeting can be used to replace a gene's existing regulatory region with a regulatory sequence isolated from a different gene or a novel regulatory sequence synthesized by genetic engineering methods. Such regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment regions, negative regulatory elements, transcriptional initiation sites, regulatory protein binding sites or combinations of said sequences. Alternatively, sequences which affect the structure or stability of the RNA or protein produced may be replaced, removed, added, or otherwise modified by targeting. These sequences include polyadenylation signals, mRNA stability elements, splice sites, leader sequences for enhancing or modifying transport or secretion properties of the protein, or other sequences which alter or improve the function or stability of protein or RNA molecules.


[0144] The targeting event may be a simple insertion of the regulatory sequence, placing the gene under the control of the new regulatory sequence, e.g., inserting a new promoter or enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple deletion of a regulatory element, such as the deletion of a tissue-specific negative regulatory element. Alternatively, the targeting event may replace an existing element; for example, a tissue-specific enhancer can be replaced by an enhancer that has broader or different cell-type specificity than the naturally occurring elements. Here, the naturally occurring sequences are deleted and new sequences are added. In all cases, the identification of the targeting event may be facilitated by the use of one or more selectable marker genes that are contiguous with the targeting DNA, allowing for the selection of cells in which the exogenous DNA has integrated into the cell genome. The identification of the targeting event may also be facilitated by the use of one or more marker genes exhibiting the property of negative selection, such that the negatively selectable marker is linked to the exogenous DNA, but configured such that the negatively selectable marker flanks the targeting sequence, and such that a correct homologous recombination event with sequences in the host cell genome does not result in the stable integration of the negatively selectable marker. Markers useful for this purpose include the Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine phosphoribosyl-transferase (gpt) gene.


[0145] The gene targeting or gene activation techniques which can be used in accordance with this aspect of the invention are more particularly described in U.S. Pat. No. 5,272,071 to Chappel; U.S. Pat. No. 5,578,461 to Sherwin et al.; International Application No. PCT/US92/09627 (WO93/09222) by Selden et al.; and International Application No. PCT/US90/06436 (WO91/06667) by Skoultchi et al., each of which is incorporated by reference herein in its entirety.



Transgenic Animals

[0146] In preferred methods to determine biological functions of the polypeptides of the invention in vivo, one or more genes provided by the invention are either over expressed or inactivated in the germ line of animals using homologous recombination [Capecchi, Science 244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory control of exogenous or endogenous promoter elements, are known as transgenic animals. Animals in which an endogenous gene has been inactivated by homologous recombination are referred to as “knockout” animals. Knockout animals, preferably non-human mammals, can be prepared as described in U.S. Pat. No. 5,557,032, incorporated herein by reference. Transgenic animals are useful to determine the roles polypeptides of the invention play in biological processes, and preferably in disease states. Transgenic animals are useful as model systems to identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human mammals, are produced using methods as described in U.S. Pat. No 5,489,743 and PCT Publication No. WO94/28122, incorporated herein by reference.


[0147] Transgenic animals can be prepared wherein all or part of a promoter of the polynucleotides of the invention is either activated or inactivated to alter the level of expression of the polypeptides of the invention. Inactivation can be carried out using homologous recombination methods described above. Activation can be achieved by supplementing or even replacing the homologous promoter to provide for increased protein expression. The homologous promoter can be supplemented by insertion of one or more heterologous enhancer elements known to confer promoter activation in a particular tissue.


[0148] The polynucleotides of the present invention also make possible the development, through, e.g., homologous recombination or knock out strategies, of animals that fail to express polypeptides of the invention or that express a variant polypeptide. Such animals are useful as models for studying the in vivo activities of polypeptide as well as for studying modulators of the polypeptides of the invention.


[0149] In preferred methods to determine biological functions of the polypeptides of the invention in vivo, one or more genes provided by the invention are either over expressed or inactivated in the germ line of animals using homologous recombination [Capecchi, Science 244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory control of exogenous or endogenous promoter elements, are known as transgenic animals. Animals in which an endogenous gene has been inactivated by homologous recombination are referred to as “knockout” animals. Knockout animals, preferably non-human mammals, can be prepared as described in U.S. Pat. No. 5,557,032, incorporated herein by reference. Transgenic animals are useful to determine the roles polypeptides of the invention play in biological processes, and preferably in disease states. Transgenic animals are useful as model systems to identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human mammals, are produced using methods as described in U.S. Pat. No 5,489,743 and PCT Publication No. WO94/28 122, incorporated herein by reference.


[0150] Transgenic animals can be prepared wherein all or part of the polynucleotides of the invention promoter is either activated or inactivated to alter the level of expression of the polypeptides of the invention. Inactivation can be carried out using homologous recombination methods described above. Activation can be achieved by supplementing or even replacing the homologous promoter to provide for increased protein expression. The homologous promoter can be supplemented by insertion of one or more heterologous enhancer elements known to confer promoter activation in a particular tissue.



Uses and Biological Activity

[0151] The polynucleotides and proteins of the present invention are expected to exhibit one or more of the uses or biological activities (including those associated with assays cited herein) identified herein. Uses or activities described for proteins of the present invention may be provided by administration or use of such proteins or of polynucleotides encoding such proteins (such as, for example, in gene therapies or vectors suitable for introduction of DNA). The mechanism underlying the particular condition or pathology will dictate whether the polypeptides of the invention, the polynucleotides of the invention or modulators (activators or inhibitors) thereof would be beneficial to the subject in need of treatment. Thus, “therapeutic compositions of the invention” include compositions comprising isolated polynucleotides (including recombinant DNA molecules, cloned genes and degenerate variants thereof) or polypeptides of the invention (including full length protein, mature protein and truncations or domains thereof), or compounds and other substances that modulate the overall activity of the target gene products, either at the level of target gene/protein expression or target protein activity. Such modulators include polypeptides, analogs, (variants), including fragments and fusion proteins, antibodies and other binding proteins; chemical compounds that directly or indirectly activate or inhibit the polypeptides of the invention (identified, e.g., via drug screening assays as described herein); antisense polynucleotides and polynucleotides suitable for triple helix formation; and in particular antibodies or other binding partners that specifically recognize one or more epitopes of the polypeptides of the invention.


[0152] The polypeptides of the present invention may likewise be involved in cellular activation or in one of the other physiological pathways described herein.



Research Uses and Utilities

[0153] The polynucleotides provided by the present invention can be used by the research community for various purposes. The polynucleotides can be used to express recombinant protein for analysis, characterization or therapeutic use; as markers for tissues in which the corresponding protein is preferentially expressed (either constitutively or at a particular stage of tissue differentiation or development or in disease states); as molecular weight markers on gels; as chromosome markers or tags (when labeled) to identify chromosomes or to map related gene positions; to compare with endogenous DNA sequences in patients to identify potential genetic disorders; as probes to hybridize and thus discover novel, related DNA sequences; as a source of information to derive PCR primers for genetic fingerprinting; as a probe to “subtract-out” known sequences in the process of discovering other novel polynucleotides; for selecting and making oligomers for attachment to a “gene chip” or other support, including for examination of expression patterns; to raise anti-protein antibodies using DNA immunization techniques; and as an antigen to raise anti-DNA antibodies or elicit another immune response. Where the polynucleotide encodes a protein which binds or potentially binds to another protein (such as, for example, in a receptor-ligand interaction), the polynucleotide can also be used in interaction trap assays (such as, for example, that described in Gyuris et al., Cell 75:791-803 (1993)) to identify polynucleotides encoding the other protein with which binding occurs or to identify inhibitors of the binding interaction.


[0154] The polypeptides provided by the present invention can similarly be used in assays to determine biological activity, including in a panel of multiple proteins for high-throughput screening; to raise antibodies or to elicit another immune response; as a reagent (including the labeled reagent) in assays designed to quantitatively determine levels of the protein (or its receptor) in biological fluids; as markers for tissues in which the corresponding polypeptide is preferentially expressed (either constitutively or at a particular stage of tissue differentiation or development or in a disease state); and, of course, to isolate correlative receptors or ligands. Proteins involved in these binding interactions can also be used to screen for peptide or small molecule inhibitors or agonists of the binding interaction.


[0155] Any or all of these research utilities are capable of being developed into reagent grade or kit format for commercialization as research products.


[0156] Methods for performing the uses listed above are well known to those skilled in the art. References disclosing such methods include without limitation “Molecular Cloning: A Laboratory Manual”, 2d ed., Cold Spring Harbor Laboratory Press, Sambrook, J., E. F. Fritsch and T. Maniatis eds., 1989, and “Methods in Enzymology: Guide to Molecular Cloning Techniques”, Academic Press, Berger, S. L. and A. R. Kimmel eds., 1987.



Nutritional Uses

[0157] Polynucleotides and polypeptides of the present invention can also be used as nutritional sources or supplements. Such uses include without limitation use as a protein or amino acid supplement, use as a carbon source, use as a nitrogen source and use as a source of carbohydrate. In such cases the polypeptide or polynucleotide of the invention can be added to the feed of a particular organism or can be administered as a separate solid or liquid preparation, such as in the form of powder, pills, solutions, suspensions or capsules. In the case of microorganisms, the polypeptide or polynucleotide of the invention can be added to the medium in or on which the microorganism is cultured.



Cytokine and Cell Proliferation/Differentiation Activity

[0158] A polypeptide of the present invention may exhibit activity relating to cytokine, cell proliferation (either inducing or inhibiting) or cell differentiation (either inducing or inhibiting) activity or may induce production of other cytokines in certain cell populations. A polynucleotide of the invention can encode a polypeptide exhibiting such attributes. Many protein factors discovered to date, including all known cytokines, have exhibited activity in one or more factor-dependent cell proliferation assays, and hence the assays serve as a convenient confirmation of cytokine activity. The activity of therapeutic compositions of the present invention is evidenced by any one of a number of routine factor dependent cell proliferation assays for cell lines including, without limitation, 32D, DA2, DA1G, T10, B9, B9/11, BaF3, MC9/G, M+(preB M+), 2E8, RB5, DA1, 123, T1165, HT2, CTLL2, TF-1, Mo7e, CMK, HUVEC, and Caco. Therapeutic compositions of the invention can be used in the following:


[0159] Assays for T-cell or thymocyte proliferation include without limitation those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Bertagnolli et al., J. Immunol. 145:1706-1712, 1990; Bertagnolli et al., Cellular Immunology 133:327-341, 1991; Bertagnolli, et al., I. Immunol. 149:3778-3783, 1992; Bowman et al., I. Immunol. 152:1756-1761, 1994.


[0160] Assays for cytokine production and/or proliferation of spleen cells, lymph node cells or thymocytes include, without limitation, those described in: Polyclonal T cell stimulation, Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and Measurement of mouse and human interleukin-γ, Schreiber, R. D. In Current Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994.


[0161] Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells include, without limitation, those described in: Measurement of Human and Murine Interleukin 2 and Interleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E. In Current Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and Sons, Toronto. 1991; deVries et al., J. Exp. Med. 173:1205-1211, 1991; Moreau et al., Nature 336:690-692, 1988; Greenberger et al., Proc. Natl. Acad. Sci. U.S.A. 80:2931-2938, 1983; Measurement of mouse and human interleukin 6-Nordan, R. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991; Smith et al., Proc. Natl. Aced. Sci. U.S.A. 83:1857-1861, 1986; Measurement of human Interleukin 11-Bennett, F., Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 6.15.1 John Wiley and Sons, Toronto. 1991; Measurement of mouse and human Interleukin 9-Ciarletta, A., Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991.


[0162] Assays for T-cell clone responses to antigens (which will identify, among others, proteins that affect APC-T cell interactions as well as direct T-cell effects by measuring proliferation and cytokine production) include, without limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. Shevach, W Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function; Chapter 6, Cytokines and their cellular receptors; Chapter 7, Immunologic studies in Humans); Weinberger et al., Proc. Natl. Acad. Sci. USA 77:6091-6095, 1980; Weinberger et al., Eur. J. Immun. 11:405-411, 1981; Takai et al., J. Immunol. 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1988.



Stem Cell Growth Factor Activity

[0163] A polypeptide of the present invention may exhibit stem cell growth factor activity and be involved in the proliferation, differentiation and survival of pluripotent and totipotent stem cells including primordial germ cells, embryonic stem cells, hematopoietic stem cells and/or germ line stem cells. Administration of the polypeptide of the invention to stem cells in vivo or ex vivo is expected to maintain and expand cell populations in a totipotential or pluripotential state which would be useful for re-engineering damaged or diseased tissues, transplantation, manufacture of bio-pharmaceuticals and the development of bio-sensors. The ability to produce large quantities of human cells has important working applications for the production of human proteins which currently must be obtained from non-human sources or donors, implantation of cells to treat diseases such as Parkinson's, Alzheimer's and other neurodegenerative diseases; tissues for grafting such as bone marrow, skin, cartilage, tendons, bone, muscle (including cardiac muscle), blood vessels, cornea, neural cells, gastrointestinal cells and others; and organs for transplantation such as kidney, liver, pancreas (including islet cells), heart and lung.


[0164] It is contemplated that multiple different exogenous growth factors and/or cytokines may be administered in combination with the polypeptide of the invention to achieve the desired effect, including any of the growth factors listed herein, other stem cell maintenance factors, and specifically including stem cell factor (SCF), leukemia inhibitory factor (LIF), Flt-3 ligand (Flt-3L), any of the interleukins, recombinant soluble IL-6 receptor fused to IL-6, macrophage inflammatory protein 1-alpha (MIP-1-alpha), G-CSF, GM-CSF, thrombopoietin (TPO), platelet factor 4 (PF-4), platelet-derived growth factor (PDGF), neural growth factors and basic fibroblast growth factor (bFGF).


[0165] Since totipotent stem cells can give rise to virtually any mature cell type, expansion of these cells in culture will facilitate the production of large quantities of mature cells. Techniques for culturing stem cells are known in the art and administration of polypeptides of the invention, optionally with other growth factors and/or cytokines, is expected to enhance the survival and proliferation of the stem cell populations. This can be accomplished by direct administration of the polypeptide of the invention to the culture medium. Alternatively, stroma cells transfected with a polynucleotide that encodes for the polypeptide of the invention can be used as a feeder layer for the stem cell populations in culture or in vivo. Stromal support cells for feeder layers may include embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or cultured embryonic fibroblasts (see U.S. Pat. No. 5,690,926).


[0166] Stem cells themselves can be transfected with a polynucleotide of the invention to induce autocrine expression of the polypeptide of the invention. This will allow for generation of undifferentiated totipotential/pluripotential stem cell lines that are useful as is or that can then be differentiated into the desired mature cell types. These stable cell lines can also serve as a source of undifferentiated totipotential/pluripotential mRNA to create cDNA libraries and templates for polymerase chain reaction experiments. These studies would allow for the isolation and identification of differentially expressed genes in stem cell populations that regulate stem cell proliferation and/or maintenance.


[0167] Expansion and maintenance of totipotent stem cell populations will be useful in the treatment of many pathological conditions. For example, polypeptides of the present invention may be used to manipulate stem cells in culture to give rise to neuroepithelial cells that can be used to augment or replace cells damaged by illness, autoimmune disease, accidental damage or genetic disorders. The polypeptide of the invention may be useful for inducing the proliferation of neural cells and for the regeneration of nerve and brain tissue, i.e. for the treatment of central and peripheral nervous system diseases and neuropathies, as well as mechanical and traumatic disorders which involve degeneration, death or trauma to neural cells or nerve tissue. In addition, the expanded stem cell populations can also be genetically altered for gene therapy purposes and to decrease host rejection of replacement tissues after grafting or implantation.


[0168] Expression of the polypeptide of the invention and its effect on stem cells can also be manipulated to achieve controlled differentiation of the stem cells into more differentiated cell types. A broadly applicable method of obtaining pure populations of a specific differentiated cell type from undifferentiated stem cell populations involves the use of a cell-type specific promoter driving a selectable marker. The selectable marker allows only cells of the desired type to survive. For example, stem cells can be induced to differentiate into cardiomyocytes (Wobus et al., Differentiation, 48: 173-182, (1991); Klug et al., J. Clin. Invest., 98(1): 216-224, (1998)) or skeletal muscle cells (Browder, L. W. In: Principles of Tissue Engineering eds. Lanza et al., Academic Press (1997)). Alternatively, directed differentiation of stem cells can be accomplished by culturing the stem cells in the presence of a differentiation factor such as retinoic acid and an antagonist of the polypeptide of the invention which would inhibit the effects of endogenous stem cell factor activity and allow differentiation to proceed.


[0169] In vitro cultures of stem cells can be used to determine if the polypeptide of the invention exhibits stem cell growth factor activity. Stem cells are isolated from any one of various cell sources (including hematopoietic stem cells and embryonic stem cells) and cultured on a feeder layer, as described by Thompson et al. Proc. Natl. Acad. Sci, U.S.A., 92: 7844-7848 (1995), in the presence of the polypeptide of the invention alone or in combination with other growth factors or cytokines. The ability of the polypeptide of the invention to induce stem cells proliferation is determined by colony formation on semi-solid support e.g. as described by Bernstein et al., Blood, 77: 2316-2321 (1991).



Hematopoiesis Regulating Activity

[0170] A polypeptide of the present invention may be involved in regulation of hematopoiesis and, consequently, in the treatment of myeloid or lymphoid cell disorders. Even marginal biological activity in support of colony forming cells or of factor-dependent cell lines indicates involvement in regulating hematopoiesis, e.g. in supporting the growth and proliferation of erythroid progenitor cells alone or in combination with other cytokines, thereby indicating utility, for example, in treating various anemias or for use in conjunction with irradiation/chemotherapy to stimulate the production of erythroid precursors and/or erythroid cells; in supporting the growth and proliferation of myeloid cells such as granulocytes and monocytes/macrophages (i.e., traditional CSF activity) useful, for example, in conjunction with chemotherapy to prevent or treat consequent myelo-suppression; in supporting the growth and proliferation of megakaryocytes and consequently of platelets thereby allowing prevention or treatment of various platelet disorders such as thrombocytopenia, and generally for use in place of or complimentary to platelet transfusions; and/or in supporting the growth and proliferation of hematopoietic stem cells which are capable of maturing to any and all of the above-mentioned hematopoietic cells and therefore find therapeutic utility in various stem cell disorders (such-as those usually treated with transplantation, including, without limitation, aplastic anemia and paroxysmal nocturnal hemoglobinuria), as well as in repopulating the stem cell compartment post irradiation/chemotherapy, either iil-vivo or ex-vivo (i.e., in conjunction with bone marrow transplantation or with peripheral progenitor cell transplantation (homologous or heterologous)) as normal cells or genetically manipulated for gene therapy.


[0171] Therapeutic compositions of the invention can be used in the following:


[0172] Suitable assays for proliferation and differentiation of various hematopoietic lines are cited above.


[0173] Assays for embryonic stem cell differentiation (which will identify, among others, proteins that influence embryonic differentiation hematopoiesis) include, without limitation, those described in: Johansson et al. Cellular Biology 15:141-151, 1995; Keller et al., Molecular and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood 81:2903-2915, 1993.


[0174] Assays for stem cell survival and differentiation (which will identify, among others, proteins that regulate Jympho-hematopoiesis) include, without limitation, those described in: Methylcellulose colony forming assays, Freshney, M. G. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 265-268, Wiley-Liss, Inc., New York, N.Y. 1994; Hirayama et al., Proc. Natl. Acad. Sci. USA 89:5907-5911, 1992; Primitive hematopoietic colony forming cells with high proliferative potential, McNiece, I. K. and Briddell, R. A. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 23-39, Wiley-Liss, Inc., New York, N.Y. 1994; Neben et al., Experimental Hematology 22:353-359, 1994; Cobblestone area forming cell assay, Ploemacher, R. E. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 1-21, Wiley-Liss, Inc., New York, N.Y. 1994; Long term bone marrow cultures in the presence of stromal cells, Spooncer, E., Dexter, M. and Allen, T. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 163-179, Wiley-Liss, Inc., New York, N.Y. 1994; Long term culture initiating cell assay, Sutherland, H. J. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 139-162, Wiley-Liss, Inc., New York, N.Y. 1994.



Tissue Growth Activity

[0175] A polypeptide of the present invention also may be involved in bone, cartilage, tendon, ligament and/or nerve tissue growth or regeneration, as well as in wound healing and tissue repair and replacement, and in healing of bums, incisions and ulcers.


[0176] A polypeptide of the present invention which induces cartilage and/or bone growth in circumstances where bone is not normally formed, has application in the healing of bone fractures and cartilage damage or defects in humans and other animals. Compositions of a polypeptide, antibody, binding partner, or other modulator of the invention may have prophylactic use in closed as well as open fracture reduction and also in the improved fixation of artificial joints. De novo bone formation induced by an osteogenic agent contributes to the repair of congenital, trauma induced, or oncologic resection induced craniofacial defects, and also is useful in cosmetic plastic surgery.


[0177] A polypeptide of this invention may also be involved in attracting bone-forming cells, stimulating growth of bone-forming cells, or inducing differentiation of progenitors of bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or periodontal disease, such as through stimulation of bone and/or cartilage repair or by blocking inflammation or processes of tissue destruction (collagenase activity, osteoclast activity, etc.) mediated by inflammatory processes may also be possible using the composition of the invention.


[0178] Another category of tissue regeneration activity that may involve the polypeptide of the present invention is tendon/ligament formation. Induction of tendon/ligament-like tissue or other tissue formation in circumstances where such tissue is not normally formed, has application in the healing of tendon or ligament tears, deformities and other tendon or ligament defects in humans and other animals. Such a preparation employing a tendon/ligament-like tissue inducing protein may have prophylactic use in preventing damage to tendon or ligament tissue, as well as use in the improved fixation of tendon or ligament to bone or other tissues, and in repairing defects to tendon or ligament tissue. De novo tendon/ligament-like tissue formation induced by a composition of the present invention contributes to the repair of congenital, trauma induced, or other tendon or ligament defects of other origin, and is also useful in cosmetic plastic surgery for attachment or repair of tendons or ligaments. The compositions of the present invention may provide environment to attract tendon- or ligament-forming cells, stimulate growth of tendon- or ligament-forming cells, induce differentiation of progenitors of tendon- or ligament-forming cells, or induce growth of tendon/ligament cells or progenitors ex vivo for return in vivo to effect tissue repair. The compositions of the invention may also be useful in the treatment of tendinitis, carpal tunnel syndrome and other tendon or ligarnent defects. The compositions may also include an appropriate matrix and/or sequestering agent as a carrier as is well known in the art.


[0179] The compositions of the present invention may also be useful for proliferation of neural cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central and peripheral nervous system diseases and neuropathies, as well as mechanical and traumatic disorders, which involve degeneration, death or trauma to neural cells or nerve tissue. More specifically, a composition may be used in the treatment of diseases of the peripheral nervous system, such as peripheral nerve injuries, peripheral neuropathy and localized neuropathies, and central nervous system diseases, such as Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic lateral sclerosis, and Shy-Drager syndrome. Further conditions which may be treated in accordance with the present invention include mechanical and traumatic disorders, such as spinal cord disorders, head trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies resulting from chemotherapy or other medical therapies may also be treatable using a composition of the invention.


[0180] Compositions of the invention may also be useful to promote better or faster closure of non-healing wounds, including without limitation pressure ulcers, ulcers associated with vascular insufficiency, surgical and traumatic wounds, and the like.


[0181] Compositions of the present invention may also be involved in the generation or regeneration of other tissues, such as organs (including, for example, pancreas, liver, intestine, kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular (including vascular endothelium) tissue, or for promoting the growth of cells comprising such tissues. Part of the desired effects may be by inhibition or modulation of fibrotic scarring may allow normal tissue to regenerate. A polypeptide of the present invention may also exhibit angiogenic activity.


[0182] A composition of the present invention may also be useful for gut protection or regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, and conditions resulting from systemic cytokine damage.


[0183] A composition of the present invention may also be useful for promoting or inhibiting differentiation of tissues described above from precursor tissues or cells; or for inhibiting the growth of tissues described above.


[0184] Therapeutic compositions of the invention can be used in the following:


[0185] Assays for tissue generation activity include, without limitation, those described in: International Patent Publication No. WO95/16035 (bone, cartilage, tendon); International Patent Publication No. WO95/05846 (nerve, neuronal); International Patent Publication No. WO91/07491 (skin, endothelium).


[0186] Assays for wound healing activity include, without limitation, those described in: Winter, Epidermal Wound Healing, pps. 71-112 (Maibach, H. I. and Rovee, D. T., eds.), Year Book Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. Invest. Dermatol 71:382-84 (1978).



Immune Stimulating or Suppressing Activity

[0187] A polypeptide of the present invention may also exhibit immune stimulating or immune suppressing activity, including without limitation the activities for which assays are described herein. A polynucleotide of the invention can encode a polypeptide exhibiting such activities. A protein may be useful in the treatment of various immune deficiencies and disorders (including severe combined immunodeficiency (SCID)), e.g., in regulating (up or down) growth and proliferation of T and/or B lymphocytes, as well as effecting the cytolytic activity of NK cells and other cell populations. These immune deficiencies may be genetic or be caused by viral (e.g., HIV) as well as bacterial or fungal infections, or may result from autoimmune disorders. More specifically, infectious diseases causes by viral, bacterial, fungal or other infection may be treatable using a protein of the present invention, including infections by HIV, hepatitis viruses, herpes viruses, mycobacteria, Leishmania spp., malaria spp. and various fungal infections such as candidiasis. Of course, in this regard, proteins of the present invention may also be useful where a boost to the immune system generally may be desirable, i.e., in the treatment of cancer.


[0188] Autoimmune disorders which may be treated using a protein of the present invention include, for example, connective tissue disease, multiple sclerosis, systemic lupus erythematosus, rheumatoid arthritis, autoimmune pulmonary inflammation, Guillain-Barre syndrome, autoimmune thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, graft-versus-host disease and autoimmune inflammatory eye disease. Such a protein (or antagonists thereof, including antibodies) of the present invention may also to be useful in the treatment of allergic reactions and conditions (e.g., anaphylaxis, serum sickness, drug reactions, food allergies, insect venom allergies, mastocytosis, allergic rhinitis, hypersensitivity pneumonitis, urticaria, angioedema, eczema, atopic dermatitis, allergic contact dermatitis, erythema multiforme, Stevens-Johnson syndrome, allergic conjunctivitis, atopic keratoconjunctivitis, venereal keratoconjunctivitis, giant papillary conjunctivitis and contact allergies), such as asthma (particularly allergic asthma) or other respiratory problems. Other conditions, in which immune suppression is desired (including, for example, organ transplantation), may also be treatable using a protein (or antagonists thereof) of the present invention. The therapeutic effects of the polypeptides or antagonists thereof on allergic reactions can be evaluated by in vivo animals models such as the cumulative contact enhancement test (Lastbom et al., Toxicology 125: 59-66, 1998), skin prick test (Hoffmann et al., Allergy 54: 446-54, 1999), guinea pig skin sensitization test (Vohr et al., Arch. Toxocol. 73: 501-9), and murine local lymph node assay (Kimber et al., J. Toxicol. Environ. Health 53: 563-79).


[0189] Using the proteins of the invention it may also be possible to modulate immune responses, in a number of ways. Down regulation may be in the form of inhibiting or blocking an immune response already in progress or may involve preventing the induction of an immune response. The functions of activated T cells may be inhibited by suppressing T cell responses or by inducing specific tolerance in T cells, or both. Immunosuppression of T cell responses is generally an active, non-antigen-specific, process which requires continuous exposure of the T cells to the suppressive agent. Tolerance, which involves inducing non-responsiveness or anergy in T cells, is distinguishable from immunosuppression in that it is generally antigen-specific and persists after exposure to the tolerizing agent has ceased. Operationally, tolerance can be demonstrated by the lack of a T cell response upon reexposure to specific antigen in the absence of the tolerizing agent.


[0190] Down regulating or preventing one or more antigen functions (including without limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high level lymphokine synthesis by activated T cells, will be useful in situations of tissue, skin and organ transplantation and in graft-versus-host disease (GVHD). For example, blockage of T cell function should result in reduced tissue destruction in tissue transplantation. Typically, in tissue transplants, rejection of the transplant is initiated through its recognition as foreign by T cells, followed by an immune reaction that destroys the transplant. The administration of a therapeutic composition of the invention may prevent cytokine synthesis by immune cells, such as T cells, and thus acts as an immunosuppressant. Moreover, a lack of costimulation may also be sufficient to anergize the T cells, thereby inducing tolerance in a subject. Induction of long-term tolerance by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration of these blocking reagents. To achieve sufficient immunosuppression or tolerance in a subject, it may also be necessary to block the function of a combination of B lymphocyte antigens.


[0191] The efficacy of particular therapeutic compositions in preventing organ transplant rejection or GVHD can be assessed using animal models that are predictive of efficacy in humans. Examples of appropriate systems which can be used include allogeneic cardiac grafts in rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been used to examine the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as described in Lenschow et al., Science 257:789-792 (1992) and Turka et al., Proc. Natl. Acad. Sci USA, 89:11102-11105 (1992). In addition, murine models of GVHD (see Paul ed., Fundamental Immunology, Raven Press, New York, 1989, pp. 846-847) can be used to determine the effect of therapeutic compositions of the invention on the development of that disease.


[0192] Blocking antigen function may also be therapeutically useful for treating autoimmune diseases. Many autoimmune disorders are the result of inappropriate activation of T cells that are reactive against self-tissue and which promote the production of cytokines and autoantibodies involved in the pathology of the diseases. Preventing the activation of autoreactive T cells may reduce or eliminate disease symptoms. Administration of reagents which block stimulation of T cells can be used to inhibit T cell activation and prevent production of autoantibodies or T cell-delived cytokines which may be involved in the disease process. Additionally, blocking reagents may induce antigen-specific tolerance of autoreactive T cells which could lead to long-term relief from the disease. The efficacy of blocking reagents in preventing or alleviating autoimmune disorders can be determined using a number of well-characterized animal models of human autoimmune diseases. Examples include murine experimental autoimmune encephalitis, systemic lupus erythmatosis in MRL/lpr/lpr mice or NZB hybrid mice, murine autoimmune collagen arthritis, diabetes mellitus in NOD mice and BB rats, and murine experimental myasthenia gravis (see Paul ed., Fundamental Immunology, Raven Press, New York, 1989, pp. 840-856). Upregulation of an antigen function (e.g., a B lymphocyte antigen function), as a means of up regulating immune responses, may also be useful in therapy. Upregulation of immune responses may be in the form of enhancing an existing immune response or eliciting an initial immune response. For example, enhancing an immune response may. be useful in cases of viral infection, including systemic viral diseases such as influenza, the common cold, and encephalitis.


[0193] Alternatively, anti-viral immune responses may be enhanced in an infected patient by removing T cells from the patient, costimulating the T cells in vitro with viral antigen-pulsed APCs either expressing a peptide of the present invention or together with a stimulatory form of a soluble peptide of the present invention and reintroducing the in vitro activated T cells into the patient. Another method of enhancing anti-viral immune responses would be to isolate infected cells from a patient, transfect them with a nucleic acid encoding a protein of the present invention as described herein such that the cells express all or a portion of the protein on their surface, and reintroduce the transfected cells into the patient. The infected cells would now be capable of delivering a costimulatory signal to, and thereby activate, T cells in vivo.


[0194] A polypeptide of the present invention may provide the necessary stimulation signal to T cells to induce a T cell mediated immune response against the transfected tumor cells. In addition, tumor cells which lack MHC class I or MHC class II molecules, or which fail to reexpress sufficient mounts of MHC class I or MHC class It molecules, can be transfected with nucleic acid encoding all or a portion of (e.g., a cytoplasmic-domain truncated portion) of an MHC class I alpha chain protein and Γ2 microglobulin protein or an MHC class II alpha chain protein and an MHC class II beta chain protein to thereby express MHC class I or MHC class II proteins on the cell surface. Expression of the appropriate class I or class II MHC in conjunction with a peptide having the activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a T cell mediated immune response against the transfected tumor cell. Optionally, a gene encoding an antisense construct which blocks expression of an MHC class II associated protein, such as the invariant chain, can also be cotransfected with a DNA encoding a peptide having the activity of a B lymphocyte antigen to promote presentation of tumor associated antigens and induce tumor specific immunity. Thus, the induction of a T cell mediated immune response in a human subject may be sufficient to overcome tumor-specific tolerance in the subject.


[0195] The activity of a protein of the invention may, among other means, be measured by the following methods:


[0196] Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in Humans); Herrmann et al., Proc. Natl. Acad. Sci. USA 78:2488-2492, 1981; Herrmann et al., J. Immunol. 128:1968-1974, 1982; Handa et al., J. Immunol. 135:1564-1572, 1985; Takai et al., I. Immunol. 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1988; Bowman et al., J. Virology 61:1992-1998; Bertagnolli et al., Cellular Immunology 133:327-341, 1991; Brown et al., J. Immunol. 153:3079-3092, 1994.


[0197] Assays for T-cell-dependent immunoglobulin responses and isotype switching (which will identify, among others, proteins that modulate T-cell dependent antibody responses and that affect Th1/Th2 profiles) include, without limitation, those described in: Maliszewski, J. Immunol. 144:3028-3033, 1990; and Assays for B cell function: In vitro antibody production, Mond, J. J. and Brunswick, M. In Current Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto. 1994.


[0198] Mixed lymphocyte reaction (MLR) assays (which will identify, among others, proteins that generate predominantly Th1 and CTL responses) include, without limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1988; Bertagnolli et al., J. Immunol. 149:3778-3783, 1992.


[0199] Dendritic cell-dependent assays (which will identify, among others, proteins expressed by dendritic cells that activate naive T-cells) include, without limitation, those described in: Guery et al., J. Immunol. 134:536-544, 1995; Inaba et al., Journal of Experimental Medicine 173:549-559, 1991; Macatonia et al., Journal of Immunology 154:5071-5079, 1995; Porgador et al., Journal of Experimental Medicine 182:255-260, 1995; Nair et al., Journal of Virology 67:4062-4069, 1993; Huang et al., Science 264:961-965, 1994; Macatonia et al., Journal of Experimental Medicine 169:1255-1264, 1989; Bhardwaj et al., Journal of Clinical Investigation 94:797-807, 1994; and Inaba et al., Journal of Experimental Medicine 172:631-640, 1990.


[0200] Assays for lymphocyte survival/apoptosis (which will identify, among others, proteins that prevent apoptosis after superantigen induction and proteins that regulate lymphocyte homeostasis) include, without limitation, those described in: Darzynkiewicz et al., Cytometry 13:795-808, 1992; Gorczyca et al., Leukemia 7:659-670, 1993; Gorczyca et al., Cancer Research 53:1945-1951, 1993; Itoh et al., Cell 66:233-243, 1991; Zacharchuk, Journal of Immunology 145:4037-4045, 1990; Zamai et al., Cytometry 14:891-897, 1993; Gorczyca et al., International Journal of Oncology 1:639-648, 1992.


[0201] Assays for proteins that influence early steps of T-cell commitment and development include, without limitation, those described in: Antica et al., Blood 84:111-117, 1994; Fine et al., Cellular Immunology 155:111-122, 1994; Galy et al., Blood 85:2770-2778, 1995; Toki et al., Proc. Nat. Acad Sci. USA 88:7548-7551, 1991.



Activin/Inhibin Activity

[0202] A polypeptide of the present invention may also exhibit activin- or inhibin-related activities. A polynucleotide of the invention may encode a polypeptide exhibiting such characteristics. Inhibins are characterized by their ability to inhibit the release of follicle stimulating hormone (FSH), while activins and are characterized by their ability to stimulate the release of follicle stimulating hormone (FSH). Thus, a polypeptide of the present invention, alone or in heterodimers with a member of the inhibin family, may be useful as a contraceptive based on the ability of inhibins to decrease fertility in female mammals and decrease spermatogenesis in male mammals. Administration of sufficient amounts of other inhibins can induce infertility in these mammals. Alternatively, the polypeptide of the invention, as a homodimer or as a heterodimer with other protein subunits of the inhibin group, may be useful as a fertility inducing therapeutic, based upon the ability of activin molecules in stimulating FSH release from cells of the anterior pituitary. See, for example, U.S. Pat. No. 4,798,885. A polypeptide of the invention may also be useful for advancement of the onset of fertility in sexually immature mammals, so as to increase the lifetime reproductive performance of domestic animals such as, but not limited to, cows, sheep and pigs.


[0203] The activity of a polypeptide of the invention may, among other means, be measured by the following methods.


[0204] Assays for activin/inhibin activity include, without limitation, those described in: Vale et al., Endocrinology 91:562-572, 1972; Ling et al., Nature 321:779-782, 1986; Vale et al., Nature 321:776-779, 1986; Mason et al., Nature 318:659-663, 1985; Forage et al., Proc. Natl. Acad. Sci. USA 83:3091-3095, 1986.



Chemotactic/Chemokinetic Activity

[0205] A polypeptide of the present invention may be involved in chemotactic or chemokinetic activity for mammalian cells, including, for example, monocytes, fibroblasts, neutrophils, T-cells, mast cells, eosinophils, epithelial and/or endothelial cells. A polynucleotide of the invention can encode a polypeptide exhibiting such attributes. Chemotactic and chemokinetic receptor activation can be used to mobilize or attract a desired cell population to a desired site of action. Chemotactic or chemokinetic compositions (e.g. proteins, antibodies, binding partners, or modulators of the invention) provide particular advantages in treatment of wounds and other trauma to tissues, as well as in treatment of localized infections. For example, attraction of lymphocytes, monocytes or neutrophils to tumors or sites of infection may result in improved immune responses against the tumor or infecting agent.


[0206] A protein or peptide has chemotactic activity for a particular cell population if it can stimulate, directly or indirectly, the directed orientation or movement of such cell population. Preferably, the protein or peptide has the ability to directly stimulate directed movement of cells. Whether a particular protein has chemotactic activity for a population of cells can be readily determined by employing such protein or peptide in any known assay for cell chemotaxis.


[0207] Therapeutic compositions of the invention can be used in the following:


[0208] Assays for chemotactic activity (which will identify proteins that induce or prevent chemotaxis) consist of assays that measure the ability of a protein to induce the migration of cells across a membrane as well as the ability of a protein to induce the adhesion of one cell population to another cell population. Suitable assays for movement and adhesion include, without limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 6.12, Measurement of alpha and beta Chemokines 6.12.1-6.12.28; Taub et al. J. Clin. Invest. 95:1370-1376, 1995; Lind et al. APMIS 103:140-146, 1995; Muller et al Eur. J. Immunol. 25:1744-1748; Gruber et al. J. of Immunol. 152:5860-5867, 1994; Johnston et al. J. of Immunol. 153:1762-1768, 1994.



Hemostatic and Thrombolytic Activity

[0209] A polypeptide of the invention may also be involved in hemostatis or thrombolysis or thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such attributes. Compositions may be useful in treatment of various coagulation disorders (including hereditary disorders, such as hemophilias) or to enhance coagulation and other hemostatic events in treating wounds resulting from trauma, surgery or other causes. A composition of the invention may also be useful for dissolving or inhibiting formation of thromboses and for treatment and prevention of conditions resulting therefrom (such as, for example, infarction of cardiac and central nervous system vessels (e.g., stroke).


[0210] Therapeutic compositions of the invention can be used in the following:


[0211] Assay for hemostatic and thrombolytic activity include, without limitation, those described in: Linet et al., J. Clin. Pharmacol. 26:131-140, 1986; Burdick et al., Thrombosis Res. 45:413-419, 1987; Humphrey et al., Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 35:467-474, 1988.



Cancer Diagnosis and Therapy

[0212] Polypeptides of the invention may be involved in cancer cell generation, proliferation or metastasis. Detection of the presence or amount of polynucleotides or polypeptides of the invention may be useful for the diagnosis and/or prognosis of one or more types of cancer. For example, the presence or increased expression of a polynucleotide/polypeptide of the invention may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing malignancy. Conversely, a defect in the gene or absence of the polypeptide may be associated with a cancer condition. Identification of single nucleotide polymorphisms associated with cancer or a predisposition to cancer may also be useful for diagnosis or prognosis.


[0213] Cancer treatments promote tumor regression by inhibiting tumor cell proliferation, inhibiting angiogenesis (growth of new blood vessels that is necessary to support tumor growth) and/or prohibiting metastasis by reducing tumor cell motility or invasiveness. Therapeutic compositions of the invention may be effective in adult and pediatric oncology including in solid phase tumors/malignancies, locally advanced tumors, human soft tissue sarcomas, metastatic cancer, including lymphatic metastases, blood cell malignancies including multiple myeloma, acute and chronic leukemias, and lymphomas, head and neck cancers including mouth cancer, larynx cancer and thyroid cancer, lung cancers including small cell carcinoma and non-small cell cancers, breast cancers including small cell carcinoma and ductal carcinoma, gastrointestinal cancers including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps associated with colorectal neoplasia, pancreatic cancers, liver cancer, urologic cancers including bladder cancer and prostate cancer, malignancies of the female genital tract including ovarian carcinoma, uterine (including endometrial) cancers, and solid tumor in the ovarian follicle, kidney cancers including renal cell carcinoma, brain cancers including intrinsic brain tumors, neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell invasion in the central nervous system, bone cancers including osteomas, skin cancers including malignant melanoma, tumor progression of human skin keratinocytes, squamous cell carcinoma, basal cell carcinoma, hemangiopericytoma and Karposi's sarcoma.


[0214] Polypeptides, polynucleotides, or modulators of polypeptides of the invention (including inhibitors and stimulators of the biological activity of the polypeptide of the invention) may be administered to treat cancer. Therapeutic compositions can be administered in therapeutically effective dosages alone or in combination with adjuvant cancer therapy such as surgery, chemotherapy, radiotherapy, thermotherapy, and laser therapy, and may provide a beneficial effect, e.g. reducing tumor size, slowing rate of tumor growth, inhibiting metastasis, or otherwise improving overall clinical condition, without necessarily eradicating the cancer.


[0215] The composition can also be administered in therapeutically effective amounts as a portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the polypeptide or modulator of the invention with one or more anti-cancer drugs in addition to a pharmaceutically acceptable carrier for delivery. The use of anti-cancer cocktails as a cancer treatment is routine. Anti-cancer drugs that are well known in the art and can be used as a treatment in combination with the polypeptide or modulator of the invention include: Actinomycin D, Aminoglutethimide, Asparaginase, Bleomycin, Busulfan, Carboplatin, Carmustine, Chlorambucil, Cisplatin (cis-DDP), Cyclophosphamide, Cytarabine HCl (Cytosine arabinoside), Dacarbazine, Dactinomycin, Daunorubicin HCl, Doxorubicin HCl, Estramustine phosphate sodium, Etoposide (V16-213), Floxuridine, 5-Fluorouracil (5-Fu), Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, Interferon Alpha-2a, Interferon Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), Lomustine, Mechlorethamine HCl (nitrogen mustard), Melphalan, Mercaptopurine, Mesna, Methotrexate (MTX), Mitomycin, Mitoxantrone HCl, Octreotide, Plicamycin, Procarbazine HCl, Streptozocin, Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine sulfate, Vinciistine sulfate, Amsacrine, Azacitidine, Hexamethylmelamine, Interleukin-2, Mitoguazone, Pentostatin, Semustine, Teniposide, and Vindesine sulfate.


[0216] In addition, therapeutic compositions of the invention may be used for prophylactic treatment of cancer. There are hereditary conditions and/or environmental situations (e.g. exposure to carcinogens) known in the art that predispose an individual to developing cancers. Under these circumstances, it may be beneficial to treat these individuals with therapeutically effective doses of the polypeptide of the invention to reduce the risk of developing cancers.


[0217] In vitro models can be used to determine the effective doses of the polypeptide of the invention as a potential cancer treatment. These in vitro models include proliferation assays of cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, (1987) Culture of Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, N.Y. Ch 18 and Ch 21), tumor systems in nude mice as described in Giovanella et al., J. Natl. Can. Inst., 52: 921-30 (1974), mobility and invasive potential of tumor cells in Boyden Chamber assays as described in Pilkington et al., Anticancer Res., 17: 4107-9 (1997), and angiogenesis assays such as induction of vascularization of the chick chorioallantoic membrane or induction of vascular endothelial cell migration as described in Ribatta et al., Intl. J. Dev. Biol., 40: 1189-97 (1999) and Li et al., Clin. Exp. Metastasis, 17:423-9 (1999), respectively. Suitable tumor cells lines are available, e.g. from American Type Tissue Culture Collection catalogs.



Receptor/Ligand Activity

[0218] A polypeptide of the present invention may also demonstrate activity as receptor, receptor ligand or inhibitor or agonist of receptor/ligand interactions. A polynucleotide of the invention can encode a polypeptide exhibiting such characteristics. Examples of such receptors and ligands include, without limitation, cytokine receptors and their ligands, receptor kinases and their ligands, receptor phosphatases and their ligands, receptors involved in cell-cell interactions and their ligands (including without limitation, cellular adhesion molecules (such as selectins, integrins and their ligands) and receptor/ligand pairs involved in antigen presentation, antigen recognition and development of cellular and humoral immune responses. Receptors and ligands are also useful for screening of potential peptide or small molecule inhibitors of the relevant receptor/ligand interaction. A protein of the present invention (including, without limitation, fragments of receptors and ligands) may themselves be useful as inhibitors of receptor/ligand interactions.


[0219] The activity of a polypeptide of the invention may, among other means, be measured by the following methods:


[0220] Suitable assays for receptor-ligand activity include without limitation those described in: Current Protocols in Lrnmunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 7.28, Measurement of Cellular Adhesion under static conditions 7.28.1-7.28.22), Takai et al., Proc. Natl. Acad. Sci. USA 84:6864-6868, 1987; Bierer et al., J. Exp. Med. 168:1145-1156, 1988; Rosenstein et al., J. Exp. Med. 169:149-160 1989; Stoltenborg et al., J. Immunol. Methods 175:59-68, 1994; Stitt et al., Cell 80:661-670, 1995.


[0221] By way of example, the polypeptides of the invention may be used as a receptor for a ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be identified through binding assays, affinity chromatography, dihybrid screening assays, BIAcore assays, gel overlay assays, or other methods known in the art.


[0222] Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or a partial antagonist require the use of other proteins as competing ligands. The polypeptides of the present invention or ligand(s) thereof may be labeled by being coupled to radioisotopes, calorimetric molecules or a toxin molecules by conventional methods. (“Guide to Protein Purification” Murray P. Deutscher (ed) Methods in Enzymology Vol. 182 (1990) Academic Press, Inc. San Diego). Examples of radioisotopes include, but are not limited to, tritium and carbon-14 . Examples of colorimetric molecules include, but are not limited to, fluorescent molecules such as fluorescamine, or rhodamine or other calorimetric molecules. Examples of toxins include, but are not limited, to ricin.



Drug Screening

[0223] This invention is particularly useful for screening chemical compounds by using the novel polypeptides or binding fragments thereof in any of a variety of drug screening techniques. The polypeptides or fragments employed in such a test may either be free in solution, affixed to a solid support, borne on a cell surface or located intracellularly. One method of drug screening utilizes eukaryotic or prokaryotic host cells which are stably transformed with recombinant nucleic acids expressing the polypeptide or a fragment thereof. Drugs are screened against such transformed cells in competitive binding assays. Such cells, either in viable or fixed form, can be used for standard binding assays. One may measure, for example, the formation of complexes between polypeptides of the invention or fragments and the agent being tested or examine the diminution in complex formation between the novel polypeptides and an appropriate cell line, which are well known in the art.


[0224] Sources for test compounds that may be screened for ability to bind to or modulate (i.e., increase or decrease) the activity of polypeptides of the invention include (1) inorganic and organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries comprised of either random or mimetic peptides, oligonucleotides or organic molecules.


[0225] Chemical libraries may be readily synthesized or purchased from a number of commercial sources, and may include structural analogs of known compounds or compounds that are identified as “hits” or “leads” via natural product screening.


[0226] The sources of natural product libraries are microorganisms (including bacteria and fungi), animals, plants or other vegetation, or marine organisms, and libraries of mixtures for screening may be created by: (1) fermentation and extraction of broths from soil, plant or marine microorganisms or (2) extraction of the organisms themselves. Natural product libraries include Polypeptides, non-ribosomal peptides, and (non-naturally occurring) variants thereof. For a review, see Science 282:63-68 (1998).


[0227] Combinatorial libraries are composed of large numbers of peptides, oligonucleotides or organic compounds and can be readily prepared by traditional automated synthesis methods, PCR, cloning or proprietary synthetic methods. Of particular interest are peptide and oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, protein, peptidomimetic, multiparallel synthetic collection, recombinatorial, and polypeptide libraries. For a review of combinatorial chemistry and libraries created therefrom, see Myers, Curr. Opin. Biotechnol. 8:701-707 (1997). For reviews and examples of peptidomimetic libraries, see Al-Obeidi et al., Mol. Biotechnol, 9(3):205-23 (1998); Hruby et al., Curr Opin Chem Biol, 1(1):114-19 (1997); Dorner et al., Bioorg Med Chem, 4(5):709-15 (1996) (alkylated dipeptides).


[0228] Identification of modulators through use of the various libraries described herein permits modification of the candidate “hit” (or “lead”) to optimize the capacity of the “hit” to bind a polypeptide of the invention. The molecules identified in the binding assay are then tested for antagonist or agonist activity in in vivo tissue culture or animal models that are well known in the art. In brief, the molecules are titrated into a plurality of cell cultures or animals and then tested for either cell/animal death or prolonged survival of the animal/cells.


[0229] The binding molecules thus identified may be complexed with toxins, e.g., ricin or cholera, or with other compounds that are toxic to cells such as radioisotopes. The toxin-binding molecule complex is then targeted to a tumor or other cell by the specificity of the binding molecule for a polypeptide of the invention. Alternatively, the binding molecules may be complexed with imaging agents for targeting and imaging purposes.



Assay for Receptor Activity

[0230] The invention also provides methods to detect specific binding of a polypeptide e.g. a ligand or a receptor. The art provides numerous assays particularly useful for identifying previously unknown binding partners for receptor polypeptides of the invention. For example, expression cloning using mammalian or bacterial cells, or dihybrid screening assays can be used to identify polynucleotides encoding binding partners. As another example, affinity chromatography with the appropriate immobilized polypeptide of the invention can be used to isolate polypeptides that recognize and bind polypeptides of the invention. There are a number of different libraries used for the identification of compounds, and in particular small molecules, that modulate (i.e., increase or decrease) biological activity of a polypeptide of the invention. Ligands for receptor polypeptides of the invention can also be identified by adding exogenous ligands, or cocktails of ligands to two cells populations that are genetically identical except for the expression of the receptor of the invention: one cell population expresses the receptor of the invention whereas the other does not. The responses of the two cell populations to the addition of ligands(s) are then compared. Alternatively, an expression library can be co-expressed with the polypeptide of the invention in cells and assayed for an autocrine response to identify potential ligand(s). As still another example, BIAcore assays, gel overlay assays, or other methods known in the art can be used to identify binding partner polypeptides, including, (1) organic and inorganic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries comprised of random peptides, oligonucleotides or organic molecules.


[0231] The role of downstream intracellular signaling molecules in the signaling cascade of the polypeptide of the invention can be determined. For example, a chimeric protein in which the cytoplasmic domain of the polypeptide of the invention is fused to the extracellular portion of a protein, whose ligand has been identified, is produced in a host cell. The cell is then incubated with the ligand specific for the extracellular portion of the chimeric protein, thereby activating the chimeric receptor. Known downstream proteins involved in intracellular signaling can then be assayed for expected modifications i.e. phosphorylation. Other methods known to those in the art can also be used to identify signaling molecules involved in receptor activity.



Anti-Inflammatory Activity

[0232] Compositions of the present invention may also exhibit anti-inflammatory activity. The anti-inflammatory activity may be achieved by providing a stimulus to cells involved in the inflammatory response, by inhibiting or promoting cell-cell interactions (such as, for example, cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the inflammatory process, inhibiting or promoting cell extravasation, or by stimulating or suppressing production of other factors which more directly inhibit or promote an inflammatory response. Compositions with such activities can be used to treat inflammatory conditions including chronic or acute conditions), including without limitation intimation associated with infection (such as septic shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia-reperfusion injury, endotoxin lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or chemokine-induced lung injury, inflammatory bowel disease, Crohn's disease or resulting from over production of cytokines such as TNF or IL-1. Compositions of the invention may also be useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material. Compositions of this invention may be utilized to prevent or treat conditions such as, but not limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid arthritis, chronic inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1, graft versus host disease, inflammatory bowel disease, inflamation associated with pulmonary disease, other autoimmune disease or inflammatory disease, an antiproliferative agent such as for acute or chronic mylegenous leukemia or in the prevention of premature labor secondary to intrauterine infections.



Leukemias

[0233] Leukemias and related disorders may be treated or prevented by administration of a therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of the invention. Such leukemias and related disorders include but are not limited to acute leukemia, acute lymphocytic leukemia, acute myelocytic leukemia, myeloblastic, promyelocytic, myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic myelocytic (granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such disorders, see Fishman et al., 1985, Medicine, 2d Ed., J. B. Lippincott Co., Philadelphia).



Nervous System Disorders

[0234] Nervous system disorders, involving cell types which can be tested for efficacy of intervention with compounds that modulate the activity of the polynucleotides and/or polypeptides of the invention, and which can be treated upon thus observing an indication of therapeutic utility, include but are not limited to nervous system injuries, and diseases or disorders which result in either a disconnection of axons, a diminution or degeneration of neurons, or demyelination. Nervous system lesions which may be treated in a patient (including human and non-human mammalian patients) according to the invention include but are not limited to the following lesions of either the central (including spinal cord, brain) or peripheral nervous systems:


[0235] (i) traumatic lesions, including lesions caused by physical injury or associated with surgery, for example, lesions which sever a portion of the nervous system, or compression injuries;


[0236] (ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system results in neuronal injury or death, including cerebral infarction or ischemia, or spinal cord infarction or ischemia;


[0237] (iii) infectious lesions, in which a portion of the nervous system is destroyed or injured as a result of infection, for example, by an abscess or associated with infection by human immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme disease, tuberculosis, syphilis;


[0238] (iv) degenerative lesions, in which a portion of the nervous system is destroyed or injured as a result of a degenerative process including but not limited to degeneration associated with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or amyotrophic lateral sclerosis;


[0239] (v) lesions associated with nutritional diseases or disorders, in which a portion of the nervous system is destroyed or injured by a nutritional disorder or disorder of metabolism including but not limited to, vitamin B12 deficiency, folic acid deficiency, Wernicke disease, tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary degeneration of the corpus callosum), and alcoholic cerebellar degeneration;


[0240] (vi) neurological lesions associated with systemic diseases including but not limited to diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, carcinoma, or sarcoidosis;


[0241] (vii) lesions caused by toxic substances including alcohol, lead, or particular neurotoxins; and


[0242] (viii) demyelinated lesions in which a portion of the nervous system is destroyed or injured by a demyelinating disease including but not limited to multiple sclerosis, human immunodeficiency virus-associated myelopathy, transverse myelopathy or various etiologies, progressive multifocal leukoencephalopathy, and central pontine myelinolysis.


[0243] Therapeutics which are useful according to the invention for treatment of a nervous system disorder may be selected by testing for biological activity in promoting the survival or differentiation of neurons. For example, and not by way of limitation, therapeutics which elicit any of the following effects may be useful according to the invention:


[0244] (i) increased survival time of neurons in culture;


[0245] (ii) increased sprouting of neurons in culture or in vivo;


[0246] (iii) increased production of a neuron-associated molecule in culture or in vivo, e.g., choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or


[0247] (iv) decreased symptoms of neuron dysfunction in vivo.


[0248] Such effects may be measured by any method known in the art. In preferred, non-limiting embodiments, increased survival of neurons may be measured by the method set forth in Arakawa et al. (1990, J. Neurosci. 10:3507-3515); increased sprouting of neurons may be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 70:65-82) or Brown et al. (1981, Ann. Rev. Neurosci. 4:17-42); increased production of neuron-associated molecules may be measured by bioassay, enzymatic assay, antibody binding, Northern blot assay, etc., depending on the molecule to be measured; and motor neuron dysfunction may be measured by assessing the physical manifestation of motor neuron disorder, e.g., weakness, motor neuron conduction velocity, or functional disability.


[0249] In specific embodiments, motor neuron disorders that may be treated according to the invention include but are not limited to disorders such as infarction, infection, exposure to toxin, trauma, surgical damage, degenerative disease or malignancy that may affect motor neurons as well as other components of the nervous system, as well as disorders that selectively affect neurons such as amyotrophic lateral sclerosis, and including but not limited to progressive spinal muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, infantile and juvenile muscular atrophy, progressive bulbar paralysis of childhood (Fazio-Londe syndrome), poliomyelitis and the post polio syndrome, and Hereditary Motorsensory Neuropathy (Charcot-Marie-Tooth Disease).



Other Activities

[0250] A polypeptide of the invention may also exhibit one or more of the following additional activities or effects: inhibiting the growth, infection or function of, or killing, infectious agents, including, without limitation, bacteria, viruses, fungi and other parasites; effecting (suppressing or enhancing) bodily characteristics, including, without limitation, height, weight, hair color, eye color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape (such as, for example, breast augmentation or diminution, change in bone form or shape); effecting biorhythms or circadian cycles or rhythms; effecting the fertility of male or female subjects; effecting the metabolism, catabolism, anabolism, processing, utilization, storage or elimination of dietary fat, lipid, protein, carbohydrate, vitamins, minerals, co-factors or other nutritional factors or component(s); effecting behavioral characteristics, including, without limitation, appetite, libido, stress, cognition (including cognitive disorders), depression (including depressive disorders) and violent behaviors; providing analgesic effects or other pain reducing effects; promoting differentiation and growth of embryonic stem cells in lineages other than hematopoietic lineages; hormonal or endocrine activity; in the case of enzymes, correcting deficiencies of the enzyme and treating deficiency-related diseases; treatment of hyperproliferative disorders (such as, for example, psoriasis); immunoglobulin-like activity (such as, for example, the ability to bind antigens or complement); and the ability to act as an antigen in a vaccine composition to raise an immune response against such protein or another material or entity which is cross-reactive with such protein.



Identification of Polymorphisms

[0251] The demonstration of polymorphisms makes possible the identification of such polymorphisms in human subjects and the pharmacogenetic use of this information for diagnosis and treatment. Such polymorphisms may be associated with, e.g., differential predisposition or susceptibility to various disease states (such as disorders involving inflammation or immune response) or a differential response to drug administration, and this genetic information can be used to tailor preventive or therapeutic treatment appropriately. For example, the existence of a polymorphism associated with a predisposition to inflammation or autoimmune disease makes possible the diagnosis of this condition in humans by identifying the presence of the polymorphism.


[0252] Polymorphisms can be identified in a variety of ways known in the art which all generally involve obtaining a sample from a patient, analyzing DNA from the sample, optionally involving isolation or amplification of the DNA, and identifying the presence of the polymorphism in the DNA. For example, PCR may be used to amplify an appropriate fragment of genomic DNA which may then be sequenced. Alternatively, the DNA may be subjected to allele-specific oligonucleotide hybridization (in which appropriate oligonucleotides are hybridized to the DNA under conditions permitting detection of a single base mismatch) or to a single nucleotide extension assay (in which an oligonucleotide that hybridizes immediately adjacent to the position of the polymorphism is extended with one or more labeled nucleotides). In addition, traditional restriction fragment length polymorphism analysis (using restriction enzymes that provide differential digestion of the genomic DNA depending on the presence or absence of the polymorphism) may be performed. Arrays with nucleotide sequences of the present invention can be used to detect polymorphisms. The array can comprise modified nucleotide sequences of the present invention in order to detect the nucleotide sequences of the present invention. In the alternative, any one of the nucleotide sequences of the present invention can be placed on the array to detect changes from those sequences.


[0253] Alternatively a polymorphism resulting in a change in the amino acid sequence could also be detected by detecting a corresponding change in amino acid sequence of the protein, e.g., by an antibody specific to the variant sequence.



Arthritis and Inflammation

[0254] The immunosuppressive effects of the compositions of the invention against rheumatoid arthritis is determined in an experimental animal model system. The experimental model system is adjuvant induced arthritis in rats, and the protocol is described by J. Holoshitz, et at., 1983, Science, 219:56, or by B. Waksman et al., 1963, Int. Arch. Allergy Appl. Immunol., 23:129. Induction of the disease can be caused by a single injection, generally intradermally, of a suspension of killed Mycobacterium tuberculosis in complete Freund's adjuvant (CFA). The route of injection can vary, but rats may be injected at the base of the tail with an adjuvant mixture. The polypeptide is administered in phosphate buffered solution (PBS) at a dose of about 1-5 mg/kg. The control consists of administering PBS only.


[0255] The procedure for testing the effects of the test compound would consist of intradermally injecting killed Mycobacterium tuberculosis in CFA followed by immediately administering the test compound and subsequent treatment every other day until day 24. At 14, 15, 18, 20, 22, and 24 days after injection of Mycobacterium CFA, an overall arthritis score may be obtained as described by J. Holoskitz above. An analysis of the data would reveal that the test compound would have a dramatic affect on the swelling of the joints as measured by a decrease of the arthritis score.



Therapeutic Methods

[0256] The compositions (including polypeptide fragments, analogs, variants and antibodies or other binding partners or modulators including antisense polynucleotides) of the invention have numerous applications in a variety of therapeutic methods. Examples of therapeutic applications include, but are not limited to, those exemplified herein.



Example

[0257] One embodiment of the invention is the administration of an effective amount of the polypeptides or other composition of the invention to individuals affected by a disease or disorder that can be modulated by regulating the peptides of the invention. While the mode of administration is not particularly important, parenteral administration is preferred. An exemplary mode of administration is to deliver an intravenous bolus. The dosage of the polypeptides or other composition of the invention will normally be determined by the prescribing physician. It is to be expected that the dosage will vary according to the age, weight, condition and response of the individual patient. Typically, the amount of polypeptide administered per dose will be in the range of about 0.01 μg/kg to 100 mg/kg of body weight, with the preferred dose being about 0.1 μg/kg to 10 mg/kg of patient body weight. For parenteral administration, polypeptides of the invention will be formulated in an injectable form combined with a pharmaceutically acceptable parenteral vehicle. Such vehicles are well known in the art and examples include water, saline, Ringer's solution, dextrose solution, and solutions consisting of small amounts of the human serum albumin. The vehicle may contain minor amounts of additives that maintain the isotonicity and stability of the polypeptide or other active ingredient. The preparation of such solutions is within the skill of the art.



Pharmaceutical Formulations and Routes of Administration

[0258] A protein or other composition of the present invention (from whatever source derived, including without limitation from recombinant and non-recombinant sources and including antibodies and other binding partners of the polypeptides of the invention) may be administered to a patient in need, by itself, or in pharmaceutical compositions where it is mixed with suitable carriers or excipient(s) at doses to treat or ameliorate a variety of disorders. Such a composition may optionally contain,(in addition to protein or other active ingredient and a carrier) diluents, fillers, salts, buffers, stabilizers, solubilizers, and other materials well known in the art. The term “pharmaceutically acceptable” means a non-toxic material that does not interfere with the effectiveness of the biological activity of the active ingredient(s). The characteristics of the carrier will depend on the route of administration. The pharmaceutical composition of the invention may also contain cytokines, lymphokines, or other hematopoietic factors such as M-CSF, GM-CSF, TNF, IL-1, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-10, IL-11, IL-12, IL-13, IL-14, IL-15, IFN, TNFO, TNFI, TNF2, G-CSF, Meg-CSF, thrombopoietin, stem cell factor, and erythropoietin. In further compositions, proteins of the invention may be combined with other agents beneficial to the treatment of the disease or disorder in question. These agents include various growth factors such as epidermal growth factor (EGF), platelet-derived growth factor (PDGF), transforming growth factors (TGF-α and TGF-62 ), insulin-like growth factor (IGF), as well as cytokines described herein.


[0259] The pharmaceutical composition may further contain other agents which either enhance the activity of the protein or other active ingredient or complement its activity or use in treatment. Such additional factors and/or agents may be included in the pharmaceutical composition to produce a synergistic effect with protein or other active ingredient of the invention, or to minimize side effects. Conversely, protein or other active ingredient of the present invention may be included in formulations of the particular clotting factor, cytokine, lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti-inflammatory agent to minimize side effects of the clotting factor, cytokine, lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti-inflammatory agent (such as IL-1Ra, IL-1 Hy1, IL-1 Hy2, anti-TNF, corticosteroids, immunosuppressive agents). A protein of the present invention may be active in multimers (e.g., heterodimers or homodimers) or complexes with itself or other proteins. As a result, pharmaceutical compositions of the invention may comprise a protein of the invention in such multimeric or complexed form.


[0260] As an alternative to being included in a pharmaceutical composition of the invention including a first protein, a second protein or a therapeutic agent may be concurrently administered with the first protein (e.g., at the same time, or at differing times provided that therapeutic concentrations of the combination of agents is achieved at the treatment site). Techniques for formulation and administration of the compounds of the instant application may be found in “Remington's Pharmaceutical Sciences,” Mack Publishing Co., Easton, Pa., latest edition. A therapeutically effective dose further refers to that amount of the compound sufficient to result in amelioration of symptoms, e.g., treatment, healing, prevention or amelioration of the relevant medical condition, or an increase in rate of treatment, healing, prevention or amelioration of such conditions. When applied to an individual active ingredient, administered alone, a therapeutically effective dose refers to that ingredient alone. When applied to a combination, a therapeutically effective dose refers to combined amounts of the active ingredients that result in the therapeutic effect, whether administered in combination, serially or simultaneously.


[0261] In practicing the method of treatment or use of the present invention, a therapeutically effective amount of protein or other active ingredient of the present invention is administered to a mammal having a condition to be treated. Protein or other active ingredient of the present invention may be administered in accordance with the method of the invention either alone or in combination with other therapies such as treatments employing cytokines, lymphokines or other hematopoietic factors. When co-administered with one or more cytokines, lymphokines or other hematopoietic factors, protein or other active ingredient of the present invention may be administered either simultaneously with the cytokine(s), lymphokine(s), other hematopoietic factor(s), thrombolytic or anti-thrombotic factors, or sequentially. If administered sequentially, the attending physician will decide on the appropriate sequence of administering protein or other active ingredient of the present invention in combination with cytokine(s), lymphokine(s), other hematopoietic factor(s), thrombolytic or anti-thrombotic factors.



Routes of Administration

[0262] Suitable routes of administration may, for example, include oral, rectal, transmucosal, or intestinal administration; parenteral delivery, including intramuscular, subcutaneous, intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, intraperitoneal, intranasal, or intraocular injections. Administration of protein or other active ingredient of the present invention used in the pharmaceutical composition or to practice the method of the present invention can be carried out in a variety of conventional ways, such as oral ingestion, inhalation, topical application or cutaneous, subcutaneous, intraperitoneal, parenteral or intravenous injection. Intravenous administration to the patient is preferred.


[0263] Alternately, one may administer the compound in a local rather than systemic manner, for example, via injection of the compound directly into a arthritic joints or in fibrotic tissue, often in a depot or sustained release formulation. In order to prevent the scarring process frequently occurring as complication of glaucoma surgery, the compounds may be administered topically, for example, as eye drops. Furthermore, one may administer the drug in a targeted drug delivery system, for example, in a liposome coated with a specific antibody, targeting, for example, arthritic or fibrotic tissue. The liposomes will be targeted to and taken up selectively by the afflicted tissue.


[0264] The polypeptides of the invention are administered by any route that delivers an effective dosage to the desired site of action. The determination of a suitable route of administration and an effective dosage for a particular indication is within the level of skill in the art. Preferably for wound treatment, one administers the therapeutic compound directly to the site. Suitable dosage ranges for the polypeptides of the invention can be extrapolated from these dosages or from similar studies in appropriate animal models. Dosages can then be adjusted as necessary by the clinician to provide maximal therapeutic benefit.



Compositions/Formulations

[0265] Pharmaceutical compositions for use in accordance with the present invention thus may be formulated in a conventional manner using one or more physiologically acceptable carriers comprising excipients and auxiliaries which facilitate processing of the active compounds into preparations which can be used pharmaceutically. These pharmaceutical compositions may be manufactured in a manner that is itself known, e.g., by means of conventional mixing, dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or lyophilizing processes. Proper formulation is dependent upon the route of administration chosen. When a therapeutically effective amount of protein or other active ingredient of the present invention is administered orally, protein or other active ingredient of the present invention will be in the form of a tablet, capsule, powder, solution or elixir. When administered in tablet form, the pharmaceutical composition of the invention may additionally contain a solid carrier such as a gelatin or an adjuvant. The tablet, capsule, and powder contain from about 5 to 95% protein or other active ingredient of the present invention, and preferably from about 25 to 90% protein or other active ingredient of the present invention. When administered in liquid form, a liquid carrier such as water, petroleum, oils of animal or plant origin such as peanut oil, mineral oil, soybean oil, or sesame oil, or synthetic oils may be added. The liquid form of the pharmaceutical composition may further contain physiological saline solution, dextrose or other saccharide solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. When administered in liquid form, the pharmaceutical composition contains from about 0.5 to 90% by weight of protein or other active ingredient of the present invention, and preferably from about 1 to 50% protein or other active ingredient of the present invention.


[0266] When a therapeutically effective amount of protein or other active ingredient of the present invention is administered by intravenous, cutaneous or subcutaneous injection, protein or other active ingredient of the present invention will be in the form of a pyrogen-free, parenterally acceptable aqueous solution. The preparation of such parenterally acceptable protein or other active ingredient solutions, having due regard to pH, isotonicity, stability, and the like, is within the skill in the art. A preferred pharmaceutical composition for intravenous, cutaneous, or subcutaneous injection should contain, in addition to protein or other active ingredient of the present invention, an isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, Dextrose Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or other vehicle as known in the art. The pharmaceutical composition of the present invention may also contain stabilizers, preservatives, buffers, antioxidants, or other additives known to those of skill in the art. For injection, the agents of the invention may be formulated in aqueous solutions, preferably in physiologically compatible buffers such as Hanks's solution, Ringer's solution, or physiological saline buffer. For transmucosal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art.


[0267] For oral administration, the compounds can be formulated readily by combining the active compounds with pharmaceutically acceptable carriers well known in the art. Such carriers enable the compounds of the invention to be formulated as tablets, pills, dragees, capsules, liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a patient to be treated. Pharmaceutical preparations for oral use can be obtained from a solid excipient, optionally grinding a resulting mixture, and processing the mixture of granules, after adding suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, gum tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, disintegrating agents may be added, such as the cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt thereof such as sodium alginate. Dragee cores are provided with suitable coatings. For this purpose, concentrated sugar solutions may be used, which may optionally contain gum arabic, talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments may be added to the tablets or dragee coatings for identification or to characterize different combinations of active compound doses.


[0268] Pharmaceutical preparations which can be used orally include push-fit capsules made of gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, optionally, stabilizers. In soft capsules, the active compounds may be dissolved or suspended in suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, stabilizers may be added. All formulations for oral administration should be in dosages suitable for such administration. For buccal administration, the compositions may take the form of tablets or lozenges formulated in conventional manner.


[0269] For administration by inhalation, the compounds for use according to the present invention are conveniently delivered in the form of an aerosol spray presentation from pressurized packs or a nebuliser, with the use of a suitable propellant, e.g., dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined by providing a valve to deliver a metered amount. Capsules and cartridges of, e.g., gelatin for use in an inhaler or insufflator may be formulated containing a powder mix of the compound and a suitable powder base such as lactose or starch. The compounds may be formulated for parenteral administration by injection, e.g., by bolus injection or continuous infusion. Formulations for injection may be presented in unit dosage form, e.g., in ampules or in multi-dose containers, with an added preservative. The compositions may take such forms as suspensions, solutions or emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, stabilizing and/or dispersing agents.


[0270] Pharmaceutical formulations for parenteral administration include aqueous solutions of the active compounds in water-soluble form. Additionally, suspensions of the active compounds may be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or triglycerides, or liposomes. Aqueous injection suspensions may contain substances which increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or dextran. Optionally, the suspension may also contain suitable stabilizers or agents which increase the solubility of the compounds to allow for the preparation of highly concentrated solutions. Alternatively, the active ingredient may be in powder form for constitution with a suitable vehicle, e.g., sterile pyrogen-free water, before use.


[0271] The compounds may also be formulated in rectal compositions such as suppositories or retention enemas, e.g., containing conventional suppository bases such as cocoa butter or other glycerides. In addition to the formulations described previously, the compounds may also be formulated as a depot preparation. Such long acting formulations may be administered by implantation (for example subcutaneously or intramuscularly) or by intramuscular injection. Thus, for example, the compounds may be formulated with suitable polymeric or hydrophobic materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as sparingly soluble derivatives, for example, as a sparingly soluble salt.


[0272] A pharmaceutical carrier for the hydrophobic compounds of the invention is a co-solvent system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic polymer, and an aqueous phase. The co-solvent system may be the VPD co-solvent system. VPD is a solution of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 80, and 65% w/v polyethylene glycol 300, made up to volume in absolute ethanol. The VPD co-solvent system (VPD:5W) consists of VPD diluted 1:1 with a 5% dextrose in water solution. This co-solvent system dissolves hydrophobic compounds well, and itself produces low toxicity upon systemic administration. Naturally, the proportions of a co-solvent system may be varied considerably without destroying its solubility and toxicity characteristics. Furthermore, the identity of the co-solvent components may be varied: for example, other low-toxicity nonpolar surfactants may be used instead of polysorbate 80; the fraction size of polyethylene glycol may be varied; other biocompatible polymers may replace polyethylene glycol, e.g. polyvinyl pyrrolidone; and other sugars or polysaccharides may substitute for dextrose. Alternatively, other delivery systems for hydrophobic pharmaceutical compounds may be employed. Liposomes and emulsions are well known examples of delivery vehicles or carriers for hydrophobic drugs. Certain organic solvents such as dimethylsulfoxide also may be employed, although usually at the cost of greater toxicity. Additionally, the compounds may be delivered using a sustained-release system, such as semipermeable matrices of solid hydrophobic polymers containing the therapeutic agent. Various types of sustained-release materials have been established and are well known by those skilled in the art. Sustained-release capsules may, depending on their chemical nature, release the compounds for a few weeks up to over 100 days. Depending on the chemical nature and the biological stability of the therapeutic reagent, additional strategies for protein or other active ingredient stabilization may be employed.


[0273] The pharmaceutical compositions also may comprise suitable solid or gel phase carriers or excipients. Examples of such carriers or excipients include but are not limited to calcium carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, gelatin, and polymers such as polyethylene glycols. Many of the active ingredients of the invention may be provided as salts with pharmaceutically compatible counter ions. Such pharmaceutically acceptable base addition salts are those salts which retain the biological effectiveness and properties of the free acids and which are obtained by reaction with inorganic or organic bases such as sodium hydroxide, magnesium hydroxide, ammonia, trialkylamine, dialkylamine, monoalkylamine, dibasic amino acids, sodium acetate, potassium benzoate, triethanol amine and the like.


[0274] The pharmaceutical composition of the invention may be in the form of a complex of the protein(s) or other active ingredient(s) of present invention along with protein or peptide antigens. The protein and/or peptide antigen will deliver a stimulatory signal to both B and T lymphocytes. B lymphocytes will respond to antigen through their surface immunoglobulin receptor. T lymphocytes will respond to antigen through the T cell receptor (TCR) following presentation of the antigen by MHC proteins. MHC and structurally related proteins including those encoded by class I and class II MHC genes on host cells will serve to present the peptide antigen(s) to T lymphocytes. The antigen components could also be supplied as purified MHC-peptide complexes alone or with co-stimulatory molecules that can directly signal T cells. Alternatively antibodies able to bind surface immunoglobulin and other molecules on B cells as well as antibodies able to bind the TCR and other molecules on T cells can be combined with the pharmaceutical composition of the invention.


[0275] The pharmaceutical composition of the invention may be in the form of a liposome in which protein of the present invention is combined, in addition to other pharmaceutically acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution. Suitable lipids for liposomal formulation include, without limitation, mono glycerides, diglycerides, sulfatides, lysolecithins, phospholipids, saponin, bile acids, and the like. Preparation of such liposomal formulations is within the level of skill in the art, as disclosed, for example, in U.S. Pat. Nos. 4,235,871; 4,501,728; 4,837,028; and 4,737,323, all of which are incorporated herein by reference.


[0276] The amount of protein or other active ingredient of the present invention in the pharmaceutical composition of the present invention will depend upon the nature and severity of the condition being treated, and on the nature of prior treatments which the patient has undergone. Ultimately, the attending physician will decide the amount of protein or other active ingredient of the present invention with which to treat each individual patient. Initially, the attending physician will administer low doses of protein or other active ingredient of the present invention and observe the patient's response. Larger doses of protein or other active ingredient of the present invention may be administered until the optimal therapeutic effect is obtained for the patient, and at that point the dosage is not increased further. It is contemplated that the various pharmaceutical compositions used to practice the method of the present invention should contain about 0.01 μg to about 100 mg (preferably about 0.1 μg to about 10 mg, more preferably about 0.1 μg to about 1 mg) of protein or other active ingredient of the present invention per kg body weight. For compositions of the present invention which are useful for bone, cartilage, tendon or ligament regeneration, the therapeutic method includes administering the composition topically, systematically, or locally as an implant or device. When administered, the therapeutic composition for use in this invention is, of course, in a pyrogen-free, physiologically acceptable form. Further, the composition may desirably be encapsulated or injected in a viscous form for delivery to the site of bone, cartilage or tissue damage. Topical administration may be suitable for wound healing and tissue repair. Therapeutically useful agents other than a protein or other active ingredient of the invention which may also optionally be included in the composition as described above, may alternatively or additionally, be administered simultaneously or sequentially with the composition in the methods of the invention. Preferably for bone and/or cartilage formation, the composition would include a matrix capable of delivering the protein-containing or other active ingredient-containing composition to the site of bone and/or cartilage damage, providing a structure for the developing bone and cartilage and optimally capable of being resorbed into the body. Such matrices may be formed of materials presently in use for other implanted medical applications.


[0277] The choice of matrix material is based on biocompatibility, biodegradability, mechanical properties, cosmetic appearance and interface properties. The particular application of the compositions will define the appropriate formulation. Potential matrices for the compositions may be biodegradable and chemically defined calcium sulfate, tricalcium phosphate, hydroxyapatite, polylactic acid, polyglycolic acid and polyanhydrides. Other potential materials are biodegradable and biologically well-defined, such as bone or dermal collagen. Further matrices are comprised of pure proteins or extracellular matrix components. Other potential matrices are nonbiodegradable and chemically defined, such as sintered hydroxyapatite, bioglass, aluminates, or other ceramics. Matrices may be comprised of combinations of any of the above-mentioned types of material, such as polylactic acid and hydroxyapatite or collagen and tricalcium phosphate. The bioceramics may be altered in composition, such as in calcium-aluminate-phosphate and processing to alter pore size, particle size, particle shape, and biodegradability. Presently preferred is a 50:50 (mole weight) copolymer of lactic acid and glycolic acid in the form of porous particles having diameters ranging from 150 to 800 microns. In some applications, it will be useful to utilize a sequestering agent, such as carboxymethyl cellulose or autologous blood clot, to prevent the protein compositions from disassociating from the matrix.


[0278] A preferred family of sequestering agents is cellulosic materials such as alkylcelluloses (including hydroxyalkylcelluloses), including methylcellulose, ethylcellulose, hydroxyethylcellulose, hydroxypropylcellulose, hydroxypropyl-methylcellulose, and carboxymethylcellulose, the most preferred being cationic salts of carboxymethylcellulose (CMC). Other preferred sequestering agents include hyaluronic acid, sodium alginate, poly(ethylene glycol), polyoxyethylene oxide, carboxyvinyl polymer and poly(vinyl alcohol). The amount of sequestering agent useful herein is 0.5-20 wt %, preferably 1-10 wt % based on total formulation weight, which represents the amount necessary to prevent desorption of the protein from the polymer matrix and to provide appropriate handling of the composition, yet not so much that the progenitor cells are prevented from infiltrating the matrix, thereby providing the protein the opportunity to assist the osteogenic activity of the progenitor cells. In further compositions, proteins or other active ingredients of the invention may be combined with other agents beneficial to the treatment of the bone and/or cartilage defect, wound, or tissue in question. These agents include various growth factors such as epidermal growth factor (EGF), platelet derived growth factor (PDGF), transforming growth factors (TGF-α and TGF-β), and insulin-like growth factor (IGF).


[0279] The therapeutic compositions are also presently valuable for veterinary applications. Particularly domestic animals and thoroughbred horses, in addition to humans, are desired patients for such treatment with proteins or other active ingredients of the present invention. The dosage regimen of a protein-containing pharmaceutical composition to be used in tissue regeneration will be determined by the attending physician considering various factors which modify the action of the proteins, e.g., amount of tissue weight desired to be formed, the site of damage, the condition of the damaged tissue, the size of a wound, type of damaged tissue (e.g., bone), the patient's age, sex, and diet, the severity of any infection, time of administration and other clinical factors. The dosage may vary with the type of matrix used in the reconstitution and with inclusion of other proteins in the pharmaceutical composition. For example, the addition of other known growth factors, such as IGF I (insulin like growth factor I), to the final composition, may also effect the dosage. Progress can be monitored by periodic assessment of tissue/bone growth and/or repair, for example, X-rays, histomorphometric determinations and tetracycline labeling.


[0280] Polynucleotides of the present invention can also be used for gene therapy. Such polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a mammalian subject. Polynucleotides of the invention may also be administered by other known methods for introduction of nucleic acid into a cell or organism (including, without limitation, in the form of viral vectors or naked DNA). Cells may also be cultured ex vivo in the presence of proteins of the present invention in order to proliferate or to produce a desired effect on or activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes.



Effective Dosage

[0281] Pharmaceutical compositions suitable for use in the present invention include compositions wherein the active ingredients are contained in an effective amount to achieve its intended purpose. More specifically, a therapeutically effective amount means an amount effective to prevent development of or to alleviate the existing symptoms of the subject being treated. Determination of the effective amount is well within the capability of those skilled in the art, especially in light of the detailed disclosure provided herein. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from appropriate in vitro assays. For example, a dose can be formulated in animal models to achieve a circulating concentration range that can be used to more accurately determine useful doses in humans. For example, a dose can be formulated in animal models to achieve a circulating concentration range that includes the IC50 as determined in cell culture (i.e., the concentration of the test compound which achieves a half-maximal inhibition of the protein's biological activity). Such information can be used to more accurately determine useful doses in humans.


[0282] A therapeutically effective dose refers to that amount of the compound that results in amelioration of symptoms or a prolongation of survival in a patient. Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio between LD50 and ED50. Compounds which exhibit high therapeutic indices are preferred. The data obtained from these cell culture assays and animal studies can be used in formulating a range of dosage for use in human. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. The exact formulation, route of administration and dosage can be chosen by the individual physician in view of the patient's condition. See, e.g., Fingl et al., 1975, in “The Pharmnacological Basis of Therapeutics”, Ch. 1 p.1. Dosage amount and interval may be adjusted individually to provide plasma levels of the active moiety which are sufficient to maintain the desired effects, or minimal effective concentration (AEC). The MEC will vary for each compound but can be estimated from in vitro data. Dosages necessary to achieve the MEC will depend on individual characteristics and route of administration. However, BPLC assays.or bioassays can be used to determine plasma concentrations.


[0283] Dosage intervals can also be determined using MEC value. Compounds should be administered using a regimen which maintains plasma levels above the MEC for 1O-90% of the time, preferably between 30-90% and most preferably between 50-90%. In cases of local administration or selective uptake, the effective local concentration of the drug may not be related to plasma concentration.


[0284] An exemplary dosage regimen for polypeptides or other compositions of the invention will be in the range of about 0.01 μg/kg to 100 mg/kg of body weight daily, with the preferred dose being about 0.1 μg/kg to 25 mg/kg of patient body weight daily, varying in adults and children. Dosing may be once daily, or equivalent doses may be delivered at longer or shorter intervals.


[0285] The amount of composition administered will, of course, be dependent on the subject being treated, on the subject's age and weight, the severity of the affliction, the manner of administration and the judgment of the prescribing physician.



Packaging

[0286] The compositions may, if desired, be presented in a pack or dispenser device which may contain one or more unit dosage forms containing the active ingredient. The pack may, for example, comprise metal or plastic foil, such as a blister pack. The pack or dispenser device may be accompanied by instructions for administration. Compositions comprising a compound of the invention formulated in a compatible pharmaceutical carrier may also be prepared, placed in an appropriate container, and labeled for treatment of an indicated condition.



Antibodies

[0287] Also included in the invention are antibodies to proteins, or fragments of proteins of the invention. The term “antibody” as used herein refers to immunoglobulin molecules and immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that contain an antigen-binding site that specifically binds (immunoreacts with) an antigen. Such antibodies include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, Fab, Fab, and F(ab′)2 fragments, and an Fab expression library. In general, an antibody molecule obtained from humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ from one another by the nature of the heavy chain present in the molecule. Certain classes have subclasses as well, such as IgG1, IgG2, and others. Furthermore, in humans, the light chain may be a kappa chain or a lambda chain. Reference herein to antibodies includes a reference to all such classes, subclasses and types of human antibody species.


[0288] An isolated related protein of the invention may be intended to serve as an antigen, or a portion or fragment thereof, and additionally can be used as an immunogen to generate antibodies that immunospecifically bind the antigen, using standard techniques for polyclonal and monoclonal antibody preparation. The full-length protein can be used or, alternatively, the invention provides antigenic peptide fragments of the antigen for use as immunogens. An antigenic peptide fragment comprises at least 6 amino acid residues of the amino acid sequence of the full length protein, such as an amino acid sequence shown in SEQ ID NO: 338-674, or 837-998, or Tables 3, 5, 6, 8, or 9, and encompasses an epitope thereof such that an antibody raised against the peptide forms a specific immune complex with the full length protein or with any fragment that contains the epitope. Preferably, the antigenic peptide comprises at least 10 amino acid residues, or at least 15 amino acid residues, or at least 20 amino acid residues, or at least 30 amino acid residues. Preferred epitopes encompassed by the antigenic peptide are regions of the protein that are located on its surface; commonly these are hydrophilic regions.


[0289] In certain embodiments of the invention, at least one epitope encompassed by the antigenic peptide is a surface region of the protein, e.g., a hydrophilic region. A hydrophobicity analysis of the human related protein sequence will indicate which regions of a related protein are particularly hydrophilic and, therefore, are likely to encode surface residues useful for targeting antibody production. As a means for targeting antibody production, hydropathy plots showing regions of hydrophilicity and hydrophobicity may be generated by any method well known in the art, including, for example, the Kyte Doolittle or the Hopp Woods methods, either with or without Fourier transformation. See, e.g., Hopp and Woods, 1981, Proc. Nat. Acad. Sci. USA 78: 3824-3828; Kyte and Doolittle 1982, J. Mol. Biol. 157: 105-142, each of which is incorporated herein by reference in its entirety. Antibodies that are specific for one or more domains within an antigenic protein, or derivatives, fragments, analogs or homologs thereof, are also provided herein.


[0290] A protein of the invention, or a derivative, fragment, analog, homolog or ortholog thereof, may be utilized as an immunogen in the generation of antibodies that immunospecifically bind these protein components.


[0291] The term “specific for” indicates that the variable regions of the antibodies of the invention recognize and bind polypeptides of the invention exclusively (i.e., able to distinguish the polypeptide of the invention from other similar polypeptides despite sequence identity, homology, or similarity found in the family of polypeptides), but may also interact with other proteins (for example, S. aureus protein A or other antibodies in ELISA techniques) through interactions with sequences outside the variable region of the antibodies, and in particular, in the constant region of the molecule. Screening assays to determine binding specificity of an antibody of the invention are well known and routinely practiced in the art. For a comprehensive discussion of such assays, see Harlow et al. (Eds), Antibodies A Laboratory Manual; Cold Spring Harbor Laboratory; Cold Spring Harbor, N.Y. (1988), Chapter 6. Antibodies that recognize and bind fragments of the polypeptides of the invention are also contemplated, provided that the antibodies are first and foremost specific for, as defined above, full-length polypeptides of the invention. As with antibodies that are specific for full length polypeptides of the invention, antibodies of the invention that recognize fragments are those which can distinguish polypeptides from the same family of polypeptides despite inherent sequence identity, homology, or similarity found in the family of proteins.


[0292] Antibodies of the invention are useful for, for example, therapeutic purposes (by modulating activity of a polypeptide of the invention), diagnostic purposes to detect or quantitate a polypeptide of the invention, as well as purification of a polypeptide of the invention. Kits comprising an antibody of the invention for any of the purposes described herein are also comprehended. In general, a kit of the invention also includes a control antigen for which the antibody is immunospecific. The invention further provides a hybridoma that produces an antibody according to the invention. Antibodies of the invention are useful for detection and/or purification of the polypeptides of the invention.


[0293] Monoclonal antibodies binding to the protein of the invention may be useful diagnostic agents for the immunodetection of the protein. Neutralizing monoclonal antibodies binding to the protein may also be useful therapeutics for both conditions associated with the protein and also in the treatment of some forms of cancer where abnormal expression of the protein is involved. In the case of cancerous cells or leukemic cells, neutralizing monoclonal antibodies against the protein may be useful in detecting and preventing the metastatic spread of the cancerous cells, which may be mediated by the protein.


[0294] The labeled antibodies of the present invention can be used for in vitro, in vivo, and in situ assays to identify cells or tissues in which a fragment of the polypeptide of interest is expressed. The antibodies may also be used directly in therapies or other diagnostics. The present invention further provides the above-described antibodies immobilized on a solid support. Examples of such solid supports include plastics such as polycarbonate, complex carbohydrates such as agarose and Sepharose®, acrylic resins and such as polyacrylamide and latex beads. Techniques for coupling antibodies to such solid supports are well known in the art (Weir, D. M. et al., “Handbook of Experimental Immunology” 4th Ed., Blackwell Scientific Publications, Oxford, England, Chapter 10 (1986); Jacoby, W. D. et al., Meth. Enzym. 34 Academic Press, N.Y. (1974)). The immobilized antibodies of the present invention can be used for in vitro, in vivo, and in situ assays as well as for immuno-affinity purification of the proteins of the present invention.


[0295] Various procedures known within the art may be used for the production of polyclonal or monoclonal antibodies directed against a protein of the invention, or against derivatives, fragments, analogs homologs or orthologs thereof (see, for example, Antibodies: A Laboratory Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., incorporated herein by reference). Some of these antibodies are discussed below.



Polyclonal Antibodies

[0296] For the production of polyclonal antibodies, various suitable host animals (e.g., rabbit, goat, mouse or other mammal) may be immunized by one or more injections with the native protein, a synthetic valiant thereof, or a derivative of the foregoing. An appropriate immunogenic preparation can contain, for example, the naturally occurring immunogenic protein, a chemically synthesized polypeptide representing the immunogenic protein, or a recombinantly expressed immunogenic protein. Furthermore, the protein may be conjugated to a second protein known to be immunogenic in the mammal being immunized. Examples of such immunogenic proteins include but are not limited to keyhole limpet hemocyanin, serum albumin, bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can further include an adjuvant. Various adjuvants used to increase the immunological response include, but are not limited to, Freund's (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surface-active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, dinitrophenol, etc.), adjuvants usable in humans such as Bacille Calmette-Guerin and Corynebacterium parvum, or similar immunostimulatory agents. Additional examples of adjuvants that can be employed include MPL-TDM adjuvant (monophosphoryl Lipid A, synthetic trehalose dicorynomycolate).


[0297] The polyclonal antibody molecules directed against the immunogenic protein can be isolated from the mammal (e.g., from the blood) and further purified by well known techniques, such as affinity chromatography using protein A or protein G, which provide primarily the IgG fraction of immune serum. Subsequently, or alternatively, the specific antigen which is the target of the immunoglobulin sought, or an epitope thereof, may be immobilized on a column to purify the immune specific antibody by immunoaffinity chromatography. Purification of immunoglobulins is discussed, for example, by D. Wilkinson (The Scientist, published by The Scientist, Inc., Philadelphia Pa., Vol. 14, No. 8 (Apr. 17, 2000), pp. 25-28).



Monoclonal Antibodies

[0298] The term “monoclonal antibody” (MAb) or “monoclonal antibody composition”, as used herein, refers to a population of antibody molecules that contain only one molecular species of antibody molecule consisting of a unique light chain gene product and a unique heavy chain gene product. In particular, the complementarity determining regions (CDRs) of the monoclonal antibody are identical in all the molecules of the population. MAbs thus contain an antigen-binding site capable of immunoreacting with a particular epitope of the antigen characterized by a unique binding affinity for it.


[0299] Monoclonal antibodies can be prepared using hybridoma methods, such as those described by Kohler and Milstein, Nature, 256, 495 (1975). In a hybridoma method, a mouse, hamster, or other appropriate host animal, is typically immunized with an immunizing agent to elicit lymphocytes that produce or are capable of producing antibodies that will specifically bind to the immunizing agent. Alternatively, the lymphocytes can be immunized in vitro.


[0300] The immunizing agent will typically include the protein antigen, a fragment thereof or a fusion protein thereof. Generally, either peripheral blood lymphocytes are used if cells of human origin are desired, or spleen cells or lymph node cells are used if non-human mammalian sources are desired. The lymphocytes are then fused with an immortalized cell line using a suitable fusing agent, such as polyethylene glycol, to form a hybridoma cell (Goding, Monoclonal Antibodies: Principles and Practice, Academic Press, (1986) pp. 59-103). Immortalized cell lines are usually transformed mammalian cells, particularly myeloma cells of rodent, bovine and human origin. Usually, rat or mouse myeloma cell lines are employed. The hybridoma cells can be cultured in a suitable culture medium that preferably contains one or more substances that inhibit the growth or survival of the unfused, immortalized cells. For example, if the parental cells lack the enzyme hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium for the hybridomas typically will include hypoxanthine, aminopterin, and thymidine (“HAT medium”), which substances prevent the growth of HGPRT-deficient cells.


[0301] Preferred immortalized cell lines are those that fuse efficiently, support stable high level expression of antibody by the selected antibody-producing cells, and are sensitive to a medium such as HAT medium. More preferred immortalized cell lines are murine myeloma lines, which can be obtained, for instance, from the Salk Institute Cell Distribution Center, San Diego, Calif. and the American Type Culture Collection, Manassas, Va. Human myeloma and mouse-human heteromyeloma cell lines also have been described for the production of human monoclonal antibodies (Kozbor, J. Immunol., 133:3001 (1984); Brodeur et al., Monoclonal Antibody Production Techniques and Applications, Marcel Dekker, Inc., New York, (1987) pp. 51-63).


[0302] The culture medium in which the hybridoma cells are cultured can then be assayed for the presence of monoclonal antibodies directed against the antigen. Preferably, the binding specificity of monoclonal antibodies produced by the hybridoma cells is determined by immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (RIA) or enzyme-linked immunoabsorbent assay (ELISA). Such techniques and assays are known in the art. The binding affinity of the monoclonal antibody can, for example, be determined by the Scatchard analysis of Munson and Pollard, Anal. Biochem., 107, 220 (1980). Preferably, antibodies having a high degree of specificity and a high binding affinity for the target antigen are isolated.


[0303] After the desired hybridoma cells are identified, the clones can be subcloned by limiting dilution procedures and grown by standard methods. Suitable culture media for this purpose include, for example, Dulbecco's Modified Eagle's Medium and RPMI-1640 medium. Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal.


[0304] The monoclonal antibodies secreted by the subclones can be isolated or purified from the culture medium or ascites fluid by conventional immunoglobulin purification procedures such as, for example, protein A-Sepharose, hydroxylapatite chromatography, gel electrophoresis, dialysis, or affinity chromatography.


[0305] The monoclonal antibodies can also be made by recombinant DNA methods, such as those described in U.S. Pat. No. 4,816,567. DNA encoding the monoclonal antibodies of the invention can be readily isolated and sequenced using conventional procedures (e.g., by using oligonucleotide probes that are capable of binding specifically to genes encoding the heavy and light chains of murine antibodies). The hybridoma cells of the invention serve as a preferred source of such DNA. Once isolated, the DNA can be placed into expression vectors, which are then transfected into host cells such as simian COS cells, Chinese hamster ovary (CHO) cells, or myeloma cells that do not otherwise produce immunoglobulin protein, to obtain the synthesis of monoclonal antibodies in the recombinant host cells. The DNA also can be modified, for example, by substituting the coding sequence for human heavy and light chain constant domains in place of the homologous murine sequences (U.S. Pat. No. 4,816,567; Morrison, Nature 368, 812-13 (1994)) or by covalently joining to the immunoglobulin coding sequence all or part of the coding sequence for a non-immunoglobulin polypeptide. Such a non-immunoglobulin polypeptide can be substituted for the constant domains of an antibody of the invention, or can be substituted for the variable domains of one antigen-combining site of an antibody of the invention to create a chimeric bivalent antibody.



Humanized Antibodies

[0306] The antibodies directed against the protein antigens of the invention can further comprise humanized antibodies or human antibodies. These antibodies are suitable for administration to humans without engendering an immune response by the human against the administered immunoglobulin. Humanized forms of antibodies are chimeric immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab′, F(ab′)2 or other antigen-binding subsequences of antibodies) that are principally comprised of the sequence of a human immunoglobulin, and contain minimal sequence derived from a non-human immunoglobulin. Humanization can be performed following the method of Winter and co-workers (Jones et al., Nature, 321, 522-525 (1986); Riechmann et al., Nature, 332, 323-327 (1988); Verhoeyen et al., Science, 239, 1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the corresponding sequences of a human antibody. (See also U.S. Pat. No. 5,225,539). In some instances, Fv framework residues of the human immunoglobulin are replaced by corresponding non-human residues. Humanized antibodies can also comprise residues that are found neither in the recipient antibody nor in the imported CDR or framework sequences. In general, the humanized antibody will comprise substantially all of at least one, and typically two, variable domains, in which all or substantially all of the CDR regions correspond to those of a non-human immunoglobulin and all or substantially all of the framework regions are those of a human immunoglobulin consensus sequence. The humanized antibody optimally also will comprise at least a portion of an immunoglobulin constant region (Fc), typically that of a human immunoglobulin (Jones et al., 1986; Riechmann et al., 1988; and Presta, Curr. Op. Struct. Biol., 2, 593-596 (1992)).



Human Antibodies

[0307] Fully human antibodies relate to antibody molecules in which essentially the entire sequences of both the light chain and the heavy chain, including the CDRs, arise from human genes. Such antibodies are termed “human antibodies”, or “fully human antibodies” herein. Human monoclonal antibodies can be prepared by the trioma technique; the human B-cell hybridoma technique (see Kozbor, et al., 1983 Immunol Today 4: 72) and the EBV hybridoma technique to produce human monoclonal antibodies (see Cole, et al., 1985 In: Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human monoclonal antibodies may be utilized in the practice of the present invention and may be produced by using human hybridomas (see Cote, et al., 1983. Proc Natl Acad Sci USA 80, 2026-2030) or by transforming human B-cells with Epstein Barr Virus in vitro (see Cole, et al., 1985 In: Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96).


[0308] In addition, human antibodies can also be produced using additional techniques, including phage display libraries (Hoogenboom and Winter, J. Mol. Biol., 227, 381 (1991); Marks et al., J. Mol. Biol., 222:581 (1991)). Similarly, human antibodies can be made by introducing human immunoglobulin loci into transgenic animals, e.g., mice in which the endogenous immunoglobulin genes have been partially or completely inactivated. Upon challenge, human antibody production is observed, which closely resembles that seen in humans in all respects, including gene rearrangement, assembly, and antibody repertoire. This approach is described, for example, in U.S. Pat. Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 5,633,425; 5,661,016, and in Marks et al. (Bio/Technology 10, 779-783 (1992)); Lonberg et al. (Nature 368, 856-859 (1994)); Morrison (Nature 368, 812-13 (1994)); Fishwild et al, (Nature Biotechnology 14, 845-51 (1996)); Neuberger (Nature Biotechnology 14, 826 (1996)); and Lonberg and Huszar (Intern. Rev. Immunol. 13, 65-93 (1995)).


[0309] Human antibodies may additionally be produced using transgenic nonhuman animals that are modified so as to produce fully human antibodies rather than the animal's endogenous antibodies in response to challenge by an antigen. (See PCT publication WO94/02602). The endogenous genes encoding the heavy and light immunoglobulin chains in the nonhuman host have been incapacitated, and active loci encoding human heavy and light chain immunoglobulins are inserted into the host's genome. The human genes are incorporated, for example, using yeast artificial chromosomes containing the requisite human DNA segments. An animal which provides all the desired modifications is then obtained as progeny by crossbreeding intermediate transgenic animals containing fewer than the full complement of the modifications. The preferred embodiment of such a nonhuman animal is a mouse, and is termed the Xenomouse™ as disclosed in PCT publications WO 96/33735 and WO 96/34096. This animal produces B cells that secrete fully human immunoglobulins. The antibodies can be obtained directly from the animal after immunization with an immunogen of interest, as, for example, a preparation of a polyclonal antibody, or alternatively from immortalized B cells derived from the animal, such as hybridomas producing monoclonal antibodies. Additionally, the genes encoding the immunoglobulins with human variable regions can be recovered and expressed to obtain the antibodies directly, or can be further modified to obtain analogs of antibodies such as, for example, single chain Fv molecules.


[0310] An example of a method of producing a nonhuman host, exemplified as a mouse, lacking expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. Pat. No. 5,939,598. It can be obtained by a method including deleting the J segment genes from at least one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of the locus and to prevent formation of a transcript of a rearranged immunoglobulin heavy chain locus, the deletion being effected by a targeting vector containing a gene encoding a selectable marker; and producing from the embryonic stem cell a transgenic mouse whose somatic and germ cells contain the gene encoding the selectable marker.


[0311] A method for producing an antibody of interest, such as a human antibody, is disclosed in U.S. Pat. No. 5,916,771. It includes introducing an expression vector that contains a nucleotide sequence encoding a heavy chain into one mammalian host cell in culture, introducing an expression vector containing a nucleotide sequence encoding a light chain into another mammalian host cell, and fusing the two cells to form a hybrid cell. The hybrid cell expresses an antibody containing the heavy chain and the light chain.


[0312] In a further improvement on this procedure, a method for identifying a clinically relevant epitope on an immunogen, and a correlative method for selecting an antibody that binds immunospecifically to the relevant epitope with high affinity, are disclosed in PCT publication WO 99/53049.



Fab Fragments and Single Chain Antibodies

[0313] According to the invention, techniques can be adapted for the production of single-chain antibodies specific to an antigenic protein of the invention (see e.g., U.S. Pat. No. 4,946,778). In addition, methods can be adapted for the construction of Fab expression libraries (see e.g., Huse, et al., 1989 Science 246, 1275-1281) to allow rapid and effective identification of monoclonal Fab fragments with the desired specificity for a protein or derivatives, fragments, analogs or homologs thereof. Antibody fragments that contain the idiotypes to a protein antigen may be produced by techniques known in the art including, but not limited to: (i) an F(ab′)2 fragment produced by pepsin digestion of an antibody molecule; (ii) an Fab fragment generated by reducing the disulfide bridges of an F(ab′)2 fragment; (iii) an Fab fragment generated by the treatment of the antibody molecule with papain and a reducing agent and (iv) Fv fragments.



Bispecific Antibodies

[0314] Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that have binding specificities for at least two different antigens. In the present case, one of the binding specificities is for an antigenic protein of the invention. The second binding target is any other antigen, and advantageously is a cell-surface protein or receptor or receptor subunit.


[0315] Methods for making bispecific antibodies are known in the art. Traditionally, the recombinant production of bispecific antibodies is based on the co-expression of two immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have different specificities (Milstein and Cuello, Nature, 305, 537-539 (1983)). Because of the random assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) produce a potential mixture of ten different antibody molecules, of which only one has the correct bispecific structure. The purification of the correct molecule is usually accomplished by affinity chromatography steps. Similar procedures are disclosed in WO 93/08829, published May 13, 1993, and in Traunecker et al., 1991 EMBO J., 10, 3655-3659.


[0316] Antibody variable domains with the desired binding specificities (antibody-antigen combining sites) can be fused to immunoglobulin constant domain sequences. The fusion preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part of the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain constant region (CH1) containing the site necessary for light-chain binding present in at least one of the fusions. DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the immunoglobulin light chain, are inserted into separate expression vectors, and are co-transfected into a suitable host organism. For further details of generating bispecific antibodies see, for example, Suresh et al., Methods in Enzymology, 121, 210 (1986).


[0317] According to another approach described in WO 96/27011, the interface between a pair of antibody molecules can be engineered to maximize the percentage of heterodimers that are recovered from recombinant cell culture. The preferred interface comprises at least a part of the CH3 region of an antibody constant domain. In this method, one or more small amino acid side chains from the interface of the first antibody molecule are replaced with larger side chains (e.g. tyrosine or tryptophan). Compensatory “cavities” of identical or similar size to the large side chain(s) are created on the interface of the second antibody molecule by replacing large amino acid side chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for increasing the yield of the heterodimer over other unwanted end-products such as homodimers.


[0318] Bispecific antibodies can be prepared as full-length antibodies or antibody fragments (e.g. F(ab′)2 bispecific antibodies). Techniques for generating bispecific antibodies from antibody fragments have been described in the literature. For example, bispecific antibodies can be prepared using chemical linkage. Brennan et al., Science 229, 81 (1985) describe a procedure wherein intact antibodies are proteolytically cleaved to generate F(ab′)2 fragments. These fragments are reduced in the presence of the dithiol complexing agent sodium arsenite to stabilize vicinal dithiols and prevent intermolecular disulfide formation. The Fab′ fragments generated are then converted to thionitrobenzoate (TNB) derivatives. One of the Fab′-TNB derivatives is then reconverted to the Fab′-thiol by reduction with mercaptoethylamine and is mixed with an equimolar amount of the other Fab′-TNB derivative to form the bispecific antibody. The bispecific antibodies produced can be used as agents for the selective immobilization of enzymes.


[0319] Additionally, Fab′ fragments can be directly recovered from E. coli and chemically coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med. 175, 217-225 (1992) describe the production of a fully humanized bispecific antibody F(ab′)2 molecule. Each Fab′ fragment was separately secreted from E. coli and subjected to directed chemical coupling in vitro to form the bispecific antibody. The bispecific antibody thus formed was able to bind to cells overexpressing the ErbB2 receptor and normal human T cells, as well as trigger the lytic activity of human cytotoxic lymphocytes against human breast tumor targets.


[0320] Various techniques for making and isolating bispecific antibody fragments directly from recombinant cell culture have also been described. For example, bispecific antibodies have been produced using leucine zippers. Kostelny et al., J. Immunol. 148(5), 1547-1553 (1992). The leucine zipper peptides from the Fos and Jun proteins were linked to the Fab′ portions of two different antibodies by gene fusion. The antibody homodimers were reduced at the hinge region to form monomers and then re-oxidized to form the antibody heterodimers. This method can also be utilized for the production of antibody homodimers. The “diabody” technology described by Hollinger et al., Proc. Natl. Acad. Sci. USA 90, 6444-6448 (1993) has provided an alternative mechanism for making bispecific antibody fragments. The fragments comprise a heavy-chain variable domain (VH) connected to a light-chain variable domain (VL) by a linker which is too short to allow pairing between the two domains on the same chain. Accordingly, the VH and VL domains of one fragment are forced to pair with the complementary VL and VH domains of another fragment, thereby forming two antigen-binding sites. Another strategy for making bispecific antibody fragments by the use of single-chain Fv (sFv) dimers has also been reported. See, Gruber et al., J. Immunol. 152, 5368 (1994).


[0321] Antibodies with more than two valencies are contemplated. For example, trispecific antibodies can be prepared. Tutt et al., J. Immunol. 147, 60 (1991).


[0322] Exemplary bispecific antibodies can bind to two different epitopes, at least one of which originates in the protein antigen of the invention. Alternatively, an anti-antigenic arm of an immunoglobulin molecule can be combined with an arm which binds to a triggering molecule on a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, CD28, or B7), or Fc receptors for IgG (FcγR), such as FcγRI (CD64), FcγRII (CD32) and FcγRIII (CD 16) so as to focus cellular defense mechanisms to the cell expressing the particular antigen. Bispecific antibodies can also be used to direct cytotoxic agents to cells which express a particular antigen. These antibodies possess an antigen-binding arm and an arm which binds a cytotoxic agent or a radionuclide chelator, such as EOTUBE, DPTA, DOTA, or TETA. Another bispecific antibody of interest binds the protein antigen described herein and further binds tissue factor (TF).



Heteroconjugate Antibodies

[0323] Heteroconjugate antibodies are also within the scope of the present invention. Heteroconjugate antibodies are composed of two covalently joined antibodies. Such antibodies have, for example, been proposed to target immune system cells to unwanted cells (U.S. Pat. No. 4,676,980), and for treatment of HIV infection (WO 91/00360; WO 92/200373; EP 03089). It is contemplated that the antibodies can be prepared in vitro using known methods in synthetic protein chemistry, including those involving crosslinking agents. For example, immunotoxins can be constructed using a disulfide exchange reaction or by forming a thioether bond. Examples of suitable reagents for this purpose include iminothiolate and methyl-4-mercaptobutyrimidate and those disclosed, for example, in U.S. Pat. No. 4,676,980.



Effector Function Engineering

[0324] It can be desirable to modify the antibody of the invention with respect to effector function, so as to enhance, e.g., the effectiveness of the antibody in treating cancer. For example, cysteine residue(s) can be introduced into the Fc region, thereby allowing interchain disulfide bond formation in this region. The homodimeric antibody thus generated can have improved internalization capability and/or increased complement-mediated cell killing and antibody-dependent cellular cytotoxicity (ADCC). See Caron et al., J. Exp Med., 176, 1191-1195 (1992) and Shopes, J. Immunol., 148, 2918-2922 (1992). Homodimeric antibodies with enhanced anti-tumor activity can also be prepared using heterobifunctional cross-linkers as described in Wolff et al. Cancer Research, 53, 2560-2565 (1993). Alternatively, an antibody can be engineered that has dual Fc regions and can thereby have enhanced complement lysis and ADCC capabilities. See Stevenson et al., Anti-Cancer Drug Design, 3, 219-230 (1989).



Immunoconjugates

[0325] The invention also pertains to immunoconjugates comprising an antibody conjugated to a cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active toxin of bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive isotope (i.e., a radioconjugate).


[0326] Chemotherapeutic agents useful in the generation of such immunoconjugates have been described above. Enzymatically active toxins and fragments thereof that can be used include diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A chain (from Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, alpha-sarcin, Aleurites fordii proteins, dianthin proteins, Phytolaca americana proteins (PAPI, PAPII, and PAP-S), momordica charantia inhibitor, curcin, crotin, sapaonaria officinalis inhibitor, gelonin, mitogellin, restrictocin, phenomycin, enomycin, and the tricothecenes. A variety of radionuclides are available for the production of radioconjugated antibodies. Examples include 212Bi, 131I, 131In, 90Y, and 186Re.


[0327] Conjugates of the antibody and cytotoxic agent are made using a variety of bifunctional protein-coupling agents such as N-succinimidyl-3-(2-pyridyldithiol) propionate (SPDP), imninothiolane (IT), bifunctional derivatives of imidoesters (such as dimethyl adipimidate HCL), active esters (such as disuccinimidyl suberate), aldehydes (such as glutareldehyde), bis-azido compounds (such as bis (p-azidobenzoyl) hexanediamine), bis-diazonium derivatives (such as bis-(p-diazoniumbenzoyl)-ethylenediamine), diisocyanates (such as tolyene 2,6-dilsocyanate), and bis-active fluorine compounds (such as 1,5-difluoro-2,4-dinitrobenzene). For example, a ricin immunotoxin can be prepared as described in Vitetta et al., Science, 238: 1098 (1987). Carbon-14-labeled 1-isothiocyanatobenzyl-3-methyldiethylene triaminepentaacetic acid (MX-DTPA) is an exemplary chelating agent for conjugation of radionucleotide to the antibody. See WO94/11026.


[0328] In another embodiment, the antibody can be conjugated to a “receptor” (such streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate is administered to the patient, followed by removal of unbound conjugate from the circulation using a clearing agent and then administration of a “ligand” (e.g., avidin) that is in turn conjugated to a cytotoxic agent.



Computer Readable Sequences

[0329] In one application of this embodiment, a nucleotide sequence of the present invention can be recorded on computer readable media. As used herein, “computer readable media” refers to any medium which can be read and accessed directly by a computer. Such media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled artisan can readily appreciate how any of the presently known computer readable mediums can be used to create a manufacture comprising computer readable medium having recorded thereon a nucleotide sequence of the present invention. As used herein, “recorded” refers to a process for storing information on computer readable medium. A skilled artisan can readily adopt any of the presently known methods for recording information on computer readable medium to generate manufactures comprising the nucleotide sequence information of the present invention.


[0330] A variety of data storage structures are available to a skilled artisan for creating a computer readable medium having recorded thereon a nucleotide sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can readily adapt any number of data processor structuring formats (e.g. text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention.


[0331] By providing any of the nucleotide sequences SEQ ID NO: 1-337, or 675-836 or a representative fragment thereof; or a nucleotide sequence at least 95% identical to any of the nucleotide sequences of SEQ ID NO: 1-337, or 675-836 in computer readable form, a skilled artisan can routinely access the sequence information for a variety of purposes. Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium. The examples which follow demonstrate how software which implements the BLAST (Altschul et al., J. Mol. Biol. 215:403-410 (1990)) and BLAZE (Brutlag et al., Comp. Chem. 17:203-207 (1993)) search algorithms on a Sybase system is used to identify open reading frames (ORFs) within a nucleic acid sequence. Such ORFs may be protein-encoding fragments and may be useful in producing commercially important proteins such as enzymes used in fermentation reactions and in the production of commercially useful metabolites.


[0332] As used herein, “a computer-based system” refers to the hardware means, software means, and data storage means used to analyze the nucleotide sequence information of the present invention. The minimum hardware means of the computer-based systems of the present invention comprises a central processing unit (CPU), input means, output means, and data storage means. A skilled artisan can readily appreciate that any one of the currently available computer-based systems are suitable for use in the present invention. As stated above, the computer-based systems of the present invention comprise a data storage means having stored therein a nucleotide sequence of the present invention and the necessary hardware means and software means for supporting and implementing a search means. As used herein, “data storage means” refers to memory which can store nucleotide sequence information of the present invention, or a memory access means which can access manufactures having recorded thereon the nucleotide sequence information of the present invention.


[0333] As used herein, “search means” refers to one or more programs which are implemented on the computer-based system to compare a target sequence or target structural motif with the sequence information stored within the data storage means. Search means are used to identify fragments or regions of a known sequence which match a particular target sequence or target motif. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software includes, but is not limited to, Smith-Waterman, MacPattern (EMBL), BLASTN and BLASTA (NPOLYPEPTIDEIA). A skilled artisan can readily recognize that any one of the available algorithms or implementing software packages for conducting homology searches can be adapted for use in the present computer-based systems. As used herein, a “target sequence” can be any nucleic acid or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. The most preferred sequence length of a target sequence is from about 10 to 300 amino acids, more preferably from about 30 to 100 nucleotide residues. However, it is well recognized that searches for commercially important fragments, such as sequence fragments involved in gene expression and protein processing, may be of shorter length.


[0334] As used herein, “a target structural motif,” or “target motif,” refers to any rationally selected sequence or combination of sequences in which the sequence(s) are chosen based on a three-dimensional configuration which is formed upon the folding of the target motif. There are a variety of target motifs known in the art. Protein target motifs include, but are not limited to, enzyme active sites and signal sequences. Nucleic acid target motifs include, but are not limited to, promoter sequences, hairpin structures and inducible expression elements (protein binding sequences).



Triple Helix Formation

[0335] In addition, the fragments of the present invention, as broadly described, can be used to control gene expression through triple helix formation or antisense DNA or RNA, both of which methods are based on the binding of a polynucleotide sequence to DNA or RNA. Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in length and are designed to be complementary to a region of the gene involved in transcription (triple helix-see Lee et al., Nucl. Acids Res. 6, 3073 (1979); Cooney et al., Science 15241, 456 (1988); and Dervan et al., Science 251, 1360 (1991)) or to the mRNA itself (antisense-Olmno, J. Neurochem. 56:560 (1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca Raton, Fla. (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into polypeptide. Both techniques have been demonstrated to be effective in model systems. Information contained in the sequences of the present invention is necessary for the design of an antisense or triple helix oligonucleotide.



Diagnostic Assays and Kits

[0336] The present invention further provides methods to identify the presence or expression of one of the ORFs of the present invention, or homolog thereof, in a test sample, using a nucleic acid probe or antibodies of the present invention, optionally conjugated or otherwise associated with a suitable label.


[0337] In general, methods for detecting a polynucleotide of the invention can comprise contacting a sample with a compound that binds to and forms a complex with the polynucleotide for a period sufficient to form the complex, and detecting the complex, so that if a complex is detected, a polynucleotide of the invention is detected in the sample. Such methods can also comprise contacting a sample under stringent hybridization conditions with nucleic acid primers that anneal to a polynucleotide of the invention under such conditions, and amplifying annealed polynucleotides, so that if a polynucleotide is amplified, a polynucleotide of the invention is detected in the sample.


[0338] In general, methods for detecting a polypeptide of the invention can comprise contacting a sample with a compound that binds to and forms a complex with the polypeptide for a period sufficient to form the complex, and detecting the complex, so that if a complex is detected, a polypeptide of the invention is detected in the sample.


[0339] In detail, such methods comprise incubating a test sample with one or more of the antibodies or one or more of the nucleic acid probes of the present invention and assaying for binding of the nucleic acid probes or antibodies to components within the test sample.


[0340] Conditions for incubating a nucleic acid probe or antibody with a test sample vary. Incubation conditions depend on the format employed in the assay, the detection methods employed, and the type and nature of the nucleic acid probe or antibody used in the assay. One skilled in the art will recognize that any one of the commonly available hybridization, amplification or immunological assay formats can readily be adapted to employ the nucleic acid probes or antibodies of the present invention. Examples of such assays can be found in Chard, T., An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science Publishers, Amsterdam, The Netherlands (1986); Bullock, G. R. et al., Techniques in Immunocytochemistry, Academic Press, Orlando, Fla. Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice and Theory of immunoassays: Laboratory Techniques in Biochemistry and Molecular Biology, Elsevier Science Publishers, Amsterdam, The Netherlands (1985). The test samples of the present invention include cells, protein or membrane extracts of cells, or biological fluids such as sputum, blood, seilm, plasma, or urine. The test sample used in the above-described method will vary based on the assay format, nature of the detection method and the tissues, cells or extracts used as the sample to be assayed. Methods for preparing protein extracts or membrane extracts of cells are well known in the art and can be readily be adapted in order to obtain a sample which is compatible with the system utilized.


[0341] In another embodiment of the present invention, kits are provided which contain the necessary reagents to carry out the assays of the present invention. Specifically, the invention provides a compartment kit to receive, in close confinement, one or more containers which comprises: (a) a first container comprising one of the probes or antibodies of the present invention; and (b) one or more other containers comprising one or more of the following: wash reagents, reagents capable of detecting presence of a bound probe or antibody.


[0342] In detail, a compartment kit includes any kit in which reagents are contained in separate containers. Such containers include small glass containers, plastic containers or strips of plastic or paper. Such containers allows one to efficiently transfer reagents from one compartment to another compartment such that the samples and reagents are not cross-contaminated, and the agents or solutions of each container can be added in a quantitative fashion from one compartment to another. Such containers will include a container which will accept the test sample, a container which contains the antibodies used in the assay, containers which contain wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and containers which contain the reagents used to detect the bound antibody or probe. Types of detection reagents include labeled nucleic acid probes, labeled secondary antibodies, or in the alternative, if the primary antibody is labeled, the enzymatic, or antibody binding reagents which are capable of reacting with the labeled antibody. One skilled in the art will readily recognize that the disclosed probes and antibodies of the present invention can be readily incorporated into one of the established kit formats which are well known in the art.



Medical Imaging

[0343] The novel polypeptides and binding partners of the invention are useful in medical imaging of sites expressing the molecules of the invention (e.g., where the polypeptide of the invention is involved in the immune response, for imaging sites of inflammation or infection). See, e.g., Kunkel et al., U.S. Pat. No. 5,413,778. Such methods involve chemical attachment of a labeling or imaging agent, administration of the labeled polypeptide to a subject in a pharmaceutically acceptable carrier, and imaging the labeled polypeptide in vivo at the target site.



Screening Assays

[0344] Using the isolated proteins and polynucleotides of the invention, the present invention further provides methods of obtaining and identifying agents which bind to a polypeptide encoded by an ORF corresponding to any of the nucleotide sequences set forth in SEQ ID NO: 1-337, or 675-836, or bind to a specific domain of the polypeptide encoded by the nucleic acid. In detail, said method comprises the steps of:


[0345] (a) contacting an agent with an isolated protein encoded by an ORF of the present invention, or nucleic acid of the invention; and


[0346] (b) determining whether the agent binds to said protein or said nucleic acid.


[0347] In general, therefore, such methods for identifying compounds that bind to a polynucleotide of the invention can comprise contacting a compound with a polynucleotide of the invention for a time sufficient to form a polynucleotide/compound complex, and detecting the complex, so that if a polynucleotide/compound complex is detected, a compound that binds to a polynucleotide of the invention is identified.


[0348] Likewise, in general, therefore, such methods for identifying compounds that bind to a polypeptide of the invention can comprise contacting a compound with a polypeptide of the invention for a time sufficient to form a polypeptide/compound complex, and detecting the complex, so that if a polypeptide/compound complex is detected, a compound that binds to a polynucleotide of the invention is identified.


[0349] Methods for identifying compounds that bind to a polypeptide of the invention can also comprise contacting a compound with a polypeptide of the invention in a cell for a time sufficient to form a polypeptide/compound complex, wherein the complex drives expression of a receptor gene sequence in the cell, and detecting the complex by detecting reporter gene sequence expression, so that if a polypeptide/compound complex is detected, a compound that binds a polypeptide of the invention is identified.


[0350] Compounds identified via such methods can include compounds which modulate the activity of a polypeptide of the invention (that is, increase or decrease its activity, relative to activity observed in the absence of the compound). Alternatively, compounds identified via such methods can include compounds which modulate the expression of a polynucleotide of the invention (that is, increase or decrease expression relative to expression levels observed in the absence of the compound). Compounds, such as compounds identified via the methods of the invention, can be tested using standard assays well known to those of skill in the art for their ability to modulate activity/expression.


[0351] The agents screened in the above assay can be, but are not limited to, peptides, carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be selected and screened at random or rationally selected or designed using protein modeling techniques.


[0352] For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and the like are selected at random and are assayed for their ability to bind to the protein encoded by the ORF of the present invention. Alternatively, agents may be rationally selected or designed. As used herein, an agent is said to be “rationally selected or designed” when the agent is chosen based on the configuration of the particular protein. For example, one skilled in the art can readily adapt currently available procedures to generate peptides, pharmaceutical agents and the like, capable of binding to a specific peptide sequence, in order to generate rationally designed antipeptide peptides, for example see Hurby et al., Application of Synthetic Peptides: Antisense Peptides,“In Synthetic Peptides, A User's Guide, W. H. Freeman, N.Y. (1992), pp. 289-307, and Kaspczak et al., Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the like.


[0353] In addition to the foregoing, one class of agents of the present invention, as broadly described, can be used to control gene expression through binding to one of the ORFs or EMFs of the present invention. As described above, such agents can be randomly screened or rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design sequence specific or element specific agents, modulating the expression of either a single ORF or multiple ORFs which rely on the same EMF for expression control. One class of DNA binding agents are agents which contain base residues which hybridize or form a triple helix formation by binding to DNA or RNA. Such agents can be based on the classic phosphodiester, ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric derivatives which have base attachment capacity.


[0354] Agents suitable for use in these methods preferably contain 20 to 40 bases and are designed to be complementary to a region of the gene involved in transcription (triple helix—see Lee et al., Nucl. Acids Res. 6, 3073 (1979); Cooney et al., Science 241, 456 (1988); and Dervan et al., Science 251, 1360 (1991)) or to the mRNA itself (antisense-Okano, J. Neurochem. 56, 560 (1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca Raton, Fla. (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into polypeptide. Both techniques have been demonstrated to be effective in model systems. Information contained in the sequences of the present invention is necessary for the design of an antisense or triple helix oligonucleotide and other DNA binding agents.


[0355] Agents which bind to a protein encoded by one of the ORFs of the present invention can be used as a diagnostic agent. Agents which bind to a protein encoded by one of the ORFs of the present invention can be formulated using known techniques to generate a pharmaceutical composition.



Use of Nucleic Acids as Probes

[0356] Another aspect of the subject invention is to provide for polypeptide-specific nucleic acid hybridization probes capable of hybridizing with naturally occurring nucleotide sequences. The hybridization probes of the subject invention may be derived from any of the nucleotide sequences SEQ ID NO: 1-337, or 675-836. Because the corresponding gene is only expressed in a limited number of tissues, a hybridization probe derived from any of the nucleotide sequences SEQ ID NO: 1-337, or 675-836 can be used as an indicator of the presence of RNA of cell type of such a tissue in a sample.


[0357] Any suitable hybridization technique can be employed, such as, for example, in situ hybridization. PCR as described in U.S. Pat. Nos. 4,683,195 and 4,965,188 provides additional uses for oligonucleotides based upon the nucleotide sequences. Such probes used in PCR may be of recombinant origin, may be chemically synthesized, or a mixture of both. The probe will comprise a discrete nucleotide sequence for the detection of identical sequences or a degenerate pool of possible sequences for identification of closely related genomic sequences.


[0358] Other means for producing specific hybridization probes for nucleic acids include the cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such vectors are known in the art and are commercially available and may be used to synthesize RNA probes in vitro by means of the addition of the appropriate RNA polymerase as T7 or SP6 RNA polymerase and the appropriate radioactively labeled nucleotides. The nucleotide sequences may be used to construct hybridization probes for mapping their respective genomic sequences. The nucleotide sequence provided herein may be mapped to a chromosome or specific regions of a chromosome using well-known genetic and/or chromosomal mapping techniques. These techniques include in situ hybridization, linkage analysis against known chromosomal markers, hybridization screening with libraries or flow-sorted chromosomal preparations specific to known chromosomes, and the like. The technique of fluorescent in situ hybridization of chromosome spreads has been described, among other places, in Verma et al (1988) Human Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York N.Y.


[0359] Fluorescent in situ hybridization of chromosomal preparations and other physical chromosome mapping techniques may be correlated with additional genetic map data. Examples of genetic map data can be found in the 1994 Genome Issue of Science (265:1981f). Correlation between the location of a nucleic acid on a physical chromosomal map and a specific disease (or predisposition to a specific disease) may help delimit the region of DNA associated with that genetic disease. The nucleotide sequences of the subject invention may be used to detect differences in gene sequences between normal, carrier or affected individuals.



Preparation of Support Bound Oligonucleotides

[0360] Oligonucleotides, i.e., small nucleic acid segments, may be readily prepared by, for example, directly synthesizing the oligonucleotide by chemical means, as is commonly practiced using an automated oligonucleotide synthesizer.


[0361] Support bound oligonucleotides may be prepared by any of the methods known to those of skill in the art using any suitable support such as glass, polystyrene or Teflon. One strategy is to precisely spot oligonucleotides synthesized by standard synthesizers. Immobilization can be achieved using passive adsorption (Inouye & Hondo, (1990) J. Clin. Microbiol. 28(6), 1469-72); using UV light (Nagata et al., 1985; Dahlen et al., 1987; Morrissey & Collins, (1989) Mol. Cell Probes 3(2) 189-207) or by covalent binding of base modified DNA (Keller et al., 1988; 1989); all references being specifically incorporated herein.


[0362] Another strategy that may be employed is the use of the strong biotin-streptavidin interaction as a linker. For example, Broude et al. (1994) Proc. Natl. Acad. Sci. USA 91(8), 3072-6, describe the use of biotinylated probes, although these are duplex probes, that are immobilized on streptavidin-coated magnetic beads. Streptavidin-coated beads may be purchased from Dynal, Oslo. Of course, this same linking chemistry is applicable to coating any surface with streptavidin. Biotinylated probes may be purchased from various sources, such as, e.g., Operon Technologies (Alameda, Calif.).


[0363] Nunc Laboratories (Naperville, Ill.) is also selling suitable material that could be used. Nunc Laboratories have developed a method by which DNA can be covalently bound to the microwell surface termed Covalink NH. Covalink NH is a polystyrene surface grafted with secondary amino groups (>NH) that serve as bridgeheads for further covalent coupling. CovaLink Modules may be purchased from Nunc Laboratories. DNA molecules may be bound to CovaLink exclusively at the 5′-end by a phosphoramidate bond, allowing immobilization of more than 1 pmol of DNA (Rasmussen et al., (1991) Anal. Biochem. 198(1) 138-42).


[0364] The use of CovaLink NH strips for covalent binding of DNA molecules at the 5′-end has been described (Rasmussen et al., (1991). In this technology, a phosphoramidate bond is employed (Chu et al., (1983) Nucleic Acids Res. 11(8) 6513-29). This is beneficial as immobilization using only a single covalent bond is preferred. The phosphoramidate bond joins the DNA to the CovaLink NH secondary amino groups that are positioned at the end of spacer arms covalently grafted onto the polystyrene surface through a 2 nm long spacer arm. To link an oligonucleotide to CovaLink NH via an phosphorarnidate bond, the oligonucleotide terminus must have a 5′-end phosphate group. It is, perhaps, even possible for biotin to be covalently bound to CovaLink and then streptavidin used to bind the probes.


[0365] More specifically, the linkage method includes dissolving DNA in water (7.5 ng/μl) and denaturing for 10 min. at 95° C. and cooling on ice for 10 min. Ice-cold 0.1 M 1-methylimidazole, pH 7.0 (1-MeIm7), is then added to a final concentration of 10 mM 1-MeIm7. A ss DNA solution is then dispensed into CovaLink NH strips (75 μl/well) standing on ice.


[0366] Carbodiumide 0.2 M 1-ethyl-3-(3-dimethylaminopropyl)-carbodiimide (EDC), dissolved in 10 mM I-Me1m7, is made fresh and 25 μl added per well. The strips are incubated for 5 hours at 50° C. After incubation the strips are washed using, e.g., Nunc-Inuuno Wash; first the wells are washed 3 times, then they are soaked with washing solution for 5 min., and finally they are washed 3 times (where in the washing solution is 0.4 N NaOH, 0.25% SDS heated to 50° C.).


[0367] It is contemplated that a further suitable method for use with the present invention is that described in PCT Patent Application WO 90/03382 (Southern & Maskos), incorporated herein by reference. This method of preparing an oligonucleotide bound to a support involves attaching a nucleoside 3′-reagent through the phosphate group by a covalent phosphodiester link to aliphatic hydroxyl groups carried by the support. The oligonucleotide is then synthesized on the supported nucleoside and protecting groups removed from the synthetic oligonucleotide chain under standard conditions that do not cleave the oligonucleotide from the support. Suitable reagents include nucleoside phosphoramidite and nucleoside hydrogen phosphorate.


[0368] An on-chip strategy for the preparation of DNA probe for the preparation of DNA probe arrays may be employed. For example, addressable laser-activated photodeprotection may be employed in the chemical synthesis of oligonucleotides directly on a glass surface, as described by Fodor et al. (1991) Science 251(4995), 767-73, incorporated herein by reference. Probes may also be immobilized on nylon supports as described by Van Ness et al. (1991) Nucleic Acids Res., 19(12) 3345-50; or linked to Teflon using the method of Duncan & Cavalier (1988) Anal. Biochem. 169(1), 104-8; all references being specifically incorporated herein.


[0369] To link an oligonucleotide to a nylon support, as described by Van Ness et al. (1991), requires activation of the nylon surface via alkylation and selective activation of the 5′-amine of oligonucleotides with cyanuric chloride.


[0370] One particular way to prepare support bound oligonucleotides is to utilize the light-generated synthesis described by Pease et al., (1994) Proc. Natl. Acad. Sci., USA 91(11), 5022-6, incorporated herein by reference). These authors used current photolithographic techniques to generate arrays of immobilized oligonucleotide probes (DNA chips). These methods, in which light is used to direct the synthesis of oligonucleotide probes in high-density, miniaturized arrays, utilize photolabile 5′-protected N-acyl-deoxynucleoside phosphoramidites, surface linker chemistry and versatile combinatorial synthesis strategies. A matrix of 256 spatially defined oligonucleotide probes may be generated in this manner.



Preparation of Nucleic Acid Fragments

[0371] The nucleic acids may be obtained from any appropriate source, such as cDNAs, genomic DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC inserts, and RNA, including mRNA without any amplification steps. For example, Sambrook et al. (1989) describes three protocols for the isolation of high molecular weight DNA from mammalian cells (p. 9.14-9.23).


[0372] DNA fragments may be prepared as clones in M13, plasmid or lambda vectors and/or prepared directly from genomic DNA or cDNA by PCR or other amplification methods. Samples may be prepared or dispensed in multiwell plates. About 100-1000 ng of DNA samples may be prepared in 2-500 ml of final volume.


[0373] The nucleic acids would then be fragmented by any of the methods known to those of skill in the art including, for example, using restriction enzymes as described at 9.24-9.28 of Sambrook et al. (1989), shearing by ultrasound and NaOH treatment.


[0374] Low pressure shearing is also appropriate, as described by Schriefer et al. (1990) Nucleic Acids Res. 18(24), 7455-6, incorporated herein by reference). In this method, DNA samples are passed through a small French pressure cell at a variety of low to intermediate pressures. A lever device allows controlled application of low to intermediate pressures to the cell. The results of these studies indicate that low-pressure shearing is a useful alternative to sonic and enzymatic DNA fragmentation methods.


[0375] One particularly suitable way for fragmenting DNA is contemplated to be that using the two base recognition endonuclease, CviJI, described by Fitzgerald et al. (1992) Nucleic Acids Res. 20(14) 3753-62. These authors described an approach for the rapid fragmentation and fractionation of DNA into particular sizes that they contemplated to be suitable for shotgun cloning and sequencing.


[0376] The restriction endonuclease CviJI normally cleaves the recognition sequence PuGCPy between the G and C to leave blunt ends. Atypical reaction conditions, which alter the specificity of this enzyme (CviJI**), yield a quasi-random distribution of DNA fragments form the small molecule pUC19 (2688 base pairs). Fitzgerald et al. (1992) quantitatively evaluated the randomness of this fragmentation strategy, using a CviJI** digest of pUC19 that was size fractionated by a rapid gel filtration method and directly ligated, without end repair, to a lac Z minus M13 cloning vector. Sequence analysis of 76 clones showed that CviJI** restricts pyGCPy and PuGCPu, in addition to PuGCPy sites, and that new sequence data is accumulated at a rate consistent with random fragmentation.


[0377] As reported in the literature, advantages of this approach compared to sonication and agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 μg instead of 2-5 μg); and fewer steps are involved (no preligation, end repair, chemical extraction, or agarose gel electrophoresis and elution are needed).


[0378] Irrespective of the manner in which the nucleic acid fragments are obtained or prepared, it is important to denature the DNA to give single stranded pieces available for hybridization. This is achieved by incubating the DNA solution for 2-5 minutes at 80-90° C. The solution is then cooled quickly to 2° C. to prevent renaturation of the DNA fragments before they are contacted with the chip. Phosphate groups must also be removed from genomic DNA by methods known in the art.



Preparation of DNA Arrays

[0379] Arrays may be prepared by spotting DNA samples on a support such as a nylon membrane. Spotting may be performed by using arrays of metal pins (the positions of which correspond to an array of wells in a microtiter plate) to repeated by transfer of about 20 nl of a DNA solution to a nylon membrane. By offset printing, a density of dots higher than the density of the wells is achieved. One to 25 dots may be accommodated in 1 mm2, depending on the type of label used. By avoiding spotting in some preselected number of rows and columns, separate subsets (subarrays) may be formed. Samples in one subarray may be the same genomic segment of DNA (or the same gene) from different individuals, or may be different, overlapped genomic clones. Each of the subarrays may represent replica spotting of the same samples. In one example, a selected gene segment may be amplified from 64 patients. For each patient, the amplified gene segment may be in one 96-well plate (all 96 wells containing the same sample). A plate for each of the 64 patients is prepared. By using a 96-pin device, all samples may be spotted on one 8×12 cm membrane. Subarrays may contain 64 samples, one from each patient. Where the 96 subarrays are identical, the dot span may be 1 mm2 and there may be a 1 mm space between subarrays.


[0380] Another approach is to use membranes or plates (available from NUNC, Naperville, Ill.) which may be partitioned by physical spacers e.g. a plastic grid molded over the membrane, the grid being similar to the sort of membrane applied to the bottom of multiwell plates, or hydrophobic strips. A fixed physical spacer is not preferred for imaging by exposure to flat phosphor-storage screens or x-ray films.


[0381] The present invention is illustrated in the following examples. Upon consideration of the present disclosure, one of skill in the art will appreciate that many other embodiments and variations may be made in the scope of the present invention. Accordingly, it is intended that the broader aspects of the present invention not be limited to the disclosure of the following examples. The present invention is not to be limited in scope by the exemplified embodiments which are intended as illustrations of single aspects of the invention, and compositions and methods which are functionally equivalent are within the scope of the invention. Indeed, numerous modifications and variations in the practice of the invention are expected to occur to those skilled in the art upon consideration of the present preferred embodiments. Consequently, the only limitations which should be placed upon the scope of the invention are those which appear in the appended claims.


[0382] All references cited within the body of the instant specification are hereby incorporated by reference in their entirety.







EXAMPLES


Example 1

[0383] Novel Nucleic Acid Sequences Obtained from Various Libraries


[0384] A plurality of novel nucleic acids were obtained from cDNA libraries prepared from various human tissues and in some cases isolated from a genomic library derived from human chromosome using standard PCR, SBH sequence signature analysis and Sanger sequencing techniques. The inserts of the library were amplified with PCR using primers specific for the vector sequences which flank the inserts. Clones from cDNA libraries were spotted on nylon membrane filters and screened with oligonucleotide probes (e.g., 7-mers) to obtain signature sequences. The clones were clustered into groups of similar or identical sequences. Representative clones were selected for sequencing.


[0385] In some cases, the 5′ sequence of the amplified inserts was then deduced using a typical Sanger sequencing protocol. PCR products were purified and subjected to fluorescent dye terminator cycle sequencing. Single pass gel sequencing was done using a 377 Applied Biosystems (ABI) sequencer to obtain the novel nucleic acid sequences.



Example 2

[0386] Assemblage of Novel Nucleic Acids


[0387] The contigs or nucleic acids of the present invention, designated as SEQ ID NO: 675-836 were assembled using an EST sequence as a seed. Then a recursive algorithm was used to extend the seed EST into an extended assemblage, by pulling additional sequences from different databases (i.e., Hyseq's database containing EST sequences, dbEST, gb pri, and UniGene, and exons from public domain genomic sequences predicated by GenScan) that belong to this assemblage. The algorithm terminated when there were no additional sequences from the above databases that would extend the assemblage. Further, inclusion of component sequences into the assemblage was based on a BLASTN hit to the extending assemblage with BLAST score greater than 300 and percent identity greater than 95%.


[0388] Table 8 sets forth the novel predicted polypeptides (including proteins) encoded by the novel polynucleotides (SEQ ID NO: 675-836) of the present invention, and their corresponding translation start and stop nucleotide locations to each of SEQ ID NO: 675-836. Table 8 also indicates the method by which the polypeptide was predicted. Method A refers to a polypeptide obtained by using a software program called FASTY (available from http://fasta.bioch.virginia,edu) which selects a polypeptide based on a comparison of the translated novel polynucleotide to known polynucleotides (W. R. Pearson, Methods in Enzymology, 183:63-98 (1990), herein incorporated by reference). Method B refers to a polypeptide obtained by using a software program called GenScan for human/vertebrate sequences (available from Stanford University, Office of Technology Licensing) that predicts the polypeptide based on a probabilistic model of gene structure/compositional properties (C. Burge and S. Karlin, J. Mol. Biol., 268:78-94 (1997), incorporated herein by reference). Method C refers to a polypeptide obtained by using a Hyseq proprietary software program that translates the novel polynucleotide and its complementary strand into six possible amino acid sequences (forward and reverse frames) and chooses the polypeptide with the longest open reading frame.



Example 3

[0389] Novel Nucleic Acids


[0390] The novel nucleic acids of the present invention SEQ ID NO: 1-337 were assembled from Hyseq's proprietary EST sequences as described in Example 1 and human genome sequences that are available from the public databases (http://www.ncbi.nlm.nih.gov/). Exons were predicted from human genome sequences using GenScan (http://genes.mit.edu/GENSCANinfo.html); HMMgene (http://www.cbs.dtu.dk/services/HMMgene/hmmgene1 1.html); and GenMark.hmm (http://genemark.biology.gatech.edu/GeneMark/whmm info.html). The Hyseq proprietary EST sequences and the predicted exons were assembled based on a BLASTN hit to the extending assemblage with BLAST score greater than 300 and percent identity greater than 95%. Then, the predicted genes were analyzed using Neural Network SignalP V1.1 program (from Center for Biological Sequence Analysis, The Technical University of Denmark) for presence of a signal peptide. These sequences were further analyzed for presence of transmembrane region(s) using the TMpred prograrn (http://www.ch.embnet.org/software/TMPRED form.html).


[0391] Table 1 shows the various tissue sources of SEQ ID NO: 1-337.


[0392] The homologs for polypeptides SEQ ID NO: 338-674, that correspond to nucleotide sequences SEQ ID NO: 1-337 were obtained by a BLASTP version 2.0 a1 19MP-WashU searches against Genpept release 124 and Geneseq (Derwent) release 200117 using BLAST algorithm. The results showing homologues for SEQ ID NO: 338-674 from Genpept 124 are shown in Table 2.


[0393] Using eMatrix software package (Stanford University, Stanford, Calif.) (Wu et al., J. Comp. Biol., Vol. 6, 219-235 (1999), http://motif.stanford.edu/ematrix-search/herein incorporated by reference), all the polypeptide sequences were examined to determine whether they had identifiable signature regions. Scoring matrices of the eMatrix software package are derived from the BLOCKS, PRINTS, PFAM, PRODOM, and DOMO databases. Table 3 shows the accession number of the homologous eMatlix signature found in the indicated polypeptide sequence, its description, and the results obtained which include accession number subtype; raw score; p-value; and the position of signature in amino acid sequence.


[0394] Using the Pfam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were examined for domains with homology to certain peptide domains. Table 4 shows the name of the Pfam model found, the description, the e-value and the Pfam score for the identified model within the sequence. Further description of the Pfam models can be found at http://pfam.wustl.edu/.


[0395] The GeneAtlas™ software package (Molecular Simulations Inc. (MSI), San Diego, Calif.) was used to predict the three-dimensional structure models for the polypeptides encoded by SEQ ID NO: 1-337 (i.e. SEQ ID NO: 338-674). Models were generated by (1) PST-BLAST which is a multiple alignment sequence profile-based searching developed by Altschul et al, (Nucl. Acids. Res. 25, 3389-3408 (1997)), (2) Hfigh Throughput Modeling (HTM) (Molecular Simulations Inc. (MSI) San Diego, Calif.,) which is an automated sequence and structure searching procedure (http://www.msi.com/), and (3) SeqFold™ which is a fold recognition method described by Fischer and Eisenberg (J. Mol. Biol. 209, 779-791 (1998)). This analysis was carried out, in part, by comparing the polypeptides of the invention with the known NMR (nuclear magnetic resonance) and x-ray crystal three-dimensional structures as templates. Table 5 shows: “PDB ID”, the Protein DataBase (PDB) identifier given to template structure; “Chain ID”, identifier of the subcomponent of the PDB template structure; “Compound Information”, information of the PDB template structure and/or its subcomponents; “PDB Function Annotation” gives function of the PDB template as annotated by the PDB files (http:/www.rcsb.org/PDB/); start and end amino acid position of the protein sequence aligned; PSI-BLAST score, the verify score, the SeqFold score, and the Potential(s) of Mean Force (PMF). The verify score is produced by GeneAtlas™ software (MSI), is based on Dr. Eisenberg's Profile-3D threading program developed in Dr. David Eisenberg's laboratory (U.S. Pat. No. 5,436,850 and Luthy, Bowie, and Eisenberg, Nature, 356:83-85 (1992)) and a publication by R. Sanchez and A. Sali, Proc. Natl. Acad. Sci. USA, 95:13597-12502. The verify score produced by GeneAtlas normalizes the verify score for proteins with different lengths so that a unified cutoff can be used to select good models as follows:


Verify score(normalized)=(raw score−½ high score)/(½ high score)


[0396] The PFM score, produced by GeneAtlas™ software (MSI), is a composite scoring function that depends in part on the compactness of the model, sequence identity in the alignment used to build the model, pairwise and surface mean force potentials (MFP) As given in table 5, a verify score between 0 to 1.0, with 1 being the best, represents a good model. Similarly, a PNF score between 0 to 1.0, with 1 being the best, represents a good model. A SeqFold™ score of more than 50 is considered significant. A good model may also be determined by one of skill in the art based all the information in Table 5 taken in totality.


[0397] Table 6 shows the position of the signal peptide in each of the polypeptides and the maximum score and mean score associated with that signal peptide using Neural Network SignalP V1.1 program (from Center for Biological Sequence Analysis, The Technical University of Denmark). The process for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the publication “ Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites” Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by reference. A maximum S score and a mean S score, as described in the Nielson et al reference, was obtained for the polypeptide sequences.


[0398] Table 7 correlates nucleotide sequences of the invention to a specific chromosomal location when assignable.


[0399] Table 9 shows the number of transmembrane regions, their location(s), and TMPred score obtained, for each of the SEQ ID NO: 338-674 that had a TMPred score of 800 or greater, using the TMpred program (http://www.ch.embnet.org/software/TMPRED form.html).


[0400] Table 10 is a correlation table of the novel polynucleotide sequences SEQ ID NO: 1-337, their corresponding polypeptide sequences SEQ ID NO: 338-674, their corresponding priority contig nucleotide sequences SEQ ID NO: 675-836, their corresponding priority contig polypeptide sequences SEQ ID NO: 837-998, and the U.S. serial number of the priority application (all of which are herein incorporated in their entirety), in which the contig sequence was filed.
1TABLE 1RNA/TissueLibraryTissue OriginSourceNameSEQ ID NO:adult brainGIBCOAB300117 21-22 34 82 263adult brainGIBCOABD00323 29 34 62 82 107-108 120 157-158 191 197 206 233 263 327336adult brainClontechABR00129 55 71 95 125 258 286adult brainClontechABR0062 29 50 80 85 101 130 143 152 155 161 163 165-166 186 188-191197 215 244-254 266-267 276 280adult brainClontechABR00842 46 52 56-63 77-78 80-81 85-88 93 95 110 112-117 120 138150-152 156 158 166 174 186 194 197 211 215 220 241 245273 280 286 296-297 322-323 334 336adult brainInvitrogenABR01482 286 322adult brainInvitrogenABR01562 82adult brainInvitrogenABR01662 304adult brainInvitrogenABT00442-43 61 71-72 102 164 172 273 276 286culturedStratageneADP0019 43 45 79 136-138 263preadipocytesadrenal glandClontechADR00239 52 54 64-65 71 121 125 164 170 258 298 309 320adult heartGIBCOAHR0019 12-13 34-35 45 71-72 82 85 99 110 113 120 127 150 158 163186 266 275 311 330adult kidneyGIBCOAKD0019 11 14 26 31 34 46 52-54 66 72 82 150 158 164 174 195 233257 263 281 284 286adult kidneyInvitrogenAKT0029 29 39-40 46 54 108 121 125 158 164 173-174 206 227 249257-258 284 300 311adult lungGIBCOALG00131 108lymph nodeClontechALN00182 251 336young liverGIBCOALV00179 94 263adult liverInvitrogenALV00231 45 73 118 139-140 143 158 164 174 216 233 263 277 315adult liverClontechALV0033adult ovaryInvitrogenAOV00124 26 29 32 34 39 44-46 50-51 54 82 85 102 108 121 125 132140 158 162 164 210 215 217 227 233-234 238 258 269 285-286297 306adult placentaClontechAPL00182 215 274placentaInvitrogenAPL00279 286adult spleenGIBCOASP00181-82 140 170 263 286adult testisGIBCOATS00125 38 82 311adult bladderInvitrogenBLD00181 85 94bone marrowClontechBMD0015 7-8 15 17 36 45-47 82 104 161 215bone marrowGFBMD00210 31 45-47 52 58 63 93 99 104 110 134-135 142-143 153 181-184191-192 221 228 311adult colonInvitrogenCLN00132 229 257 263-264adult cervixBioChainCVX0012 34 46 54 58 69 71 82 94 119-120 161 164 167 174 191 233266endothelial cellsStratageneEDT00139-40 45-46 75 77 82-83 108 121 143 164 194 199 216 285-286311fetal brainClontechFBR001247 266fetal brainClontechFBR004265-266fetal brainClontechFBR00619 42 51 58 69 80 87 94-97 104 110 112 140 143 154-156 164174 186 191 196-197 199 220 230 245 267-274 282 286 297300 311 335fetal brainInvitrogenFBT00244 54 58 141 143 164 286fetal heartInvitrogenFHR00147 51 66 80 104 114 116 127 143 197 200-201 212-214 227 259263 280 286 318fetal kidneyClontechFKD00139 103 121fetal kidneyClontechFKD00259 81 88 110 162 186 202-203 221 223 247 273 280 284fetal lungClontechFLG00120 102fetal lungInvitrogenFLG00398-99 126fetal liver-ColumbiaFLS0011-4 16 18 21 32 39-40 45-47 49 54 68-69 71 79 82 108 110 121spleenUniversity127-128 138 161 171 174-176 210 215-216 218 234 267 280286 328 330fetal liver-ColumbiaFLS0023 11 32 39-40 45 47 68 79 82 90 121 128-131 138 140 142-143spleenUniversity158 161 164 169 174-175 210 215-217 227 232 256 278 285-286305 307 314 319 328 330fetal liver-ColumbiaFLS0033 21 127-128 140 162 178-180 218-219 227spleenUniversityfetal liverInvitrogenFLV00141 75 94 164 286 309fetal liverClontechFLV0023 9 59 114fetal liverClontechFLV0043 10 69 88 143 186 204 222-223 267 311 314 336fetal muscleInvitrogenFMS00175 164 263 286 311fetal muscleInvitrogenFMS00282 140 150 193-195 224-225 311fetal skinInvitrogenFSK00166 79 100 114 126 138 158-159 211 249 254 263 286 311 329fetal skinInvitrogenFSK0029 63 88 94 106 110 114 116 138 143 150 167-168 197 205-207226-232 259 267 311 329umbilical cordBioChainFUC00166 82 95 120 127 162 216 273 286 330fetal brainGIBCOHFB0016 19-20 26-27 32 34 45 62 82 138 141 157 163-164 216 238 263311macrophageInvitrogenHMP001281 325infant brainColumbiaIB200227 29 48 52 74 91 106-108 138 158 163 236 240 245 263 322University336infant brainColumbiaIB200329 37 46 71 74 79 118 141 237-241 251 266 269Universityinfant brainColumbiaIBM00294 263Universityinfant brainColumbiaIBS00129 37 46 164Universitylung, fibroblastStratageneLFB00182lung tumorInvitrogenLGT0029 18 28 30 32-33 45 59-60 72 75-77 79 82 92 118 120-124 140143 164 257 273 284-286 311 316lymphocytesATCCLPC00121 27 46 89 93-94 116 132 143 160 169 228 233-235 286 333336leukocyteGIBCOLUC0019-10 27 30 32 41 46-47 54 58 79 82 89 94 109 143 160 186 233258 263 308 311leukocyteClontechLUC003233 299melanomaClontechMEL00454 105 282-283 286from-cell-line-ATCC-#CRL-1424mammary glandInvitrogenMMG00116 29-31 34 45 54 66-68 78-79 82 85 118 174 211 217 256 263273 284 313 336induced neuron-StratageneNTD00150 266 311cellsretinoic acid-StratageneNTR0012 54 216 255 286induced-neuronal-cellsneuronal cellsStratageneNTU001256 286pituitary glandClontechPIT00450 58 261 286 311placentaClontechPLA003116 186 206 208-209 228 232 244 329prostateClontechPRT00159 72 90 217 221 233 262rectumInvitrogenREC00188 101 133 217 249 263salivary glandClontechSAL001216small intestineClontechSIN00127 37 40 66 69 72 82 108 143 152 162 177 184-186 191 206242-243 247 264skeletal muscleClontechSKM001108 269spinal cordClontechSPC00127 29 34 88 102 125 138 158 170 174 195 233 241 254 258-259adult spleenClontechSPLc0166 81 110 140 161 228 236 260stomachClontechSTO00159 108thalamusClontechTHA00232 46 61 79 110thymusClontechTHM00168 89 102-103 233 331thymusClontechTHMc0229 39 69 71 79 81 89 93-94 104 110-111 121 186 211 235-236299 318 331thyroid glandClontechTHR0019 27 39 58 69-71 77 84 88 121 158 167 197 233 254 263 273311tracheaClontechTRC00110 63 69 81 227uterusClontechUTR001125 233 266 286 311


[0401]

2










TABLE 2








SEQ ID
Accession



%


NO:
No.
Species
Description
Score
Identity




















338
gi201734


Mus musculus


t complex protein-10
76
44


338
gi53992


Mus musculus


Tcp-10
76
44


338
gi201727


Mus musculus


t complex protein-10
76
44


339
AAB88481


Homo sapiens


Human membrane or secretory protein clone
254
73





PSEC0251.


339
gi57115


Rattus


ribosomal protein L31 (AA 1-125)
175
67






norvegicus




339
gi14198321


Mus musculus


ribosomal protein L31
175
67


340
gi3093754


Neurospora


AR2
78
28






crassa




340
gi3776090


Mus musculus


wolframin
76
29


340
gi3777585


Mus musculus


transmembrane protein
76
29


341
gi13507259


Homo sapiens


amnionless mRNA, complete cds.
1167
99


341
gi13649780


Mus musculus


amnionless precursor protein
840
71


341
AAY66714


Homo sapiens


Membrane-bound protein PRO1028.
1167
99


342
gi13183881


Homo sapiens


Fanconi anemia complementation group D2
657
90





protein (FANCD2) mRNA, complete cds,





alternatively spliced.


342
gi13324523


Homo sapiens


Fanconi anemia complementation group D2
657
90





protein (FANCD2) gene, exons 43, 44, and





complete cds, alternatively spliced.


342
gi10434106


Homo sapiens


cDNA FLJ12551 fis, clone NT2RM4000700.
175
100


343
gi4200216


Homo sapiens




H. sapiens
gene from PAC 1026E2, partial.

475
100


343
gi14141674


Rattus


BMP/retinoic acid-inducible neural-specific
151
54






norvegicus


protein


343
gi3041877


Homo sapiens


IB3089A (IB3089A) mRNA, complete cds.
151
54


344
gi14193307
Candidatus
ATP synthase beta subunit
61
35






Carsonella








ruddii




344
gi2688677


Borrelia


oligopeptide ABC transporter, permease protein
65
28






burgdorferi


(oppC-2)


344
gi14193323
Candidatus
ATP synthase beta subunit
59
31






Carsonella








ruddii




345
gi14250140


Homo sapiens


clone MGC: 14809, mRNA, complete cds.
173
100


345
gi561639


Homo sapiens


IgE receptor beta chain (HTm4) mRNA,
173
100





complete cds.


345
AAW06503


Homo sapiens


HTm4 protein.
173
100


346
AAY27669


Homo sapiens


Human secreted protein encoded by gene No.
255
100





103.


346
gi3719255


Mus musculus


Clq/MBL/SPA receptor ClqRp
50
35


346
gi5714405


Mus musculus


Clq/MBL/SP-A phagocytic receptor ClqRp
50
35


347
gi12580867


Picea abies


60S ribosomal protein L13E
83
33


347
gi3127821


Drosophila


Sex-Peptide
66
41






subobscura




347
gi3549864


Drosophila


Sex-peptide
66
41






subobscura




348
gi10176829


Arabidopsis


gene_id: MBB18.16˜
79
32






thaliana




349
gi7380324


Neisseria


ClpB protein
91
32






meningitidis






Z2491


349
gi7226713


Neisseria


clpB protein
91
32






meningitidis






MC58


349
gi9658311


Vibrio


integrase-related protein
61
34






cholerae




350
gi3986168


Lentinula


SHP1
55
31






edodes




350
gi12805659


Mus musculus


Similar to syndecan 4
53
34


351
gi9789476


Mus musculus


claudin-19
98
41


351
gi3335182


Mus musculus


claudin-1
98
32


351
gi12805093


Mus musculus


claudin 1
98
32


352
AAB37990


Homo sapiens


Human secreted protein encoded by gene 7
303
98





clone HWLHH15.


352
gi312188
Bovine
glycoprotein gD
85
29




herpesvirus 1


352
gi5668989
Bovine
glycoprotein D precursor
76
29




herpesvirus




type 1.1


353
gi7239364


Homo sapiens


acetylcholinesterase collagen-like tail subunit
136
29





(COLQ) gene, exon 17; and complete cds,





alternatively spliced.


353
gi3599478


Acanthamoeba


Myosin-IA
137
35






castellanii




353
gi3858883


Acanthamoeba


myosin I heavy chain kinase
133
30






castellanii




354
gi5901822


Drosophila


EG: 118B3.2
160
70






melanogaster




354
AAB29877


Homo sapiens


Human secreted protein BLAST search protein
127
52





SEQ ID NO: 135.


354
AAB29878


Homo sapiens


Human secreted protein BLAST search protein
121
41





SEQ ID NO: 136.


355
AAB53400


Homo sapiens


Human colon cancer antigen protein sequence
220
91





SEQ ID NO: 940.


355
gi1177469


Homo sapiens


gene for interleukin-10.
37
46


355
AAB62192


Homo sapiens


Human interleukin-10 (IL-10) protein.
37
46


356
gi2589210


Mus musculus


calcium-sensing receptor related protein 3
105
35


356
gi3130157


Takifugu


pheromone receptor
106
34






rubripes




356
gi2589208


Mus musculus


calcium-sensing receptor related protein 2
99
33


357
gi3130189


Takifugu


pheromone receptor
212
63






rubripes




357
gi2589208


Mus musculus


calcium-sensing receptor related protein 2
205
50


357
gi2589210


Mus musculus


calcium-sensing receptor related protein 3
203
48


358
AAB43892


Homo sapiens


Human cancer associated protein sequence SEQ
253
83





ID NO: 1337.


358
gi6456100


Mus musculus


F-box protein FBL10
247
83


358
gi14250563


Homo sapiens


clone IMAGE: 3163445, mRNA, partial cds.
253
83


359
AAB13343


Homo sapiens


Human cortexin-like protein.
204
53


359
AAB38538


Homo sapiens


Human secreted protein sequence encoded by
57
39





gene 17 SEQ ID NO: 75.


359
AAB34316


Homo sapiens


Human secreted protein sequence encoded by
54
34





gene 18 SEQ ID NO: 77.


360
AAB24074


Homo sapiens


Human PRO1153 protein sequence SEQ ID
136
42





NO: 49.


360
AAY66735


Homo sapiens


Membrane-bound protein PRO1153.
136
42


360
AAB65258


Homo sapiens


Human PRO1153 (UNQ583) protein sequence
136
42





SEQ ID NO: 351.


361
AAB70534


Homo sapiens


Human PRO4 protein sequence SEQ ID NO: 8.
395
100


361
AAY13377


Homo sapiens


Amino acid sequence of protein PRO257.
395
100


361
AAB80245


Homo sapiens


Human PRO257 protein.
395
100


362
gi4731216


Boophilus


NADH dehydrogenase subunit 2
52
25






microplus




362
gi6180101


Cafeteria


NADH dehydrogenase subunit 2
71
48






roenbergensis




362
gi5869819


Globodera


NADH-ubiquinone oxidoreductase subunit 1
82
35






pallida




363
AAB08944


Homo sapiens


Human secreted protein sequence encoded by
206
83





gene 19 SEQ ID NO: 101.


363
AAB08909


Homo sapiens


Human secreted protein sequence encoded by
159
80





gene 19 SEQ ID NO: 66.


363
gi14029247


Gnorimosphaeroma


cytochrome oxidase subunit I
66
53






oregonense




364
gi13195147


Mus musculus


HCH
953
77


364
gi1339910


Homo sapiens


Human DOCK180 protein mRNA, complete
203
32





cds.


364
AAW03515


Homo sapiens


Human DOCK180 protein.
203
32


365
gi10433539


Homo sapiens


cDNA FLJ12133 fis, clone MAMMA1000278.
224
35


365
AAW64461


Homo sapiens


Human secreted protein from clone B121.
218
35


365
gi4406644


Homo sapiens


clone 25130 mRNA sequence, complete cds.
223
41


366
gi1537002
Hepatitis C
envelope glycoprotein E2/NS1
61
32




virus


366
gi3153687
Hepatitis C
genome polyprotein
60
41




virus


366
AAB45374


Homo sapiens


Human secreted protein sequence encoded by
58
50





gene 36 SEQ ID NO: 126.


367
gi2935614


Homo sapiens


PAC clone RP1-102K2 from 22q12.1-qter,
1306
100





complete sequence.


367
gi386988


Homo sapiens


Human oncostatin M gene, exon 3.
1306
100


367
AAR33380


Homo sapiens


Cytokine hOSM.
1306
100


368
AAB87396


Homo sapiens


Human gene 8 encoded secreted protein
440
89





HMAM121, SEQ ID NO: 137.


368
AAY95967


Homo sapiens


Human TANGO 240.
436
88


368
AAB88402


Homo sapiens


Human membrane or secretory protein clone
434
88





PSEC0152.


369
AAB24476


Homo sapiens


Human secreted protein sequence encoded by
241
69





gene 40 SEQ ID NO: 101.


369
gi452414


Mus musculus


mPit-1R
69
31


369
gi7769944


Leishmania


L354.10
87
25






major




370
gi36853


Homo sapiens


Human mRNA for T-cell receptor alpha-chain
585
100





HAVP02 (V(a)11.1-J(a)I).


370
gi2358022


Homo sapiens


T-cell receptor alpha delta locus from bases 1 to
585
100





250529 (section 1 of 5) of the Complete





Nucleotide Sequence.


370
gi404055


Macaca


T-cell receptor alpha chain
568
97






mulatta




371
gi9963895


Homo sapiens


HT021 (HT021) mRNA, complete cds.
255
94


371
AAW54455


Homo sapiens


Mouse novel secreted protein isolated from
255
94





clone BF290_li.


371
AAB59017


Homo sapiens


Breast and ovarian cancer associated antigen
255
94





protein sequence SEQ ID 725.


372
gi2055228


Glycine max


SRC1
76
26


372
gi204144


Rattus


profilaggrin
97
25






norvegicus




372
gi3820941
Hepatitis B
core antigen
71
24




virus


373
gi1234787


Xenopus


up-regulated by thyroid hormone in tadpoles;
1115
58






laevis


expressed specifically in the tail and only at





metamorphosis; membrane bound or





extracellular protein; C-terminal basic region


373
gi10435980


Homo sapiens


cDNA FLJ13840 fis, clone THYRO1000783,
699
72





moderately similar to Xenopus laevis tail-





specific thyroid hormone up-regulated (gene 5)





mRNA.


373
gi4868122


Mus musculus


hedgehog-interacting protein
405
33


374
gi1181494


Paramecium


a331L
61
46






bursaria






Chlorella virus 1


374
AAY91469


Homo sapiens


Human secreted protein sequence encoded by
57
40





gene 19 SEQ ID NO: 142.


374
AAY91617


Homo sapiens


Human secreted protein sequence encoded by
57
40





gene 19 SEQ ID NO: 290.


375
gi12007419


Mus musculus


B4 olfactory receptor
285
60


375
gi12007420


Mus musculus


B5 olfactory receptor
285
60


375
gi12007421


Mus musculus


B6 olfactory receptor
285
60


376
AAB20695


Homo sapiens


Polymeric immunoglobulin receptor binding
60
55





domain peptide SEQ ID NO: 11.


376
gi1181346


Paramecium


a183L
56
28






bursaria






Chlorella virus 1


376
gi14030701


Arabidopsis


At2g28370/T1B3.11
72
27






thaliana




378
gi1296632


Homo sapiens




H. sapiens
gene encoding G protein coupled

104
37





receptor.


378
gi1124905


Homo sapiens




H. sapiens
P2Y4 gene.

104
37


378
AAW23606


Homo sapiens


Human P2Y4 receptor polypeptide.
104
37


379
gi4877582


Homo sapiens


lipoma HMGIC fusion partner (LHFP) mRNA,
110
25





complete cds.


379
AAY87336


Homo sapiens


Human signal peptide containing protein HSPP-
110
25





113 SEQ ID NO: 113.


380
AAY27721


Homo sapiens


Human secreted protein encoded by gene No.
1118
88





29.


380
AAB87068


Homo sapiens


Human secreted protein TANGO 365, SEQ ID
621
99





NO: 46.


380
AAB87146


Homo sapiens


Human secreted protein TANGO 365 A5V
617
98





variant, SEQ ID NO: 161.


381
gi7208423


Caulobacter


CpaA
65
36






crescentus




381
gi13424575


Caulobacter


pilus assembly protein CpaA
65
36






crescentus




382
AAY28917


Homo sapiens


Human regulatory protein HRGP-3.
267
100


382
AAB53312


Homo sapiens


Human colon cancer antigen protein sequence
267
100





SEQ ID NO: 852.


382
gi11526789


Homo sapiens


inorganic pyrophosphatase 2 (PPA2) mRNA,
258
98





complete cds, nuclear gene for mitochondrial





product.


383
gi13938575


Homo sapiens


Similar to RIKEN cDNA 2610511E22 gene,
655
89





clone MGC: 4251, mRNA, complete cds.


383
AAY91458


Homo sapiens


Human secreted protein sequence encoded by
655
89





gene 8 SEQ ID NO: 131.


383
AAY91598


Homo sapiens


Human secreted protein sequence encoded by
655
89





gene 8 SEQ ID NO: 271.


384
gi2065210


Mus musculus


Pro-Pol-dUTPase polyprotein
1026
82


384
gi3860513


Mus famulus


reverse transcriptase
482
84


384
gi4379237


Mus musculus


reverse transcriptase
477
83


385
gi14190365


Arabidopsis


AT5g17300/MKP11_15
64
32






thaliana




385
gi11275913
Protophormia
cytochrome oxidase subunit 1
55
44




atriceps


385
AAY29337


Homo sapiens


Human secreted protein clone gg894_13
63
28





alternate reading frame protein.


386
AAY20840


Homo sapiens


Human neurofilament-H wild type protein
67
38





fragment 1.


386
gi10584099
Halobacterium
Vng6036h
61
28




sp. NRC-1


386
gi7739781


Rattus


CCN family protein COP-1
80
26






norvegicus




387
gi14042550


Homo sapiens


cDNA FLJ14779 fis, clone NT2RP4000398,
242
66





moderately similar to ZINC FINGER PROTEIN





140.


387
gi456269


Mus musculus


zinc finger protein 30
242
70




domesticus


387
gi5080758


Homo sapiens


chromosome 19, BAC 331191 (CIT-B-471f3),
244
69





complete sequence.


388
AAB47106


Homo sapiens


Second splice variant of MAPP.
223
97


388
AAB47105


Homo sapiens


First splice variant of MAPP.
200
90


388
AAW25722


Homo sapiens


Human partial beta meltrin protein fragment 2.
184
66


389
AAB90649


Homo sapiens


Human secreted protein, SEQ ID NO: 192.
563
92


389
AAB90565


Homo sapiens


Human secreted protein, SEQ ID NO: 103.
472
100


389
AAB90651


Homo sapiens


Human secreted protein, SEQ ID NO: 194.
203
97


390
AAY87335


Homo sapiens


Human signal peptide containing protein HSPP-
623
99





112 SEQ ID NO: 112.


390
gi2292988


Rattus


Inter-alpha-inhibitor H4 heavy chain
87
32






norvegicus




390
AAY90288


Homo sapiens


Human peptidase, HPEP-5 protein sequence.
63
36


391
AAY92710


Homo sapiens


Human membrane-associated protein Zsig24.
230
100


391
AAY87250


Homo sapiens


Human signal peptide containing protein HSPP-
230
100





27 SEQ ID NO: 27.


391
AAG00627


Homo sapiens


Human secreted protein, SEQ ID NO: 4708.
93
100


392
gi10441465


Homo sapiens


actin filament associated protein (AFAP)
274
90





mRNA, complete cds.


392
gi13129531


Gallus gallus


actin filament-associated protein
204
71


392
gi13129529


Gallus gallus


neural actin filament protein
204
71


393
AAB64802


Homo sapiens


Human secreted protein sequence encoded by
58
41





gene 30 SEQ ID NO: 88.


393
gi1711217


Caenorhabditis


F58A3.1b
77
30






elegans




393
gi1711215


Caenorhabditis


F58A3.1a
77
30






elegans




394
AAB12121


Homo sapiens


Hydrophobic domain protein from clone
153
68





HP02962 isolated from KB cells.


394
AAY30812


Homo sapiens


Human secreted protein encoded from gene 2.
149
65


394
AAB88452


Homo sapiens


Human membrane or secretory protein clone
144
66





PSEC0241.


395
gi13623237


Homo sapiens


clone MGC: 10671, mRNA, complete cds.
146
57


395
gi13310191
multiple
recombinant envelope protein
126
35




sclerosis




associated




retrovirus




element


395
gi4262296


Homo sapiens


endogenous retrovirus W envelope protein
117
35





mRNA, partial cds.


396
gi10437485


Homo sapiens


cDNA: FLJ21394 fis, clone COL03536.
65
30


396
AAG02270


Homo sapiens


Human secreted protein, SEQ ID NO: 6351.
59
44


397
AAY20292


Homo sapiens


Human apolipoprotein E wild type protein
63
51





fragment 2.


397
AAB32406


Homo sapiens


Human secreted protein sequence encoded by
62
36





gene 5 SEQ ID NO: 92.


397
gi12667610
uncultured
dissimilatory sulfite reductase subunit A
72
39




sulfate-




reducing




bacterium




UMTRAdsr648-22


398
gi12053099


Homo sapiens


mRNA; cDNA DKFZp434A171 (from clone
172
65





DKFZp434A171); complete cds.


398
gi3002799


Pseudomonas


2-aminomuconic acid semialdehyde
118
29






pseudoalcaligenes


dehydrogenase


398
gi5821145


Homo sapiens


mRNA for RNA binding protein, partial cds,
120
22





clone: R11.


399
gi14249823


Homo sapiens


cholecystokinin, clone MGC: 10571, mRNA,
356
100





complete cds.


399
gi179996


Homo sapiens


Human cholecystokinin (CCK) gene, exon 3.
356
100


399
AAB24381


Homo sapiens


Human procholecystokinin amino acid sequence
356
100





SEQ ID NO: 1.


400
gi1870554


Saguinus


T-cell receptor beta
79
32






oedipus




400
gi1150925
Bovine
glycoprotein B
65
38




herpesvirus 1


400
gi159250


Holothuria


sperm specific protein phi-0
60
30






tubulosa




401
gi4097231


Ureaplasma


multiple banded antigen
395
23






urealyticum




401
gi560649


Neocallimastix


Xylanase B, XYLB {EC 3.2.1.8}
330
20






patriciarum
,





Peptide, 860




aa


401
gi600118


Zea mays


extensin-like protein
331
35


402
AAB12140


Homo sapiens


Hydrophobic domain protein isolated from
172
51





WERI-RB cells.


402
AAY25806


Homo sapiens


Human secreted protein fragment encoded from
130
46





gene 23.


402
gi5901846


Drosophila


BcDNA.GH12144
124
39






melanogaster




403
AAB66267


Homo sapiens


Human TANGO 272 SEQ ID NO: 14.
1329
97


403
gi2289904


Mus musculus


DRPLA
125
28


403
gi1549217


Mus musculus


DRPLA protein
124
28


404
gi4705


Saccharomyces


Ty protein
58
51






cerevisiae




404
gi11139690


Ovis aries


muscle specific calpain 3
54
41


404
AAY41363


Homo sapiens


Human secreted protein encoded by gene 56
54
55





clone HNGFE55.


405
gi13926111


Homo sapiens


2P domain potassium channel Talk-2
1430
100





(KCNK17) mRNA, complete eds.


405
AAY90354


Homo sapiens


Human TWIK-3 protein.
1426
99


405
gi13507377


Homo sapiens


potassium channel TASK-4 mRNA, complete
1364
99





cds.


406
gi514916


Bos taurus


tau protein
91
36


406
gi437055


Macaca


mucin
95
28






mulatta




406
gi2754696


Gallus gallus


high molecular mass nuclear antigen
103
28


407
gi3127175


Homo sapiens


sulfonylurea receptor 2A (SUR2) gene,
713
98





alternatively spliced product, exon 38a and





complete cds.


407
gi3127176


Homo sapiens


sulfonylurea receptor 2B (SUR2) gene,
713
98





alternatively spliced product, exon 38b and





complete cds.


407
gi5814019


Oryctolagus


cardiac ventricle sulfonyl urea receptor
678
93






cuniculus




408
AAB24035


Homo sapiens


Human PRO4397 protein sequence SEQ ID
1894
100





NO: 42.


408
AAY93951


Homo sapiens


Amino acid sequence of a Brainiac-5
1241
100





polypeptide.


408
AAY06462


Homo sapiens


Human Brainiac-3.
553
48


409
AAW88708


Homo sapiens


Secreted protein encoded by gene 175 clone
747
87





HEMAM41.


409
gi159655


Ascaris suum


collagen
94
36


409
gi289662


Caenorhabditis


col-36 collagen
109
41






elegans




410
gi975893


Homo sapiens


Human apolipoprotein apoC-IV (APOC4) gene,
693
100





complete cds.


410
AAG03772


Homo sapiens


Human secreted protein, SEQ ID NO: 7853.
669
96


410
gi1185465


Oryctolagus


Apolipoprotein C-IV
379
55






cuniculus




411
AAY57878


Homo sapiens


Human transmembrane protein HTMPN-2.
101
86


411
gi4406500


Carassius


gonadotropin releasing hormone receptor type A
72
31






auratus




412
AAY59682


Homo sapiens


Secreted Protein 108-009-5-0-A2-FL.
488
100


412
AAY01635


Homo sapiens


Human PS214 derived polypeptide.
488
100


412
AAY64650


Homo sapiens


Human luman homology protein.
488
100


413
gi13442978


Mus musculus


D-glucuronyl C5-epimerase
1001
94


413
gi11935177


Mus musculus


heparin/heparan sulfate:glucuronic acid C5
1001
94





epimerase


413
gi13654639


Bos taurus


D-glucuronyl C5 epimerase
972
92


414
AAG00122


Homo sapiens


Human secreted protein, SEQ ID NO: 4203.
102
100


414
gi4583535


Homo sapiens


integrin alpha 2 subunit (ITGA2) DNA, 5’ UTR
99
95





and promoter region.


414
AAW70542


Homo sapiens


Integrin alpha-2 chain.
102
100


415
AAY01387


Homo sapiens


Secreted protein encoded by gene 5 clone
60
40





HTLFE42.


415
gi3406819


Mus musculus


growth factor receptor
58
38


415
AAG02139


Homo sapiens


Human secreted protein, SEQ ID NO: 6220.
53
40


416
AAB12150


Homo sapiens


Hydrophobic domain protein isolated from HT-
683
100





1080 cells.


416
gi13096862


Mus musculus


RIKEN cDNA 9430096L06 gene
634
90


416
AAB29651


Homo sapiens


Human membrane-associated protein HUMAP-8.
502
100


417
AAY41428


Homo sapiens


Fragment of human secreted protein encoded by
107
43





gene 17.


417
AAY41324


Homo sapiens


Human secreted protein encoded by gene 17
108
40





clone HNFIY77.


417
AAB67576


Homo sapiens


Amino acid sequence of a human hydrolytic
108
40





enzyme HYENZ8.


418
gi7209315


Homo sapiens


mRNA for FLJ00007 protein, partial cds.
1024
79


418
AAY99428


Homo sapiens


Human PRO1431 (UNQ737) amino acid
430
93





sequence SEQ ID NO: 315.


418
gi6599145


Homo sapiens


mRNA; cDNA DKFZp434L127 (from clone
320
33





DKFZp434L127); partial cds.


419
gi297172


Rattus rattus


ribosomal protein S7
432
93


419
gi2811284


Mus musculus


ribosomal protein S7
432
93


419
gi12804027


Homo sapiens


ribosomal protein S7, clone MGC: 10268,
432
93





mRNA, complete cds.


420
AAB68888


Homo sapiens


Human RECAP polypeptide, SEQ ID NO: 18.
277
64


420
AAB08944


Homo sapiens


Human secreted protein sequence encoded by
74
72





gene 19 SEQ ID NO: 101.


420
AAY76198


Homo sapiens


Human secreted protein encoded by gene 75.
67
59


421
gi4096055


Homo sapiens


chromosome 19, cosmid R28379, complete
136
100





sequence.


421
gi9950071


Pseudomonas


probable permease of ABC transporter
81
39






aeruginosa




421
gi2113989


Mycobacterium


ccsA
79
34






tuberculosis




422
gi10438804


Homo sapiens


cDNA: FLJ22419 fis, clone HRC08593.
262
92


422
gi10436785


Homo sapiens


cDNA FLJ14342 fis, clone THYRO1000569,
98
42





highly similar to Mus musculus hematopoietic





zinc finger protein mRNA.


422
gi6690339


Mus musculus


hematopoietic zinc finger protein
96
40


423
gi9963845


Homo sapiens


HT017 mRNA, complete cds.
558
38


423
AAW09405


Homo sapiens


Pineal gland specific gene-1 protein.
558
38


423
AAB69185


Homo sapiens


Human hISLR-iso protein SEQ ID NO: 7.
558
38


424
gi475542


Rattus


glutamate receptor delta-1 subunit
505
98






norvegicus




424
gi220418


Mus musculus


glutamate receptor channel subunit delta-1
505
98


424
gi56286


Rattus


glutamate receptor subtype delta-1
482
98






norvegicus




425
AAB61880


Homo sapiens


Human cytokine receptor Zcytor14.
163
28


425
AAB61881


Homo sapiens


Human variant Zcytor14 protein Zcytor14-1.
137
32


425
AAB87606


Homo sapiens


Human PRO20040.
143
28


426
gi13195147


Mus musculus


HCH
413
86


426
gi1339910


Homo sapiens


Human DOCK180 protein mRNA, complete
373
78





cds.


426
AAW03515


Homo sapiens


Human DOCK180 protein.
366
76


427
gi12724402


Lactococcus


prophage pi3 protein 41
58
36






lactis
subsp.





lactis


427
gi155287


Vibrio


disulfide isomerase
73
29






cholerae




428
gi6822060


Arabidopsis


peptide transport-like protein
93
31






thaliana




428
gi206311


Rattus


protein phosphatase-2Bc
58
30






norvegicus




429
gi14042519


Homo sapiens


cDNA FLJ14763 fis, clone NT2RP3003621.
2026
99


429
gi13097630


Homo sapiens


clone MGC: 10791, mRNA, complete cds.
2026
99


429
gi13591620


Homo sapiens


kremen mRNA for kringle-containing
860
49





transmembrane protein, complete cds.


430
gi13161409


Mus musculus


family 4 cytochrome P450
437
73


430
gi7331756


Caenorhabditis


contains similarity to Pfam family PF00067
139
37






elegans


(Cytochrome P450), score = 356.1, E = 3.6e−103,





N = 1


430
gi3876203


Caenorhabditis


contains similarity to Pfam domain: PF00067
135
37






elegans


(Cytochrome P450), Score = 347.4, E-





value = 5.1e−101, N = 1


431
AAB08862


Homo sapiens


Amino acid sequence of a human secretory
958
100





protein.


431
gi12654587


Homo sapiens


clone MGC: 2463, mRNA, complete cds.
953
99


431
AAB12163


Homo sapiens


Hydrophobic domain protein from clone
953
99





HP10671 isolated from Thymus cells.


432
gi4877582


Homo sapiens


lipoma HMGIC fusion partner (LHFP) mRNA,
195
30





complete cds.


432
AAY87336


Homo sapiens


Human signal peptide containing protein HSPP-
195
30





113 SEQ ID NO: 113.


432
gi7529641


Schizosacchar


calcium permease family membrane transporter
110
28






omyces
pombe



433
gi3598974


Rattus


protein tyrosine phosphatase TD14
105
38






norvegicus




433
gi6625751


Mink enteritis


capsid protein VP2
50
34




virus


433
gi5442034


Mus musculus


calmodulin-dependent protein kinase II beta M
66
37





isoform


434
AAB33892


Homo sapiens


Human secreted protein BLAST search protein
43
60





SEQ ID NO: 107.


434
AAB54248


Homo sapiens


Human pancreatic cancer antigen protein
62
42





sequence SEQ ID NO: 700.


434
gi683548


Chironomus


gamma protein constant region
62
38






pallidivittatus




435
gi41077


Escherichia


cal protein precursor (aa 1-51)
63
42






coli




435
gi2995968


Leontopithecus


NADH dehydrogenase subunit 4
76
28






rosalia




435
gi2995972


Leontopithecus


NADH dehydrogenase subunit 4
76
28






chrysomelas




436
gi1196439


Homo sapiens


(clone H 4.4) latent transforming growth factor-
291
98





beta binding protein (LTBP-1L) gene, partial





cds.


436
gi207286


Rattus


TGF-beta masking protein large subunit
226
77






norvegicus




436
gi3493176


Mus musculus


latent TGF beta binding protein
217
73


437
AAY57951


Homo sapiens


Human transmembrane protein HTMPN-75.
77
33


437
gi642017


Hordeum


phospholipid transfer protein precursor
72
30






vulgare




437
gi11037708


Triticum


lipid transfer protein precursor
72
34






aestivum




438
AAY20852


Homo sapiens


Human neurofilament-H mutant protein
108
38





fragment 11.


438
gi1888411


Homo sapiens


mRNA encoding chimaeric transcript of
80
30





collagen type 1 alpha 1 and platelet derived





growth factor beta, 314 bp.


438
AAW18664


Homo sapiens


Fragmented human NF-H gene + 1 frameshift
100
38





mutant product.


439
AAB08912


Homo sapiens


Human secreted protein sequence encoded by
251
100





gene 22 SEQ ID NO: 69.


439
gi12248917


Homo sapiens


mRNA for spinesin, complete cds.
251
100


439
AAB11699


Homo sapiens


Human serine protease BSSP2 (hBSSP2), SEQ
251
100





ID NO: 10.


440
gi13990776


Gallus gallus


immunoglobulin lambda chain
67
43


440
gi1086714


Caenorhabditis


coded for by C. elegans cDNA yk74c8.5;
55
45






elegans


Similar to small type-II membrane antigen


440
gi1469906


Gallus gallus


beta-1,4-galactosyltransferase
56
46


441
AAY17526


Homo sapiens


Human secreted protein clone AM349 2 protein.
1131
100


441
AAY02361


Homo sapiens


Polypeptide identified by the signal sequence
1131
100





trap method.


441
AAW52834


Homo sapiens


Secreted protein encoded by clone AM349_2.
664
100


442
gi5579130
Hepatitis E
non-structural polyprotein
71
37




virus


442
gi330005
Hepatitis E
poly-proline hinge
58
35




virus


442
gi7768740


Homo sapiens


genomic DNA, chromosome 21q, section
82
29





89/105.


443
AAY86234


Homo sapiens


Human secreted protein HNTNC20, SEQ ID
476
60





NO: 149.


443
AAB24074


Homo sapiens


Human PRO1153 protein sequence SEQ ID
111
46





NO: 49.


443
AAY66735


Homo sapiens


Membrane-bound protein PRO1153.
111
46


444
gi12836893


Gallus gallus


IPR328-like protein
165
30


444
gi13357180


Homo sapiens


calcium channel gamma subunit 8 (CACNG8)
125
28





mRNA, partial cds.


444
gi4558766


Homo sapiens


neuronal voltage gated calcium channel gamma-
158
30





3 subunit mRNA, complete cds.


445
AAY79384


Homo sapiens


Human G protein coupled receptor SLGP 7
396
100





transmembrane region.


445
gi11225483


Homo sapiens


ETL protein (ETL) mRNA, complete cds.
396
100


445
AAB61144


Homo sapiens


Human NOV14 protein.
396
100


446
gi13195147


Mus musculus


HCH
209
77


446
gi1339910


Homo sapiens


Human DOCK180 protein mRNA, complete
95
43





cds.


446
AAW03515


Homo sapiens


Human DOCK180 protein.
95
43


447
gi10438431


Homo sapiens


cDNA: FLJ22155 fis, clone HRC00205.
518
34


447
gi10437336


Homo sapiens


cDNA: FLJ21267 fis, clone COL01717.
506
36


447
AAY07754


Homo sapiens


Human secreted protein fragment encoded from
291
37





gene 11.


448
gi1552496


Homo sapiens


Human germline T-cell receptor beta chain
614
100





Dopamine-beta-hydroxylase-like, TRY1, TRY2,





TRY3, TCRBV27S1P, TCRBV22S1A2N1T,





TCRBV9S1A1T, TCRBV7S1A1N2T,





TCRBV5S1A1T, TCRBV13S3, TCRBV6S7P,





TCRBV7S3A2T, TCRBV13S2A1T,





TCRBV9S2A2PT, TCRBV7S2A1N4T,





TCRBV13S9/13S2A1T, TCRBV6S5A1N1,





TCRBV30S1P, TCRBV31S1, TCRBV13S5,





TCRBV6S1A1N1, TCRBV32S1P,





TCRBV5S5P, TCRBV1S1A1N1,





TCRBV12S2A1T, TCRBV21S1, TCRBV8S4P,





TCRBV12S3, TCRBV21S3A2N2T,





TCRBV8S5P, TCRBV13S1 genes from bases 1





to 267156 (section 1 of 3).


448
gi33560


Homo sapiens


Human mRNA for T-cell receptor V beta gene
609
100





segment V-beta-9, clone IGRb20.


448
gi37634


Homo sapiens




H. sapiens
rearranged TCR Vbeta 9.1 mRNA for

609
100





T cell receptor.


449
gi13960126


Homo sapiens


Similar to leucine-rich neuronal protein, clone
162
80





MGC: 4126, mRNA, complete cds.


449
gi14043281


Homo sapiens


clone IMAGE: 3528313, mRNA, partial cds.
133
64


449
gi3135309


Homo sapiens


chromosome 7q22 sequence, complete
133
64





sequence.


450
AAB61141


Homo sapiens


Human NOV11 protein.
370
86


450
gi4760778


Mus musculus


Ten-m2
369
100


450
gi5712201


Rattus


neurestin alpha
369
100






norvegicus




451
AAW88628


Homo sapiens


Secreted protein encoded by gene 95 clone
78
30





HPWAN23.


451
AAY57923


Homo sapiens


Human transmembrane protein HTMPN-47.
78
30


451
gi7109072


Plasmodium


PfEMP1 protein
78
37






falciparum




452
gi1061424


Homo sapiens


Human PMS2 related (hPMSR3) gene, complete
194
48





cds.


452
gi5738553


Homo sapiens


mRNA for zinc finger protein, clone cZNF41.5,
175
48





partial.


452
gi5738547


Homo sapiens


mRNA for zinc finger protein, clone cZNF41.2,
174
71





partial.


453
gi14161140


Streptococcus


M protein
75
35






pyogenes




453
gi472917


Enterococcus


v-type Na-ATPase
64
37






hirae




453
AAW00946


Homo sapiens


Human c-Fos protein.
63
40


454
gi6088092


Mesocricetus


cytochrome P450
92
47






auratus




454
AAY91348


Homo sapiens


Human secreted protein sequence encoded by
130
40





gene 3 SEQ ID NO: 69.


454
gi4249595


Mus musculus


CYP2C40
115
34


455
gi12053357


Homo sapiens


mRNA; cDNA DKFZp586G2122 (from clone
488
67





DKFZp586G2122); complete cds.


455
AAY27649


Homo sapiens


Human secreted protein encoded by gene No.
62
35





83.


455
gi9755390


Arabidopsis


F17F8.22
81
46






thaliana




456
gi6273399


Homo sapiens


melanoma-associated antigen MG50 mRNA,
359
95





partial cds.


456
AAW81030


Homo sapiens


Melanoma associated antigen MG50.
359
95


456
AAY70469


Homo sapiens


Human p53 target molecule, PRG2 protein.
359
95


457
AAB24074


Homo sapiens


Human PRO1153 protein sequence SEQ ID
1023
99





NO: 49.


457
AAY66735


Homo sapiens


Membrane-bound protein PRO1153.
1023
99


457
AAB65258


Homo sapiens


Human PRO1153 (UNQ583) protein sequence
1023
99





SEQ ID NO: 351.


458
gi1364247


Sus scrofa


Ca(2+)-transport ATPase (AA 989-1042); non-
57
38





muscle isoform (1 is 3rd base in codon)


458
AAB65991


Homo sapiens


Human secreted protein BLAST search protein
73
34





SEQ ID NO: 131.


458
AAB65992


Homo sapiens


Human secreted protein BLAST search protein
73
34





SEQ ID NO: 132.


459
gi2150146


Mus musculus


sulfonylurea receptor 2A
634
73


459
gi8843832


Rattus


sulphonylurea receptor 2b
375
73






norvegicus




459
gi3127175


Homo sapiens


sulfonylurea receptor 2A (SUR2) gene,
372
74





alternatively spliced product, exon 38a and





complete cds.


460
gi4467773


Helicobacter


cytotoxin associated protein A
60
34






pylori




460
gi7248699


Helicobacter


cytotoxin associated protein CagA
60
34






pylori




460
gi5851989


Helicobacter


cytotoxin associated protein A
59
31






pylori




461
gi13278675


Homo sapiens


clone MGC: 11170, mRNA, complete cds.
77
41


461
gi6457690


Deinococcus


2-oxo acid dehydrogenase, E2 component
90
31






radiodurans




461
gi179521


Homo sapiens


Human bullous pemphigoid (BP180) mRNA,
72
36





partial cds.


462
AAB52176


Homo sapiens


Human secreted protein BLAST search protein
468
95





SEQ ID NO: 132.


462
AAR27651


Homo sapiens


Human calcium channel 27980/13.
117
26


462
gi179764


Homo sapiens


Human neuronal DHP-sensitive, voltage-
117
26





dependent, calcium channel alpha-1D subunit





mRNA, complete cds.


463
gi13623421


Homo sapiens


Similar to RIKEN cDNA 5730589L02 gene,
495
98





clone MGC: 13124, mRNA, complete cds.


463
gi12803383


Homo sapiens


clone MGC: 2099, mRNA, complete cds.
189
100


463
gi13111983


Homo sapiens


clone MGC: 4221, mRNA, complete cds.
189
100


464
AAW75100


Homo sapiens


Human secreted protein encoded by gene 44
121
83





clone HE8CJ26.


464
gi11275978


Homo sapiens


NOTCH 2 (N2) mRNA, complete cds.
125
87


464
AAY06816


Homo sapiens


Human Notch2 (humN2) protein sequence.
125
87


465
gi2696709


Mus musculus


RST
258
43


465
gi2687858


Pseudopleuron


renal organic anion transporter
236
40






ectes






americanus


465
gi4586315


Homo sapiens


ORCTL3 mRNA for organic-cation transporter
232
37





like 3, complete cds.


466
gi11463949


Homo sapiens


hUGTrel7 mRNA for UDP-glucuronic acid,
256
100





complete cds.


466
AAB60119


Homo sapiens


Human transport protein TPPT-39.
175
63


466
AAB56473


Homo sapiens


Human prostate cancer antigen protein sequence
175
63





SEQ ID NO: 1051.


467
AAB88377


Homo sapiens


Human membrane or secretory protein clone
370
94





PSEC0113.


467
gi12656637


Mus musculus


equilibrative nucleoside transporter 3
109
25


467
gi3877156


Caenorhabditis


F44D12.9
92
32






elegans




468
gi9828006


Leishmania


probable ctg26 alteRNAte open reading frame
60
40






major




468
gi4096496


Homo sapiens


Human pre-B cell Ig heavy chain mRNA, third
55
47





complementarity-determining region, clone





PBT-55, partial cds.


468
gi3005708


Homo sapiens


clone 23619 phosphoprotein mRNA, partial cds.
66
33


469
gi1339910


Homo sapiens


Human DOCK180 protein mRNA, complete
121
54





cds.


469
AAW03515


Homo sapiens


Human DOCK180 protein.
121
54


469
gi13195147


Mus musculus


HCH
107
61


470
gi11036344


Pichia


NADH dehydrogenase subunit 4L
69
38






canadensis




470
gi10175432


Bacillus


D-alanine aminotransferase
87
35






halodurans




470
gi10639223


Thermoplasma


ethanolamine permease related protein
88
27






acidophilum




471
AAB90654


Homo sapiens


Human secreted protein, SEQ ID NO: 197.
58
29


471
AAY36085


Homo sapiens


Extended human secreted protein sequence,
56
34





SEQ ID NO: 470.


471
gi3617829


Gallus gallus


gallinacin 1 prepropeptide
55
42


472
gi14189735


Homo sapiens


ATP-binding cassette transporter family A
251
43





member 12 (ABCA12) mRNA, complete cds.


472
gi14209834


Mus musculus


ATP-binding cassette transporter sub-family A
199
39





member 7


472
gi9211112


Homo sapiens


macrophage ABC transporter (ABCA7) mRNA,
196
40





complete cds.


473
gi8919747
Cottontail
e8
65
36




rabbit




papillomavirus


473
gi8919568
Cottontail
E8
64
36




rabbit




papillomavirus


473
gi5679184


Xanthomonas


HrcU homolog
80
25






campestris
pv.





glycines


474
AAY30817


Homo sapiens


Human secreted protein encoded from gene 7.
569
98


474
gi3411233


Mus musculus


IER5
107
37


474
AAG02396


Homo sapiens


Human secreted protein, SEQ ID NO: 6477.
85
61


475
AAY99353


Homo sapiens


Human PRO1415 (UNQ731) amino acid
1435
99





sequence SEQ ID NO: 50.


475
AAB88426


Homo sapiens


Human membrane or secretory protein clone
1428
99





PSEC0199.


475
gi11230635


Homo sapiens


CD30 gene for cytokine receptor CD30, exons
106
29





1-8.


476
gi6636340


Rattus


myosin heavy chain Myr 8
157
61






norvegicus




476
gi10863773


Rattus


myosin heavy chain Myr 8b
157
61






norvegicus




476
AAB51865


Homo sapiens


Human secreted protein sequence encoded by
71
31





gene 39 SEQ ID NO: 98.


477
gi213109


Discopyge


synaptic vesicle protein
75
36






ommata




477
gi1679584


Cavia


membrane cofactor protein precursor
80
37






porcellus




477
gi1655471


Cavia


membrane cofactor protein(GMP1-full)
80
37






porcellus




478
gi14330016


Mus musculus


bM401L17.2.1 (cholinergic receptor, nicotinic,
164
50





alpha polypeptide 4 (isoform 1))


478
gi9886085


Mus musculus


nicotinic acetlycholine receptor alpha 4 subunit
164
50


478
gi14330017


Mus musculus


bM401L17.2.2 (cholinergic receptor, nicotinic,
164
50





alpha polypeptide 4 (isoform 2))


479
gi409995


Rattus sp.


mucin
137
47


479
gi4995986
Human
13.6% identical to DR8 gene of strain U1102 of
135
32




herpesvirus 6
HHV-6


479
gi2388546


Homo sapiens


Human Xq28 BAC RP11-159I8 (Roswell Park
118
37





Cancer Institute Human BAC Library), Cosmid





LL0XNC01-3C3 (LLNL X Chromsome





Library), and BAC GS1-92B2 (Genome





Systems Human BAC Library) complete





sequence.


480
AAY58174


Homo sapiens


Human embryogenesis protein, EMPRO.
872
96


480
gi3879940


Caenorhabditis


Similarity to Mouse H(beta)58 protein
650
67






elegans


(SW: HB58_MOUSE)


480
gi3342000


Homo sapiens


H beta 58 homolog
666
70


481
gi13359817


Escherichia


high-affinity choline transport
1021
100






coli
O157:H7



481
gi1657512


Escherichia


high-affinity choline transport protein
1021
100






coli




481
gi1786506


Escherichia


high-affinity choline transport
1021
100






coli
K12



482
gi10584129
Halobacterium
Vng6071c
81
27




sp. NRC-1


482
gi10584473
Halobacterium
Vng6455c
81
27




sp. NRC-1


482
gi12723038


Lactococcus


UNKNOWN PROTEIN
58
28






lactis
subsp.





lactis


483
gi13364609


Escherichia


fumarate reductase FrdD
515
96






coli
O157:H7



483
gi145266


Escherichia


g13 protein
515
96






coli




483
gi1790594


Escherichia


fumarate reductase, anaerobic, membrane
515
96






coli
K12

anchor polypeptide


484
gi1160319


Escherichia


aldohexuronate transport system
928
96






coli




484
gi13363448


Escherichia


transport protein of hexuronates
928
96






coli
O157:H7



484
gi2367193


Escherichia


transport of hexuronates
928
96






coli
K12



485
gi395270


Escherichia


FepE
402
100






coli




485
gi1786802


Escherichia


ferric enterobactin (enterochelin) transport
402
100






coli
K12



485
gi1778503


Escherichia


ferric enterobactin transport protein
402
100






coli




486
gi145521


Escherichia


methyl-accepting chemotaxis protein II
411
73






coli




486
gi1736539


Escherichia


Methyl-accepting chemotaxis protein II (MCP-
411
73






coli


II) (Aspartate chemoreceptor protein).


486
gi1788195


Escherichia


methyl-accepting chemotaxis protein II,
411
73






coli
K12

aspartate sensor receptor


487
gi14456429


Equus caballus


galanin receptor 1
69
28


487
gi3282259


Cucumaria


ND4L
69
30






pseudocurata




487
gi3282257


Cucumaria


ND4L
68
30






miniata




488
gi3702702
bacteriophage
Vpf77
65
30




Vf33


488
gi3702711
bacteriophage
Vpf77
65
30




Vf12


488
gi1742947
Alcaligenes sp.
urf-1 (merE)
64
31


489
gi263516


Azospirillum


NifB {N-terminal}
58
39






brasilense
,





Sp7, Peptide




Partial, 70 aa


489
gi9622741


Conus catus


four-loop conotoxin precursor
57
33


489
gi149569
Lactobacillus
lactacin F
56
40




sp.


490
gi896286


Leishmania


NH2 terminus uncertain
123
19






tarentolae




490
gi4155384


Helicobacter


IRON(III) DICITRATE TRANSPORT
120
27






pylori
J99

SYSTEM PERMEASE PROTEIN


490
gi1542807


Asterina


NADH-dehydrogenase subunit 4L
98
27






pectinifera




491
AAB88433


Homo sapiens


Human membrane or secretory protein clone
299
55





PSEC0210.


491
gi6996444


Homo sapiens


CTL2 gene.
299
55


491
AAB24284


Homo sapiens


Human H38087 (clone GTB6) protein sequence
295
54





SEQ ID NO: 7.


492
gi6807868


Homo sapiens


mRNA; cDNA DKFZp434G0625 (from clone
324
68





DKFZp434G0625); partial cds.


492
AAY13373


Homo sapiens


Amino acid sequence of protein PRO235.
209
62


492
AAB33420


Homo sapiens


Human PRO235 protein UNQ209 SEQ ID
209
62





NO: 31.


493
gi10434911


Homo sapiens


cDNA FLJ13068 fis, clone NT2RP3001739,
573
100





weakly similar to HYPOTHETICAL 72.5 KD





PROTEIN C2F7.10 IN CHROMOSOME I.


493
gi7022673


Homo sapiens


cDNA FLJ10562 fis, clone NT2RP2002701.
109
43


493
AAY87090


Homo sapiens


Human secreted protein sequence SEQ ID
109
43





NO: 129.


494
AAB63630


Homo sapiens


Human gastric cancer associated antigen protein
165
55





sequence SEQ ID NO: 992.


494
AAB63629


Homo sapiens


Human gastric cancer associated antigen protein
170
55





sequence SEQ ID NO: 991.


494
AAR06471


Homo sapiens


Derived protein from clone ICA525 (ATCC
172
55





40704).


495
gi13543949


Homo sapiens


Similar to RIKEN cDNA 2810432L12 gene,
2104
100





clone MGC: 12992, mRNA, complete cds.


495
AAY87340


Homo sapiens


Human signal peptide containing protein HSPP-
2104
100





117 SEQ ID NO: 117.


495
gi3876730


Caenorhabditis


F35C11.4
181
27






elegans




496
gi5001993


Dissostichus


chimeric AFGP/trypsinogen-like serine protease
199
49






mawsoni


precursor


496
gi295736


Dictyostelium


spore coat protein sp96
189
48






discoideum




496
gi2114321


Equine


membrane glycoprotein
186
39






herpesvirus
1



497
AAB66272


Homo sapiens


Human TANGO 378 SEQ ID NO: 29.
664
89


497
gi6006811


Mus musculus


serpentine receptor
261
40


497
AAB01247


Homo sapiens


Human HE6 receptor.
263
38


498
gi13623515


Homo sapiens


clone MGC: 12705, mRNA, complete cds.
94
87


498
gi1017781
bacteriophage
Rz1 protein precursor
44
41




lambda


498
gi6599136


Homo sapiens


mRNA; cDNA DKFZp434F216 (from clone
94
87





DKFZp434F216); partial cds.


499
AAC84384


Homo sapiens


Human A236 polypeptide coding sequence.
693
100



aa1


499
gi10438797


Homo sapiens


cDNA: FLJ22415 fis, clone HRC08561.
692
100


499
AAY41692


Homo sapiens


Human PRO 363 protein sequence.
692
100


500
gi8515813


Rattus


RSD-6
84
25






norvegicus




500
gi12657809
Simian
gag protein
83
25




immunodeficiency




virus


500
gi9454456
Human
pol protein
60
35




immunodeficiency




virus type 1


501
AAY71056


Homo sapiens


Human membrane transport protein, MTRP-1.
143
76


501
gi13096889


Mus musculus


Similar to ATPas, class II, type 9B
142
68


501
gi13905302


Mus musculus


Similar to ATPase, class II, type 9A
119
63


502
gi2384752


Paracentrotus


transcription factor; PaxA
56
47






lividus




502
gi6601486


Ovis aries


pulmonary surfactant protein B
76
30


502
AAR41266


Homo sapiens


vWF fragment Arg441-Tyr508, deltaCys474-
56
47





Pro488.


503
AAY99420


Homo sapiens


Human PRO1486 (UNQ755) amino acid
1082
100





sequence SEQ ID NO: 287.


503
AAW88747


Homo sapiens


Secreted protein encoded by gene 45 clone
1069
99





HCESF40.


503
gi6942096


Mus musculus


CBLN3
942
94


504
gi11558496


Sus scrofa


sodium iodide symporter
170
51


504
gi12642414


Mus musculus


sodium iodide symporter NIS
184
39


504
gi14290145


Mus musculus


sodium iodide symporter
184
39


505
AAY66645


Homo sapiens


Membrane-bound protein PRO1310.
554
100


505
AAB65168


Homo sapiens


Human PRO1310 protein sequence SEQ ID
554
100





NO: 62.


505
gi2921092


Mus musculus


carboxypeptidase X2
281
58


507
gi58442
Human
8.0 K protein (AA 1-74)
56
44




adenovirus




type 41


507
gi388253


Trifolium


ribulose bisphosphate carboxylase
54
32






repens




507
gi1345574


Sinapis alba


small subunit ribulose 1,5-bisphosphate
57
36





carboxylase (AA 1-82)


508
gi3047402


Homo sapiens


monocarboxylate transporter 2 (hMCT2)
539
34





mRNA, complete cds.


508
gi7688756


Mus musculus


monocarboxylate transporter 4
296
48


508
gi3834395


Homo sapiens


monocarboxylate transporter 2 (MCT2) mRNA,
528
33





complete cds.


509
gi6136782


Mus musculus


synaptotagmin V
595
91


509
gi14210264


Rattus


synaptotagmin 5
592
91






norvegicus




509
gi6136792


Mus musculus


synaptotagmin X
268
43


510
AAB53400


Homo sapiens


Human colon cancer antigen protein Sequence
493
100


510
gi6760350


Homo sapiens


cytomegalovirus partial fusion receptor mRNA,
348
98





partial cds.


510
gi603380


Saccharomyces


Yer140wp
106
30






cerevisiae




511
AAB12136


Homo sapiens


Hydrophobic domain protein from clone
1142
100





HP10625 isolated from Liver cells.


511
AAB24036


Homo sapiens


Human PRO4407 protein sequence SEQ ID
1142
100





NO: 47.


511
AAY57952


Homo sapiens


Human transmembrane protein HTMPN-76.
1142
100


512
gi2654984
Hepatitis GB
polyprotein
50
38




virus C


512
gi861305


Caenorhabditis


similar to C. elegans protein F59B2.2
75
32






elegans




512
AAW75055


Homo sapiens


Fragment of human secreted protein encoded by
52
38





gene 18.


513
gi2696709


Mus musculus


RST
95
47


513
gi1293672


Mus musculus


kidney-specific transport protein
93
40


513
gi7707622


Homo sapiens


hOAT4 mRNA for organic anion transporter 4,
93
37





complete cds.


514
gi17829


Brassica napus


LEA76 peptide (AA 1-280)
137
27


514
gi11994339


Arabidopsis


embryonic abundant protein LEA-like
119
28






thaliana




514
gi3873646


Caenorhabditis


AC3.3
123
27






elegans




515
AAB74753


Homo sapiens


Human secreted protein sequence encoded by
38
54





gene 21 SEQ ID NO: 62.


515
gi2369777


Drosophila


sex-peptide
39
53






mauritiana




515
gi2369804


Drosophila


sex-peptide
39
53






simulans




516
gi13959739
Caprine
envelope glycoprotein
87
33




arthritis-




encephalitis




virus


516
gi5732606
Hepatitis B
precore/core mutant protein
74
33




virus


516
gi4033542
Hepatitis B
truncated pre-core-protein
72
34




virus


517
gi1336041


Homo sapiens


Human olfactory receptor (OLF1) gene,
482
50





complete cds.


517
gi1246530


Gallus gallus


olfactory receptor 2
474
50


517
gi1246534


Gallus gallus


olfactory receptor 4
474
50


518
AAY36243


Homo sapiens


Human secreted protein encoded by gene 20.
64
48


518
gi409995
Rattus sp.
mucin
65
57


518
gi11141770


Bos taurus


Toll-like receptor 4
80
29


519
gi8918871
Plasmid F
96 pct identical to gp: AB021078_30
288
98


519
gi4512467
Plasmid ColIb-
100 pct identical to 25 residues of 79 aa protein
256
93




P9
sp: YPF8_ECOLI


519
gi47517
Synechocystis
ATPase subunit epsilon
72
45




sp. PCC 6803


520
gi5139695


Cucumis


expressed in cucumber hypocotyls
85
28






sativus




520
gi3406819


Mus musculus


growth factor receptor
63
47


520
AAG03497


Homo sapiens


Human secreted protein, SEQ ID NO: 7578.
61
51


521
AAB18985


Homo sapiens


Amino acid sequence of a human
251
35





transmembrane protein.


521
gi6013381


Rattus


TM6P1
246
33






norvegicus




521
AAE00330


Homo sapiens


Human membrane-bound protein-60 (Zsig60).
251
35


523
gi1046315


Plasmodium


merozoite surface protein-1
88
34






vivax




523
gi2213834


Plasmodium


merozite surface protein 1
85
29






vivax




523
gi537916


Lilium


meiotin-1
87
32






longiflorum




524
AAY91618


Homo sapiens


Human secreted protein sequence encoded by
63
29





gene 20 SEQ ID NO: 291.


524
AAG02988


Homo sapiens


Human secreted protein, SEQ ID NO: 7069.
58
29


525
gi220411


Mus musculus


N-methyl-D-aspartate receptor channel subunit
159
100





epsilon 1


525
gi286234


Rattus


N-methyl-D-aspartate receptor subunit
159
100






norvegicus




525
gi2155310


Rattus


N-methyl-D-aspartate receptor NMDAR2A
159
100






norvegicus


subunit; NMDA receptor NMDAR2A subunit


526
AAB66267


Homo sapiens


Human TANGO 272 SEQ ID NO: 14.
697
50


526
AAY72712


Homo sapiens


HTLIH44 clone human attractin-like protein.
570
47


526
AAY72715


Homo sapiens


HFICU08 clone human attractin-like protein.
565
47


527
gi2384746


Mus musculus


testicular condensing enzyme
681
52


527
gi4633135


Mus musculus


condensing enzyme
681
52


527
gi12652723


Homo sapiens


clone MGC: 3295, mRNA, complete cds.
276
29


528
gi12224992


Homo sapiens


mRNA; cDNA DKFZp667O2416 (from clone
877
100





DKFZp667O2416).


528
gi4929647


Homo sapiens


CGI-89 protein mRNA, complete cds.
603
61


528
gi12652585


Homo sapiens


CGI-89 protein, clone MGC: 845, mRNA,
602
60





complete cds.


529
AAY36047


Homo sapiens


Extended human secreted protein sequence,
61
57





SEQ ID NO. 432.


529
AAG01318


Homo sapiens


Human secreted protein, SEQ ID NO: 5399.
59
44


529
AAW74979


Homo sapiens


Human secreted protein encoded by gene 105
58
35





clone HSVAF07.


530
gi12314108


Homo sapiens


Human DNA sequence from clone RP1-23013
634
100





on chromosome 6q22.1-22.33 Contains part of a





gene for a novel protein, STSs and GSSs,





complete sequence.


530
gi10434835


Homo sapiens


cDNA FLJ13018 fis, clone NT2RP3000685.
435
68


530
gi1491712


Homo sapiens




H. sapiens
mRNA for novel protein.

95
56


532
gi861305


Caenorhabditis


similar to C. elegans protein F59B2.2
124
30






elegans




532
gi10177114


Arabidopsis


amino acid transporter protein-like
91
34






thaliana




532
gi2576363


Arabidopsis


amino acid transport protein
79
29






thaliana




533
AAY28678


Homo sapiens


Human cw272_7 secreted protein.
324
38


533
gi13185723


Homo sapiens


n 1755 can be A, G, C, or T
248
30


533
AAB70537


Homo sapiens


Human PRO7 protein sequence SEQ ID NO: 14.
248
30


534
gi10186503


Homo sapiens


sialic acid-specific acetylesterase II mRNA,
932
100





complete cds, alternatively spliced.


534
gi6808138


Homo sapiens


mRNA; cDNA DKFZp761A051 (from clone
923
100





DKFZp761A051); partial cds.


534
gi10242345


Homo sapiens


sialic acid-specific 9-O-acetylesterase I mRNA,
753
100





complete cds.


535
gi7328084


Homo sapiens


mRNA; cDNA DKFZp761L0812 (from clone
225
82





DKFZp761L0812); partial cds.


535
gi7576817


Plasmodium


merozoite surface protein 2
94
38






falciparum




535
gi3261822


Mycobacterium


PE_PGRS
103
36






tuberculosis




536
gi3165565


Caenorhabditis


contains similarity to transmembrane domains
129
25






elegans


found in HMG CoA reductases and drosophila





patched protein (SW: P18502)


536
gi1825729


Caenorhabditis


similar to drosophila membrane protein
125
26






elegans


PATCHED SP: P18502 (PID: g129645)


536
gi15120
enterobacteria
unidentified reading frame
67
31




phage P1


537
gi13452508


Mus musculus


claudin 14
438
40


537
gi12597447


Homo sapiens


claudin 14 (CLDN14) mRNA, complete cds.
438
39


537
gi7768724


Homo sapiens


genomic DNA. chromosome 21q, section
438
39





70/105.


538
AAR12603


Homo sapiens


SIB 121 intestinal mucin.
148
53


538
AAW36946


Homo sapiens


Protein encoded by 5’ fragment of clone M8_2.
92
35


538
AAY91378


Homo sapiens


Human secreted protein sequence encoded by
86
45





gene 33 SEQ ID NO: 99.


539
gi13561518


Homo sapiens


GalNAc-4-sulfotransferase 2 mRNA, complete
213
97





cds, alternatively spliced.


539
gi12711481


Homo sapiens


N-acetylgalactosamine 4-O-sulfotransferase 2
187
97





GalNAc4ST-2 mRNA, complete cds.


539
AAY86315


Homo sapiens


Human secreted protein HNTMX29, SEQ ID
63
27





NO: 230.


540
gi3150438
Human
pol-env
264
51




endogenous




retrovirus K


540
gi3150441
Human
envelope protein
258
50




endogenous




retrovirus K


540
gi5802817


Homo sapiens


endogenous retrovirus HERV-K104 long
258
51





terminal repeat, complete sequence; and Gag





protein (gag) and envelope protein (env) genes,





complete cds.


541
AAY91625


Homo sapiens


Human secreted protein sequence encoded by
547
97





gene 22 SEQ ID NO: 298.


541
AAU00437


Homo sapiens


Human dendritic cell membrane protein FIRE.
547
97


541
AAW30638


Homo sapiens


Partial human 7-transmembrane receptor
374
66





HAPO167 protein.


542
AAY96963


Homo sapiens


Wound healing tissue peptidoglycan recognition
1811
92





protein-like protein.


542
AAY96962


Homo sapiens


Keratinocyte peptidoglycan recognition protein-
768
62





like protein.


542
AAY76124


Homo sapiens


Human secreted protein encoded by gene 1.
768
62


543
AAB72286


Homo sapiens


Human ADAMTS-9 amino acid sequence.
1009
100


543
AAB72301


Homo sapiens


Human ADAMTS-9 alternative amino acid
1009
100





sequence.


543
AAB90617


Homo sapiens


Human secreted protein, SEQ ID NO: 155.
358
39


544
gi4323581


Homo sapiens


senescence-associated epithelial membrane
150
100





protein (SEMP1) mRNA, complete cds.


544
gi4559278


Homo sapiens


claudin-1 (CLDN1) mRNA, complete cds.
150
100


544
gi13383364


Homo sapiens


claudin-1 (CLDN1) gene, exon 4 and complete
150
100





cds.


545
AAW93960


Homo sapiens


Human 53BP2: IP-2 protein fragment.
59
45


545
AAY19607


Homo sapiens


SEQ ID NO 325 from WO9922243.
57
64


545
AAY07942


Homo sapiens


Human secreted protein fragment encoded from
55
42





gene 91.


546
gi4406172
Human
latent membrane protein-1
159
37




herpesvirus 4


546
gi475574
Human
latent membrane protein 1
153
39




herpesvirus 4




type 2


546
gi2736358


Caenorhabditis


Contains similarity to Pfam domain: PF00069
155
51






elegans


(pkinase), Score = 214.7, E-value = 4.3e−61,





N = 1


547
gi552087


Drosophila


crumbs protein
127
45






melanogaster




547
AAY66747


Homo sapiens


Membrane-bound protein PRO1158.
67
46


547
AAB87559


Homo sapiens


Human PRO1158.
67
46


548
AAB39181


Homo sapiens


Human secreted protein sequence encoded by
57
41





gene 3 SEQ ID NO: 61.


549
AAW71565


Homo sapiens


Hepatocyte nuclear factor 4 alpha polypeptide
44
36





(exon 2 product).


549
gi2804240


Rattus


histidase
56
42






norvegicus




549
gi149163
Plasmid pJHC-
streptomycin-spectinomycin resistance protein
65
71




MW1


550
gi10435833


Homo sapiens


cDNA FLJ13729 fis, clone PLACE3000121,
233
100





weakly similar to VESICULAR TRAFFIC





CONTROL PROTEIN SEC15.


550
gi6807998


Homo sapiens


mRNA; cDNA DKFZp761I2124 (from clone
195
80





DKFZp761I2124); partial cds.


550
gi7023795


Homo sapiens


cDNA FLJ11251 fis, clone PLACE1008813.
195
80


551
gi5668598


Homo sapiens


Wiskott-Aldrich syndrome protein interacting
156
33





protein (WASPIP) mRNA, partial cds.


551
gi1314755


Mus musculus


Wiskott-Aldrich Syndrome Protein
140
33


551
gi4096355


Mus musculus


Wiskott-Aldrich syndrome protein (WASP)
140
33


552
gi4886381
Human
E5 protein
54
36




papillomavirus




type 16


552
AAB28331


Homo sapiens


Human secreted protein BLAST search protein
54
36





SEQ ID NO: 115.


552
gi4886413
Human
E5 protein
53
26




papillomavirus




type 16


553
gi12276062


Homo sapiens


group XII secreted phospholipase A2 mRNA,
354
100





complete cds.


553
gi12276193


Homo sapiens


FKSG38 (FKSG38) mRNA, complete cds.
354
100


553
AAY88271


Homo sapiens


Human TANGO 180 protein.
354
100


554
gi4885010


Conus textile


O-superfamily conotoxin TxO5 precursor
73
26


554
gi6409400


Conus textile


conotoxin scaffold VI/VII precursor
71
25


554
AAW78192


Homo sapiens


Human secreted protein encoded by gene 67
67
39





clone HTOFC34.


555
AAB38330


Homo sapiens


Human secreted protein encoded by gene 10
214
97





clone HTEBV72.


555
gi2335059


Mus musculus


IgG receptor
76
52


555
gi969034


Mus musculus


Fc gamma receptor IIb1
76
52


556
gi13311009


Homo sapiens


NYD-SP16 mRNA, complete cds.
488
100


556
gi3287162
Human
vpu
69
26




immunodeficiency




virus type 1


556
gi1303982


Bacillus


YqkE
59
40






subtilis




557
gi13938651


Mus musculus


Similar to conserved membrane protein at 44E
502
83


557
gi14194169


Arabidopsis


At1g05960/T21E18_20
124
30






thaliana




557
gi265786
human,
betacellulin. [Homo
75
57




mRNA, 1271




nt


558
gi310100


Rattus


developmentally regulated protein
539
80






norvegicus




558
AAW52812


Homo sapiens


Human induced tumor protein.
227
37


558
AAY07771


Homo sapiens


Human secreted protein fragment encoded from
221
40





gene 28.


559
AAY71294


Homo sapiens


Human orphan G protein-coupled receptor
1711
100





hRUP3.


559
AAB02828


Homo sapiens


Human G protein coupled receptor hRUP3
1711
100





protein SEQ ID NO: 8.


559
gi1204095


Takifugu


dopamine receptor
237
28






rubripes




560
gi3041879


Mus musculus


LNXp80
556
54


560
gi3041881


Mus musculus


LNXp70
556
54


560
gi13183073


Homo sapiens


multi-PDZ-domain-containing protein mRNA,
539
56





complete cds.


561
AAB08872


Homo sapiens


Amino acid sequence of a human secretory
77
93





protein.


561
gi5734537


Methanotherm


transmembrane protein 9.0 kDa
62
43






obacter






thermautotrophicus


561
gi13357178


Homo sapiens


calcium channel gamma subunit 7 (CACNG7)
78
38





mRNA, complete cds.


562
gi5070458
tomato yellow
BV2 protein
60
33




leaf curl virus


562
gi9944667
Amsacta
AMV144
60
26




moorei




entomopoxvirus


562
gi293853


Mus musculus


betacellulin
48
25


563
gi10799398


Homo sapiens


chromosome 19, BAC BC349142 (CTC-
1513
100





518B2), complete sequence.


563
gi6063386


Homo sapiens


kallikrein-like protein 4 KLK-L4 gene, complete
1513
100





cds.


563
gi4884462


Homo sapiens


mRNA; cDNA DKFZp586J1923 (from clone
912
98





DKFZp586J1923); partial cds.


564
AAB90602


Homo sapiens


Human secreted protein, SEQ ID NO: 140.
704
100


564
AAB90662


Homo sapiens


Human secreted protein, SEQ ID NO: 205.
704
100


564
AAB90571


Homo sapiens


Human secreted protein, SEQ ID NO: 109.
700
99


565
AAB53436


Homo sapiens


Human colon cancer antigen protein sequence
82
33





SEQ ID NO: 976.


565
AAG02279


Homo sapiens


Human secreted protein, SEQ ID NO: 6360.
82
61


565
gi3879077


Caenorhabditis


R10E11.9
81
35






elegans




566
gi581191


Escherichia


unidentified reading frame (AA 1-79)
64
36






coli




566
gi929915
synthetic
insulin C chain
61
58




construct


566
AAP60248


Homo sapiens


Human proinsulin.
61
58


567
AAB08854


Homo sapiens


Amino acid sequence of a human secretory
787
100





protein.


567
AAY87268


Homo sapiens


Human signal peptide containing protein HSPP-
787
100





45 SEQ ID NO: 45.


567
AAY66723


Homo sapiens


Membrane-bound protein PRO1100.
787
100


568
gi14211714


Homo sapiens


naked cuticle-1 (NKD1) mRNA, complete cds.
193
92


568
AAB08216


Homo sapiens


A protein related to Drosophila naked cuticle
193
92





polypeptide.


568
gi13487305


Mus musculus


Nkd
151
62


569
gi3452275


Pleuronectes


aminopeptidase N
215
28






americanus




569
gi2766187


Gallus gallus


aminopeptidase Ey
178
32


569
gi3776238


Rattus


aminopeptidase N
151
29






norvegicus




570
AAB58305


Homo sapiens


Lung cancer associated polypeptide sequence
273
100





SEQ ID 643.


570
gi5830684
variola minor
A20L protein
57
24




virus


570
gi297302
Variola virus
A19L
57
24


571
AAB38019


Homo sapiens


Human secreted protein encoded by gene 27
583
99





clone HPJBF63.


571
AAB38010


Homo sapiens


Human secreted protein encoded by gene 27
576
98





clone HOUHD63.


571
gi167020


Hordeum


C-hordein storage protein
47
27






vulgare




572
AAY91385


Homo sapiens


Human secreted protein sequence encoded by
969
100





gene 40 SEQ ID NO: 106.


572
gi4126441


Homo sapiens


CD22 gene variant 6, partial cds.
68
34


572
gi201798


Mus musculus


T-cell receptor beta
95
29


573
gi9971734


Galleria


heavy-chain fibroin
121
34






mellonella




573
gi3002791


Homo sapiens


macrophage receptor MARCO mRNA,
81
28





complete cds.


573
gi5231092


Homo sapiens


macrophage receptor (MARCO) gene, exon 17
81
28





and complete cds.


574
gi409995
Rattus sp.
mucin
173
64


574
gi4063042


Cryptosporidium


GP900; mucin-like glycoprotein
134
38






parvum




574
gi5732924


Toxocara canis


excretory/secretory mucin MUC-4
112
29


575
gi1841555


Homo sapiens


HLA class III region containing NOTCH4 gene,
422
100





partial sequence, homeobox PBX2 (HPBX)





gene, receptor for advanced glycosylation end





products (RAGE) gene, complete cds, and 6





unidentified cds, complete sequence.


575
AAB25697


Homo sapiens


Human secreted protein sequence encoded by
122
40





gene 33 SEQ ID NO: 86.


575
AAB25755


Homo sapiens


Human secreted protein sequence encoded by
122
40





gene 33 SEQ ID NO: 144.


576
gi5732924


Toxocara canis


excretory/secretory mucin MUC-4
114
34


576
gi5732920


Toxocara canis


excretory/secretory mucin MUC-2
113
32


576
gi409995
Rattus sp.
mucin
95
29


577
gi12656447


Plasmodium


erythrocyte membrane protein 1
73
33






falciparum




577
AAG04067


Homo sapiens


Human secreted protein, SEQ ID NO: 8148.
73
51


577
gi4200249


Homo sapiens




H. sapiens
gene from PAC 747L4.

76
32


578
gi12003279


Perilla


15 kD oleosin-like protein 1
77
36






frutescens




578
gi409424


Homo sapiens


Human carboxyl ester lipase like protein
59
32





(CELL) mRNA, complete cds.


578
gi609286


Xenopus


xsna
79
30






laevis




579
gi1841555


Homo sapiens


HLA class III region containing NOTCH4 gene,
80
42





partial sequence, homeobox PBX2 (HPBX)





gene, receptor for advanced glycosylation end





products (RAGE) gene, complete cds, and 6





unidentified cds, complete sequence.


579
AAB18976


Homo sapiens


Amino acid sequence of a human
69
40





transmembrane protein.


579
AAW73192


Homo sapiens


Human vesicle trafficking protein.
43
38


580
gi13241972


Mus musculus


SugarCrisp
841
56


580
gi13241970


Gallus gallus


SugarCrisp
840
59


580
gi2943716


Homo sapiens


mRNA for 25 kDa trypsin inhibitor, complete
840
63





cds.


581
gi4584539


Arabidopsis


extensin-like protein
138
34






thaliana




581
gi306316
Herpesvirus
EBNA-2
171
38




papio


581
gi1632787
Human
BYRF1, encodes EBNA-2 (Dambaugh et al,
142
35




herpesvirus 4
1984; Dillner et al, 1984)


582
gi13185723


Homo sapiens


n 1755 can be A, G, C, or T
373
100


582
AAB70537


Homo sapiens


Human PRO7 protein sequence SEQ ID NO: 14.
373
100


582
gi13185725


Homo sapiens


n 1755 can be A, G, C, or T.
373
100


583
gi202752


Rattus


adenylyl cyclase type II
261
59






norvegicus




583
AAB02006


Homo sapiens


Adenylyl cyclase type II-C2 C2 alpha domain.
261
59


583
gi2204110


Bos taurus


adenylyl cyclase type VII
138
50


584
gi10433645


Homo sapiens


cDNA FLJ12221 fis, clone MAMMA1001091.
1086
69


584
gi10440418


Homo sapiens


mRNA for FLJ00044 protein, partial cds.
1086
69


584
AAB56941


Homo sapiens


Human prostate cancer antigen protein sequence
126
28





SEQ ID NO: 1519.


585
AAY99402


Homo sapiens


Human PRO1382 (UNQ718) amino acid
492
98





sequence SEQ ID NO: 220.


585
AAY32937


Homo sapiens


Human cerebellin-2 protein sequence.
300
70


585
gi5702371


Mus musculus


precerebellin-1
284
66


586
AAB44681


Homo sapiens


Human secreted protein sequence encoded by
361
63





gene 41 SEQ ID NO: 146.


586
gi1293734


Saccharomyces


O3635p
279
34






cerevisiae




586
gi13877141


Homo sapiens


FKSG89
162
33


587
AAY34120


Homo sapiens


Human potassium channel K+ Hnov4.
1597
99


587
gi206044


Rattus


potassium channel Kv3.2b
1582
98






norvegicus




587
gi206914


Rattus


K+ channel protein
1582
98






norvegicus




588
gi3790674


Caenorhabditis


contains similarity to a vac1/fab1-type domain
449
54






elegans




589
AAB53626


Homo sapiens


Human colon cancer antigen protein sequence
55
47





SEQ ID NO: 1166.


589
gi1049106


Homo sapiens


Human dystonin isoform 2 mRNA, partial cds.
63
100


589
gi470480


Homo sapiens


Human clone JL8 immunoglobulin kappa chain
58
34





(IgK) mRNA, VKIII-JK3 region, partial cds.


590
AAY44985


Homo sapiens


Human epidermal protein-2.
82
37


590
gi11073


Drosophila


Mst84Da
75
37






melanogaster




590
gi8571115


Homo sapiens


human endogenous retrovirus HRES-1 p8
75
40





protein (p8) and p15 protein (p15) genes,





complete cds.


591
gi13676322


Homo sapiens


chromosome 1 open reading frame 2, clone
230
31





MGC: 1298, mRNA, complete cds.


591
gi13938585


Homo sapiens


clone MGC: 4509, mRNA, complete cds.
230
31


591
gi2564916


Homo sapiens


clk2 kinase (CLK2), propin1, cote1,
229
31





glucocerebrosidase (GBA), and metaxin genes,





complete cds; metaxin pseudogene and





glucocerebrosidase pseudogene; and





thrombospondin3 (THBS3) gene, partial cds.


592
gi56463


Rattus


gp210 (AA 1-1886)
363
79






norvegicus




592
gi6650678


Mus musculus


nuclear pore membrane glycoprotein POM210
358
78


592
gi1703554


Caenorhabditis


strong similarity to rat integral membrane
143
32






elegans


glycoprotein GP120 precursor (SP: P11654)


593
AAB73355


Homo sapiens


Human mesangial cell meg-1 protein.
317
52


593
gi4191594


Homo sapiens


protein serine/threonine phosphatase 4
292
52





regulatory subunit 1 (PP4R1) mRNA, complete





cds.


593
gi10120321


Salmo trutta


MHC class II alpha chain
58
30


594
gi11320944


Homo sapiens


peptide deformylase-like protein mRNA,
1300
100





complete cds.


594
gi13195254


Homo sapiens


polypeptide deformylase-like protein (PDF)
1300
100





mRNA, complete cds.


594
gi11320968


Lycopersicon


peptide deformylase-like protein
346
40






esculentum




595
gi13279254


Homo sapiens


Similar to RIKEN cDNA 2610207I16 gene,
417
94





clone MGC: 10940, mRNA, complete cds.


595
gi5869811


Glomus


Fox2 protein
187
30






mosseae




595
gi432977


Homo sapiens


Human sterol carrier protein 2 mRNA, complete
174
32





cds.


596
gi10803406


Homo sapiens


mRNA for cadherin-19 (CDH19 gene).
863
100


596
AAY41725


Homo sapiens


Human PRO941 protein sequence.
863
100


596
AAB44281


Homo sapiens


Human PRO941 (UNQ478) protein sequence
863
100





SEQ ID NO: 264.


597
AAG02731


Homo sapiens


Human secreted protein, SEQ ID NO: 6812.
67
38


597
gi1841964


Toxocara canis


TcH SLdT.460
63
37


597
gi3986598


Ginglymostoma


antigen receptor
58
47






cirratum




598
gi575501


Homo sapiens


thyrotropin beta-subunit (TSHB) gene, exon 3.
739
99


598
gi339998


Homo sapiens


Human thyrotropin beta (TSH-beta) subunit
739
99





gene, exons 2 and 3.


598
gi340002


Homo sapiens


Human thyrotropin beta subunit gene, exons 2
739
99





and 3.


599
AAB53436


Homo sapiens


Human colon cancer antigen protein sequence
368
97





SEQ ID NO: 976.


599
AAB25691


Homo sapiens


Human secreted protein sequence encoded by
168
93





gene 27 SEQ ID NO: 80.


599
AAY01428


Homo sapiens


Secreted protein encoded by gene 46 clone
81
42





HAQBT52.


600
AAB54178


Homo sapiens


Human pancreatic cancer antigen protein
1025
99





sequence SEQ ID NO: 630.


600
gi7321824


Drosophila


out at first
510
38






melanogaster




600
gi2443448


Drosophila


out at first
508
39






virilis




601
AAW75178


Homo sapiens


Human secreted protein encoded by gene 69
45
47





clone HPEBD70.


601
gi6466876
Kashmir bee
RNA polymerase
72
43




virus


601
gi6646671
cloudy wing
RNA polymerase
72
43




virus


602
AAB88377


Homo sapiens


Human membrane or secretory protein clone
379
91





PSEC0113.


602
gi190506


Homo sapiens


Human PRB1 locus salivary proline-rich protein
111
32





mRNA, clone cP5, complete cds.


602
gi190475


Homo sapiens


Human salivary proline-rich protein 1 gene,
84
34





segment 2.


603
gi1235645


Cladomyrma


cytochrome oxidase subunit II
57
50






cryptata




603
gi4981606


Thermotoga


oligopeptide ABC transporter, permease protein
43
31






maritima




603
gi6681644
Yaba monkey
similar to vaccinia A14.5L
55
45




tumor virus


604
gi7020918


Homo sapiens


cDNA FLJ20668 fis, clone KAIA585.
461
66


604
AAB54305


Homo sapiens


Human pancreatic cancer antigen protein
62
33





sequence SEQ ID NO: 757.


604
AAY41352


Homo sapiens


Human secreted protein encoded by gene 45
58
21





clone HTXFH55.


605
AAY54054


Homo sapiens


Angiostatin-binding domain of ABP-1,
137
39





designated Big-3.


605
gi9887326


Homo sapiens


angiomotin mRNA, complete cds.
155
37


605
AAY54052


Homo sapiens


An angiogenesis-associated protein which binds
155
37





plasminogen.


606
gi11072097


Homo sapiens


MLL/GAS7 fusion protein (MLL/GAS7)
83
25





mRNA, partial cds.


606
gi7331837


Caenorhabditis


contains similarity to human X-linked deafness
60
25






elegans


dystonia protein (GB: U66035)


606
AAG02452


Homo sapiens


Human secreted protein, SEQ ID NO: 6533.
59
44


607
gi854065
Human
U88
305
47




herpesvirus 6


607
gi9757150


Leishmania


extremely cysteine/valine rich protein
284
50






major




607
gi10434098


Homo sapiens


cDNA FLJ12547 fis, clone NT2RM4000634.
219
38


608
AAY48278


Homo sapiens


Human prostate cancer-associated protein 64.
98
89


608
AAB58446


Homo sapiens


Lung cancer associated polypeptide sequence
98
89





SEQ ID 784.


608
AAG00214


Homo sapiens


Human secreted protein, SEQ ID NO: 4295.
98
89


610
AAB61421


Homo sapiens


Human TANGO 300 protein.
1583
99


610
AAB23618


Homo sapiens


Human secreted protein SEQ ID NO: 36.
1581
99


610
AAB87592


Homo sapiens


Human PRO1925.
1354
98


611
gi6841194


Homo sapiens


HSPC272
421
66


611
gi12248392


Mus musculus


transcriptional inhibitory factor
90
28


611
gi2853265


Rattus


jun dimerization protein 2
90
28






norvegicus




612
gi9964124


Helicobacter


HP0519-like protein
54
45






pylori




612
gi6970424
Human
start codon is not identified
59
29




papillomavirus




type 69


613
gi14330385


Homo sapiens


mRNA for sodium/calcium exchanger, SCL8A3,
178
92





alternative splice form B (SCL8A3 gene).


613
gi14330383


Homo sapiens


mRNA for sodium/calcium exchanger SCL8A3,
193
60





alternative splice form A (SCL8A3 gene).


613
gi1552526


Rattus


sodium-calcium exchanger form 3
178
92






norvegicus




614
gi58028
synthetic
suef protein
148
32




construct


614
gi2447210


Paramecium


a312aR
67
35






bursaria






Chlorella virus 1


615
gi8100892
Human
protease
76
30




immunodeficiency




virus type 1


615
gi14281259
Human
HIV Protease
71
28




immunodeficiency




virus


615
gi10504617
Human
protease
71
31




immunodeficiency




virus type 1


616
gi4128041


Homo sapiens


claudin-9 (CLDN9) gene.
146
37


616
AAB64401


Homo sapiens


Amino acid sequence of human intracellular
146
37





signalling molecule INTRA33.


616
gi4325296


Mus musculus


claudin-9
143
36


617
AAY05376


Homo sapiens


Human HCMV inducible gene protein, SEQ ID
974
90





NO: 20.


617
AAB60496


Homo sapiens


Human cell cycle and proliferation protein
974
90





CCYPR-44, SEQ ID NO: 44.


617
gi13879501


Mus musculus


RIKEN cDNA 4933419D20 gene
348
41


618
AAY25451


Homo sapiens


Human secreted protein 2 derived from extended
123
53





cDNA.


618
AAY35882


Homo sapiens


Extended human secreted protein sequence,
123
53





SEQ ID NO: 19.


618
AAY66636


Homo sapiens


Membrane-bound protein PRO180.
126
47


619
gi14042279


Homo sapiens


cDNA FLJ14627 fis, clone NT2RP2000289.
208
82


619
AAW78193


Homo sapiens


Human secreted protein encoded by gene 68
103
46





clone H2CBJ08.


620
gi10579884
Halobacterium
Vng0244h
68
32




sp. NRC-1


621
AAY19740


Homo sapiens


SEQ ID NO: 458 from WO9922243.
60
36


621
gi5911915


Homo sapiens


mRNA; cDNA DKFZp586M0622 (from clone
68
31





DKFZp586M0622); partial cds.


621
gi4574260


Haemophilus


outer membrane protein 26
70
29






influenzae




622
gi13543049


Mus musculus


Similar to RIKEN cDNA 0610030G03 gene
1147
87


622
gi5263332


Arabidopsis


F8K7.23
123
24






thaliana




622
gi6552728


Arabidopsis


T26F17.1
123
24






thaliana




623
gi14290586


Homo sapiens


Similar to RIKEN cDNA 2810403L02 gene,
1809
100





clone IMAGE: 3868486, mRNA, partial cds.


623
gi11493522


Homo sapiens


PRO1512
1512
100


623
AAB58871


Homo sapiens


Breast and ovarian cancer associated antigen
1412
92





protein sequence SEQ ID 579.


624
gi2114213


Homo sapiens


immunoglobulin lambda gene locus DNA,
788
100





clone: 123E1 upstream contig.


624
gi2114308


Homo sapiens


immunoglobulin lambda gene locus DNA,
788
100





clone: 123E1.


624
gi693811
human,
Vpre-B = VPre-B protein
788
100




chromosome




22, Genomic,




1100 nt].




[Homo sapiens


625
gi14250299


Homo sapiens


Similar to RIKEN cDNA C030006K11 gene,
686
87





clone MGC: 18180, mRNA, complete cds.


625
gi7230571


Mus musculus


lim homeodomain-containing transcription
87
26





factor


625
gi587461


Mesocricetus


1mx1.1
83
25






auratus




626
AAB24074


Homo sapiens


Human PRO1153 protein sequence SEQ ID
130
34





NO: 49.


626
AAY66735


Homo sapiens


Membrane-bound protein PRO1153.
130
34


626
AAB65258


Homo sapiens


Human PRO1153 (UNQ583) protein sequence
130
34





SEQ ID NO: 351.


627
gi405956


Escherichia


yeeE
1138
93






coli




627
gi405954


Escherichia


exonuclease I
1014
86






coli




627
gi1736685


Escherichia


Exodeoxyribonuclease I (EC 3.1.11.1)
1014
86






coli


(Exonuclease I) (DNA





deoxyribophosphodiesterase) (DRPase).


628
gi295196


Salmonella


level of amino acid identity between E. coli and
699
86






typhimurium




S. typhimurium
strongly suggests authentic gene



628
gi405956


Escherichia


yeeE
96
36






coli




628
AAG01568


Homo sapiens


Human secreted protein, SEQ ID NO: 5649.
65
25


629
AAW67894


Homo sapiens


Human secreted protein encoded by gene 2
60
28





clone HBMCF37.


629
AAY87145


Homo sapiens


Human secreted protein sequence SEQ ID
60
28





NO: 184.


629
AAY87182


Homo sapiens


Human secreted protein sequence SEQ ID
60
28





NO: 221.


630
gi216539


Escherichia


BasS
825
98






coli




630
gi1790551


Escherichia


sensor protein for basR
825
98






coli
K12



630
gi536956


Escherichia


basS
825
98






coli




631
gi1786804


Escherichia


ferric enterobactin transport protein
1021
100






coli
K12



631
gi1778505


Escherichia


ferric enterobactin transport protein
1021
100






coli




631
gi13360086


Escherichia


ferric enterobactin transport protein
1020
99






coli
O157: H7



632
gi349227


Escherichia


transmembrane protein
1114
100






coli




632
gi466681


Escherichia


dppC
1114
100






coli




632
gi13363896


Escherichia


dipeptide transport system permease protein 2
1114
100






coli
O157: H7



633
gi4063042


Cryptosporidium


GP900; mucin-like glycoprotein
359
57






parvum




633
gi2827460


Cercopithecus


hepatitis A virus cellular receptor 1 short form
324
56






aethiops




633
gi2827462


Cercopithecus


hepatitis A virus cellular receptor 1 long form
324
56






aethiops




634
gi13959789


Homo sapiens


lung alpha/beta hydrolase protein 1 mRNA,
203
88





complete cds.


634
gi13784946


Mus musculus


alpha/beta hydrolase-1
175
77


634
gi7545019


Neurospora


apocytochrome b
47
41






crassa




635
AAB87774


Homo sapiens


Human T2R44 amino acid sequence SEQ ID
364
91





NO: 70.


635
AAB87780


Homo sapiens


Human T2R50 amino acid sequence SEQ ID
363
89





NO: 76.


635
AAB87745


Homo sapiens


Human T2R15 amino acid sequence SEQ ID
343
85





NO: 28.


636
gi2275592


Homo sapiens


T cell receptor beta locus, TCRBV8S5P to
534
100





TCRBV21S2A2 region.


636
gi2275570


Homo sapiens


T cell receptor beta locus, TCRBV6S4A1 to
534
100





TCRBV8S1 region.


636
gi2218039


Homo sapiens


Human germline T-cell receptor beta chain
534
100





TCRBV13S1, TCRBV6S8A2T,





TCRBV5S6A3N2T, TCRBV13S6A2T,





TCRBV6S9P, TCRBV5S3A2T, TCRBV13S8P,





TCRBV6S3A1N1T, TCRBV5S2,





TCRBV6S6A2T, TCRBV5S7P, TCRBV13S4,





TCRBV6S2A1N1T, TCRBV5S4A2T,





TCRBV6S4A1, TCRBV23S1A2T,





TCRBV12S1A1N2, TCRBV21S2A2,





TCRBV8S1, TCRBV8S2A1T, TCRBV8S3,





TCRBV16S1A1N1, TCRBV24S1A3T,





TCRBV25S1A2PT, TCRBV26S1P,





TCRBV18S1, TCRBV17S1A1T, TCRBV2S1,





TCRBV10S1P genes from bases 257519 to





472940 (section 2 of 3).


637
AAB49502


Homo sapiens


Clone HYASC03.
310
98


637
gi7020468


Homo sapiens


cDNA FLJ20396 fis, clone KAT00561.
145
39


637
AAB18980


Homo sapiens


Amino acid sequence of a human
145
39





transmembrane protein.


638
AAY38432


Homo sapiens


Human secreted protein encoded by gene No. 3.
81
46


638
AAY73420


Homo sapiens


Human secreted protein clone ye22_1 protein
75
33





sequence SEQ ID NO: 62.


638
AAY20298


Homo sapiens


Human apolipoprotein E mutant protein
77
30





fragment 11.


639
gi9948048


Pseudomonas


probable transporter (membrane subunit)
557
63






aeruginosa




639
gi7227389


Neisseria


sodium/dicarboxylate symporter family protein
492
58






meningitidis






MC58


639
gi9657417


Vibrio


sodium/dicarboxylate symporter
474
55






cholerae




640
gi13111711


Homo sapiens


solute carrier family 2 (facilitated glucose
1273
60





transporter), member 5, clone MGC: 1619,





mRNA, complete cds.


640
gi12804761


Homo sapiens


solute carrier family 2 (facilitated glucose
1273
60





transporter), member 5, clone MGC: 3654,





mRNA, complete cds.


640
gi183298


Homo sapiens


Human glucose transport-like 5 (GLUT5)
1273
60





mRNA, complete cds.


641
gi14336709


Homo sapiens


16p13.3 sequence section 3 of 8.
358
57


641
gi9621664


Homo sapiens


RHBDL gene for rhomboid-related protein.
358
57


641
gi3287191


Homo sapiens


mRNA for rhomboid-related protein, complete
358
57





CDS.


642
AAY45023


Homo sapiens


Human sensory transduction G-protein coupled
968
100





receptor-B3.


642
gi13785657


Mus musculus


candidate taste receptor T1R1
786
77


642
gi13785659


Mus musculus


candidate taste receptor T1R2
303
36


643
gi871498


Oryza sativa


DNA binding protein
86
35


643
gi7160630


Bordetella


pertactin (P.68)
86
39






bronchiseptica




643
gi9049498


Bordetella


pertactin
86
39






bronchiseptica




644
gi5911988


Homo sapiens


mRNA; cDNA DKFZp434H2235 (from clone
164
73





DKFZp434H2235); partial cds.


644
gi5262574


Homo sapiens


mRNA; cDNA DKFZp434G173 (from clone
164
73





DKFZp434G173); complete cds.


644
AAW89030


Homo sapiens


Polypeptide fragment encoded by gene 165.
147
64


645
gi10437864


Homo sapiens


cDNA: FLJ21709 fis, clone COL10077.
429
74


645
AAY91433


Homo sapiens


Human secreted protein sequence encoded by
412
76





gene 33 SEQ ID NO: 154.


645
gi14042074


Homo sapiens


cDNA FLJ14508 fis, clone NT2RM1000421,
411
80





weakly similar to RIBONUCLEASE





INHIBITOR.


646
gi9280561


Mus musculus


elafin-like protein I
66
30


646
AAY99453


Homo sapiens


Human PRO1784 (UNQ846) amino acid
77
31





sequence SEQ ID NO: 390.


646
gi10176740


Arabidopsis


RING zinc finger protein-like
76
33






thaliana




647
AAY19485


Homo sapiens


Amino acid sequence of a human secreted
53
52





protein.


648
gi6900006


Ceratitis


chorion protein s18
95
31






capitata




648
gi1491621
Bovine
UL36
104
35




herpesvirus 1


648
gi2653311
Bovine
very large virion protein (tegument)
104
35




herpesvirus




type 1.1


649
gi4877582


Homo sapiens


lipoma HMGIC fusion partner (LHFP) mRNA,
72
34





complete cds.


649
AAY87336


Homo sapiens


Human signal peptide containing protein HSPP-
72
34





113 SEQ ID NO: 113.


649
gi9658445


Vibrio


AzIC family protein
49
38






cholerae




650
gi6899191


Ureaplasma


amino acid antiporter
67
33






urealyticum




650
gi5708228


Rhodopseudo


LH2alpha7
62
35






monas






acidophila


650
gi7211354


Saimiri


olfactory receptor
77
34






boliviensis




651
AAB19403


Homo sapiens


Amino acid sequence of a human secreted
712
89





protein.


651
gi387048


Cricetus


DHFR-coamplified protein
230
47






cricetus




651
gi3261597


Mycobacterium


lprA
77
29






tuberculosis




652
gi12718841


Mus musculus


Skullin
310
38


652
gi4191356


Mus musculus


claudin-6
308
38


652
gi13543081


Mus musculus


claudin-6
308
38


653
gi801882


Vibrio


FkuB
83
31






alginolyticus




653
gi2795895


Homo sapiens


clone 23819 white protein homolog mRNA,
71
30





partial cds.


653
gi5777942


Equus caballus


IL-1ra
52
25


654
gi9872


Plasmodium


ATPase I
116
41






falciparum




654
gi7688148


Homo sapiens


Novel human gene mapping to chomosome I.
119
42


654
gi3451312


Schizosacchar


membrane atpase
116
41






omyces
pombe



655
gi6682873


Homo sapiens


rec mRNA, complete cds.
200
90


655
gi7230612


Rattus


small rec
197
87






norvegicus




655
gi4959442


Drosophila


DNZDHHC/NEW1 zinc finger protein 11
93
41






melanogaster




656
gi2204110


Bos taurus


adenylyl cyclase type VII
233
69


656
gi602412


Mus musculus


adenylyl cyclase type VII
209
66


656
AAB02011


Homo sapiens


Type VII adenylyl cyclase.
209
66


657
gi3297936


Rattus


rhomboid-related protein
267
71






norvegicus




657
gi9621664


Homo sapiens


RHBDL gene for rhomboid-related protein.
266
71


657
gi14336709


Homo sapiens


16p13.3 sequence section 3 of 8.
266
71


658
gi10437529


Homo sapiens


cDNA: FLJ21432 fis, clone COL04219.
145
25


658
AAY76136


Homo sapiens


Human secreted protein encoded by gene 13.
113
28


658
gi4929559


Homo sapiens


CGI-45 protein mRNA. complete cds.
113
28


659
gi2429362


Santalum


proline rich protein
137
34






album




659
gi5139695


Cucumis


expressed in cucumber hypocotyls
127
28






sativus




659
gi7671460


Arabidopsis


AtAGP4
111
37






thaliana




660
gi3165565


Caenorhabditis


contains similarity to transmembrane domains
94
23






elegans


found in HMG CoA reductases and drosophila





patched protein (SW: P18502)


660
gi160281


Plasmodium


erythrocyte binding protein
64
35






falciparum




660
AAY28686


Homo sapiens


Human yb39_1 secreted protein.
57
43


662
AAY71948


Homo sapiens


Human ion channel protein (ICP).
1195
99


662
AAY71949


Homo sapiens


Human alternative ion channel protein (ICP).
1195
99


662
AAR27654


Homo sapiens


Human calcium channel 27980/16.
149
27


663
gi478889


Rana


transcription factor RcC/EPB-1
82
33






catesbeiana




663
gi4098456


Sus scrofa


follicle-stimulating hormone beta subunit
60
38


663
AAR56767


Homo sapiens


Human FSH beta subunit fragment with
58
33





residues −18 to 35.


664
gi5578778


Homo sapiens


mRNA for G18.2 protein (G18.2 gene, located
73
41





in the class III region of the major





histocompatibility complex).


664
gi213591


Pseudopleuronectes


HPLC6
65
43






americanus




664
gi11345434


Thermus


competence factor ComEA
79
43






thermophilus




665
gi13111831


Homo sapiens


clone IMAGE: 3451448, mRNA, partial cds.
606
60


665
AAW78128


Homo sapiens


Human secreted protein encoded by gene 3
606
60





clone HOSBI96.


665
AAB18993


Homo sapiens


Amino acid sequence of a human
606
60





transmembrane protein.


666
gi14249886


Homo sapiens


clone MGC: 15763, mRNA, complete cds.
196
77


666
gi217554


Bos taurus


endothelin receptor
50
32


666
gi3299894


Equus caballus


endothelin-B receptor
50
32


667
AAW52812


Homo sapiens


Human induced tumor protein.
123
38


667
gi8895091


Homo sapiens


Diff33 protein homolog mRNA, complete cds.
123
38


667
AAY95015


Homo sapiens


Human secreted protein vc61_1, SEQ ID
123
38





NO: 70.


668
gi32093


Homo sapiens




H. sapiens
HGMIP07J gene for olfactory

849
54





receptor.


668
AAF61132


Homo sapiens


Human OLFXY cDNA.
802
49



aa1


668
AAB46999


Homo sapiens


Human OLFXY protein.
799
49


669
gi9081843


Prunus dulcis


self-incompatibility associated ribonuclease
79
44


669
gi6539444


Prunus avium


S6-RNase
79
44


669
gi6539438


Prunus avium


S1-RNase
78
44


670
AAB66272


Homo sapiens


Human TANGO 378 SEQ ID NO: 29.
581
100


670
AAB61166


Homo sapiens


Human BBSR seven transmembrane receptor
168
39





protein.


670
gi6006811


Mus musculus


serpentine receptor
168
41


671
AAY66750


Homo sapiens


Membrane-bound protein PRO1287.
785
98


671
AAB87561


Homo sapiens


Human PRO1287.
785
98


671
ANB65273


Homo sapiens


Human PRO1287 (UNQ656) protein sequence
785
98





SEQ ID NO: 381.


672
AAY99421


Homo sapiens


Human PRO1433 (UNQ738) amino acid
915
48





sequence SEQ ID NO: 292.


672
gi13537297


Homo sapiens


GS1999full mRNA, complete cds.
879
51


672
AAY94889


Homo sapiens


Human protein clone HP02485.
723
43


673
gi10435844


Homo sapiens


cDNA FLJ13737 fis, clone PLACE3000157.
93
28


673
gi205752


Rattus


Nopp140
95
27






norvegicus




673
AAY53800


Homo sapiens


Amino acids 145-197 of the mature human
63
40





chromogranin A (CgA) protein.


674
gi7717312


Homo sapiens


chromosome 21 segment HS21C049.
422
97


674
AAB18666


Homo sapiens


A human regulator of intracellular
115
92





phosphorylation.


674
gi11342496
Bacteriophage
holin
77
27




phi-Ea1h










[0402]

3








TABLE 3








SEQ ID





NO:
Accession No.
Description
Results*







339
BL01144
Ribosomal protein L31e proteins.
BL01144 25.07 6.684e−17 83-135


342
PF01325
Iron dependant repressor.
PF01325B 20.91 5.680e−09 34-56


354
BL00019
Actinin-rype actin-binding domain
BL00019D 15.33 3.948e−14 41-71




proteins.


357
BL00979
G-protein coupled receptors family
BL00979M 14.39 6.532e−11 30-81




3 proteins.


367
BL00590
LIF/OSM family proteins.
BL00590B 17.36 3.045e−19 183-201


375
PR00245
OLFACTORY RECEPTOR
PR00245A 18.03 1.419e−18 57-79




SIGNATURE


376
PR00927
ADENTNE NUCLEOTIDE
PR00927A 7.98 9.667e−09 14-27




TRANSLOCATOR 1




SIGNATURE


378
PR00237
RHODOPSIN-LIKE GPCR
PR00237B 13.50 2.250e−09 58-80




SUPERFAMILY SIGNATURE
PR00237G 19.63 9.372e−09 143-170


379
PR00698


C. ELEGANS
SRG FAMILY

PR00698E 14.43 8.714e−09 97-123




INTEGRAL MEMBRANE




PROTEIN SIGNATURE


384
PF00075
RNase H.
PF00075A 14.44 4.429e−09 231-248


387
PD01066
PROTEIN ZINC FINGER ZINC-
PD01066 19.43 9.727e−36 58-97




FINGER METAL-BINDING NU.


388
PR00907
THROMBOMODULIN
PR00907E 11.70 2.969e−10 49-72




SIGNATURE


399
PD01115
PRECURSOR AMPHIBIAN SKIN
PD01115A 12.27 9.750e−12 1-24




SIGNAL.


403
BL00970
Nuclear transition protein 2
BL00970B 10.09 8.966e−10 83-109




proteins.


405
PF01007
Inward rectifier potassium channel.
PF01007B 17.48 1.000e−08 95-139


419
BL00948
Ribosomal protein S7e proteins.
BL00948A 14.13 5.034e−20 68-91


423
PR00019
LEUCINE-RICH REPEAT
PR00019B 11.36 4.150e−10 70-84




SIGNATURE
PR00019B 11.36 9.100e−10 94-108





PR00019A 11.19 8.000e−09 73-87


425
BL00476
Fatty acid desaturases family 1
BL00476B 18.34 4.938e−09 252-296




proteins.


429
BL01253
Type I fibronectin domain proteins.
BL01253C 15.89 6.654e−18 78-117


434
PR00049
WILM'S TUMOUR PROTEIN
PR00049D 0.00 6.034e−09 7-22




SIGNATURE


436
PR00591
SOMATOSTATIN RECEPTOR
PR00591B 7.56 4.750e−09 117-132




TYPE 5 SIGNATURE


438
PR00709
AVIDIN SIGNATURE
PR00709A 4.60 1.170e−09 16-35


439
BL01253
Type I fibronectin domain proteins.
BL01253F 14.35 5.050e−14 78-117


445
BL00649
G-protein coupled receptors family
BL00649C 17.82 6.339e−12 4-30




2 proteins.


452
PD01066
PROTEIN ZINC FINGER ZINC-
PD01066 19.43 6.362e−29 129-168




FINGER METAL-BINDING NU.


454
PR00463
E-CLASS P450 GROUP I
PR00463B 17.50 3.3 14e−13 135-157




SIGNATURE
PR00463A 11.40 8.568e−10 111-131


459
BL00211
ABC transporters family proteins.
BL00211B 13.37 2.286e−13 222-254





BL00211A 12.23 9.550e−09 160-172


474
PR00049
WILM'S TUMOUR PROTEIN
PR00049D 0.00 8.780e−09 78-93




SIGNATURE


479
PF00624
Flocculin repeat proteins.
PF00624J 6.21 7.070e−09 40-95





PF00624F 11.04 9.056e−09 68-104


481
BL01303
BCCT family of transporters
BL01303A 14.33 5.629e−31 89-122




proteins.
BL01303B 10.14 2.250e−18 142-161


482
PR00075
FATTY ACID DESATURASE
PR00075A 16.97 9.565e−09 9-30




FAMILY 1 SIGNATURE


486
BL00538
Bacterial chemotaxis sensory
BL00538C 10.61 1.000e−40 152-191




transducers proteins.
BL00538A 23.61 3.647e−39 96-144


488
BL00077
Heme-copper oxidase catalytic
BL00077C 18.98 9.697e−09 9-60




subunit, copper B binding regio.


494
PR00550
HYPERGLYCEMIC HORMONE
PR00550C 11.31 9.426e−10 29-40




SIGNATURE


496
DM01283
A-BINDING PROTEIN
DM01283A 14.91 9.600e−10 35-71




CHLOROPHYLL.


497
BL00649
G-protein coupled receptors family
BL00649B 20.68 5.061e−11 23-69




2 proteins.
BL00649C 17.82 4.955e−10 82-108


499
BL00312
Glycophorin A proteins.
BL00312B 9.22 9.911e−09 2-31


501
PR00957
GENE 66 (IR5) PROTEIN
PR00957A 7.65 3.473e−09 158-176




SIGNATURE


502
BL00479
Phorbol esters/diacylglycerol
BL00479A 19.86 1.220e−10 59-82




binding domain proteins.


503
PR00007
COMPLEMENT C1Q DOMAIN
PR00007B 14.16 7.698e−13 116-136




SIGNATURE
PR00007D 9.64 9.654e−11 193-204





PR00007A 19.33 2.552e−10 89-116





PR00007C 15.60 3.656e−10 163-185


505
PR00925
NONHISTONE
PR00925B 3.73 5.982e−10 78-91




CHROMOSOMAL PROTEIN




HMG17 FAMILY SIGNATURE


517
BL00237
G-protein coupled receptors
BL00237A 27.68 7.000e−14 67-107




proteins.


526
PR00011
TYPE III EGF-LIKE SIGNATURE
PR00011B 13.08 5.576e−13 76-95





PR00011D 14.03 6.943e−13 76-95





PR00011B 13.08 9.542e−13 33-52





PR00011D 14.03 3.211e−12 33-52





PR00011A 14.06 6.516e−12 33-52





PR00011A 14.06 8.548e−12 76-95





PR00011D 14.03 3.213e−11 162-181





PR00011B 13.08 2.174e−10 162-181





PR00011D 14.03 2.523e−10 119-138





PR00011B 13.08 2.356e−09 119-138





PR00011B 13.08 5.685e−09 205-224





PR00011A 14.06 6.425e−09 119-138





PR00011A 14.06 6.671e−09 162-181





PR00011D 14.03 9.870e−09 205-224


531
PR00251
BACTERIAL OPSIN
PR00251G 16.33 4.000e−09 176-195




SIGNATURE


541
BL00649
G-protein coupled receptors family
BL00649C 17.82 6.073e−13 21-47




2 proteins.


546
BL00242
Integrins alpha chain proteins.
BL00242E 9.03 8.154e−09 82-111


551
DM00215
PROLINE-RICH PROTEIN 3.
DM00215 19.43 8.071e−10 122-155


555
PR00806
VINCULIN SIGNATURE
PR00806C 11.07 8.839e−09 13-31


559
BL00237
G-protein coupled receptors
BL00237A 27.68 9.129e−15 71-111




proteins.
BL00237C 13.19 1.346e−13 218-245





BL00237D 11.23 9.308e−11 271-288


563
BL00495
Apple domain proteins.
BL00495N 11.04 8.239e−14 204-239





BL00495O 13.75 9.000e−14 236-265


580
PR00838
VENOM ALLERGEN 5
PR00838G 16.07 9.760e−12 165-185




SIGNATURE
PR00838D 8.73 1.563e−10 87-106


581
PR00049
WILM'S TUMOUR PROTEIN
PR00049D 0.00 7.344e−13 205-220




SIGNATURE
PR00049D 0.00 9.262e−13 206-221





PR00049D 0.00 4.000e−12 207-222





PR00049D 0.00 4.000e−12 208-223





PR00049D 0.00 7.655e−11 202-217





PR00049D 0.00 7.958e−11 204-219





PR00049D 0.00 8.336e−11 203-218





PR00049D 0.00 1.214e−10 209-224





PR00049D 0.00 1.214e−10 210-225





PR00049D 0.00 3.746e−09 211-226


585
BL01113
C1q domain proteins.
BL01113A 17.99 3.106e−10 22-49


586
PR00828
FORMIN SIGNATURE
PR00828H 8.87 4.081e−09 390-412


587
PR00169
POTASSIUM CHANNEL
PR00169H 8.09 5.696e−30 225-252




SIGNATURE
PR00169E 9.10 8.773e−28 127-154





PR00169G 9.39 6.684e−27 196-219





PR00169C 16.31 8.714e−25 59-83





PR00169F 7.19 6.192e−24 156-180





PR00169D 12.86 2.385e−20 85-106


590
PR00451
CHITIN-BINDING DOMAIN
PR00451A 6.49 1.871e−09 88-97




SIGNATURE


594
PF01327
Polypeptide deformylase.
PF01327D 18.82 2.440e−20 197-229





PF01327A 18.58 2.187e−09 92-127


595
PD02796
PROTEIN STEROL CARRIER
PD02796B 20.92 6.507e−23 157-204




LIPID-TRAN.


596
BL00232
Cadherins extracellular repeat
BL00232A 27.72 7.218e−12 38-71




proteins domain proteins.


598
BL00261
Glycoprotein hormones beta chain
BL00261B 25.64 1.000e−40 72-116




proteins.
BL00261A 23.97 3.500e−34 22-56


599
PR00796
VIRAL SPIKE GLYCOPROTEIN
PR00796I 8.96 7.638e−11 32-58




PRECURSOR SIGNATURE


602
PR00209
ALPHA/BETA GLIADIN
PR00209B 4.88 8.594e−09 129-148




FAMILY SIGNATURE


605
PR00833
POLLEN ALLERGEN POA PI
PR00833H 2.30 6.625e−10 61-76




SIGNATURE


622
PR00779
INOSITOL 1,4,5-
PR00779H 8.81 6.909e−09 18-40




TRISPHOSPHATE-BINDING




PROTEIN RECEPTOR




SIGNATURE


624
DM00031
IMMUNOGLOBULIN V
DM00031B 15.41 4.508e−15 84-118




REGION.


628
PD01736
PROTEIN TRANSMEMBRANE
PD01736B 8.42 9.250e−09 118-130




INTERGENIC REGION RECQ-




PLD.


630
PF00512
Signal carboxyl-terminal domain
PF00512 13.94 3.571e−14 150-169




proteins.


631
PF01032
FecCD transport family.
PF01032B 9.12 7.300e−15 132-147


632
BL00713
Sodium: dicarboxylate symporter
BL00713D 20.98 6.063e−09 24-62




family proteins.


633
DM00784
APILLOMA VIRUS E4 PROTEIN.
DM00784B 17.87 7.492e−09 67-92


639
BL00713
Sodium: dicarboxylate symporter
BL00713C 19.76 1.964e−09 100-139




family proteins.


640
BL00216
Sugar transport proteins.
BL00216B 27.64 8.000e−25 108-158


642
BL00979
G-protein coupled receptors family
BL00979M 14.39 5.114e−12 126-177




3 proteins.


643
BL00402
Binding-protein-dependent
BL00402A 5.93 7.000e−09 55-69




transport systems inner membrane




co.


645
PR00237
RHODOPSIN-LIKE GPCR
PR00237F 13.57 8.342e−09 24-49




SUPERFAMILY SIGNATURE


662
PR00170
SODIUM CHANNEL
PR00170G 7.74 3.374e−09 37-66




SIGNATURE


668
BL00237
G-protein coupled receptors
BL00237A 27.68 5.974e−12 83-123




proteins.










[0403]

4









TABLE 4








SEQ ID NO:
Pfam Model
Description
E-value
Pfam Score



















339
Ribosomal_L31e
Ribosomal protein L31e
0.00061
16.6


357
7tm_3
7 transmembrane receptor (metabotropic
0.0073
−95.1




glutamate family)


367
LIF_OSM
LIF/OSM family
  8e−145
494.5


370
ig
Immunoglobulin domain
1.5e−05
23.0


375
7tm_1
7 transmembrane receptor (rhodopsin
3.8e−06
21.7




family)


378
7tm_1
7 transmembrane receptor (rhodopsin
0.064
8.3




family)


380
DUF6
Integral membrane protein DUF6
1.4e−05
32.0


384
rvt
Reverse transcriptase (RNA-dependent
  3e−15
61.0




DNA Polymerase)


387
KRAB
KRAB box
  2e−42
154.4


399
Gastrin
Gastrin/cholecystokinin family
7.5e−22
83.9


401
Cornifin

0.0031
5.4


405
ion_trans
Ion transport protein
0.0034
24.0


408
Galactosyl_T
Galactosyltransferase
2.9e−28
107.3


419
Ribosomal_S7e
Ribosomal protein S7e
6.9e−17
69.5


423
LRR
Leucine Rich Repeat
1.8e−15
64.8


429
kringle
Kringle domain
1.2e−17
72.1


430
p450
Cytochrome P450
0.034
10.6


439
trypsin
Trypsin
1.9e−06
23.0


444
PMP22_Claudin
PMP-22/EMP/MP20/Claudin family
0.002
−5.3


448
ig
Immunoglobulin domain
1.7e−08
32.5


452
KRAB
KRAB box
6.4e−22
86.3


454
p450
Cytochrome P450
8.3e−13
48.0


459
ABC_tran
ABC transporter
0.0016
−23.4


478
neur_chan
Neurotransmitter-gated ion-channel
4.8e−15
54.0


481
BCCT
BCCT family transporter
8.5e−22
85.8


483
Fumarate_red_D

3.4e−64
226.7


486
HAMP

1.1e−11
52.2


497
7tm_2
7 transmembrane receptor (Secretin
0.0039
−87.5




family)


503
C1q
C1q domain
2.2e−45
164.2


508
MCT
Monocarboxylate transporter
4.4e−59
209.7


517
7tm_1
7 transmembrane receptor (rhodopsin
5.4e−22
72.0




family)


526
EGF
EGF-like domain
0.00021
28.1


527
DUF6
Integral membrane protein DUF6
0.043
13.8


528
zf-DHHC
DHHC zinc finger domain
1.2e−32
121.9


533
CUB
CUB domain
6.9e−32
119.4


537
PMP22_Claudin
PMP-22/EMP/MP20/Claudin family
7.6e−31
115.9


541
7tm_2
7 transmembrane receptor (Secretin
3.7e−05
−46.4




family)


543
tsp_1
Thrombospondin type 1 domain
0.028
12.1


559
7tm_1
7 transmembrane receptor (rhodopsin
9.3e−40
128.4




family)


560
PDZ
PDZ domain (Also known as DHR or
2.1e−42
154.3




GLGF).


563
trypsin
Trypsin
9.8e−99
313.5


569
Peptidase_M1
Peptidase family M1
3.7e−11
32.8


572
ig
Immunoglobulin domain
1.2e−06
26.5


580
SCP
SCP-like extracellular protein
2.9e−21
80.4


585
C1q
C1q domain
5.4e−08
35.2


587
ion_trans
Ion transport protein
3.9e−31
116.9


594
Pep_deformylase
Polypeptide deformylase
2.1e−20
81.2


595
SCP2
SCP-2 sterol transfer family
5.2e−23
89.9


596
cadherin
Cadherin domain
2.9e−08
40.9


598
Cys_knot
Cystine-knot domain
3.3e−52
186.9


624
ig
Immunoglobulin domain
2.6e−09
35.1


630
HAMP

1.1e−08
42.3


631
FecCD_family
FecCD transport family
7.4e−44
159.1


632
BPD_transp
Binding-protein-dependent transport
  6e−05
29.9




systems inner membrane component


636
ig
Immunoglobulin domain
8.8e−13
46.2


639
SDF
Sodium: dicarboxylate symporter family
3.4e−58
206.8


640
sugar_tr
Sugar (and other) transporter
  2e−99
343.7


642
7tm_3
7 transmembrane receptor (metabotropic
2.1e−06
−21.8




glutamate family)


652
PMP22_Claudin
PMP-22/EMP/MP20/Claudin family
4.1e−08
40.4


655
zf-DHHC
DHHC zinc finger domain
0.0085
−6.4


657
Rhomboid
Rhomboid family
0.072
−20.3


668
7tm_1
7 transmembrane receptor (rhodopsin
7.1e−30
97.1




family)










[0404]

5















TABLE 5








SEQ












ID
PDB
Chain
Start
End
PSI-
Verify
PMF
SeqFold


NO:
ID
ID
AA
AA
BLAST
Score
Score
Score
Coumpound
PDB annotation

























354
1bhd
A
42
87
8.5e−18
0.00
0.04

UTROPHIN; CHAIN: A, B;
STRUCTURAL PROTEIN CALPONIN












HOMOLOGY, ACTIN BINDING,












STRUCTURAL PROTEIN


354
1bkr
A
41
89
1.7e−20
−0.24
0.28

SPECTRIN BETA CHAIN;
ACTIN-BINDING CALPONIN











CHAIN: A;
HOMOLOGY (CH) DOMAIN;












FILAMENTOUS ACTIN-BINDING












DOMAIN, CYTOSKELETON


354
1dxx
A
26
76
  1e−09
−0.48
0.41

DYSTROPHIN; CHAIN: A,
STRUCTURAL PROTEIN











B, C, D;
DYSTROPHIN, MUSCULAR












DYSTROPHY, CALPONIN












HOMOLOGY DOMAIN, 2 ACTIN-












BINDING, UTROPHIN


354
1dxx
A
42
89
1.5e−16
−0.35
0.11

DYSTROPHIN; CHAIN: A,
STRUCTURAL PROTEIN











B, C, D;
DYSTROPHIN, MUSCULAR












DYSTROPHY, CALPONIN












HOMOLOGY DOMAIN, 2 ACTIN-












BINDING, UTROPHIN


354
1qag
A
42
87
8.5e−18
−0.59
0.09

UTROPHIN ACTIN
STRUCTURAL PROTEIN CALPONIN











BINDING REGION; CHAIN:
HOMOLOGY DOMAIN, DOMAIN











A, B;
SWAPPING, ACTIN BINDING, 2












UTROPHIN, DYSTROPHIN,












STRUCTURAL PROTEIN


358
1fqv
A
28
67
0.005
−0.85
0.43

SKP2; CHAIN: A, C, E, G, I,
LIGASE CYCLIN A/CDK2-











K, M, O; SKP1; CHAIN: B,
ASSOCIATED PROTEIN P45; CYCLIN











D, F, H, J, L, N, P;
A/CDK2-ASSOCIATED PROTEIN P19;












SKP1, SKP2, F-BOX, LRR, LEUCINE-












RICH REPEAT, SCF, UBIQUITIN, 2












E3, UBIQUITIN PROTEIN LIGASE


361
1sfp

38
78
0.0015
−0.73
0.71

ASFP; CHAIN: NULL;
SPERMADHESIN ACIDIC SEMINAL












PROTEIN; SPERMADHESIN, BOVINE












SEMINAL PLASMA PROTEIN,












ACIDIC 2 SEMINAL FLUID PROTEIN,












ASFP, CUB DOMAIN, X-RAY












CRYSTAL 3 STRUCTURE, GROWTH












FACTOR


361
1spp
B
24
78
0.0015
−0.08
0.10

MAJOR SEMINAL PLASMA
COMPLEX (SEMINAL PLASMA











GLYCOPROTEIN PSP-I;
PROTEIN/SPP) SEMINAL PLASMA











CHAIN: A; MAJOR
PROTEINS, SPERMADHESINS, CUB











SEMINAL PLASMA
DOMAIN 2 ARCHITECTURE,











GLYCOPROTEIN PSP-II;
COMPLEX (SEMINAL PLASMA











CHAIN: B
PROTEIN/SPP)


367
1evs
A
29
212
1.8e−78
1.02
1.00

ONCOSTATIN M; CHAIN:
CYTOKINE 4-HELIX BUNDLE, GP130











A;
BINDING CYTOKINE


367
1evs
A
29
212
5.1e−76
1.13
1.00

ONCOSTATIN M; CHAIN:
CYTOKINE 4-HELIX BUNDLE, GP130











A;
BINDING CYTOKINE


370
1ac6
A
27
119
1.3e−15


56.47
T-CELL RECEPTOR
RECEPTOR RECEPTOR, V ALPHA











ALPHA; CHAIN: A, B;
DOMAIN, SITE-DIRECTED












MUTAGENESIS, 2 THREE-












DIMENSIONAL STRUCTURE,












GLYCOPROTEIN, SIGNAL


370
1ao7
D
26
119
7.5e−21


51.88
HLA-A 0201; CHAIN: A;
COMPLEX (MHC/VIRAL











BETA-2 MICROGLOBULIN;
PEPTIDE/RECEPTOR) HLA-A2











CHAIN: B; TAX PEPTIDE;
HEAVY CHAIN; CLASS I MHC, T-











CHAIN: C; T CELL
CELL RECEPTOR, VIRAL PEPTIDE, 2











RECEPTOR ALPHA;
COMPLEX (MHC/VIRAL











CHAIN: D; T CELL
PEPTIDE/RECEPTOR











RECEPTOR BETA; CHAIN:











E;


370
1aqk
L
28
117
5.1e−48
0.35
0.89

FAB B7-15A2; CHAIN: L, H;
IMMUNOGLOBULIN HUMAN FAB,












ANTI-TETANUS TOXOID, HIGH












AFFINITY, CRYSTAL 2 PACKING












MOTIF, PROGRAMMING












PROPENSITY TO CRYSTALLIZE, 3












IMMUNOGLOBULIN


370
1b6d
A
25
114
1.2e−44
0.16
0.60

IMMUNOGLOBULIN;
IMMUNOGLOBULIN











CHAIN: A, B;
IMMUNOGLOBULIN, KAPPA LIGHT-












CHAIN DIMER HEADER


370
1bjl
L
25
114
5.1e−46
0.37
0.63

FAB FRAGMENT; CHAIN:
COMPLEX (ANTIBODY/ANTIGEN)











L, H, J, K; VASCULAR
FAB-12; VEGF; COMPLEX











ENDOTHELIAL GROWTH
(ANTIBODY/ANTIGEN),











FACTOR; CHAIN: V, W;
ANGIOGENIC FACTOR


370
1bjm
A
27
116
5.1e−45
0.13
0.83

LOC-LAMBDA 1 TYPE
IMMUNOGLOBULIN BENCE-JONES











LIGHT-CHAIN DIMER;
PROTEIN; 1BJM 8 BENCE JONES,











1BJM 6 CHAIN: A, B; 1BJM 7
ANTIBODY, MULTIPLE












QUATERNARY STRUCTURES 1BJM












13


370
1bww
A
23
114
1.7e−45
0.27
0.31

IG KAPPA CHAIN V-I
IMMUNE SYSTEM REIV,











REGION REI; CHAIN: A, B;
STABILIZED IMMUNOGLOBULIN












FRAGMENT, BENCE-JONES 2












PROTEIN, IMMUNE SYSTEM


370
1dee
A
25
114
3.4e−47
0.25
0.48

IGM RF 2A2; CHAIN: A, C,
IMMUNE SYSTEM FAB-IBP











E; IGM RF 2A2; CHAIN: B,
COMPLEX CRYSTAL STRUCTURE











D, F; IMMUNOGLOBULIN
2.7A RESOLUTION BINDING 2











G BINDING PROTEIN A;
OUTSIDE THE ANTIGEN











CHAIN: G, H;
COMBINING SITE SUPERANTIGEN












FAD VH3 3 SPECIFICITY


370
1dfb
L
25
119
8.5e−47
0.50
0.64

IMMUNOGLOBULIN 3D6











FAB 1DFB 3


370
1fgv
L
25
114
1.4e−45
0.21
0.53

IMMUNOGLOBULIN FV











FRAGMENT OF A











HUMANIZED VERSION OF











THE ANTI-CD18 1FGV 3











ANTIBODY ’H52’ (HUH52-











AA FV) 1FGV 4


370
2fb4
L
26
117
1.2e−44
0.35
0.82

IMMUNOGLOBULIN











IMMUNOGLOBULIN FAB











2FB4 4


370
2fgw
L
25
114
1.7e−45
0.32
0.77

IMMUNOGLOBULIN FAB











FRAGMENT OF A











HUMANIZED VERSION OF











THE ANTI-CD18 2FGW 3











ANTIBODY ’H52’ (HUH52-











OZ FAB) 2FGW 4


373
1cru
A
169
400
1.5e−46
0.13
0.28

SOLUBLE QUINOPROTEIN
OXIDOREDUCTASE BETA-











GLUCOSE
PROPELLER, SUPERBARREL,











DEHYDROGENASE;
COMPLEX WITH THE COFACTOR











CHAIN: A, B;
PQQ 2 AND THE INHIBITOR












METHYLHYDRAZINE,












OXIDOREDUCTASE


373
1cru
A
186
404
7.5e−49
0.01
0.27

SOLUBLE QUINOPROTEIN
OXIDOREDUCTASE BETA-











GLUCOSE
PROPELLER, SUPERBARREL,











DEHYDROGENASE;
COMPLEX WITH THE COFACTOR











CHAIN: A, B;
PQQ 2 AND THE INHIBITOR












METHYLHYDRAZINE,












OXIDOREDUCTASE


384
1c0t
A
174
431
1.7e−65
−0.24
0.25

HIV-1 REVERSE
TRANSFERASE HIV-1 REVERSE











TRANSCRIPTASE (A-
TRANSCRIPTASE, AIDS, NON-











CHAIN); CHAIN: A; HIV-1
NUCLEOSIDE INHIBITOR, 2 DRUG











REVERSE
DESIGN











TRANSCRIPTASE (B-











CHAIN); CHAIN: B;


384
1c0t
B
176
431
  1e−62
−0.31
0.23

HIV-1 REVERSE
TRANSFERASE HIV-1 REVERSE











TRANSCRIPTASE (A-
TRANSCRIPTASE, AIDS, NON-











CHAIN); CHAIN: A; HIV-1
NUCLEOSIDE INHIBITOR, 2 DRUG











REVERSE
DESIGN











TRANSCRIPTASE (B-











CHAIN); CHAIN: B;


384
1c1c
B
175
431
  1e−74
−0.12
0.39

HIV-1 REVERSE
TRANSFERASE HIV-1 REVERSE











TRANSCRIPTASE (A-
TRANSCRIPTASE, AIDS, NON-











CHAIN); CHAIN: A; HIV-1
NUCLEOSIDE INHIBITOR, 2 DRUG











REVERSE
DESIGN











TRANSCRIPTASE (B-











CHAIN); CHAIN: B;


384
1c9r
A
171
431
  1e−70
−0.08
0.94

HIV-1 REVERSE
TRANSFERASE/IMMUNE











TRANSCRIPTASE (CHAIN
SYSTEM/DNA HIV-1 RT; HIV-1 RT;











A); CHAIN: A; HIV-1
HIV, REVERSE TRANSCRIPTASE,











REVERSE
MET184ILE, 3TC, PROTEIN-DNA 2











TRANSCRIPTASE (CHAIN
COMPLEX, DRUG RESISTANCE,











B); CHAIN: B; ANTIBODY
M184I, TRANSFERASE/IMMUNE 3











(LIGHT CHAIN); CHAIN: L;
SYSTEM/DNA











ANTIBODY (HEAVY











CHAIN); CHAIN: H; DNA











(5’-CHAIN: T; DNA (5’-











CHAIN: P;


384
1c9r
B
171
431
1.7e−79
−0.14
0.59

HIV-1 REVERSE
TRANSFERASE/IMMUNE











TRANSCRIPTASE (CHAIN
SYSTEM/DNA HIV-1 RT; HIV-1 RT;











A); CHAIN: A; HIV-1
HIV, REVERSE TRANSCRIPTASE,











REVERSE
MET184ILE, 3TC, PROTEIN-DNA 2











TRANSCRIPTASE (CHAIN
COMPLEX, DRUG RESISTANCE,











B); CHAIN: B; ANTIBODY
M184I, TRANSFERASE/IMMUNE 3











(LIGHT CHAIN); CHAIN: L;
SYSTEM/DNA











ANTIBODY (HEAVY











CHAIN); CHAIN: H; DNA











(5’-CHAIN: T; DNA (5’-











CHAIN: P;


384
1mml

154
396
5.1e−50


116.10
MMLV REVERSE
REVERSE TRANSCRIPTASE











TRANSCRIPTASE; 1MML 4











CHAIN: NULL; 1MML 5


384
1rth
A
171
431
3.4e−86
−0.12
0.74

HIV-1 REVERSE
NUCLEOTIDYLTRANSFERASE HIV-











TRANSCRIPTASE; 1RTH 4
1 RT; 1RTH 6 HIV-1 REVERSE











CHAIN: A, B; 1RTH 5
TRANSCRIPTASE 1RTH 15


384
1rth
B
173
431
  1e−75
−0.09
0.23

HIV-1 REVERSE
NUCLEOTIDYLTRANSFERASE HIV-











TRANSCRIPTASE; 1RTH 4
1 RT; 1RTH 6 HIV-1 REVERSE











CHAIN: A, B; 1RTH 5
TRANSCRIPTASE 1RTH 15


384
1vrt
A
174
431
1.7e−85
−0.26
0.40

HIV-1 REVERSE
NUCLEOTIDYLTRANSFERASE HIV-











TRANSCRIPTASE; 1VRT 4
1 RT; 1VRT 6 HIV-1 REVERSE











CHAIN: A, B; 1VRT 5
TRANSCRIPTASE 1VRT 15


384
1vrt
B
175
431
1.7e−75
−0.20
0.11

HIV-1 REVERSE
NUCLEOTIDYLTRANSFERASE HIV-











TRANSCRIPTASE; 1VRT 4
1 RT; 1VRT 6 HIV-1 REVERSE











CHAIN: A, B; 1VRT 5
TRANSCRIPTASE 1VRT 15


384
3hvt
B
172
431
3.4e−74
−0.14
0.00

NUCLEOTIDYLTRANSFER











ASE REVERSE











TRANSCRIPTASE











(E.C.2.7.7.49) 3HVT 3


388
1aut
L
47
75
0.00068
−0.18
0.42

ACTIVATED PROTEIN C;
COMPLEX (BLOOD











CHAIN: C, L; D-PHE-PRO-
COAGULATION/INHIBITOR)











MAI; CHAIN: P;
AUTOPROTHROMBIN IIA;












HYDROLASE, SERINE












PROTEINASE), PLASMA CALCIUM












BINDING, 2 GLYCOPROTEIN,












COMPLEX (BLOOD












COAGULATION/INHIBITOR)


388
1diy
A
46
77
0.00068
0.69
0.25

PROSTAGLANDIN H2
OXIDOREDUCTASE ARACHIDONIC











SYNTHASE-1; CHAIN: A;
ACID, MEMBRANE PROTEIN,












PEROXIDASE, DIOXYGENASE


388
1fsb

46
75
0.0034
1.08
0.34

P-SELECTIN; CHAIN:
CELL ADHESION PROTEIN EGF-











NULL;
LIKE DOMAIN, CELL ADHESION












PROTEIN, TRANSMEMBRANE, 2












GLYCOPROTEIN


395
1mgl
A
260
376
3.4e−28
−0.94
0.06

HTLV-1 GP21
LEUKEMIA VIRUS TYPE 1 HUMAN T











ECTODOMAIN/MALTOSE-
CELL LEUKEMIA VIRUS TYPE 1,











BINDING PROTEIN CHAIN:
HTLV-1, ENVELOPE 2 PROTEIN,











A;
MEMBRANE FUSION, MALTOSE-












BINDING PROTEIN CHIMERA


395
2ebo
A
304
376
5.1e−22
−0.56
0.21

EBOLA VIRUS ENVELOPE
ENVELOPE GLYCOPROTEIN











GLYCOPROTEIN; CHAIN:
ENVELOPE GLYCOPROTEIN,











A, B, C;
FILOVIRUS, EBOLA VIRUS, GP2,












COAT 2 PROTEIN


423
1a9n
A
27
164
2.5e−18
0.22
0.69

U2 RNA HAIRPIN IV;
COMPLEX (NUCLEAR











CHAIN: Q, R; U2 A′; CHAIN:
PROTEIN/RNA) COMPLEX











A, C; U2 B″; CHAIN: B, D;
(NUCLEAR PROTEIN/RNA), RNA,












SNRNP, RIBONUCLEOPROTEIN


423
1a9n
A
54
188
  5e−24
0.30
0.48

U2 RNA HAIRPIN IV;
COMPLEX (NUCLEAR











CHAIN: Q, R; U2 A′; CHAIN:
PROTEIN/RNA) COMPLEX











A, C; U2 B″; CHAIN: B, D;
(NUCLEAR PROTEIN/RNA), RNA,












SNRNP, RIBONUCLEOPROTEIN


423
1a9n
C
27
164
7.5e−18
0.38
0.96

U2 RNA HAIRPIN IV;
COMPLEX (NUCLEAR











CHAIN: Q, R; U2 A′; CHAIN:
PROTEIN/RNA) COMPLEX











A, C; U2 B″; CHAIN: B, D;
(NUCLEAR PROTEIN/RNA), RNA,












SNRNP, RIBONUCLEOPROTEIN


423
1a9n
C
54
188
1.5e−23
0.46
0.53

U2 RNA HAIRPIN IV;
COMPLEX (NUCLEAR











CHAIN: Q, R; U2 A′; CHAIN:
PROTEIN/RNA) COMPLEX











A, C; U2 B″; CHAIN: B, D;
(NUCLEAR PROTEIN/RNA), RNA,












SNRNP, RIBONUCLEOPROTEIN


423
1d0b
A
70
237
1.7e−21
−0.00
0.41

INTERNALIN B; CHAIN: A;
CELL ADHESION LEUCINE RICH












REPEAT, CALCIUM BINDING, CELL












ADHESION


423
1dce
A
98
218
1.2e−09
−0.43
0.30

RAB
TRANSFERASE CRYSTAL











GERANYLGERANYLTRAN
STRUCTURE, RAB











SFERASE ALPHA
GERANYLGERANYLTRANSFERASE,











SUBUNIT; CHAIN: A, C;
2.0 A 2 RESOLUTION, N-











RAB
FORMYLMETHIONINE, ALPHA











GERANYLGERANYLTRAN
SUBUNIT, BETA SUBUNIT











SFERASE BETA SUBUNIT;











CHAIN: B, D;


423
1ds9
A
55
178
2.5e−17
−0.29
0.06

OUTER ARM DYNEIN;
CONTRACTILE PROTEIN LEUCINE-











CHAIN: A;
RICH REPEAT, BETA-BETA-ALPHA












CYLINDER, DYNEIN, 2












CHLAMYDOMONAS, FLAGELLA


423
2bnh

34
183
  1e−21
0.28
−0.03

RIBONUCLEASE
ACETYLATION RNASE INHIBITOR,











INHIBITOR; CHAIN: NULL;
RIBONUCLEASE/ANGIOGENIN












INHIBITOR ACETYLATION,












LEUCINE-RICH REPEATS


429
1a0h
A
30
150
2.5e−29
0.37
0.65

MEIZOTHROMBIN; CHAIN:
COMPLEX (SERINE











A, B, D, E; D-PHE-PRO-
PROTEASE/INHIBITOR) DESF1;











ARG; CHAIN: C, F;
PPACK; SERINE PROTEASE,












COAGULATION, THROMBIN,












PROTHROMBIN, 2












MEIZOTHROMBIN, COMPLEX












(SERINE PROTEASE/INHIBITOR)


429
1a0h
A
30
169
6.8e−10
0.28
0.76

MEIZOTHROMBIN; CHAIN:
COMPLEX (SERINE











A, B, D, E; D-PHE-PRO-
PROTEASE/INHIBITOR) DESF1;











ARG; CHAIN: C, F;
PPACK; SERINE PROTEASE,












COAGULATION, THROMBIN,












PROTHROMBIN, 2












MEIZOTHROMBIN, COMPLEX












(SERINE PROTEASE/INHIBITOR)


429
1a0h
A
30
201
2.5e−29


82.71
MEIZOTHROMBIN; CHAIN:
COMPLEX (SERINE











A, B, D, E; D-PHE-PRO-
PROTEASE/INHIBITOR) DESF1;











ARG; CHAIN: C, F;
PPACK; SERINE PROTEASE,












COAGULATION, THROMBIN,












PROTHROMBIN, 2












MEIZOTHROMBIN, COMPLEX












(SERINE PROTEASE/INHIBITOR)


429
1b2i
A
32
120
7.5e−26


72.58
PLASMINOGEN; CHAIN: A;
HYDROLASE SERINE PROTEASE,












FIBRINOLYSIS, LYSINE-BINDING












DOMAIN, 2 PLASMINOGEN,












KRINGLE 2, HYDROLASE


429
1b2i
A
34
119
7.5e−26
0.90
0.81

PLASMINOGEN; CHAIN: A;
HYDROLASE SERINE PROTEASE,












FIBRINOLYSIS, LYSINE-BINDING












DOMAIN, 2 PLASMINOGEN,












KRINGLE 2, HYDROLASE


429
1cea
A
35
119
1e−24


68.58
PLASMINOGEN; 1CEA 7
SERINE PROTEASE K1PG; 1CEA 10











CHAIN: A, B; 1CEA 8


429
1kdu

35
120
2.5e−28


71.09
PLASMINOGEN











ACTIVATION











PLASMINOGEN











ACTIVATOR (UROKINASE-











TYPE, KRINGLE DOMAIN)











1KDU 3 (U-PA K) (NMR,











MINIMIZED AVERAGE











STRUCTURE) 1KDU 4


429
1kdu

36
119
2.5e−28
0.91
0.96

PLASMINOGEN











ACTIVATION











PLASMINOGEN











ACTIVATOR (UROKINASE-











TYPE, KRINGLE DOMAIN)











1KDU 3 (U-PA K) (NMR,











MINIMIZED AVERAGE











STRUCTURE) 1KDU 4


429
1krn

35
119
  5e−22


76.76
PLASMINOGEN; CHAIN:
SERINE PROTEASE KRINGLE,











NULL;
BLOOD, PLASMINOGEN, SERINE












PROTEASE


429
1pml
A
34
119
1.3e−28
0.89
1.00

HYDROLASE(SERINE











PROTEASE) TISSUE











PLASMINOGEN











ACTIVATOR KRINGLE 2











(E.C.3.4.21.68) 1PML 3


429
1pml
A
34
121
1.3e−28


86.47
HYDROLASE(SERINE











PROTEASE) TISSUE











PLASMINOGEN











ACTIVATOR KRINGLE 2











(E.C.3.4.21.68) 1PML 3


429
1pml
C
34
119
  1e−28
0.94
0.96

HYDROLASE(SERINE











PROTEASE) TISSUE











PLASMINOGEN











ACTIVATOR KRINGLE 2











(E.C.3.4.21.68) 1PML 3


429
1pml
C
34
120
  1e−28


86.67
HYDROLASE(SERINE











PROTEASE) TISSUE











PLASMINOGEN











ACTIVATOR KRINGLE 2











(E.C.3.4.21.68) 1PML 3


429
1sfp

218
329
2.5e−17
1.13
0.99

ASFP; CHAIN: NULL;
SPERMADHESIN ACIDIC SEMINAL












PROTEIN; SPERMADHESIN, BOVINE












SEMINAL PLASMA PROTEIN,












ACIDIC 2 SEMINAL FLUID PROTEIN,












ASFP, CUB DOMAIN, X-RAY












CRYSTAL 3 STRUCTURE, GROWTH












FACTOR


429
1sfp

238
327
3.4e−07
0.62
0.09

ASFP; CHAIN: NULL;
SPERMADHESIN ACIDIC SEMINAL












PROTEIN; SPERMADHESIN, BOVINE












SEMINAL PLASMA PROTEIN,












ACIDIC 2 SEMINAL FLUID PROTEIN,












ASFP, CUB DOMAIN, X-RAY












CRYSTAL 3 STRUCTURE, GROWTH












FACTOR


429
1spp
A
218
323
2.5e−16
0.67
0.11

MAJOR SEMINAL PLASMA
COMPLEX (SEMINAL PLASMA











GLYCOPROTEIN PSP-I;
PROTEIN/SPP) SEMINAL PLASMA











CHAIN: A; MAJOR
PROTEINS, SPERMADHESINS, CUB











SEMINAL PLASMA
DOMAIN 2 ARCHITECTURE,











GLYCOPROTEIN PSP-II;
COMPLEX (SEMINAL PLASMA











CHAIN: B
PROTEIN/SPP)


429
1spp
B
218
323
2.5e−15
0.62
−0.07

MAJOR SEMINAL PLASMA
COMPLEX (SEMINAL PLASMA











GLYCOPROTEIN PSP-I;
PROTEIN/SPP) SEMINAL PLASMA











CHAIN: A; MAJOR
PROTEINS, SPERMADHESINS, CUB











SEMINAL PLASMA
DOMAIN 2 ARCHITECTURE,











GLYCOPROTEIN PSP-II;
COMPLEX (SEMINAL PLASMA











CHAIN: B
PROTEIN/SPP)


429
1spp
B
245
328
6.8e−06
0.38
0.09

MAJOR SEMINAL PLASMA
COMPLEX (SEMINAL PLASMA











GLYCOPROTEIN PSP-I;
PROTEIN/SPP) SEMINAL PLASMA











CHAIN: A; MAJOR
PROTEINS, SPERMADHESINS, CUB











SEMINAL PLASMA
DOMAIN 2 ARCHITECTURE,











GLYCOPROTEIN PSP-II;
COMPLEX (SEMINAL PLASMA











CHAIN: B
PROTEIN/SPP)


429
1urk

1
123
2.2e−23


69.71
PLASMINOGEN











ACTIVATION











PLASMINOGEN











ACTIVATOR (UROKINASE-











TYPE) (AMINO TERMINAL











FRAGMENT) (NMR, 15











STRUCTURES)


429
2hpp
P
36
119
  5e−25


66.47
HYDROLASE(SERINE











PROTEINASE) ALPHA-











THROMBIN (E.C.3.4.21.5)











COMPLEX WITH 2HPP 3 D-











PHE-PRO-ARG-











CHLOROMETHYLKETONE











(PPACK)











CHLOROMETHYLKETONE











2HPP 4 REPLACED BY A











METHYLENE GROUP AND











BOVINE PROTHROMBIN











2HPP 5 FRAGMENT 2 2HPP 6


429
2hpp
P
36
119
  5e−25
0.71
0.39

HYDROLASE(SERINE











PROTEINASE) ALPHA-











THROMBIN (E.C.3.4.21.5)











COMPLEX WITH 2HPP 3 D-











PHE-PRO-ARG-











CHLOROMETHYLKETONE











(PPACK)











CHLOROMETHYLKETONE











2HPP 4 REPLACED BY A











METHYLENE GROUP AND











BOVINE PROTHROMBIN











2HPP 5 FRAGMENT 2 2HPP 6


429
2hpq
P
36
119
1.2e−24


60.25
HYDROLASE(SERINE











PROTEINASE) ALPHA-











THROMBIN (E.C.3.4.21.5)











COMPLEX WITH 2HPQ 3 D-











PHE-PRO-ARG-











CHLOROMETHYLKETONE











(PPACK)











CHLOROMETHYLKETONE











2HPQ 4 REPLACED BY A











METHYLENE GROUP AND











HUMAN PROTHROMBIN











2HPQ 5 FRAGMENT 2











2HPQ 6


429
2pf1

20
119
2.3e−25
0.85
0.71

HYDROLASE(SERINE











PROTEINASE)











PROTHROMBIN











FRAGMENT 1











(RESIDUES 1-156)











2PF1 3


429
2pf1

5
131
2.3e−25


58.78
HYDROLASE(SERINE











PROTEINASE)











PROTHROMBIN











FRAGMENT 1











(RESIDUES 1-156)











2PF1 3


429
2pf2

35
119
2.5e−25
0.82
0.77

HYDROLASE(SERINE











PROTEASE)











PROTHROMBIN











FRAGMENT 1











(RESIDUES 1-156)











COMPLEX WITH











2PF2 3 CALCIUM 2PF2 4


429
3kiv

35
119
  5e−27


76.56
APOLIPOPROTEIN; CHAIN:
KRINGLE KRINGLE, LYSINE











NULL;
BINDING SITE,












APOLIPOPROTEIN(A)


429
3kiv

35
119
  5e−27
0.76
0.87

APOLIPOPROTEIN; CHAIN:
KRINGLE KRINGLE, LYSINE











NULL;
BINDING SITE,












APOLIPOPROTEIN(A)


429
5hpg
A
35
122
  1e−26


77.19
PLASMINOGEN; CHAIN: A,
SERINE PROTEASE SERINE











B;
PROTEASE, KRINGLE 5, HUMAN












PLASMINOGEN, FIBRINOLYSIS


429
5hpg
A
35
122
  1e−26
0.65
0.70

PLASMINOGEN; CHAIN: A,
SERINE PROTEASE SERINE











B;
PROTEASE, KRINGLE 5, HUMAN












PLASMINOGEN, FIBRINOLYSIS


429
9wga
A
21
168
1.7e−13
0.15
−0.12

LECTIN (AGGLUTININ)











WHEAT GERM











AGGLUTININ (ISOLECTIN











2) 9WGA 3


429
9wga
A
51
234
3.4e−10
0.16
−0.19

LECTIN (AGGLUTININ)











WHEAT GERM











AGGLUTININ (ISOLECTIN











2) 9WGA 3


448
1a14
L
20
126
3.4e−25


55.31
NEURAMINIDASE; CHAIN:
COMPLEX (ANTIBODY/ANTIGEN)











N; SINGLE CHAIN
COMPLEX (ANTIBODY/ANTIGEN),











ANTIBODY; CHAIN: H, L;
SINGLE-CHAIN ANTIBODY, 2












GLYCOSYLATED PROTEIN


448
1a2y
A
20
126
5.1e−27


54.70
MONOCLONAL
COMPLEX











ANTIBODY D1.3; CHAIN:
(IMMUNOGLOBULIN/HYDROLASE)











A, B; LYSOZYME; CHAIN:
COMPLEX











C;
(IMMUNOGLOBULIN/HYDROLASE),












IMMUNOGLOBULIN V 2 REGION,












SIGNAL, HYDROLASE,












GLYCOSIDASE, BACTERIOLYTIC 3












ENZYME, EGG WHITE


448
1a7q
L
20
136
1.5e−25


52.86
MONOCLONAL
IMMUNOGLOBULIN











ANTIBODY D1.3; CHAIN:
IMMUNOGLOBULIN, VARIANT











L, H;


448
1ao7
E
22
142
3.4e−46
−0.08
0.06

HLA-A 0201; CHAIN: A;
COMPLEX (MHC/VIRAL











BETA-2 MICROGLOBULIN;
PEPTIDE/RECEPTOR) HLA-A2











CHAIN: B; TAX PEPTIDE;
HEAVY CHAIN; CLASS I MHC, T-











CHAIN: C; T CELL
CELL RECEPTOR, VIRAL PEPTIDE, 2











RECEPTOR ALPHA;
COMPLEX (MHC/VIRAL











CHAIN: D; T CELL
PEPTIDE/RECEPTOR











RECEPTOR BETA; CHAIN:











E;


448
1ap2
A
20
128
5.1e−30


51.96
MONOCLONAL
IMMUNOGLOBULIN VARIABLE











ANTIBODY C219; CHAIN:
DOMAIN; SINGLE CHAIN FV,











A, B, C, D;
MONOCLONAL ANTIBODY, C219, P-












GLYCOPROTEIN, 2












IMMUNOGLOBULIN


448
1ar1
D
20
136
3.4e−26


52.90
CYTOCHROME C
COMPLEX











OXIDASE; CHAIN: A, B;
(OXIDOREDUCTASE/ANTIBODY)











ANTIBODY FV
CYTOCHROME AA3, COMPLEX IV,











FRAGMENT; CHAIN: C, D;
FERROCYTOCHROME C, COMPLEX












(OXIDOREDUCTASE/ANTIBODY),












ELECTRON TRANSPORT, 2












TRANSMEMBRANE, CYTOCHROME












OXIDASE, ANTIBODY COMPLEX


448
1b0w
A
20
127
5.1e−27


55.43
BENCE-JONES KAPPA 1
IMMUNE SYSTEM BENCE-JONES;











PROTEIN BRE; CHAIN: A,
IMMUNOGLOBULIN, AMYLOID,











B, C;
IMMUNE SYSTEM


448
1bd2
E
22
160
1.7e−48
−0.10
0.07

HLA-A 0201; CHAIN: A;
COMPLEX (MHC/VIRAL











BETA-2 MICROGLOBULIN;
PEPTIDE/RECEPTOR) HLA A2











CHAIN: B; TAX PEPTIDE;
HEAVY CHAIN; COMPLEX











CHAIN: C; T CELL
(MHC/VIRAL PEPTIDE/RECEPTOR)











RECEPTOR ALPHA;











CHAIN: D; T CELL











RECEPTOR BETA; CHAIN:











E;


448
1bec

23
143
1.7e−46
0.27
0.30

14.3.D T CELL ANTIGEN
RECEPTOR T CELL RECEPTOR 1BEC











RECEPTOR; 1BEC 6
14











CHAIN: NULL; 1BEC 6


448
1bfv
L
20
127
1.7e−25


51.18
FV4155; CHAIN: L, H;
IMMUNOGLOBULIN












IMMUNOGLOBULIN, FV












FRAGMENT, STEROID HORMONE, 2












FINE SPECIFICITY


448
1bvk
A
20
127
1.2e−29


57.64
HULYS11; CHAIN: A, B, D,
COMPLEX (HUMANIZED











E; LYSOZYME; CHAIN: C,
ANTIBODY/HYDROLASE)











F;
MURAMIDASE; HUMANIZED












ANTIBODY, ANTIBODY COMPLEX,












FV, ANTI-LYSOZYME, 2 COMPLEX












(HUMANIZED












ANTIBODY/HYDROLASE)


448
1bwm
A
23
138
3.4e−45
0.06
0.12

ALPHA-BETA T CELL
IMMUNE SYSTEM











RECEPTOR (TCR) (D10);
IMMUNOGLOBULIN,











CHAIN: A;
IMMUNORECEPTOR, IMMUNE












SYSTEM


448
1bww
A
18
126
  1e−28


52.26
IG KAPPA CHAIN V-I
IMMUNE SYSTEM REIV,











REGION RE1; CHAIN: A, B;
STABILIZED IMMUNOGLOBULIN












FRAGMENT, BENCE-JONES 2












PROTEIN, IMMUNE SYSTEM


448
1d9k
B
23
138
3.4e−45
0.12
0.34

T-CELL RECEPTOR D10
IMMUNE SYSTEM MHC I-AK; MHC











(ALPHA CHAIN); CHAIN:
I-AK; T-CELL RECEPTOR, MHC











A, E; T-CELL RECEPTOR
CLASS II, D10, I-AK











D10 (BETA CHAIN);











CHAIN: B, F; MHC I-AK A











CHAIN (ALPHA CHAIN);











CHAIN: C, G; MHC I-AK B











CHAIN (BETA CHAIN);











CHAIN: D, H;











CONALBUMIN PEPTIDE;











CHAIN: P, Q;


448
1d1f
L
20
127
8.5e−26


54.31
ANTI-DANSYL
IMMUNOGLOBULIN ANTI-DANSYL











IMMUNOGLOBULIN
FV FRAGMENT FV FRAGMENT,











IGG2A(S); CHAIN: L, H;
IMMUNOGLOBULIN


448
1dsf
L
20
129
5.1e−23


53.77
ANTICANCER ANTIBODY
IMMUNOGLOBULIN B1DSFV;











B1; CHAIN: L, H;
MONOCLONAL ANTIBODY,












ANTITUMOR, IMMUNOGLOBULIN


448
1f11
A
20
159
6.8e−34
−0.00
0.07

F124 IMMUNOGLOBULIN
IMMUNE SYSTEM











(KAPPA LIGHT CHAIN);
IMMUNOGLOBULIN, ANTIBODY,











CHAIN: A, C; F124
FAB, HEPATITIS B, PRES2











IMMUNOGLOBULIN (IGG1











HEAVY CHAIN); CHAIN: B,











D;


448
1fgv
L
20
136
1.7e−31


55.11
IMMUNOGLOBULIN FV











FRAGMENT OF A











HUMANIZED VERSION OF











THE ANTI-CD18 1FGV 3











ANTIBODY ’H52’ (HUH52-











AAFV) 1FGV4


448
1fvc
A
20
128
6.8e−31


54.26
IMMUNOGLOBULIN FV











FRAGMENT OF











HUMANIZED ANTIBODY











4D5, VERSION 8 1FVC 3


448
1fyt
E
22
160
6.8e−44
0.05
0.10

HLA CLASS II
IMMUNE SYSTEM HLA-DR1, DRA;











HISTOCOMPATIBILITY
HLA-DR1, DRB1 0101; TCR HA1.7











ANTIGEN, DR CHAIN: A;
ALPHA CHAIN; TCR HA1.7 BETA











HLA CLASS II
CHAIN; PROTEIN-PROTEIN











HISTOCOMPATIBILITY
COMPLEX, IMMUNOGLOBULIN











ANTIGEN, DR-1 CHAIN: B;
FOLD











HEMAGGLUTININ HA1











PEPTIDE CHAIN; CHAIN:











C; T-CELL RECEPTOR











ALPHA CHAIN; CHAIN: D;











T-CELL RECEPTOR BETA











CHAIN; CHAIN: E;


448
1igm
L
20
134
5.1e−30


57.34
IMMUNOGLOBULIN











IMMUNOGLOBULIN M











(IG-M) FV FRAGMENT











1IGM 3


448
1ivl
A
20
126
  1e−24


60.97
IMMUNOGLOBULIN











IMMUNOGLOBULIN VL











DOMAIN (VARIABLE











DOMAIN OF KAPPA LIGHT











1IVL 3 CHAIN) OF











DESIGNED ANTIBODY











M29B 1IVL 4


448
1jhl
L
20
127
3.4e−28


57.95
COMPLEX(ANTIBODY-











ANTIGEN) FV FRAGMENT











(IGG1, KAPPA) (LIGHT











AND HEAVY VARIABLE











DOMAINS 1JHL 3 NON-











COVALENTLY











ASSOCIATED) OF











MONOCLONAL ANTI-HEN











EGG 1JHL 4 LYSOZYME











ANTIBODY D11.15











COMPLEX WITH











PHEASANT EGG 1JHL 5











LYSOZYME 1JHL 6


448
1kb5
B
21
136
1.7e−33


50.75
KB5-C20 T-CELL ANTIGEN
COMPLEX











RECEPTOR; CHAIN: A, B;
(IMMUNOGLOBULIN/RECEPTOR)











ANTIBODY DESTIRE-1;
TCR VAPLHA VBETA DOMAIN; T-











CHAIN: L, H;
CELL RECEPTOR, STRAND SWITCH,












FAB, ANTICLONOTYPIC, 2












(IMMUNOGLOBULIN/RECEPTOR)


448
1maj

20
127
6.8e−24


50.59
IMMUNOGLOBULIN











MURINE ANTIBODY 26-10











VL DOMAIN (NMR, 15











ENERGY MINIMIZED











1MAJ 3 STRUCTURES)











1MAJ 4


448
1nfd
B
20
143
  1e−45
0.08
0.27

N15 ALPHA-BETA T-CELL
COMPLEX











RECEPTOR; CHAIN: A, B,
(IMMUNORECEPTOR/












IMMUNOGLOBULIN)











C, D; H57 FAB; CHAIN: E, F,
COMPLEX











G, H
(IMMUNORECEPTOR/












IMMUNOGLOBULIN)


448
1nmb
L
20
128
8.5e−27


58.89
N9 NEURAMINIDASE;
COMPLEX











1NMB 4 CHAIN: N; 1NMB 5
(HYDROLASE/IMMUNOGLOBULIN)











FAB NC10; 1NMB 9 CHAIN:











L, H; 1NMB 10


448
1rvf
L
20
130
5.1e−26


54.02
HUMAN RHINOVIRUS 14
COMPLEX (COAT











COAT PROTEIN; CHAIN: 1,
PROTEIN/IMMUNOGLOBULIN)











2,3,4; FAB 17-IA; CHAIN:
POLYPROTEIN, COAT PROTEIN,











L, H
CORE PROTEIN, RNA-DIRECTED












RNA 2 POLYMERASE, HYDROLASE,












THIOL PROTEASE,












MYRISTYLATION, 3 COMPLEX












(COAT












PROTEIN/IMMUNOGLOBULIN)


448
1sbs
L
20
159
1.2e−33
0.10
0.33

MONOCLONAL
MONOCLONAL ANTIBODY











ANTIBODY 3A2; CHAIN: H,
MONOCLONAL ANTIBODY, FAB-











L;
FRAGMENT, REPRODUCTION


448
1tcr
B
20
143
  1e−45
0.06
0.17

ALPHA, BETA T-CELL
RECEPTOR TCR; T-CELL,











RECEPTOR CHAIN: A, B;
RECEPTOR, TRANSMEMBRANE,












GLYCOPROTEIN, SIGNAL


448
1wtl
A
20
127
6.8e−28


54.08
IMMUNOGLOBULIN WAT,











A VARIABLE DOMAIN











FROM IMMUNOGLOBULIN











LIGHT-CHAIN 1WTL 3











(BENCE-JONES PROTEIN)











1WTL 4


448
2rhe

21
130
1.7e−24


52.52
IMMUNOGLOBULIN











BENCE-*JONES PROTEIN











(LAMBDA, VARIABLE











DOMAIN) 2RHE 4


449
1fo1
A
1
53
0.00012
−0.34
0.12

NUCLEAR RNA EXPORT
RNA BINDING PROTEIN TAP











FACTOR 1; CHAIN: A, B;
(NFX1); RIBONUCLEOPROTEIN












(RNP, RBD OR RRM) AND LEUCINE-












RICH-REPEAT 2 (LRR)


454
1dt6
A
60
248
8.5e−52
−0.41
0.05

CYTOCHROME P450 2C5;
OXIDOREDUCTASE











CHAIN: A;
PROGESTERONE 21-












HYDROXYLASE, CYPIIC5 P450 1,












MEMBRANE PROTEIN,












PROGESTERONE 21-












HYDROXYLASE, BENZO(A) 2












PYRENE HYDROXYLASE,












ESTRADIOL 2-HYDROXYLASE,












P450, CYP2C5


459
1b0u
A
129
254
5.1e−24
0.37
0.36

HISTIDINE PERMEASE;
TRANSPORT PROTEIN ABC











CHAIN: A;
TRANSPORTER, HISP; ABC












TRANSPORTER, HISTIDINE












PERMEASE, TRANSPORT PROTEIN


459
1f2u
A
141
175
0.0025
−0.78
0.09

RAD50 ABC-ATPASE;
REPLICATION DNA DOUBLE-











CHAIN: A, C; RAD50 ABC-
STRAND BREAK REPAIR, ABC-











ATPASE; CHAIN: B, D;
ATPASE


459
1f2u
A
160
213
0.0048
−0.91
0.12

RAD50 ABC-ATPASE;
REPLICATION DNA DOUBLE-











CHAIN: A, C; RAD50 ABC-
STRAND BREAK REPAIR, ABC-











ATPASE; CHAIN: B, D;
ATPASE


459
1g29
1
142
253
3.4e−21
−0.28
0.34

MALTOSE TRANSPORT
SUGAR BINDING PROTEIN MALK;











PROTEIN MALK; CHAIN: 1,
ATPASE, ACTIVE TRANSPORT,











2;
MALTOSE UPTAKE AND












REGULATION


459
1gky

158
184
0.0027
−0.82
0.28

TRANSFERASE











GUANYLATE KINASE











(E.C.2.7.4.8) COMPLEX











WITH 1GKY 3 GUANOSINE











MONOPHOSPHATE 1GKY 4


461
1e3y
A
103
159
0.0025
0.21
0.52

FADD PROTEIN; CHAIN: A;
APOPTOSIS FAS-ASSOCIATING












DEATH DOMAIN-CONTAINING












PROTEIN; DEATH DOMAIN,












ADAPTER MOLECULE, FAS












RECEPTOR DEATH INDUCING 2












SIGNALLING COMPLEX


461
1fad
A
103
150
0.00075
0.14
1.00

FADD PROTEIN; CHAIN: A;
APOPTOSIS APOPTOSIS, FADD,












DEATH DOMAIN


461
1lrv

19
202
0.0018
0.32
0.03

LEUCINE-RICH REPEAT
LEUCINE-RICH REPEATS LRV;











VARIANT; CHAIN: NULL;
LEUCINE-RICH REPEATS,












REPETITIVE STRUCTURE, IRON












SULFUR 2 PROTEINS, NITROGEN












FIXATION


483
1fum
D
2
100
1.7e−44
−0.68
1.00

FUMARATE REDUCTASE
OXIDOREDUCTASE COMPLEX II;











FLAVOPROTEIN SUBUNIT;
COMPLEX II; COMPLEX II;











CHAIN: A, M; FUMARATE
COMPLEX II; FUMARATE











REDUCTASE IRON-
REDUCTASE, COMPLEX II,











SULFUR PROTEIN; CHAIN:
SUCCINATE DEHYDROGENASE, 2











B, N; FUMARATE
RESPIRATION, OXIDOREDUCTASE











REDUCTASE 15 KD











HYDROPHOBIC PROTEIN;











CHAIN: C, O; FUMARATE











REDUCTASE 13 KD











HYDROPHOBIC PROTEIN;











CHAIN: D, P;


483
1fum
D
2
117
1.7e−44


168.49
FUMARATE REDUCTASE
OXIDOREDUCTASE COMPLEX II;











FLAVOPROTEIN SUBUNIT;
COMPLEX II; COMPLEX II;











CHAIN: A, M; FUMARATE
COMPLEX II; FUMARATE











REDUCTASE IRON-
REDUCTASE, COMPLEX II,











SULFUR PROTEIN; CHAIN:
SUCCINATE DEHYDROGENASE, 2











B, N; FUMARATE
RESPIRATION, OXIDOREDUCTASE











REDUCTASE 15 KD











HYDROPHOBIC PROTEIN;











CHAIN: C, O; FUMARATE











REDUCTASE 13 KD











HYDROPHOBIC PROTEIN;











CHAIN: D, P;


486
1qu7
A
154
214
2.5e−09
−0.49
0.90

METHYL-ACCEPTING
SIGNALING PROTEIN SERINE,











CHEMOTAXIS PROTEIN I;
CHEMOTAXIS, FOUR HELICAL-











CHAIN: A, B;
BUNDLE


486
2asr

38
71
  5e−10
−0.81
0.51

CHEMOTAXIS











ASPARTATE RECEPTOR











(LIGAND BINDING











DOMAIN) 2ASR 3


486
2lig
A
26
71
2.5e−14
−0.79
0.47

ASPARTATE RECEPTOR;
CHEMOTAXIS











2LIG 4 CHAIN: A, B; 2LIG 5


490
1c17
M
130
265
0.001


73.12
ATP SYNTHASE SUBUNIT
MEMBRANE PROTEIN MEMBRANE











C; CHAIN: A, B, C, D, E, F,
PROTEIN, HELIX, COMPLEX











G, H, I, J, K, L; ATP











SYNTHASE SUBUNIT A;











CHAIN: M;


496
1d0s
A
2
91
1.3e−09
0.26
−0.20

NICOTINATE
TRANSFERASE DINUCLEOTIDE-











MONONUCLEOTIDE: 5, 6-
BINDING MOTIF,











CHAIN: A;
PHOSPHORIBOSYL TRANSFERASE


496
1eut

24
125
  1e−09
0.40
−0.20

SIALIDASE; CHAIN: NULL;
HYDROLASE NEURAMINIDASE;












HYDROLASE, GLYCOSIDASE


496
2pro
A
10
136
  1e−18
0.12
−0.20

ALPHA-LYTIC PROTEASE;
PRO REGION PRO REGION,











CHAIN: A, B, C;
FOLDASE, PROTEIN FOLDING,












SERINE PROTEASE


503
1c28
A
71
204
1.7e−34
0.72
0.89

30 KD ADIPOCYTE
SERUM PROTEIN ACRP30 C1Q TNF











COMPLEMENT-RELATED
TRIMER ALL-BETA, SERUM











PROTEIN CHAIN: A, B, C;
PROTEIN


503
1c28
A
73
203
6.8e−33
0.52
0.98

30 KD ADIPOCYTE
SERUM PROTEIN ACRP30 C1Q TNF











COMPLEMENT-RELATED
TRIMER ALL-BETA, SERUM











PROTEIN CHAIN: A, B, C;
PROTEIN


503
1c28
A
77
204
1.7e−34


64.45
30 KD ADIPOCYTE
SERUM PROTEIN ACRP30 C1Q TNF











COMPLEMENT-RELATED
TRIMER ALL-BETA, SERUM











PROTEIN CHAIN: A, B, C;
PROTEIN


503
1c28
B
73
203
  1e−30
0.76
0.83

30 KD ADIPOCYTE
SERUM PROTEIN ACRP30 C1Q TNF











COMPLEMENT-RELATED
TRIMER ALL-BETA, SERUM











PROTEIN CHAIN: A, B, C;
PROTEIN


503
1c28
B
81
196
  1e−30


53.80
30 KD ADIPOCYTE
SERUM PROTEIN ACRP30 C1Q TNF











COMPLEMENT-RELATED
TRIMER ALL-BETA, SERUM











PROTEIN CHAIN: A, B, C;
PROTEIN


503
1c28
C
73
203
8.5e−28
0.56
0.37

30 KD ADIPOCYTE
SERUM PROTEIN ACRP30 C1Q TNF











COMPLEMENT-RELATED
TRIMER ALL-BETA, SERUM











PROTEIN CHAIN: A, B, C;
PROTEIN


514
4hb1

290
328
0.00051
0.28
0.53

DHP1; CHAIN: NULL;
DESIGNED HELICAL BUNDLE












DESIGNED HELICAL BUNDLE


526
1aut
L
97
202
2.5e−13


51.11
ACTIVATED PROTEIN C;
COMPLEX (BLOOD











CHAIN: C, L; D-PHE-PRO-
COAGULATION/INHIBITOR)











MAI; CHAIN: P;
AUTOPROTHROMBIN IIA;












HYDROLASE, SERINE












PROTEINASE), PLASMA CALCIUM












BINDING, 2 GLYCOPROTEIN,












COMPLEX (BLOOD












COAGULATION/INHIBITOR)


526
1dan
L
114
245
  5e−16


53.38
BLOOD COAGULATION
BLOOD COAGULATION, SERINE











FACTOR VIIA; CHAIN: L,
PROTEASE, COMPLEX, CO-FACTOR,











H; SOLUBLE TISSUE
2 RECEPTOR ENZYME, INHIBITOR,











FACTOR; CHAIN: T, U; D-
GLA, EGF, 3 COMPLEX (SERINE











PHE-PHE-ARG-
PROTEASE/COFACTOR/LIGAND)











CHLOROMETHYLKETONE











(DFFRCMK) WITH CHAIN:











C;


526
1dan
L
151
232
8.5e−12
0.07
0.30

BLOOD COAGULATION
BLOOD COAGULATION, SERINE











FACTOR VIIA; CHAIN: L,
PROTEASE, COMPLEX, CO-FACTOR,











H; SOLUBLE TISSUE
2 RECEPTOR ENZYME, INHIBITOR,











FACTOR; CHAIN: T, U; D-
GLA, EGF, 3 COMPLEX (SERINE











PHE-PHE-ARG-
PROTEASE/COFACTOR/LIGAND)











CHLOROMETHYLKETONE











(DFFRCMK) WITH CHAIN:











C;


526
1dan
L
32
154
2.5e−15
0.23
−0.13

BLOOD COAGULATION
BLOOD COAGULATION, SERINE











FACTOR VIIA; CHAIN: L,
PROTEASE, COMPLEX, CO-FACTOR,











H; SOLUBLE TISSUE
2 RECEPTOR ENZYME, INHIBITOR,











FACTOR; CHAIN: T, U; D-
GLA, EGF, 3 COMPLEX (SERINE











PHE-PHE-ARG-
PROTEASE/COFACTOR/LIGAND)











CHLOROMETHYLKETONE











(DFFRCMK) WITH CHAIN:











C;


526
1dan
L
82
197
  5e−16
0.45
−0.12

BLOOD COAGULATION
BLOOD COAGULATION, SERINE











FACTOR VIIA; CHAIN: L,
PROTEASE, COMPLEX, CO-FACTOR,











H; SOLUBLE TISSUE
2 RECEPTOR ENZYME, INHIBITOR,











FACTOR; CHAIN: T, U; D-
GLA, EGF, 3 COMPLEX (SERINE











PHE-PHE-ARG-
PROTEASE/COFACTOR/LIGAND)











CHLOROMETHYLKETONE











(DFFRCMK) WITH CHAIN:











C;


526
1dva
L
151
232
8.5e−12
−0.02
0.63

DES-GLA FACTOR VIIA
HYDROLASE/HYDROLASE











(HEAVY CHAIN); CHAIN:
INHIBITOR PROTEIN-PEPTIDE











H, I; DES-GLA FACTOR
COMPLEX











VIIA (LIGHT CHAIN);











CHAIN: L, M; (DPN)-PHE-











ARG; CHAIN: C, D;











PEPTIDE E-76; CHAIN: X,











Y;


526
1dx5
I
107
225
2.5e−15
0.30
0.04

THROMBIN LIGHT CHAIN;
SERINE PROTEINASE











CHAIN: A, B, C, D;
COAGULATION FACTOR II;











THROMBIN HEAVY
COAGULATION FACTOR II;











CHAIN; CHAIN: M, N, O, P;
FETOMODULIN, TM, CD141











THROMBOMODULIN;
ANTIGEN; EGR-CMK SERINE











CHAIN: I, J, K, L;
PROTEINASE, EGF-LIKE DOMAINS,











THROMBIN INHIBITOR L-
ANTICOAGULANT COMPLEX, 2











GLU-L-GLY-L-ARM;
ANTIFIBRINOLYTIC COMPLEX











CHAIN: E, F, G, H;


526
1dx5
I
149
259
6.8e−14
−0.00
−0.18

THROMBIN LIGHT CHAIN;
SERINE PROTEINASE











CHAIN: A, B, C, D;
COAGULATION FACTOR II;











THROMBIN HEAVY
COAGULATION FACTOR II;











CHAIN; CHAIN: M, N, O, P;
FETOMODULIN, TM, CD141











THROMBOMODULIN;
ANTIGEN; EGR-CMK SERINE











CHAIN: I, J, K, L;
PROTEINASE, EGF-LIKE DOMAINS,











THROMBIN INHIBITOR L-
ANTICOAGULANT COMPLEX, 2











GLU-L-GLY-L-ARM;
ANTIFIBRINOLYTIC COMPLEX











CHAIN: E, F, G, H;


526
1dx5
I
70
193
  2e−16
0.42
−0.12

THROMBIN LIGHT CHAIN;
SERINE PROTEINASE











CHAIN: A, B, C, D;
COAGULATION FACTOR II;











THROMBIN HEAVY
COAGULATION FACTOR II;











CHAIN; CHAIN: M, N, O, P;
FETOMODULIN, TM, CD141











THROMBOMODULIN;
ANTIGEN; EGR-CMK SERINE











CHAIN: I, J, K, L;
PROTEINASE, EGF-LIKE DOMAINS,











THROMBIN INHIBITOR L-
ANTICOAGULANT COMPLEX, 2











GLU-L-GLY-L-ARM;
ANTIFIBRINOLYTIC COMPLEX











CHAIN: E, F, G, H;


526
1ext
A
33
173
  5e−15
0.13
−0.15

TUMOR NECROSIS
SIGNALLING PROTEIN BINDING











FACTOR RECEPTOR;
PROTEIN, CYTOKINE, SIGNALLING











CHAIN: A, B;
PROTEIN


526
1ext
A
53
203
1.8e−15


65.24
TUMOR NECROSIS
SIGNALLING PROTEIN BINDING











FACTOR RECEPTOR;
PROTEIN, CYTOKINE, SIGNALLING











CHAIN: A, B;
PROTEIN


526
1ext
A
54
197
1.8e−15
0.30
−0.06

TUMOR NECROSIS
SIGNALLING PROTEIN BINDING











FACTOR RECEPTOR;
PROTEIN, CYTOKINE, SIGNALLING











CHAIN: A, B;
PROTEIN


526
1fak
L
151
232
8.5e−12
−0.08
0.78

BLOOD COAGULATION
BLOOD CLOTTING











FACTOR VIIA; CHAIN: L;
COMPLEX (SERINE











BLOOD COAGULATION
PROTEASE/COFACTOR/LIGAND),











FACTOR VIIA; CHAIN: H;
BLOOD COAGULATION, 2 SERINE











SOLUBLE TISSUE
PROTEASE, COMPLEX, CO-FACTOR,











FACTOR; CHAIN: T; 5L15;
RECEPTOR ENZYME, 3 INHIBITOR,











CHAIN: I;
GLA, EGF, COMPLEX (SERINE 4












PROTEASE/COFACTOR/LIGAND),












BLOOD CLOTTING


526
1igr
A
8
214
  1e−21
0.15
−0.12

INSULIN-LIKE GROWTH
HORMONE RECEPTOR HORMONE











FACTOR RECEPTOR 1;
RECEPTOR, INSULIN RECEPTOR











CHAIN: A;
FAMILY


526
1klo

111
237
3.4e−17
0.52
0.88

LAMININ; CHAIN: NULL;
GLYCOPROTEIN GLYCOPROTEIN


526
1klo

154
268
8.5e−16
0.34
0.10

LAMININ; CHAIN: NULL;
GLYCOPROTEIN GLYCOPROTEIN


526
1klo

31
198
2.5e−29
0.39
0.01

LAMININ; CHAIN: NULL;
GLYCOPROTEIN GLYCOPROTEIN


526
1klo

33
199
2.5e−29


91.67
LAMININ; CHAIN: NULL;
GLYCOPROTEIN GLYCOPROTEIN


526
1klo

68
197
1.2e−18
0.51
0.90

LAMININ; CHAIN: NULL;
GLYCOPROTEIN GLYCOPROTEIN


526
1klo

68
218
2.5e−26
0.24
−0.14

LAMININ; CHAIN: NULL;
GLYCOPROTEIN GLYCOPROTEIN


526
1ncf
A
39
180
  5e−17
0.23
−0.07

TUMOR NECROSIS
SIGNALLING PROTEIN TYPE I











FACTOR RECEPTOR; 1NCF
RECEPTOR, STNFR1; 1NCF 8











4 CHAIN: A, B; 1NCF 5
BINDING PROTEIN, CYTOKINE












1NCF 19


526
1ncf
A
51
180
  5e−17


54.09
TUMOR NECROSIS
SIGNALLING PROTEIN TYPE I











FACTOR RECEPTOR; 1NCF
RECEPTOR, STNFR1; 1NCF 8











4 CHAIN: A, B; 1NCF 5
BINDING PROTEIN, CYTOKINE












1NCF 19


526
1ncf
A
96
218
2.5e−16
0.33
−0.14

TUMOR NECROSIS
SIGNALLING PROTEIN TYPE I











FACTOR RECEPTOR; 1NCF
RECEPTOR, STNFR1; 1NCF 8











4 CHAIN: A, B; 1NCF 5
BINDING PROTEIN, CYTOKINE












1NCF 19


526
1pfx
L
64
214
1.8e−26
0.20
−0.18

FACTOR IXA; CHAIN: C,
COMPLEX (BLOOD











L,; D-PHE-PRO-ARG;
COAGULATION/INHIBITOR)











CHAIN: I;
CHRISTMAS FACTOR; COMPLEX,












INHIBITOR, HEMOPHILIA/EGF,












BLOOD COAGULATION, 2 PLASMA,












SERINE PROTEASE, CALCIUM-












BINDING, HYDROLASE, 3












GLYCOPROTEIN


526
1pfx
L
72
208
  5e−28


62.61
FACTOR IXA; CHAIN: C,
COMPLEX (BLOOD











L,; D-PHE-PRO-ARG;
COAGULATION/INHIBITOR)











CHAIN: I;
CHRISTMAS FACTOR; COMPLEX,












INHIBITOR, HEMOPHILIA/EGF,












BLOOD COAGULATION, 2 PLASMA,












SERINE PROTEASE, CALCIUM-












BINDING, HYDROLASE, 3












GLYCOPROTEIN


526
1pp2
R
64
184
  1e−17
0.13
−0.18

HYDROLASE CALCIUM-











FREE PHOSPHOLIPASE











A = 2 = (E.C.3.1.1.4)











1PP2 4


526
1qfk
L
107
206
7.5e−17
0.34
−0.02

COAGULATION FACTOR
SERINE PROTEASE FVIIA; FVIIA;











VIIA (LIGHT CHAIN);
BLOOD COAGULATION, SERINE











CHAIN: L; COAGULATION
PROTEASE











FACTOR VIIA (HEAVY











CHAIN); CHAIN: H;











TRIPEPTIDYL INHIBITOR;











CHAIN: C;


526
1qfk
L
151
232
8.5e−12
0.04
0.83

COAGULATION FACTOR
SERINE PROTEASE FVIIA; FVIIA;











VIIA (LIGHT CHAIN);
BLOOD COAGULATION, SERINE











CHAIN: L; COAGULATION
PROTEASE











FACTOR VIIA (HEAVY











CHAIN); CHAIN: H;











TRIPEPTIDYL INHIBITOR;











CHAIN: C;


526
1qub
A
11
297
2.5e−33


60.56
HUMAN BETA2-
MEMBRANE ADHESION SHORT











GLYCOPROTEIN I; CHAIN:
CONSENSUS REPEAT, SUSHI,











A;
COMPLEMENT CONTROL PROTEIN,












2 N-GLYCOSYLATION, MULTI-












DOMAIN, MEMBRANE ADHESION


526
1skz

106
216
2.5e−20


66.23
ANTISTASIN; CHAIN:
SERINE PROTEASE INHIBITOR











NULL;
FACTOR XA INHIBITOR;












ANTISTASIN, CRYSTAL












STRUCTURE, FACTOR XA












INHIBITOR, 2 SERINE PROTEASE












INHIBITOR, THROMBOSIS


526
1skz

64
216
2.5e−20
0.18
0.21

ANTISTASIN; CHAIN:
SERINE PROTEASE INHIBITOR











NULL;
FACTOR XA INHIBITOR;












ANTISTASIN, CRYSTAL












STRUCTURE, FACTOR XA












INHIBITOR, 2 SERINE PROTEASE












INHIBITOR, THROMBOSIS


526
1tpg

37
143
2.2e−18
0.38
0.11

T-PLASMINOGEN
PLASMINOGEN ACTIVATION











ACTIVATOR F1-G; 1TPG 7











CHAIN: NULL; 1TPG 8


526
1tpg

81
184
  5e−18
0.49
−0.08

T-PLASMINOGEN
PLASMINOGEN ACTIVATION











ACTIVATOR F1-G; 1TPG 7











CHAIN: NULL; 1TPG 8


526
1vap
A
70
184
1.8e−15
−0.14
0.00

PHOSPHOLIPASE A2;
LIPID DEGRADATION











CHAIN: A, B;
PHOSPHOLIPASE A2, LIPID












DEGRADATION, HYDROLASE


526
1xka
L
70
154
  5e−15
0.14
0.18

BLOOD COAGULATION
BLOOD COAGULATION FACTOR











FACTOR XA; CHAIN: L, C;
STUART FACTOR; BLOOD












COAGULATION FACTOR, SERINE












PROTEINASE, EPIDERMAL 2












GROWTH FACTOR LIKE DOMAIN


526
2not
A
34
137
  5e−15
0.11
−0.13

PHOSPHOLIPASE A2;
HYDROLASE HYDROLASE, LIPID











CHAIN: A, B;
DEGRADATION, CALCIUM,












PRESYNAPTIC 2 NEUROTOXIN,












VENOM


526
9wga
A
20
181
1.7e−16
0.20
0.05

LECTIN (AGGLUTININ)











WHEAT GERM











AGGLUTININ (ISOLECTIN











2) 9WGA 3


526
9wga
A
31
180
2.5e−27
0.32
−0.17

LECTIN (AGGLUTININ)











WHEAT GERM











AGGLUTININ (ISOLECTIN











2) 9WGA 3


526
9wga
A
53
219
  5e−30


79.05
LECTIN (AGGLUTININ)











WHEAT GERM











AGGLUTININ (ISOLECTIN











2) 9WGA 3


526
9wga
A
55
229
3.4e−15
0.39
0.10

LECTIN (AGGLUTININ)











WHEAT GERM











AGGLUTININ (ISOLECTIN











2) 9WGA 3


526
9wga
A
64
218
  5e−30
0.79
−0.05

LECTIN (AGGLUTININ)











WHEAT GERM











AGGLUTININ (ISOLECTIN











2) 9WGA 3


533
1ckl
A
12
132
8.5e−11
−0.11
0.05

CD46; CHAIN: A, B, C, D, E,
GLYCOPROTEIN MEMBRANE











F;
COFACTOR PROTEIN (MCP); VIRUS












RECEPTOR, COMPLEMENT












COFACTOR, SHORT CONSENSUS












REPEAT, 2 SCR, MEASLES VIRUS,












GLYCOPROTEIN


533
1ckl
A
67
131
  5e−11
0.36
0.03

CD46; CHAIN: A, B, C, D, E,
GLYCOPROTEIN MEMBRANE











F;
COFACTOR PROTEIN (MCP); VIRUS












RECEPTOR, COMPLEMENT












COFACTOR, SHORT CONSENSUS












REPEAT, 2 SCR, MEASLES VIRUS,












GLYCOPROTEIN


533
1e5g
A
72
192
3.4e−14
0.10
0.24

COMPLEMENT CONTROL
COMPLEMENT INHIBITOR VCP,











PROTEIN; CHAIN: A;
SP35; COMPLEMENT, NMR,












MODULES, PROTEIN STRUCTURE,












VACCINIA VIRUS


533
1e5g
A
73
155
1.3e−15
0.11
0.24

COMPLEMENT CONTROL
COMPLEMENT INHIBITOR VCP,











PROTEIN; CHAIN: A;
SP35; COMPLEMENT, NMR,












MODULES, PROTEIN STRUCTURE,












VACCINIA VIRUS


533
1hcc

73
132
  5e−10
0.35
0.57

GLYCOPROTEIN 16TH











COMPLEMENT CONTROL











PROTEIN (/CCP$) OF











FACTOR H 1HCC 3


533
1hfh

70
192
1.3e−10


51.85
GLYCOPROTEIN FACTOR











H, 15TH AND 16TH C-











MODULE PAIR (NMR,











MINIMIZED 1HFHA 1











AVERAGED STRUCTURE)











1HFH 4 1HFHA 5


533
1hfh

71
155
1.3e−10
0.36
0.22

GLYCOPROTEIN FACTOR











H, 15TH AND 16TH C-











MODULE PAIR (NMR,











MINIMIZED 1HFHA 1











AVERAGED STRUCTURE)











1HFH 4 1HFHA 5


533
1qub
A
21
297
3.4e−27


56.46
HUMAN BETA2-
MEMBRANE ADHESION SHORT











GLYCOPROTEIN I; CHAIN:
CONSENSUS REPEAT, SUSHI,











A;
COMPLEMENT CONTROL PROTEIN,












2 N-GLYCOSYLATION, MULTI-












DOMAIN, MEMBRANE ADHESION


533
1sfp

129
245
2.3e−27


51.76
ASFP; CHAIN: NULL;
SPERMADHESIN ACIDIC SEMINAL












PROTEIN; SPERMADHESIN, BOVINE












SEMINAL PLASMA PROTEIN,












ACIDIC 2 SEMINAL FLUID PROTEIN,












ASFP, CUB DOMAIN, X-RAY












CRYSTAL 3 STRUCTURE, GROWTH












FACTOR


533
1sfp

135
242
2.3e−27
0.38
0.81

ASFP; CHAIN: NULL;
SPERMADHESIN ACIDIC SEMINAL












PROTEIN; SPERMADHESIN, BOVINE












SEMINAL PLASMA PROTEIN,












ACIDIC 2 SEMINAL FLUID PROTEIN,












ASFP, CUB DOMAIN, X-RAY












CRYSTAL 3 STRUCTURE, GROWTH












FACTOR


533
1sfp

155
244
1.7e−07
0.01
0.16

ASFP; CHAIN: NULL;
SPERMADHESIN ACIDIC SEMINAL












PROTEIN; SPERMADHESIN, BOVINE












SEMINAL PLASMA PROTEIN,












ACIDIC 2 SEMINAL FLUID PROTEIN,












ASFP, CUB DOMAIN, X-RAY












CRYSTAL 3 STRUCTURE, GROWTH












FACTOR


533
1spp
A
135
242
  5e−28
0.38
0.71

MAJOR SEMINAL PLASMA
COMPLEX (SEMINAL PLASMA











GLYCOPROTEIN PSP-I;
PROTEIN/SPP) SEMINAL PLASMA











CHAIN: A; MAJOR
PROTEINS, SPERMADHESINS, CUB











SEMINAL PLASMA
DOMAIN 2 ARCHITECTURE,











GLYCOPROTEIN PSP-II;
COMPLEX (SEMINAL PLASMA











CHAIN: B
PROTEIN/SPP)


533
1spp
B
129
242
  5e−29
0.36
0.62

MAJOR SEMINAL PLASMA
COMPLEX (SEMINAL PLASMA











GLYCOPROTEIN PSP-I;
PROTEIN/SPP) SEMINAL PLASMA











CHAIN: A; MAJOR
PROTEINS, SPERMADHESINS, CUB











SEMINAL PLASMA
DOMAIN 2 ARCHITECTURE,











GLYCOPROTEIN PSP-II;
COMPLEX (SEMINAL PLASMA











CHAIN: B
PROTEIN/SPP)


533
1vvc

10
127
3.4e−15
0.24
−0.14

VACCINIA VIRUS
COMPLEMENT INHIBITOR SP35,











COMPLEMENT CONTROL
VCP, VACCINIA VIRUS SP35;











PROTEIN; CHAIN: NULL;
COMPLEMENT INHIBITOR,












COMPLEMENT MODULE, SCR,












SUSHI DOMAIN, 2 MODULE PAIR


533
1vvc

72
194
1.7e−12
0.09
0.25

VACCINIA VIRUS
COMPLEMENT INHIBITOR SP35,











COMPLEMENT CONTROL
VCP, VACCINIA VIRUS SP35;











PROTEIN; CHAIN: NULL;
COMPLEMENT INHIBITOR,












COMPLEMENT MODULE, SCR,












SUSHI DOMAIN, 2 MODULE PAIR


533
1vvc

72
196
1.7e−12


50.78
VACCINIA VIRUS
COMPLEMENT INHIBITOR SP35,











COMPLEMENT CONTROL
VCP, VACCINIA VIRUS SP35;











PROTEIN; CHAIN: NULL;
COMPLEMENT INHIBITOR,












COMPLEMENT MODULE, SCR,












SUSHI DOMAIN, 2 MODULE PAIR


542
1lba

230
375
  5e−39
0.08
0.05

HYDROLASE(ACTING ON











LINEAR AMIDES)











LYSOZYME (E.C.3.5.1.28)











MUTANT WITH ALA 6











REPLACED BY LYS 1LBA 3











AND RESIDUES 2-5











DELETED (DEL(2-5), A6K)











1LBA 4


542
1lba

258
359
1.7e−23
0.03
0.33

HYDROLASE(ACTING ON











LINEAR AMIDES)











LYSOZYME (E.C.3.5.1.28)











MUTANT WITH ALA 6











REPLACED BY LYS 1LBA 3











AND RESIDUES 2-5











DELETED (DEL(2-5), A6K)











1LBA 4


542
1lba

72
232
2.5e−23


54.86
HYDROLASE(ACTING ON











LINEAR AMIDES)











LYSOZYME (E.C.3.5.1.28)











MUTANT WITH ALA 6











REPLACED BY LYS 1LBA 3











AND RESIDUES 2-5











DELETED (DEL(2-5), A6K)











1LBA 4


542
1lba

74
214
2.5e−23
0.26
0.55

HYDROLASE(ACTING ON











LINEAR AMIDES)











LYSOZYME (E.C.3.5.1.28)











MUTANT WITH ALA 6











REPLACED BY LYS 1LBA 3











AND RESIDUES 2-5











DELETED (DEL(2-5), A6K)











1LBA 4


542
1lba

81
175
3.4e−23
0.66
0.88

HYDROLASE(ACTING ON











LINEAR AMIDES)











LYSOZYME (E.C.3.5.1.28)











MUTANT WITH ALA 6











REPLACED BY LYS 1LBA 3











AND RESIDUES 2-5











DELETED (DEL(2-5), A6K)











1LBA 4


543
1c2a
A
35
148
0.0027
−0.45
0.03

BOWMAN-BIRK TRYPSIN
HYDROLASE INHIBITOR ALL-BETA











INHIBITOR; CHAIN: A
STRUCTURE, HYDROLASE












INHIBITOR


546
1c17
M
110
248
1.2e−07


79.86
ATP SYNTHASE SUBUNIT
MEMBRANE PROTEIN MEMBRANE











C; CHAIN: A, B, C, D, E, F,
PROTEIN, HELIX, COMPLEX











G, H, I, J, K, L; ATP











SYNTHASE SUBUNIT A;











CHAIN: M;


560
1b8q
A
223
353
1.2e−13


50.34
NEURONAL NITRIC OXIDE
OXIDOREDUCTASE PDZ DOMAIN,











SYNTHASE; CHAIN: A;
NNOS, NITRIC OXIDE SYNTHASE











HEPTAPEPTIDE; CHAIN: B;


560
1b8q
A
224
302
1.2e−13
0.05
0.88

NEURONAL NITRIC OXIDE
OXIDOREDUCTASE PDZ DOMAIN,











SYNTHASE; CHAIN: A;
NNOS, NITRIC OXIDE SYNTHASE











HEPTAPEPTIDE; CHAIN: B;


560
1b8q
A
313
429
1.8e−17
0.38
0.11

NEURONAL NITRIC OXIDE
OXIDOREDUCTASE PDZ DOMAIN,











SYNTHASE; CHAIN: A;
NNOS, NITRIC OXIDE SYNTHASE











HEPTAPEPTIDE; CHAIN: B;


560
1be9
A
116
229
1.7e−14
−0.22
0.70

PSD-95; CHAIN: A; CRIPT;
PEPTIDE RECOGNITION PEPTIDE











CHAIN: B;
RECOGNITION, PROTEIN












LOCALIZATION


560
1be9
A
221
337
1.3e−09


51.39
PSD-95; CHAIN: A; CRIPT;
PEPTIDE RECOGNITION PEPTIDE











CHAIN: B;
RECOGNITION, PROTEIN












LOCALIZATION


560
1be9
A
230
338
1.3e−09
0.71
1.00

PSD-95; CHAIN: A; CRIPT;
PEPTIDE RECOGNITION PEPTIDE











CHAIN: B;
RECOGNITION, PROTEIN












LOCALIZATION


560
1be9
A
315
380
  1e−10
0.07
0.18

PSD-95; CHAIN: A; CRIPT;
PEPTIDE RECOGNITION PEPTIDE











CHAIN: B;
RECOGNITION, PROTEIN












LOCALIZATION


560
1be9
A
349
413
  1e−10
−0.39
0.28

PSD-95; CHAIN: A; CRIPT;
PEPTIDE RECOGNITION PEPTIDE











CHAIN: B;
RECOGNITION, PROTEIN












LOCALIZATION


560
1i16

231
330
  2e−10
−0.19
0.01

INTERLEUKIN 16; CHAIN:
CYTOKINE LCF; CYTOKINE,











NULL;
LYMPHOCYTE












CHEMOATTRACTANT FACTOR, PDZ












DOMAIN


560
1i16

282
413
2.5e−16


52.15
INTERLEUKIN 16; CHAIN:
CYTOKINE LCF; CYTOKINE,











NULL;
LYMPHOCYTE












CHEMOATTRACTANT FACTOR, PDZ












DOMAIN


560
1i16

315
388
2.5e−16
0.93
1.00

INTERLEUKIN 16; CHAIN:
CYTOKINE LCF; CYTOKINE,











NULL;
LYMPHOCYTE












CHEMOATTRACTANT FACTOR, PDZ












DOMAIN


560
1kwa
A
234
321
1.5e−11
0.44
1.00

HCASK/LIN-2 PROTEIN;
KINASE HCASK, GLGF REPEAT,











CHAIN: A, B;
DHR; PDZ DOMAIN,












NEUREXIN,












SYNDECAN, RECEPTOR












CLUSTERING, KINASE


560
1kwa
A
313
388
2.3e−16
0.21
1.00

HCASK/LIN-2 PROTEIN;
KINASE HCASK, GLGF REPEAT,











CHAIN: A, B;
DHR; PDZ DOMAIN, NEUREXIN,












SYNDECAN, RECEPTOR












CLUSTERING, KINASE


560
1pdr

122
218
1.2e−13
−0.25
0.27

HUMAN DISCS LARGE
SIGNAL TRANSDUCTION HDLG,











PROTEIN; CHAIN: NULL;
DHR3 DOMAIN; SIGNAL












TRANSDUCTION, SH3 DOMAIN,












REPEAT


560
1pdr

228
295
2.5e−12
0.23
0.99

HUMAN DISCS LARGE
SIGNAL TRANSDUCTION HDLG,











PROTEIN; CHAIN: NULL;
DHR3 DOMAIN; SIGNAL












TRANSDUCTION, SH3 DOMAIN,












REPEAT


560
1pdr

311
380
  2e−12
0.41
0.82

HUMAN DISCS LARGE
SIGNAL TRANSDUCTION HDLG,











PROTEIN; CHAIN: NULL;
DHR3 DOMAIN; SIGNAL












TRANSDUCTION, SH3 DOMAIN,












REPEAT


560
1qau
A
117
224
7.5e−15
0.12
−0.01

NEURONAL NITRIC OXIDE
OXIDOREDUCTASE BETA-FINGER











SYNTHASE (RESIDUES











1-130);











CHAIN: A;


560
1qau
A
231
346
2.3e−13
0.45
0.83

NEURONAL NITRIC OXIDE
OXIDOREDUCTASE BETA-FINGER











SYNTHASE (RESIDUES











1-130);











CHAIN: A;


560
1qau
A
313
388
  5e−16
0.88
1.00

NEURONAL NITRIC OXIDE
OXIDOREDUCTASE BETA-FINGER











SYNTHASE (RESIDUES











1-130);











CHAIN: A;


560
1qav
A
114
212
1.5e−15
0.01
1.00

ALPHA-1 SYNTROPHIN
MEMBRANE











(RESIDUES 77-171);
PROTEIN/OXIDOREDUCTASE BETA-











CHAIN: A; NEURONAL
FINGER, HETERODIMER











NITRIC OXIDE SYNTHASE











(RESIDUES 1-130); CHAIN:











B;


560
1qav
A
229
309
7.5e−12
0.68
1.00

ALPHA-1 SYNTROPHIN
MEMBRANE











(RESIDUES 77-171);
PROTEIN/OXIDOREDUCTASE BETA-











CHAIN: A; NEURONAL
FINGER, HETERODIMER











NITRIC OXIDE SYNTHASE











(RESIDUES 1-130); CHAIN:











B;


560
1qav
A
311
388
  2e−16
0.66
1.00

ALPHA-1 SYNTROPHIN
MEMBRANE











(RESIDUES 77-171);
PROTEIN/OXIDOREDUCTASE BETA-











CHAIN: A; NEURONAL
FINGER, HETERODIMER











NITRIC OXIDE SYNTHASE











(RESIDUES 1-130); CHAIN:











B;


560
1qlc
A
116
213
  5e−15
0.81
0.89

POSTSYNAPTIC DENSITY
PEPTIDE RECOGNITION PSD-95;











PROTEIN 95; CHAIN: A;
PDZ DOMAIN, NEURONAL NITRIC












OXIDE SYNTHASE, NMDA












RECEPTOR 2 BINDING


560
1qlc
A
120
213
5.1e−15
0.36
0.22

POSTSYNAPTIC DENSITY
PEPTIDE RECOGNITION PSD-95;











PROTEIN 95; CHAIN: A;
PDZ DOMAIN, NEURONAL NITRIC












OXIDE SYNTHASE, NMDA












RECEPTOR 2 BINDING


560
1qlc
A
229
309
1.5e−09
0.68
1.00

POSTSYNAPTIC DENSITY
PEPTIDE RECOGNITION PSD-95;











PROTEIN 95; CHAIN: A;
PDZ DOMAIN, NEURONAL NITRIC












OXIDE SYNTHASE, NMDA












RECEPTOR 2 BINDING


560
1qlc
A
311
388
1.5e−14
0.75
1.00

POSTSYNAPTIC DENSITY
PEPTIDE RECOGNITION PSD-95;











PROTEIN 95; CHAIN: A;
PDZ DOMAIN, NEURONAL NITRIC












OXIDE SYNTHASE, NMDA












RECEPTOR 2 BINDING


560
3pdz
A
113
212
  5e−15
0.36
0.96

TYROSINE PHOSPHATASE
HYDROLASE PDZ DOMAIN, HUMAN











(PTP-BAS, TYPE 1); CHAIN:
PHOSPHATASE, HPTP1E, PTP-BAS,











A;
SPECIFICITY 2 OF BINDING


560
3pdz
A
227
324
7.5e−12
0.47
0.99

TYROSINE PHOSPHATASE
HYDROLASE PDZ DOMAIN, HUMAN











(PTP-BAS, TYPE 1); CHAIN:
PHOSPHATASE, HPTP1E, PTP-BAS,











A;
SPECIFICITY 2 OF BINDING


560
3pdz
A
311
388
2.5e−15
1.23
1.00

TYROSINE PHOSPHATASE
HYDROLASE PDZ DOMAIN, HUMAN











(PTP-BAS, TYPE 1); CHAIN:
PHOSPHATASE, HPTP1E, PTP-BAS,











A;
SPECIFICITY 2 OF BINDING


563
1a0j
A
36
265
0


235.32
TRYPSIN; CHAIN: A, B, C,
SERINE PROTEASE SERINE











D;
PROTEINASE, TRYPSIN,












HYDROLASE


563
1a0j
A
36
265
0
1.07
1.00

TRYPSIN; CHAIN: A, B, C,
SERINE PROTEASE SERINE











D;
PROTEINASE, TRYPSIN,












HYDROLASE


563
1a01
A
36
264
5.1e−82


167.61
BETA-TRYPTASE; CHAIN:
SERINE PROTEINASE TRYPSIN-











A, B, C, D;
LIKE SERINE PROTEINASE,












TETRAMER, HEPARIN, ALLERGY, 2












ASTHMA


563
1a5i
A
23
263
2.5e−83


173.85
PLASMINOGEN
COMPLEX (SERINE











ACTIVATOR; CHAIN: A;
PROTEASE/INHIBITOR)











GLU-GLY-ARG
(DELTAFEK)DSPAALPHA1;











CHLOROMETHYL
EGRCMK; SERINE PROTEASE,











KETONE; CHAIN: I;
FIBRINOLYTIC ENZYMES,












PLASMINOGEN 2 ACTIVATORS


563
1ao5
A
36
266
2.5e−96


226.34
GLANDULAR
SERINE PROTEASE PRORENIN











KALLIKREIN-13; CHAIN:
CONVERTING ENZYME (PRECE),











A, B;
EPIDERMAL GLANDULAR












KALLIKREIN, SERINE PROTEASE,












PROTEIN MATURATION


563
1ao5
A
38
264
2.5e−96
1.16
1.00

GLANDULAR
SERINE PROTEASE PRORENIN











KALLIKREIN-13; CHAIN:
CONVERTING ENZYME (PRECE),











A, B;
EPIDERMAL GLANDULAR












KALLIKREIN, SERINE PROTEASE,












PROTEIN MATURATION


563
1aut
C
36
263
2.2e−88


172.05
ACTIVATED PROTEIN C;
COMPLEX (BLOOD











CHAIN: C, L; D-PHE-PRO-
COAGULATION/INHIBITOR)











MAI; CHAIN: P;
AUTOPROTHROMBIN IIA;












HYDROLASE, SERINE












PROTEINASE), PLASMA CALCIUM












BINDING, 2 GLYCOPROTEIN,












COMPLEX (BLOOD












COAGULATION/INHIBITOR)


563
1bio

36
263
  5e−89


198.56
COMPLEMENT FACTOR D;
SERINE PROTEASE SERINE











CHAIN: NULL;
PROTEASE, HYDROLASE,












COMPLEMENT, FACTOR D,












CATALYTIC 2 TRIAD, SELF-












REGULATION


563
1bqy
A
36
271
  1e−92


205.81
PLASMINOGEN
BLOOD CLOTTING TSV-PA;











ACTIVATOR; CHAIN: A, B;
FIBRINOLYSIS, PLASMINOGEN











GLU-GLY-ARG-
ACTIVATOR, SERINE PROTEINASE,











CHLOROMETHYLKETONE
2 SNAKE VENOM, COMPLEX











INHIBITOR; CHAIN: E, F;
(HYDROLASE/INHIBITOR), BLOOD












CLOTTING


563
1cgh
A
36
264
3.4e−74


175.67
CATHEPSIN G; CHAIN: A;
COMPLEX (SERINE











PHOSPHONATE
PROTEASE/INHIBITOR)











INHIBITOR SUC-VAL-PRO-
INFLAMMATION, INHIBITOR,











PHEP-(OPH)2; CHAIN: S;
SPECIFICITY, SERINE PROTEASE, 2












COMPLEX (SERINE












PROTEASE/INHIBITOR)


563
1dpo

36
265
  1e−97


226.55
TRYPSIN; CHAIN: NULL;
SERINE PROTEASE HYDROLASE,












SERINE PROTEASE, DIGESTION,












PANCREAS, ZYMOGEN, 2 SIGNAL,












MULTIGENE FAMILY


563
1fxy
A
36
266
1.7e−91


218.47
COAGULATION FACTOR
COMPLEX (PROTEASE/INHIBITOR)











XA-TRYPSIN CHIMERA;
TRYPSIN, COAGULATION FACTOR











CHAIN: A; D-PHE-PRO-
XA, CHIMERA, PROTEASE, PPACK, 2











ARG-
CHLOROMETHYLKETONE,











CHLOROMETHYLKETONE
COMPLEX (PROTEASE/












INHIBITOR)











(PPACK) WITH CHAIN: I;


563
1mct
A
36
265
0


234.27
COMPLEX(PROTEINASE/












INHIBITOR)











TRYPSIN











(E.C.3.4.21.4) COMPLEXED











WITH INHIBITOR FROM











BITTER 1MCT 3 GOURD











1MCT 4


563
1mct
A
36
265
0
1.20
1.00

COMPLEX(PROTEINASE/











INHIBITOR)











TRYPSIN











(E.C.3.4.21.4) COMPLEXED











WITH INHIBITOR FROM











BITTER 1MCT 3 GOURD











1MCT 4


563
1npm
A
36
263
  5e−94


253.14
NEUROPSIN; CHAIN: A, B;
SERINE PROTEINASE SERINE












PROTEINASE, GLYCOPROTEIN


563
1pfx
C
36
263
  5e−91


177.42
FACTOR IXA; CHAIN: C,
COMPLEX (BLOOD











L,; D-PHE-PRO-ARG;
COAGULATION/INHIBITOR)











CHAIN: I;
CHRISTMAS FACTOR; COMPLEX,












INHIBITOR, HEMOPHILIA/EGF,












BLOOD COAGULATION, 2 PLASMA,












SERINE PROTEASE, CALCIUM-












BINDING, HYDROLASE, 3












GLYCOPROTEIN


563
1qrz
A
21
265
1.7e−88


176.03
PLASMINOGEN; CHAIN: A,
HYDROLASE











B, C, D;
MICROPLASMINOGEN, SERINE












PROTEASE, ZYMOGEN,












CHYMOTRYPSIN 2 FAMILY,












HYDROLASE


563
1rfn
A
36
263
1.3e−90


177.83
COAGULATION FACTOR
COAGULATION FACTOR SERINE











IX; CHAIN: A;
PROTEINASE, BLOOD











COAGULATION FACTOR
COAGULATION, COAGULATION











IX; CHAIN: B;
FACTOR


563
1rtf
B
36
264
  1e−84


177.39
TWO CHAIN TISSUE
SERINE PROTEASE (TC)-T-PA;











PLASMINOGEN
SERINE PROTEASE, FIBRINOLYTIC











ACTIVATOR; CHAIN: A, B;
ENZYMES


563
1sgf
A
45
266
7.5e−83


179.65
NERVE GROWTH FACTOR;
GROWTH FACTOR 7S NGF;











CHAIN: A, B, G, X, Y, Z;
GROWTH FACTOR (BETA-NGF),












HYDROLASE - SERINE PROTEINASE












2 (GAMMA-NGF), INACTIVE SERINE












PROTEINASE (ALPHA-NGF)


563
1sgf
G
36
265
8.5e−99
1.03
1.00

NERVE GROWTH FACTOR;
GROWTH FACTOR 7S NGF;











CHAIN: A, B, G, X, Y, Z;
GROWTH FACTOR (BETA-NGF),












HYDROLASE - SERINE PROTEINASE












2 (GAMMA-NGF), INACTIVE SERINE












PROTEINASE (ALPHA-NGF)


563
1sgf
G
36
266
8.5e−99


248.29
NERVE GROWTH FACTOR;
GROWTH FACTOR 7S NGF;











CHAIN: A, B, G, X, Y, Z;
GROWTH FACTOR (BETA-NGF),












HYDROLASE - SERINE PROTEINASE












2 (GAMMA-NGF), INACTIVE SERINE












PROTEINASE (ALPHA-NGF)


563
1slw
B
36
265
  1e−99


223.23
ECOTIN; CHAIN: A;
COMPLEX (SERINE











ANIONIC TRYPSIN;
PROTEASE/INHIBITOR) TRYPSIN











CHAIN: B;
INHIBITOR; SERINE PROTEASE,












INHIBITOR, COMPLEX, METAL












BINDING SITES, 2 PROTEIN












ENGINEERING, PROTEASE-












SUBSTRATE INTERACTIONS, 3












METALLOPROTEINS


563
1slw
B
36
265
  1e−99
1.25
1.00

ECOTIN; CHAIN: A;
COMPLEX (SERINE











ANIONIC TRYPSIN;
PROTEASE/INHIBITOR) TRYPSIN











CHAIN: B;
INHIBITOR; SERINE PROTEASE,












INHIBITOR, COMPLEX, METAL












BINDING SITES, 2 PROTEIN












ENGINEERING, PROTEASE-












SUBSTRATE INTERACTIONS, 3












METALLOPROTEINS


563
1ton

36
266
  5e−97


232.87
HYDROLASE(SERINE











PROTEINASE) TONIN (E.C.











NUMBER NOT ASSIGNED)











1TON 4


563
1ton

38
264
  5e−97
1.16
1.00

HYDROLASE(SERINE











PROTEINASE) TONIN (E.C.











NUMBER NOT ASSIGNED)











1TON 4


563
1trn
A
36
266
0


229.81
HYDROLASE (SERINE











PROTEINASE) TRYPSIN











(E.C.3.4.21.4) COMPLEXED











WITH THE INHIBITOR











1TRN 3 DIISOPROPYL-











FLUOROPHOSPHOFLUORI











DATE (DFP) 1TRN 4











HUMAN TRYPSIN, DFP











INHIBITED 1TRN 6


563
1trn
A
36
266
0
1.00
1.00

HYDROLASE (SERINE











PROTEINASE) TRYPSIN











(E.C.3.4.21.4) COMPLEXED











WITH THE INHIBITOR











1TRN 3 DIISOPROPYL-











FLUOROPHOSPHOFLUORI











DATE (DFP) 1TRN 4











HUMAN TRYPSIN, DFP











INHIBITED 1TRN 6


563
2tbs

36
265
  1e−99


222.15
HYDROLASE (SERINE











PROTEINASE) TRYPSIN











(E.C.3.4.21.4) COMPLEXED











WITH BENZAMIDINE











INHIBITOR 2TBS 3


563
2tbs

36
265
  1e−99
1.04
1.00

HYDROLASE (SERINE











PROTEINASE) TRYPSIN











(E.C.3.4.21.4) COMPLEXED











WITH BENZAMIDINE











INHIBITOR 2TBS 3


563
5ptp

36
265
0


230.52
BETA TRYPSIN; CHAIN:
SERINE PROTEASE HYDROLASE,











NULL;
SERINE PROTEASE, DIGESTION,












PANCREAS, 2 ZYMOGEN, SIGNAL


563
5ptp

36
265
0
1.26
1.00

BETA TRYPSIN; CHAIN:
SERINE PROTEASE HYDROLASE,











NULL;
SERINE PROTEASE, DIGESTION,












PANCREAS, 2 ZYMOGEN, SIGNAL


572
1epf
A
18
124
  5e−07
0.22
0.33

NEURAL CELL ADHESION
CELL ADHESION NCAM; NCAM,











MOLECULE; CHAIN: A, B,
IMMUNOGLOBULIN FOLD,











C, D;
GLYCOPROTEIIN


572
1f5w
A
15
107
  1e−06
0.02
0.36

COXSACKIE VIRUS AND
VIRUS/VIRAL PROTEIN RECEPTOR











ADENOVIRUS RECEPTOR;
IMMUNOGLOBULIN V DOMAIN











CHAIN: A, B;
FOLD, SYMMETRIC DIMER


572
1fhg
A
22
109
2.5e−07
0.09
0.06

TELOKIN; CHAIN: A
CONTRACTILE PROTEIN












IMMUNOGLOBULIN FOLD, BETA












BARREL


572
1tnm

22
107
2.5e−06
0.11
0.11

MUSCLE PROTEIN TITIN











MODULE M5











(CONNECTIN) 1TNM 3











(NMR, MINIMIZED











AVERAGE STRUCTURE)











1TNM 4 1TNM 58


572
2ncm

20
109
  2e−06
0.12
0.30

NEURAL CELL ADHESION
CELL ADHESION NCAM DOMAIN 1;











MOLECULE; CHAIN:
CELL ADHESION, GLYCOPROTEIN,











NULL;
HEPARIN-BINDING, GPI-ANCHOR, 2












NEURAL ADHESION MOLECULE,












IMMUNOGLOBULIN FOLD, SIGNAL


572
3ncm
A
22
109
  5e−07
−0.09
0.12

NEURAL CELL ADHESION
CELL ADHESION PROTEIN NCAM











MOLECULE, LARGE
MODULE 2; CELL ADHESION,











ISOFORM; CHAIN: A;
GLYCOPROTEIN, HEPARIN-












BINDING, GPI-ANCHOR, 2 NEURAL












ADHESION MOLECULE,












IMMUNOGLOBULIN FOLD,












HOMOPHILIC 3 BINDING, CELL












ADHESION PROTEIN


580
1cfe

60
219
5.1e−42
0.22
1.00

PATHOGENESIS-RELATED
PATHOGENESIS-RELATED PROTEIN











PROTEIN P14A; CHAIN:
PATHOGENESIS-RELATED LEAF











NULL;
PROTEIN 6, ETHYLENE












PATHOGENESIS-RELATED












PROTEIN, PR-1 PROTEINS, 2 PLANT












DEFENSE


580
1cfe

61
219
5.1e−42


75.76
PATHOGENESIS-RELATED
PATHOGENESIS-RELATED PROTEIN











PROTEIN P14A; CHAIN:
PATHOGENESIS-RELATED LEAF











NULL;
PROTEIN 6, ETHYLENE












PATHOGENESIS-RELATED












PROTEIN, PR-1 PROTEINS, 2 PLANT












DEFENSE


580
1qnx
A
58
220
1.7e−42
0.33
1.00

VES V 5; CHAIN: A;
ALLERGEN ANTIGEN 5; ANTIGEN 5,












ALLERGEN, VESPID VENOM


594
1def

63
229
3.4e−46


62.32
PEPTIDE DEFORMYLASE;
HYDROLASE HYDROLASE, ZINC











CHAIN: NULL;
METALLOPROTEASE


595
1c44
A
92
211
8.5e−37
0.82
0.99

STEROL CARRIER
LIPID BINDING PROTEIN NON











PROTEIN 2; CHAIN: A;
SPECIFIC LIPID BINDING PROTEIN;












STEROL CARRIER PROTEIN, NON












SPECIFIC LIPID TRANSFER












PROTEIN, 2 FATTY ACID BINDING,












FATTY ACYL COA BINDING


596
1edh
A
46
166
3.4e−17
−0.18
0.25

E-CADHERIN; CHAIN: A,
CELL ADHESION PROTEIN











B;
EPITHELIAL CADHERIN DOMAINS 1












AND 2, ECAD12; CADHERIN, CELL












ADHESION PROTEIN, CALCIUM












BINDING PROTEIN


596
1edh
A
51
164
1.7e−23
0.24
0.98

E-CADHERIN; CHAIN: A,
CELL ADHESION PROTEIN











B;
EPITHELIAL CADHERIN DOMAINS 1












AND 2, ECAD12; CADHERIN, CELL












ADHESION PROTEIN, CALCIUM












BINDING PROTEIN


596
1ncg

45
138
7.5e−17
0.44
0.93

N-CADHERIN; 1NCG 3
CELL ADHESION PROTEIN












CADHERIN 1NCG 13


596
1nci
B
45
140
  5e−16
−0.22
0.63

N-CADHERIN; 1NCI 3
CELL ADHESION PROTEIN












CADHERIN 1NCI 13


596
1ncj
A
45
164
  5e−20
0.22
0.86

N-CADHERIN; CHAIN: A;
CELL ADHESION PROTEIN CELL












ADHESION PROTEIN


596
1ncj
A
45
165
3.4e−19
−0.11
0.36

N-CADHERIN; CHAIN: A;
CELL ADHESION PROTEIN CELL












ADHESION PROTEIN


596
1suh

44
144
  5e−24


55.29
EPITHELIAL CADHERIN;
CELL ADHESION UVOMORULIN;











CHAIN: NULL;
CADHERIN, CALCIUM BINDING,












CELL ADHESION


596
1suh

45
144
3.4e−09
0.31
0.59

EPITHELIAL CADHERIN;
CELL ADHESION UVOMORULIN;











CHAIN: NULL;
CADHERIN, CALCIUM BINDING,












CELL ADHESION


596
1suh

45
144
  5e−24
0.66
0.98

EPITHELIAL CADHERIN;
CELL ADHESION UVOMORULIN;











CHAIN: NULL;
CADHERIN, CALCIUM BINDING,












CELL ADHESION


598
1hcn
B
15
126
  1e−45


94.05
HORMONE HUMAN











CHORIONIC











GONADOTROPIN 1HCN 3


598
1hcn
B
16
125
  1e−45
0.10
1.00

HORMONE HUMAN











CHORIONIC











GONADOTROPIN 1HCN 3


598
1hcn
B
17
126
1.7e−43
−0.14
1.00

HORMONE HUMAN











CHORIONIC











GONADOTROPIN 1HCN 3


624
1a14
H
20
139
5.1e−14


61.90
NEURAMINIDASE; CHAIN:
COMPLEX (ANTIBODY/












ANTIGEN)











N; SINGLE CHAIN
COMPLEX (ANTIBODY/ANTIGEN),











ANTIBODY; CHAIN: H, L;
SINGLE-CHAIN ANTIBODY, 2












GLYCOSYLATED PROTEIN


624
1a2y
A
19
134
3.4e−28


57.90
MONOCLONAL
COMPLEX











ANTIBODY D1.3; CHAIN:
(IMMUNOGLOBULIN/HYDROLASE)











A, B; LYSOZYME; CHAIN:
COMPLEX











C;
(IMMUNOGLOBULIN/HYDROLASE),












IMMUNOGLOBULIN V 2 REGION,












SIGNAL, HYDROLASE,












GLYCOSIDASE, BACTERIOLYTIC 3












ENZYME, EGG WHITE


624
1a7q
L
19
132
3.4e−26


56.95
MONOCLONAL
IMMUNOGLOBULIN











ANTIBODY D1.3; CHAIN:
IMMUNOGLOBULIN, VARIANT











L, H;


624
1adq
L
21
141
6.8e−46
0.39
0.95

IGG4 REA; CHAIN: A; RF-
COMPLEX











AN IGM/LAMBDA; CHAIN:
(IMMUNOGLOBULIN/












AUTOANTIGEN)











H, L;
COMPLEX












(IMMUNOGLOBULIN/












AUTOANTIGEN),












RHEUMATOID FACTOR 2 AUTO-












ANTIBODY COMPLEX


624
1ao7
D
20
142
2.3e−19


59.00
HLA-A 0201; CHAIN: A;
COMPLEX (MHC/VIRAL











BETA-2 MICROGLOBULIN;
PEPTIDE/RECEPTOR) HLA-A2











CHAIN: B; TAX PEPTIDE;
HEAVY CHAIN; CLASS I MHC, T-











CHAIN: C; T CELL
CELL RECEPTOR, VIRAL PEPTIDE, 2











RECEPTOR ALPHA;
COMPLEX (MHC/VIRAL











CHAIN: D; T CELL
PEPTIDE/RECEPTOR











RECEPTOR BETA; CHAIN:











E;


624
1ap2
A
19
133
3.4e−31


57.34
MONOCLONAL
IMMUNOGLOBULIN VARIABLE











ANTIBODY C219; CHAIN:
DOMAIN; SINGLE CHAIN FV,











A, B, C, D;
MONOCLONAL ANTIBODY, C219, P-












GLYCOPROTEIN, 2












IMMUNOGLOBULIN


624
1aqk
L
22
141
3.4e−50
0.07
0.95

FAB B7-15A2; CHAIN: L, H;
IMMUNOGLOBULIN HUMAN FAB,












ANTI-TETANUS TOXOID, HIGH












AFFINITY, CRYSTAL 2 PACKING












MOTIF, PROGRAMMING












PROPENSITY TO CRYSTALLIZE, 3












IMMUNOGLOBULIN


624
1ar1
D
19
129
  1e−24


57.17
CYTOCHROME C
COMPLEX











OXIDASE; CHAIN: A, B;
(OXIDOREDUCTASE/ANTIBODY)











ANTIBODY FV
CYTOCHROME AA3, COMPLEX IV,











FRAGMENT; CHAIN: C, D;
FERROCYTOCHROME C, COMPLEX












(OXIDOREDUCTASE/ANTIBODY),












ELECTRON TRANSPORT, 2












TRANSMEMBRANE, CYTOCHROME












OXIDASE, ANTIBODY COMPLEX


624
1bow
A
19
127
1.7e−29


58.09
BENCE-JONES KAPPA I
IMMUNE SYSTEM BENCE-JONES;











PROTEIN BRE; CHAIN: A,
IMMUNOGLOBULIN, AMYLOID,











B, C;
IMMUNE SYSTEM


624
1bfv
L
19
135
8.5e−28


57.62
FV4155; CHAIN: L, H;
IMMUNOGLOBULIN












IMMUNOGLOBULIN, FV












FRAGMENT, STEROID HORMONE, 2












FINE SPECIFICITY


624
1bjm
A
21
142
5.1e−45
0.33
0.90

LOC - LAMBDA 1 TYPE
IMMUNOGLOBULIN BENCE-JONES











LIGHT-CHAIN DIMER;
PROTEIN; 1BJM 8 BENCE JONES,











1BJM 6 CHAIN: A, B; 1BJM 7
ANTIBODY, MULTIPLE












QUATERNARY STRUCTURES 1BJM












13


624
1bvk
A
19
135
5.1e−32


61.61
HULYS11; CHAIN: A, B, D,
COMPLEX (HUMANIZED











E; LYSOZYME; CHAIN: C,
ANTIBODY/HYDROLASE)











F;
MURAMIDASE; HUMANIZED












ANTIBODY, ANTIBODY COMPLEX,












FV, ANTI-LYSOZYME, 2 COMPLEX












(HUMANIZED












ANTIBODY/HYDROLASE)


624
1bww
A
17
132
1.7e−31


61.59
IG KAPPA CHAIN V-I
IMMUNE SYSTEM REIV,











REGION REI; CHAIN: A, B;
STABILIZED IMMUNOGLOBULIN












FRAGMENT, BENCE-JONES 2












PROTEIN, IMMUNE SYSTEM


624
1cd0
A
22
122
1.2e−46
0.57
1.00

JTO, A VARIABLE
IMMUNE SYSTEM











DOMAIN FROM LAMBDA-
IMMUNOGLOBULIN, BENCE-JONES











6 TYPE CHAIN: A, B;
PROTEIN, LAMDA-6


624
1dlf
L
19
135
3.4e−27


57.84
ANTI-DANSYL
IMMUNOGLOBULIN ANTI-DANSYL











IMMUNOGLOBULIN
FV FRAGMENT FV FRAGMENT,











IGG2A(S); CHAIN: L, H;
IMMUNOGLOBULIN


624
1fgv
L
19
134
1.7e−33


67.61
IMMUNOGLOBULIN FV











FRAGMENT OF A











HUMANIZED VERSION OF











THE ANTI-CD18 1FGV 3











ANTIBODY ’H52’ (HUH52-











AA FV) 1FGV 4


624
1fvc
A
19
136
3.4e−31


64.02
IMMUNOGLOBULIN FV











FRAGMENT OF











HUMANIZED ANTIBODY











4D5, VERSION 8 1FVC 3


624
1igm
L
19
144
  1e−30


62.66
IMMUNOGLOBULIN











IMMUNOGLOBULIN M











(IG-M) FV FRAGMENT











1IGM 3


624
1maj

19
135
1.2e−25


56.76
IMMUNOGLOBULIN











MURINE ANTIBODY 26-10











VL DOMAIN (NMR, 15











ENERGY MINIMIZED











1MAJ 3 STRUCTURES)











1MAJ 4


624
1mel
A
22
145
3.4e−12


58.28
VH SINGLE-DOMAIN
COMPLEX (ANTIBODY/












ANTIGEN)











ANTIBODY; CHAIN: A, B;
CAB-LYS3 COMPLEX; CAMEL











LYSOZYME; CHAIN: L, M;
SINGLE-DOMAIN












ANTI-LYSOZYME,












COMPLEX 2 (ANTIBODY/












ANTIGEN)


624
1rvf
L
20
138
1.2e−31


62.39
HUMAN RHINOVIRUS 14
COMPLEX (COAT











COAT PROTIEN; CHAIN: 1,
PROTEIN/IMMUNOGLOBULIN)











2,3,4; FAB 17-IA; CHAIN:
POLYPROTEIN, COAT PROTEIN,











L, H
CORE PROTEIN, RNA-DIRECTED












RNA 2 POLYMERASE, HYDROLASE,












THIOL PROTEASE,












MYRISTYLATION, 3 COMPLEX












(COAT












PROTEIN/IMMUNOGLOBULIN)


624
1wtl
A
19
127
1.5e−31


60.42
IMMUNOGLOBULIN WAT,











A VARIABLE DOMAIN











FROM IMMUNOGLOBULIN











LIGHT-CHAIN 1WTL 3











(BENCE-JONES PROTEIN)











1WTL4


624
2cd0
A
23
122
5.1e−47
0.67
1.00

BENCE-JONES PROTEIN
IMMUNE SYSTEM











WIL, A VARIABLE
IMMUNOGLOBULIN, BENCE-JONES











DOMAIN FROM CHAIN: A,
PROTEIN, LAMBDA-6











B;


624
2fb4
L
20
142
1.7e−46
0.21
0.86

IMMUNOGLOBULIN











IMMUNOGLOBULIN FAB











2FB4 4


624
2imn

19
127
6.8e−33


60.82
IMUNOGLOBULIN











IMMUNOGLOBULIN VL











DOMAIN (VARIABLE











DOMAIN OF KAPPA 2IMN











3 LIGHT CHAIN) OF











MCPC603 MUTANT IN











WHICH 2IMN 4











COMPLEMENTARITY-











DETERMINING REGION I











HAS BEEN REPLACED BY











2IMN 5 THAT FROM











MOPC167 2IMN 6


624
2mcg
1
21
142
1.7e−52
0.32
0.72

IMMUNOGLOBULIN











IMMUNOGLOBULIN











LAMBDA LIGHT CHAIN











DIMER (/MCG$) 2MCG 3











(TRIGONAL FORM) 2MCG 4


624
2rhe

20
140
1.2e−44


68.90
IMMUNOGLOBULIN











BENCE-*JONES PROTEIN











(LAMBDA, VARIABLE











DOMAIN) 2RHE 4


624
2rhe

21
121
1.2e−44
0.59
1.00

IMMUNOGLOBULIN











BENCE-*JONES PROTEIN











(LAMBDA, VARIABLE











DOMAIN) 2RHE 4


624
43c9
A
19
134
1.7e−31


57.86
IMMUNOGLOBULIN
IMMUNOGLOBULIN











(LIGHT CHAIN); CHAIN: A,
IMMUNOGLOBULIN











C, E, G;











IMMUNOGLOBULIN











(HEAVY CHAIN); CHAIN:











B, D, F, H;


624
43c9
B
18
140
5.1e−15


61.03
IMMUNOGLOBULIN
IMMUNOGLOBULIN











(LIGHT CHAIN); CHAIN: A,
IMMUNOGLOBULIN











C, E, G;











IMMUNOGLOBULIN











(HEAVY CHAIN); CHAIN:











B, D, F, H;


624
7fab
L
21
141
1.4e−43
0.29
0.21

IMMUNOGLOBULIN











IMMUNOGLOBULIN FAB’











NEW (LAMBDA LIGHT











CHAIN) 7FAB 3


624
8fab
A
23
141
3.4e−44
0.31
0.84

IMMUNOGLOBULIN FAB











FRAGMENT FROM











HUMAN











IMMUNOGLOBULIN IGG1











(LAMBDA, HIL) 8FAB 3


627
1fxx
A
224
454
6.8e−97
0.07
1.00

EXONUCLEASE I; CHAIN:
HYDROLASE











A;
EXODEOXYRIBONUCLEASE I;












ALPHA-BETA DOMAIN, SH3-LIKE












DOMAIN, DNAQ SUPERFAMILY


636
1ao7
E
74
194
3.4e−53
0.46
1.00

HLA-A 0201; CHAIN: A;
COMPLEX (MHC/VIRAL











BETA-2 MICROGLOBULIN;
PEPTIDE/RECEPTOR) HLA-A2











CHAIN: B; TAX PEPTIDE;
HEAVY CHAIN; CLASS I MHC, T-











CHAIN: C; T CELL
CELL RECEPTOR, VIRAL PEPTIDE, 2











RECEPTOR ALPHA;
COMPLEX (MHC/VIRAL











CHAIN: D; T CELL
PEPTIDE/RECEPTOR











RECEPTOR BETA; CHAIN:











E;


636
1ao7
E
74
217
3.4e−53


80.28
HLA-A 0201; CHAIN: A;
COMPLEX (MHC/VIRAL











BETA-2 MICROGLOBULIN;
PEPTIDE/RECEPTOR) HLA-A2











CHAIN: B; TAX PEPTIDE;
HEAVY CHAIN; CLASS 1 MHC, T-











CHAIN: C; T CELL
CELL RECEPTOR, VIRAL PEPTIDE, 2











RECEPTOR ALPHA;
COMPLEX (MHC/VIRAL











CHAIN: D; T CELL
PEPTIDE/RECEPTOR











RECEPTOR BETA; CHAIN:











E;


636
1bd2
E
74
194
1e−55
0.58
1.00

HLA-A 0201; CHAIN: A;
COMPLEX (MHC/VIRAL











BETA-2 MICROGLOBULIN;
PEPTIDE/RECEPTOR) HLA A2











CHAIN: B; TAX PEPTIDE;
HEAVY CHAIN; COMPLEX











CHAIN: C; T CELL
(MHC/VIRAL PEPTIDE/RECEPTOR)











RECEPTOR ALPHA;











CHAIN: D; T CELL











RECEPTOR BETA; CHAIN:











E;


636
1bd2
E
74
217
1e−55


62.05
HLA-A 0201; CHAIN: A;
COMPLEX (MHC/VIRAL











BETA-2 MICROGLOBULIN;
PEPTIDE/RECEPTOR) HLA A2











CHAIN: B; TAX PEPTIDE;
HEAVY CHAIN; COMPLEX











CHAIN: C; T CELL
(MHC/VIRAL PEPTIDE/RECEPTOR)











RECEPTOR ALPHA;











CHAIN: D; T CELL











RECEPTOR BETA; CHAIN:











E;


636
1bec

74
217
8.5e−57


72.58
14.3.D T CELL ANTIGEN
RECEPTOR T CELL RECEPTOR 1BEC











RECEPTOR; 1BEC 5
14











CHAIN: NULL; 1BEC 6


636
1bec

75
195
8.5e−57
0.51
1.00

14.3.D T CELL ANTIGEN
RECEPTOR T CELL RECEPTOR 1BEC











RECEPTOR; 1BEC 5
14











CHAIN: NULL; 1BEC 6


636
1bwm
A
74
217
1.7e−48


61.36
ALPHA-BETA T CELL
IMMUNE SYSTEM











RECEPTOR (TCR) (D10);
IMMUNOGLOBULIN,











CHAIN: A;
IMMUNORECEPTOR, IMMUNE












SYSTEM


636
1bwm
A
75
185
1.7e−48
0.55
1.00

ALPHA-BETA T CELL
IMMUNE SYSTEM











RECEPTOR (TCR) (D10);
IMMUNOGLOBULIN,











CHAIN: A;
IMMUNORECEPTOR, IMMUNE












SYSTEM


636
1d9k
B
75
185
1.7e−48
0.63
1.00

T-CELL RECEPTOR D10
IMMUNE SYSTEM MHC I-AK; MHC











(ALPHA CHAIN); CHAIN:
I-AK; T-CELL RECEPTOR, MHC











A, E; T-CELL RECEPTOR
CLASS II, D10, I-AK











D10 (BETA CHAIN);











CHAIN: B, F; MHC I-AK A











CHAIN (ALPHA CHAIN);











CHAIN: C, G; MHC I-AK B











CHAIN (BETA CHAIN);











CHAIN: D, H;











CONALBUMIN PEPTIDE;











CHAIN: P, Q;


636
Ifyt
E
74
194
3.4e−50
0.39
1.00

HLA CLASS II
IMMUNE SYSTEM HLA-DR1, DRA;











HISTOCOMPATIBILITY
HLA-DR1, DRB1 0101; TCR HA1.7











ANTIGEN, DR CHAIN: A;
ALPHA CHAIN; TCR HA1.7 BETA











HLA CLASS II
CHAIN; PROTEIN-PROTEIN











HISTOCOMPATIBILITY
COMPLEX, IMMUNOGLOBULIN











ANTIGEN, DR-1 CHAIN: B;
FOLD











HEMAGGLUTININ HA1











PEPTIDE CHAIN; CHAIN:











C; T-CELL RECEPTOR











ALPHA CHAIN; CHAIN: D;











T-CELL RECEPTOR BETA











CHAIN; CHAIN: E;


636
1nct

27
93
0.0015
−0.00
0.11

TITIN; CHAIN: NULL;
MUSCLE PROTEIN CONNECTIN,












NEXTM5; CELL ADHESION,












GLYCOPROTEIN,












TRANSMEMBRANE, REPEAT,












BRAIN, 2 IMMUNOGLOBULIN FOLD,












ALTERNATIVE SPLICING, SIGNAL, 3












MUSCLE PROTEIN


636
1tcr
B
72
195
8.5e−55
0.45
1.00

ALPHA, BETA T-CELL
RECEPTOR TCR; T-CELL,











RECEPTOR CHAIN: A, B;
RECEPTOR, TRANSMEMBRANE,












GLYCOPROTEIN, SIGNAL


636
1tcr
B
72
217
8.5e−55


74.81
ALPHA, BETA T-CELL
RECEPTOR TCR; T-CELL,











RECEPTOR CHAIN: A, B;
RECEPTOR, TRANSMEMBRANE,












GLYCOPROTEIN, SIGNAL


645
1a4y
A
52
166
  5e−05
0.05
0.43

RIBONUCLEASE
COMPLEX (INHIBITOR/












NUCLEASE)











INHIBITOR; CHAIN: A, D;
COMPLEX (INHIBITOR/












NUCLEASE),











ANGIOGENIN; CHAIN: B, E;
COMPLEX (RI-ANG),












HYDROLASE 2












MOLECULAR RECOGNITION,












EPITOPE MAPPING, LEUCINE-RICH












3 REPEATS










[0405]

6








TABLE 6









Position




SEQ ID NO:
of the Signal Peptide
Maximum Score
Mean Score


















338
1-23
0.964
0.838


339
1-29
0.941
0.708


340
1-44
0.855
0.504


341
1-19
0.991
0.956


342
1-20
0.965
0.833


343
1-33
0.981
0.884


344
1-26
0.956
0.717


345
1-21
0.990
0.950


346
1-44
0.990
0.644


347
1-29
0.968
0.793


348
1-39
0.986
0.641


349
1-58
0.925
0.495


350
1-20
0.916
0.453


351
1-22
0.943
0.746


352
1-19
0.993
0.953


353
1-49
0.956
0.507


354
1-33
0.988
0.893


355
1-31
0.894
0.613


356
1-16
0.989
0.500


357
1-42
0.872
0.606


358
1-18
0.902
0.649


359
1-24
0.909
0.643


360
1-38
0.930
0.725


361
1-24
0.967
0.918


362
1-22
0.981
0.881


363
1-41
0.987
0.895


364
1-15
0.935
0.811


365
1-75
0.981
0.516


366
1-45
0.954
0.577


367
1-23
0.989
0.965


368
1-33
0.937
0.526


369
1-18
0.898
0.665


370
1-25
0.974
0.872


371
1-39
0.990
0.515


372
1-72
0.963
0.485


373
1-19
0.976
0.950


374
1-43
0.966
0.542


375
1-49
0.994
0.792


376
1-24
0.993
0.937


377
1-39
0.996
0.930


378
1-70
0.938
0.480


379
1-40
0.967
0.492


380
1-43
0.987
0.765


381
1-41
0.977
0.722


382
1-23
0.952
0.651


383
1-19
0.983
0.492


384
1-19
0.987
0.898


385
1-27
0.930
0.465


386
1-26
0.972
0.732


387
1-13
0.923
0.755


388
1-25
0.951
0.738


389
1-34
0.845
0.537


390
1-30
0.967
0.769


391
1-48
0.979
0.568


392
1-43
0.973
0.486


393
1-18
0.956
0.655


394
1-27
0.975
0.831


395
1-44
0.987
0.725


396
1-35
0.969
0.616


397
1-35
0.954
0.759


398
1-20
0.926
0.787


399
1-20
0.974
0.908


400
1-16
0.888
0.686


401
1-28
0.889
0.529


402
1-27
0.973
0.870


403
1-37
0.956
0.698


404
1-25
0.969
0.873


405
1-48
0.985
0.679


406
1-60
0.988
0.525


407
1-11
0.977
0.958


408
1-22
0.953
0.916


409
1-39
0.972
0.817


410
1-29
0.983
0.897


411
1-24
0.917
0.657


412
1-23
0.967
0.856


413
1-49
0.963
0.532


414
1-25
0.928
0.667


415
1-69
0.982
0.489


416
1-38
0.966
0.856


417
1-41
0.971
0.804


418
1-19
0.937
0.870


419
1-15
0.987
0.802


420
1-20
0.925
0.699


421
1-74
0.996
0.456


422
1-40
0.977
0.661


423
1-14
0.967
0.876


424
1-41
0.990
0.724


425
1-23
0.968
0.924


426
1-27
0.882
0.585


427
1-38
0.868
0.535


428
1-17
0.950
0.658


429
1-25
0.971
0.897


430
1-39
0.996
0.868


431
1-20
0.987
0.946


432
1-45
0.991
0.468


433
1-14
0.946
0.864


434
1-49
0.963
0.513


435
1-66
0.981
0.530


436
1-26
0.982
0.896


437
1-32
0.989
0.841


438
1-37
0.972
0.775


439
1-76
0.979
0.577


440
1-42
0.943
0.626


441
1-34
0.993
0.933


442
1-43
0.943
0.527


443
1-33
0.968
0.827


444
1-28
0.995
0.945


445
1-26
0.994
0.932


446
1-41
0.959
0.629


447
1-28
0.988
0.935


448
1-24
0.981
0.776


449
1-25
0.898
0.612


450
1-14
0.943
0.864


451
1-24
0.976
0.925


452
1-13
0.835
0.583


453
1-21
0.896
0.706


454
1-58
0.924
0.495


455
1-39
0.983
0.710


456
1-26
0.971
0.899


457
1-27
0.970
0.898


458
1-53
0.987
0.512


459
1-29
0.964
0.562


460
1-33
0.937
0.698


461
1-24
0.988
0.952


462
1-18
0.995
0.978


463
1-13
0.972
0.733


464
1-25
0.992
0.929


465
1-20
0.987
0.963


466
1-41
0.972
0.714


467
1-19
0.940
0.480


468
1-40
0.993
0.805


469
1-42
0.890
0.551


470
1-11
0.975
0.532


471
1-21
0.942
0.816


472
1-25
0.954
0.816


473
1-66
0.976
0.499


474
1-41
0.983
0.859


475
1-25
0.980
0.906


476
1-15
0.953
0.860


477
1-31
0.995
0.895


478
1-17
0.959
0.867


479
1-22
0.874
0.557


480
1-18
0.981
0.858


481
1-40
0.935
0.478


482
1-22
0.993
0.966


483
1-49
0.987
0.594


484
1-66
0.893
0.506


485
1-25
0.990
0.857


486
1-26
0.985
0.956


487
1-48
0.985
0.571


488
1-17
0.976
0.772


489
1-15
0.932
0.796


490
1-40
0.996
0.972


491
1-60
0.980
0.490


492
1-25
0.941
0.656


493
1-16
0.984
0.949


494
1-47
0.956
0.497


495
1-34
0.971
0.910


496
1-42
0.983
0.683


497
1-45
0.878
0.636


498
1-17
0.961
0.884


499
1-26
0.996
0.922


500
1-20
0.947
0.881


501
1-48
0.940
0.755


502
1-30
0.968
0.777


503
1-32
0.953
0.778


504
1-20
0.963
0.551


505
1-25
0.958
0.928


506
1-56
0.968
0.630


507
1-24
0.933
0.671


508
1-44
0.956
0.803


509
1-47
0.967
0.826


510
1-48
0.992
0.807


511
1-25
0.976
0.909


512
1-36
0.932
0.534


513
1-29
0.973
0.792


514
1-29
0.922
0.662


515
1-32
0.967
0.646


516
1-21
0.933
0.785


517
1-46
0.981
0.714


518
1-44
0.955
0.611


519
1-17
0.950
0.712


520
1-14
0.989
0.917


521
1-27
0.998
0.952


522
1-35
0.969
0.716


523
1-17
0.943
0.681


524
1-21
0.956
0.879


525
1-25
0.985
0.718


526
1-17
0.943
0.794


527
1-29
0.998
0.924


528
1-21
0.986
0.966


529
1-22
0.942
0.465


530
1-73
0.968
0.573


531
1-25
0.872
0.581


532
1-25
0.988
0.947


533
1-18
0.900
0.591


534
1-23
0.975
0.884


535
1-18
0.898
0.719


536
1-43
0.907
0.701


537
1-20
0.989
0.960


538
1-40
0.998
0.990


539
1-35
0.984
0.757


540
1-42
0.977
0.671


541
1-15
0.978
0.902


542
1-17
0.976
0.927


543
1-34
0.957
0.706


544
1-18
0.978
0.937


545
1-49
0.967
0.531


546
1-9
0.806
0.506


547
1-36
0.978
0.657


548
1-19
0.973
0.788


549
1-20
0.964
0.774


550
1-24
0.978
0.709


551
1-21
0.968
0.782


552
1-45
0.998
0.924


553
1-22
0.989
0.960


554
1-49
0.986
0.825


555
1-38
0.959
0.769


556
1-28
0.988
0.744


557
1-20
0.972
0.830


558
1-48
0.957
0.617


559
1-20
0.980
0.902


560
1-17
0.905
0.697


561
1-47
0.995
0.684


562
1-19
0.848
0.605


563
1-20
0.983
0.888


564
1-31
0.977
0.806


565
1-48
0.986
0.542


566
1-38
0.968
0.457


567
1-20
0.972
0.888


568
1-10
0.993
0.569


569
1-34
0.994
0.867


570
1-23
0.904
0.643


571
1-22
0.974
0.877


572
1-17
0.959
0.814


573
1-48
0.946
0.768


574
1-24
0.807
0.500


575
1-19
0.957
0.838


576
1-38
0.988
0.950


577
1-72
0.974
0.510


578
1-31
0.945
0.695


579
1-46
0.992
0.562


580
1-23
0.958
0.866


581
1-25
0.973
0.888


582
1-41
0.981
0.577


583
1-43
0.970
0.727


584
1-32
0.913
0.607


585
1-27
0.962
0.882


586
1-22
0.989
0.887


587
1-28
0.972
0.825


588
1-31
0.990
0.766


589
1-30
0.995
0.964


590
1-24
0.955
0.640


591
1-37
0.977
0.860


592
1-38
0.983
0.775


593
1-18
0.990
0.922


594
1-24
0.993
0.923


595
1-22
0.948
0.754


596
1-22
0.989
0.927


597
1-31
0.979
0.864


598
1-16
0.988
0.968


599
1-55
0.996
0.513


600
1-27
0.977
0.934


601
1-43
0.994
0.918


602
1-45
0.995
0.686


603
1-44
0.965
0.514


604
1-26
0.975
0.807


605
1-30
0.982
0.647


606
1-42
0.982
0.664


607
1-36
0.999
0.992


608
1-66
0.972
0.489


609
1-41
0.901
0.614


610
1-20
0.994
0.976


611
1-21
0.940
0.738


612
1-38
0.991
0.889


613
1-16
0.915
0.719


614
1-57
0.960
0.515


615
1-28
0.974
0.886


616
1-21
0.981
0.911


617
1-30
0.993
0.832


618
1-21
0.993
0.979


619
1-38
0.884
0.655


620
1-25
0.963
0.849


621
1-27
0.954
0.863


622
1-27
0.961
0.767


623
1-14
0.952
0.606


624
1-19
0.972
0.877


625
1-21
0.901
0.545


626
1-25
0.986
0.802


627
1-26
0.895
0.712


628
1-23
0.956
0.836


629
1-19
0.989
0.950


630
1-40
0.967
0.821


631
1-19
0.968
0.923


632
1-44
0.990
0.566


633
1-41
0.922
0.748


634
1-34
0.991
0.758


635
1-36
0.952
0.513


636
1-32
0.968
0.678


637
1-16
0.969
0.917


638
1-19
0.978
0.930


639
1-39
0.982
0.678


640
1-36
0.987
0.866


641
1-24
0.942
0.780


642
1-46
0.963
0.617


643
1-76
0.976
0.542


644
1-49
0.998
0.716


645
1-45
0.996
0.966


646
1-32
0.971
0.914


647
1-25
0.998
0.958


648
1-69
0.984
0.491


649
1-41
0.962
0.555


650
1-19
0.973
0.893


651
1-37
0.968
0.621


652
1-24
0.983
0.949


653
1-40
0.980
0.824


654
1-21
0.953
0.854


655
1-46
0.990
0.503


656
1-45
0.987
0.852


657
1-46
0.826
0.557


658
1-24
0.959
0.869


659
1-20
0.982
0.852


660
1-44
0.894
0.594


661
1-48
0.981
0.692


662
1-61
0.990
0.551


663
1-17
0.992
0.969


664
1-35
0.915
0.516


665
1-29
0.975
0.835


666
1-17
0.924
0.748


667
1-18
0.943
0.843


668
1-33
0.970
0.887


669
1-25
0.980
0.893


670
1-18
0.973
0.922


671
1-26
0.994
0.969


672
1-34
0.961
0.562


673
1-39
0.978
0.791


674
1-17
0.928
0.753










[0406]

7








TABLE 7











SEQ ID NO:
Chromosomal Location



















1
6



4
14



5
3p



6
1q24.1-25.3



12
13



17
17



19
19



20
19



22
5



23
22



27
5



28
6



36
14



39
11



40
17



44
18



45
4



48
6



49
22q13.1-13.33



51
20



53
6



54
11



55
4



60
3



62
3



64
1



68
6p21.1-21.2



70
12



71
16



72
5



73
19



74
17



78
18



81
5



85
3



87
10



89
5



90
8



92
16



93
4



95
6



99
2



100
6



105
3q



116
10



118
1



121
11



122
12q



126
19



129
1p31.2-32.3



133
17



137
22



140
17



150
15



151
22



153
1



154
1



158
9



160
16



170
15



172
11



174
7



175
2



176
11



177
4p16.3



180
14



184
19



185
7



188
16



189
5



193
6q22.1-22.33



197
11



199
6



207
3q



208
13



210
1



212
13



214
5



216
4



221
5



223
13



225
6p11.2-12.3



226
19



227
6q16-21



228
1q23-24



229
5



230
17



232
5



235
19



238
18



239
5



243
20



249
8



251
4p16.3



255
3



256
20



257
16



260
22q13.1-13.2 



262
16



264
17



267
06



270
19



271
19



273
16



275
6



276
14



278
1



280
8



283
6



285
17



286
19



289
17



296
1



301
21



303
1



307
14



312
7



313
16



314
3



315
12q



316
5



317
3



319
13



321
15



325
5



327
9



328
4



329
3



330
15



333
16



336
11











[0407]

8









TABLE 8












Amino acid sequence (A = Alanine, C = Cysteine,





Location of
D = Aspartic Acid, E = Glutamic Acid, F =





first
Phenylalanine, G = Glycine, H = Histidine, I =





nucleotide of
Isoleucine, K = Lysine, L = Leucine, M = Methionine,





codon
N = Asparagine, P = Proline, Q = Glutamine, R =


SEQ ID

Nucleotide location
corresp. to
Arginine, S = Serine, T = Threonine, V = Valine, W =


NO: of

corresp. to first
last residue
Tryptophan, Y = Tyrosine, X = Unknown, * = Stop codon,


peptide

residue of peptide
of peptide
/ = possible nucleotide deletion, = possible nucleotide


sequence
Method
sequence
sequence
insertion)



















837
A
583
989
PVETHCALVIGKSPTSRPFHSLVQGRAILGIMLAITVL









GQFFINLNQFDPEVGDEEEQQQSHGEEDPKEQEPARE









LPHES*QHRAHCQPHRPHGDHGSHTQDQGSFPLSLY









LFSSGAPAFAQHVLSGTFCGLCSI





838
A
1
1143
MVVCNWYTLSFSLLFVSVTPDYVPPLGNFDVETLDIT









PHTVTAISAKIRKKGKIERKQKTDGSKTSSSDTLSEEK









NSECDPTPSHRGQLNKEFTGKEEKTSLLLHNSHAFFR









ELDIEVFSILHCGLVTKFILDTEMHTECLAAENHGVV









DGPGVKVQEYHIMSSCYQRLLQIFHGLFAWSGFSQP









ENQNLLYSALHVLSSRLKQGEHSQPLEELLSIYLEHT









ESILKAIEEIAGVGVPELINSPKDASSSTFPTLTRHTFV









VFFRVMMAELEKTVKKIEPGTAADSQQIHEEKLLYW









NMAVRDFSILINLIKVFDSHPVLHVCLKGEEIKSQNSQ









ESTADESEDDMSSQASKSKATEDGEEDEVSAGEKEQ









DSDESYDDSD





839
A
1
2337
TRFRGLRPAVAPWTALLALGLPGWVLAVSATAAAV









VPEQHASVAGQHPLDWLLTDRGPFHRAQEYADFME









RYRQGFTTRYRIYREFARWKVNNLALERKDFFSLPLP









LAPEFIRNIRLLGRRPNLQQVTENLIKKYGTHFLLSAT









LGGEESLTIFVDKQKLGRKTETTGGASHGGSGNSTA









VSLETLHQLAASYFIDRESTLRRLHHIQIATGAIKVTE









TRTGPLGCSNYDNLDSVSSVLVQSPENKVQLLGLQV









LLPEYLRERFVAAALSYITCSSEGELVCKENDCWCKC









SPTFPECNCPDADIQAMEDSLLQIQDSWATHNRQFEE









SEEFQALLKRLPDDRFLNSTAISQFWAMDTSLQHRY









QQLGAGLKVLFKKTHRILRRLFNLCKRCHRQPRFRLP









KERSLSYWWNRIQSLLYCGESTFPGTFLEQSHSCTCP









YDQSSCQGPIPCALGEGPACAHCAPDNSTRCGSCNPG









YVLAQGLCRPEVAESLENFLGLETDLQDLELKYLLQ









KQDSRIEVHSIFISNDMRLGSWFDPSWRKRMLLTLKS









NKYKPGLVHVMLALSLQICLTKNSTLEPVMAIYVNP









FGGSHSESWFMPVNEGSFPDWERTNVDAAAQCQNW









TITLGNRWKTFFETVHVYLRSRIKSLDDSSNETIYYEP









LEMTDPSKNLGYMKINTL\QVFGYSLPFDPD\AIRDLI









LQLDYPYTQGSQDSALLQLIELRDRVNQLSPPGKVRL









DLFSCLLRHRLKLANNEVGRIQSSLRAFNSKLPNPVE









YETGKLCS





840
A
763
244
FLGPRIIGLRHEISVETQDHKSAVRGNNTHDNYENVE









AGPPKAKGKTDKELYENTGQSNFEEHIYGNETSSDY









YNFQKPRPSEVPQDEDIYILPDSY*L/CQNIDFCYWMI









NIHCNFSTAKTRNQTKC*STVDWIKKMWYTYTIEYY









AAVKK/DTKLTWEQKIKYHIFSLKSGS





841
A
263
467
LQSRPLGAWRGPPHGKQPHGPHCPVGPASPSCWSWP









PWLRSSSSPPLCWLNACSAVLSAQTPATVHPP





842
B
1
5652
MGRGGLTRAGFARAWATRSWRPLAGERAQRRGRTP









SPHPRAAGGRHCYLGPRFQSPWQRSRSVRDAAAESG









DGGGAGMRMYDGAAGQNDGIQTRNRFIQCPRVRST









WGYWEAQGFTFAMEESTGTPTCSPASGWASPPELW









EEPIPSYTSGPRLSELLWRNMLGVVYANARLLGLYRF









LQENVLTHTRSVLKSLIQQTPETSSFILCLSHRSAPPHT









GQPQRQDPVPILLSALLSHLTTTHAVVQLMLTSDGLG









CHLPPLPTTVAVVFTMAVSSVLASGSPGVCQTPSTSP









SSYLQRAAVLQLLDNLPAPYHSAQGKSTVAVEIFSS









WLPWQGCWVASLPQMYIILLKPERNAPDWPHDLGS









RTRWAGVRGGLETTHRAAPLRKRGLRHRLGASPSG









ALPRSVRVLVAISALVFDSLLSRSGGFVPLKFTAPGD









MPNGPTRQPGKRLSRTIWRPPGRRRKSPPGYELQQTP









RRAPVDAGKAAAMSAAEVRPLLRLWVRRSAGLGGS









GEPAGRGGVRLGFAVFLGSAWSISPRDGSHQPVGLW









GRAVRLEVLFYFSCIDLIMIGAFHSALCSWPISLLNSR









PDPSPTQSPHHPPPNQPSQPSSLPLPCIPSALPFTLPQL









VYYGTPPCGILAPNALPTPEPSLTGAVNTSSMSLSIRL









SGKALLPIPAIEFGARLQISVPMPGPPQIAHGGAQILLT









SHVDLIPFTSSQALFIFQKQPPSVQDSHQLSKPRDTEA









QRLRFRQFQYQWRAGRTSRWASSGRCAPVAEARGA









LQGADAGAVVLEQFLGALPSKMRTWCSHRAPEAAG









RPPAWWGPHTDVPAGRGAAGMFGESLPPQKAWRIG









GSGKNRPLYDPFLPHSSGISGLGRTQDWSFGEEEDGK









SPRSRKNHRRACVPLLGRPQGCPIKDCVRLSPALPVF









PHPDYAAASGVDRERPPGSPGTCSSTRTSGVGQRRR









HDRKSIAKPEVISLLEQGEEPCQWSRHVLNALVQPC









WITTSQTPDVSIDGVIDIFHLCATDISPLQTLTVRETFC









QTCCQLIQRFVNAACATSFRQAKPAAIAIGYRKAYPP









DRPDPSGPGIHYITTTTECAYRHSTPMILPRQLKPRSR









FNHAVVPTSRWLGAEIAREQIRRISFDHNFSAGINGKS









SRSSSPVVHRKPNRARYNIITVFANCELVKLTPISNILS









PYDNRPDGRLNLFFSFSLSASASFSSSRFRPLQRKSVR









RRHRFQTDSTWDRQQRLREDIRRRQQHPDNECTNQH









IRAFFAISDTIKVSLIGVKKARPRRWRSSLNLAATADQ









WRGYQMRDKRLSIFYRQFTVYHSTKGKVQRFSVSV









RSRFYHRLWRGFWLRSGELACVCAAAGVTPSAQLH









YLSPPGYGWCALIARLFRIVCNVGTLHRPAAWRSSSI









DAFSSGVIELSDHHRLCRQAFMRLSVSRNRKGTSSKF









ITGDNRQSYSSRTFPYRWCRHYAADHRLRVTESAVR









ERFNFSSPLVLSCNKSLNFSPQYVASGQRILTAWSASS









QANEHEIRIRVAESRCPICRVPIDNRSAWSSYILAGITG









CTVISVHLAHTGASLLMLMAAFILVTRLQVLKDIIVK









SKGLSGYDSLRAKCREYAATQVDGQRKTLSVWCVL









GDWSHPYLPWTSKLKPTSSARWAKYRQRSPAQRRE









ASSLRNAAYRRDHYTILGTVKGAELELLRLPIRLWAS









TFRQSSAIRYPGCRYRCRSHRAWPRPDDYVIGQKYG









LETANRLARTALICRALIRRWWRERLQSERHRRCAA









AEKGALLHVEKMQHSYPCAGVTKRRSSSARRRSGR









QHGSERSACAVTERDQGVQWIPDWGQARIESMLLT









VLTGVSPVSAPGVTDVTVRAQRHGRAASASLELMEE









VANALKSMASGVVGIL





843
A
2313
1475
DDGAAHVMHREVWMAVFSYLSHQDLCVCMRVCRT









WNRWCCDKRLWTRIDLNHCKSITPLMLSGIIRRQPVS









LDLSWTNISKKQLSWLINRLPGLRDLVLSGCSWIAVS









ALCSSSCPLLRTLDVQWVEGLKDAQMRDLLSPPTDN









RPGQMDNRSKLRNIVELRLAGLDITDASLRLIIRHMP









LLSKLHLSYCNHVTDQSINLLTAVGTTTRDSLTEINLS









DCNKVTDQCLSFFKRCGNICHIDLRYCKQVTKEGCE









QFIAEMSVSVQFGQV\EEKLLQKLS





844
C
477
737
MDPKRIRKHRTMRMPKKINNRIPKTNVIFCSRDMLES









ADTFPRGTRDEGMGCPPSMVQGSQSTGYPGFLFFHH









QSAEVMVAALWRG





845
A
113
358
MELVRRLMPLTLLILSCLAELTMAEAEGNASCTVSL









GGANMAETHKAMILQLNPSENCTWTIERPENKSIRIIF









CYVQLGSE





846
A
16
381
SLTYLLTSAKKEIELMSEELRGLKSEKQLLSQEGNDL









KLENGSLLSKLVELEAKIALLQGDQQKLWSVNETLN









LEKEKFLEEKQDAEKYYEQEHLNKEALAVEREKLLK









EINVVQEELLKIM





847
A
3
5530
LPHGRTRGPGPAMAPWRKADKERHGVAIYNFQGSG









APQLSLQIGDVVRIQETCGDWYRGYLIKHKMLQGIFP









KSFIHIKEVTVEKRRNTENIIPAEIPLAQEVTTTLWEW









GSIWKQLYVASKKERFLQVQSMMYDLMEWRSQLLS









GTLPKDELKELKQKVTSKIDYGNKILELDLIVRDEDG









NILDPDNTSVISLFHAHEEATDKITERIKEEMSKDQPD









YAMYSRISSSPTHSLYVFVRNFVCRIGEDAELFMSLY









DPNKQTVISENYLVRWGSRGFPKEIEMLNNLKVVFT









DLGNKDLNRDKIYLICQIVRVGKMDLKDTGAKKCTQ









GLRRPFGVAVMDITDIIKGKAESDEEKQHFIPFHPVTA









ENDFLHSLLGKVIASKGDSGGQGLWVTMKMLVGDII









QIRKDYPHLVDRTTVVARKLGFPEIIMPGDVRNDIYIT









LLQGDFDKYNKTTQRNVEVIMCVCAEDGKTLPNAIC









VGAGDKPMNEYRSVVYYQVKQPRWMETVKVAVPI









EDMQRIHLRFMFRHRSSLESKDKGEKNFAMSYVKL









MKEDGTTLHDGFHDLVVLKGDSKKMEDASAYLTLP









SYRHHVENKGATLSRSSSSVGGLSVSSRDVFSISTLV









CSTKLTQNVGLLGLLKWRMKPQLLQENLEKLKIVDG









EEVVKFLQDTLDALFNIMMEHSQSDEYDILVFDALIY









IIGLIADRKFQHFNTVLEAYIQQHFSATLAYKKLMTV









LKTYLDTSSRGEQCEPILRTLKALEYVFKFIVRSRTLF









SQLYEGKEQMEFEESMRRLFESINNLMKSQYKTTILL









QVAALKYIPSVLHDVEMVFDAKLLSQLLYEFYTCIPP









VKLQKQKVQSMNEIVQSNLFKKQECRDILLPVITKEL









KELLEQKDDMQHQVLERKYCVELLNSILEVLSYQDA









AFTYHHIQEIMVQLLRTVNRTVITMGRDHILISHFVA









CMTAILNQMGDQHYSFYIETFQTSSELVDFLMETFIM









FKDLIGKNVYPGDWMAMSMVQNRVFLRAINKFAET









MNQKFLEHTNFEFQLWNNYFHLAVAFITQDSLQLEQ









FSHAKYNKILNKYGDMRRLIGFSIRDMWYKLGQNKI









CFIPGMVGPILEMTLIPEAELRKATIPIFFDMMLCEYQ









RSGDFKKFENEIILKLDHEVEGGRGDEQYMQLLESIL









MECAAEHPTIAKSVENFVNLVKGLLEKLLDYRGVMT









DESKDNRMSCTVNLLNFYKDNNREEMYIRYLYKLR









DLHLDCDNYTEAAYTLLLHTWLLKWSDEQCASQVM









QTGQQHPQTHRQLKETLYETIIGYFDKGKMWEEAIS









LCKELAEQYEMEIFDYELLSQNLIQQAKFYESIMKILR









PKPDYFAVGYYGQGFPSFLRNKVFIYRGKEYERRED









FQMQLMTQFPNAEKMNTTSAPGDDVKNAPGQYIQC









FTVQPVLDEHPRFNKPVPDQIINFYKSNYVQRFHYS









RPVRRGTVDPENEFASMWIERTSFVTAYKLPGILRWF









EVVHMSQTTISPLENAIETMSTANEKILMMINQYQSD









ETLPINPLSMLLNGIVDPAVMGGFAKYEKAFFTEEYV









RDHPEDQDKLTHLKDLIAWQIPFLGAGIKIHEKRVSD









NLRPFHDRMEECFKNLKMKVEKEYGVREMPDFDDR









RVGRPRSMLRSYRQMSIISLASMNSDCSTPSKPTSESF









DLELASPKTPRVEQEEPISPGSTLPEVKLRRSKKRTKR









SSVVFADEKAAAESDLKRLSRKHEFMSDTNLSEHAA









IPLKASVLSQMSFASQSMPTIPALALSVAGIPGLDEAN









TSPRLSQTFLQLSDGDKKTLTRKKVNQFFKTMLASK









SAEEGKQIPDSLSTDL





848
A
3
453
RGIRSWRLFTGCCVNPRKIFPRGHSCRFFYVLGQITLS









SLVAPVMWLSVALLNGTFYECAMSGTRSSGLLELIC









KGKPKECWEELHKVSCGKTSMLPTVNEELKLSLQAQ









SQILGWCLICSASFFSLLTTCYARCRSKVSYLQLSFW









KTY





849
A
2
807
VEFHPQRARAGARAPSMGVLLTQRTLLSLVLALLFPS









MASMAAIGSCSKEYRVLLGQLQKQTDLMQDTSRLL









DPYIRIQGLDVPKLREHCRERPGAFPSEETLRGLGRR









CFLQTLNATLGCVLHRLADLEQRLPKAQDLERSGLNI









EDLEKLQMARPNILGLRNNIYCMAQLLDNSDTAEPT









KAGRGASQPPTPTPASDAFQRKLEGCRFLHGYHRFM









HSVGRVFSKWGESPNRSRRHSPHQALRKGVRRTRPS









RKGKRLMTRGQLPR





850
A
1
2808
MGDFNTPLSTLDRSTRQKVNKDIQELNSALHQADLI









DIYRTLHPKSTEYTFFSAPHHTYSKIDHIVGSKALLSK









CKRTEIITNCLSDHSAIKLELRIKKLTQNRSTTWKLNK









NDYWVHNEMKAEIKMFFETNKNKDTTYQNLWDTF









KAVCRGKFIALNAHKRKQERSKIDTLTSQLKEPEKQE









QTHSKASRRQEITKIRAELKEIETQKTLQKINESRSWF









FERINKIDRPLARLIKKKREKNQIDPIKNDKGDITTDP









TEIQTTIREYYKHLYTNKLENLEEMDKFLDTYTLPRL









NQEEVESLNRPLTGSEIVAIINSLPTKKSPGPDGFTAEF









YQRYKEELVPFLLKLFQSIEKEGILPNSFDEASIILIPKP









GRDSTKKENFRPISLMNIDAKILNKILANRIQQHIKKLI









HHDQVGFIPGMQVWFNIRKSINVIQHINRTKDKNHMI









ISIDAEKAFDTIQQPFMLKTLNKLGIDGMYLKIIRAIY









DKPTANITLNGQKLEAFPLKTGTRQGCPLSPFLFNIVL









GYPSLLLPHTYTPDHVVGPGADIDPTQITFPGCICVKT









PCLPGTCSCLRHGENYDDNSCLRDIGSGGKYAEPVFE









CNVLCRCSDHCRNRVVQKGLQFHFQVFKTHKKGWG









LRTLEFIPKGRFVCEYAGEVLGFSEVQRRIHLQTKSDS









NYIIAIREHVYNGQVMETFVDPTYIGNIGRFLNHSCEP









NLLMIPVRIDSMVPKLALFAAKDIVPEEELSYDYSGR









YLNLTVSEDKERLDHGKLRKPCYCGAKSCTAFLPFD









SSLYCPVEKSNISCGNEKEPSMCGSAPSVFPSCKRLTL









EVDAFPPNGYGLYNIVGNAWEWTSDWWTVHHSVE









ETLNPMVPIPAGVFTMGTDDPQIKQDGEAPARRVTID









AFYMDAYEVSNTEFEKFVNSTGYLTEVPHTTPGDYG









NYNSR





851
A
1
1528
APSPHNRQHKHRRKRRNQCDFRLNLGQRWNFTLPLL









RSHEQSELNSFLWTIKRDPPSYFFGTIHVPYTRVWDFI









PDNSKEAFLQSSIVYFELDLTDPYTISALTSCQMLPQG









ENLQDVLPRDIYCRLKRHLEYVKLMMPLWMTPDQR









GKGLYADYLFNAIAGNWERKRPVWVMLMVNSLTE









VDIKSRGVPVLDLFLAQEAERLRKQTGAVEKVEEQC









HPLNGLNFSQVIFALNQTLLQQESLRAGSLQIPYTTED









LIKHYNCGDLSSVILSHDSSQVPNFINATLPPQERITA









QEIDSYLRRELIYKRNERIGKRVKALLEEFPDKGFFFA









FGAGHFMGNNTVLDVLRREGYEVEHAPAGRPIHKG









KSKKTSTRPTLSTIFAPKVPTLEVPAPEAVSSGHSTLP









PLVSRPGSADTPSEAEQRFRKKRRRSQRRPRLRQFSD









LWVRLEESDIVPQLQVPVLDRHISTELRLPRRGHSHH









SQMVASSACLSLWTPVFWVLVLAFQTETPLL





852
A
2
409
ALQSTLGAVWLGLLLNSLWKVAESKDQVFQPSTAA









SSEGAVVEIFCNHSVSNAYNFFWYLHFPGCAPRLLV









KGSKPSQQGRYNMTYERFSSSLLILQVREADAAVYY









CAVEVPNTDKLIFGTGTRLQVFPNIQNPD





853
B
418
1620
METIAFTGPLARGCRPRLQPTCHIEQVLRCCWKAWK









SICNCCFFLFFHSSSFSSSSTTGILFVILFVITVECLTLW









RPGNPTYLRSCKGQVNRQEVQKESHLPPPERTEQNP









YTPRRSQQASGKCQSSPQQTRHMGCSEVHARVLQA









GAHTRVRSLFWTGTSALAGYPGVPDGERVESGSPGT









VETREMHRYGVLQHPGEHAGSQRGKSSEASRRNGT









DLALKEEAGLPVADAGPTRGSSKRKAEIKPLESWHS









HGRSEPEENQGSGGAGCALTGATDLGCVPRLPYKGY









SVWVPAPLSGLSDWLQLQLLPVTEQQGPVSFPKDFR









QSPVLLAILLVSTLGKLGAGDSETTVQTSSPSQAQPT









GCDPTETPPQQVIHSVLLASLGFLITCCFRREQSTLE





854
A
385
3
RNRSVVPEFVLLGLSAGPQTQTLLFVLFVVICLLTVM









GNLLLLVVINADSCLHTPMYFFLGQLSFLDLCHSSVT









APKLLENLLSEKKTISVEGCMA*VFFVFATGGTESSL









LAVMAYDRYVAIRTRG





855
A
1674
1839
VVRVTCCPPARSTTERTNAYDEEDCVEMVASGGWN









DVACHTTMYFMCEFDKKNM





856
A
1
318
GFGAARVRSLFKEARARAPCIVYIDEIDAVGKKRSTT









MSGFSNTEEEQTLNQLLVEMDGASLDQLPSQGTMRK









LRGKTPACSCLTEPTGSRRAMEGHSLCWGCLLH





857
A
2
462
EFQEAAKLYHTNYVRNSRAIGVLWAIFTICFAIVNVV









CFIQPYWIGDGVDTPQAGYFGLFHYCIGNGFSRELTC









RGSFTDFSTLPSGAFKAASFFIGLSMMLIIACIICFTLFF









FCNTATVYKICAWMQLTSAACLVLGCMIFPDGWDS









DEVN





858
A
997
758
MIQLFFVLYGILALAFLSGYYVTLAAQILAVLLPPVM









LLIDGNVAYWHNTRRVEFWNQMKLLGESVGIFGTA









VILATDG*





859
A
126
392
ARGSKHTGLIAQWAHEQSGHGGRAGGYAWAQQHG









LPLTKADLP\AMATAECPICQQQRPTLSPRYGTIPW/W









AWDAPGGRGCWRLQKAGE





860
A
405
3
LSLLLVPTASFCKSPTISQTLIKVNHSTGVRAVRNSLP









FIIFCWEKVQGTSHSVGTRAKLPHHGNTLPTHSST*Q









QAILPRPLPPRPISTPACKRWWALALGWFPTSVGVML









DIKPAFPVELPSVCMSFFNPC





861
B
1
1575
MRSPCVNKIQGLEPNAKPVLSPATLQLLALMYPVPF









KEMWSRKGNAYTADSGAENYQSLIAGSCDTRGLGY









VGGVAFGMMTTGNNEWPRAQISVGTQDRDGSGNIG









LQMHFRFMASGQISYAGPQGAGSFQQVPASDRNIKH









SIKDDESSIAYNNIKSMRFRNFIFNDDEQERLGARRH









NQHHLVAMVDTESPLCPLSPLEAGDLESPLSEEFLQE









MGNIQEISQSIGEDSSGSFGFTEYQYLGSCPGSDGSVI









TGSIQQTAAVEKEEYGAAATEGHVALLGPSSTIHNKI









TNRTHDWGALALDGGPVPRCSGCRVQVVGWLCIPP









RKAKCLTQREHYPFYKVINTGCQVPHLKKNLDHLLG









MGSSRAWLPSTQIHRWRGAAGGPRTSRFLRATPERL









QPLAAPGKASRPMWPRPRAPPETSGAHRGLGKILRD









LTAQGRTHSSHKQLEMDGLAEKGNPSTNPAPGPTLQ









QVCPCQSPRGAQPDPDRWPGGTQGFGQPLSWAQVS









WRSQGRQGLPPLATYPGDA





862
A
3
690
SLSLPSSWDYRRAPPRPANFVFLVETGFTMLTTIVLIS









*PHHPPASASQNAGIV/GVSHSPRPGIACLITRLDF/HQ









GTQSRVPALGSS*PIPHARPRGHTLKALPYRLYTQTA









RGFGQPRVRIPLLSLLGSLKPSELRGQVGHAFASAIFA









SAISLCVFTLLVEGKLKPPGASVRRCLQSNRDLGFRA









TFPTSHRGHGALGLNYISQRATGTPNLWLPFRFRVVL









LQAACG





863
A
1073
480
XXPDALSTVAEXPGRPTRPPTRTAAPWPRPGCSSASA









PPTPASAPWPASPSSSSGRWSTDSRGPRPWEGSQGC









WHCGSW*RT*CTCKIIGGPGSRGCAASSSWASSSRPS









PSLPSAPSSCWPSPGIRASQTPPATTSPASGASFPSSGP









SCSASMPTATGLTLLTSASSAISDPGGSVYA*SGMVH









QSGKEPSTVYTS





864
A
31
390
MVLPLPWLSRYHFLRLLLPSWSLAPQGSHGCCSQNP









KASMEEQTNSRGNGKMTSPPRGPGTHRTAELARAEE









LLEQQLELYQALLEGQEGAWEAQALVLKIHKLKEQ









MRRHQESLGGGA*





865
A
841
1209
SPARGKSNRTDVMITAPKNKKMTENLAAPEALDSST









HSSSTATQSRAKMNTPAPTPSTVPAIPRGGSGGPPPC









APHDRVSSVLQCDTQAMDHKTESSHSVVEFLFKRTK









TPSPFHPAVRENRN





866
A
5157
2939
AVRAEPGLEELSSGLRAHSPSATTVCEPEAQGSASGC









RYAAHPHWGLGGAAAAGGSWEPQPPRPVCEPAGRG









KPHPPAAPRSPLLPGSRRRPHAAQPGARARTSPPPAS









ARNMAARPAATLAWSLLLLSSALLREGCRARFVAER









DSEDDGEEPVVFPESPLQSPTVLVAVLARNAAHTLPH









FLGCLERLDYPKSRMAIWAATDHNVDNTTEIFREWL









KNVQRLYHYVEWRPMDEPESYPDEIGPKHWPTSRFA









HVMKLRQAALRTAREKWSDYILFIDVDNFLTNPQTL









NLLIAENKTIVAPMLESRGLYSNFWCGITPKGFYKRT









PDY\VQIREWKRTGCFPVPMVHSTFLIDLRKEASDKL









TFYPPHQDYTWTFDDIIVFAFSSRQAGIQMYLCNREH









YGYLPIPLKPHQTLQEDIENLIHVQIEAMIDRPPMEPS









QYVSVVPKYPDKMGFDEIFMINLKRRKGQGGDRWL









RTLYEQEIEVKIVEAVDGKALNTSQLKALNIEMLPGY









RDPYSSRPLTRGEIGCFLSHYSVWKEVIDRELEKTLVI









EDDVRFEHQFKKKLMKLMDNIDQAQLDWELIYIGR









KRMQVKEPEKAVPNVANLVEADYSYWTLGYVISLE









GAQKLVGANPFGKMLPVDEFLPVMYNKHPVAEYKE









YYESRDLKAFSAEPLLIYPTHYTGQPGYLSDTETSTI









WDNETVATDWDRTHAWKSRKQSRIYSNAKNTEALP









PPTSLDTVPSRDEL





867
A
1
2088
PTQSTRRIATVSIAAAVAPLTLFLYRGDGGLSSRRRA









DAAAGALCGEVAVKPPINPFTEFMEKAVNDGSHSEE









LFCHLKTISEKEDLPRCTSESHLSCLKQDILNEKTELE









ATLKEAELVTHSVELLLPLPKDTIEKINFENANLSALN









LKISEQKEILIKELDTFKSVKLALEHLLRKRDYKQTG









DNLSSMLLENLTDNESENTNLKKKVFEKEAHIQELSC









LFQSEKANTLKANRFSQSVKVVHERLQIQIHKREAEN









DKLKEYVKSLETKIAKWNLQSRMNKNEAIVMKEAS









RQKTVALKKASKVYKQRLDHFTGAIEKLTSQIRDQE









AKLSETISASNAWKSHYEKIVIEKTELEVQIETMKKQI









INLLEDLKKMEDHGKNSCEEILRKVHSIEHENETLNL









ENTKLKLRFPCRITESKNMNILIVLDMLCYISSEKTTL









AALKDEVVSVENELSELQEVEKKQKTLIEMYKTQVQ









KLQEAAEIVKSRCENLLHKNNQITKTKNKNVEKMRG









QMESHLKELERVCDSLTAAERRLHECQESLQCCKGK









CADQEHTIRELQGQVDGNHNLLTKLSLEEENCLIQLK









CENLQQKLEQMDAENKELEKKLANQEECLKHSNLK









FKEKSAEYTALARQLEAALEEGRQKVAEEIEKMSSR









ESALQIKILDLETELRKKNEEQNQLVCKMNSDPETP





868
A
749
1020
VLVRDPSQPAQPFSVSFSPQKHRDEKLYFLPKGVSGG









SELRGRPQPYLPCPVSPTLCPWGHLSLAPPSVPPTACE









SSSELWPSLSWTWAE





869
A
114
549
RPLVLPELGSAAGLLRLETPSQLRPNPKAMNSGVCLC









VLMAVLAAGALTQPVPPADPAGSGLQRAEEAPRRQL









RVSQRTDGESRAHLGALLARYIQQARKAPSGRMSTV









KNLQNLDPSHRISDRDYMGWMDFGRRSAEEYEYPS





870
C
169
423
MDGDLQGPRIPRRSVLVVHETGLRTLIMDHTARGDT









GNPLLLGSGGRGEWQPPQAPFIAQVPRNKLSSKKKG









DTVEGKLPPTQP





871
A
54
410
MPTTPVAYNSLGAVIGIAVLGSLVVALVALFIGYRH









WQKGKEHHHLAVAYSSGRLDGSEYVMPDVPPSYSH









YYSNPSYHTLSQCSPNPPPPNKVPGPLFASLQKPERPG









GAQGHDNHTT





872
A
1
542
LPGAGHRRVLDAGGPRGAGLQPQLPARQVGAVAEL









HVSGPPGAGLA/GSGSGASGVGLGAAGWGSGPRGVR









AEGEGAYSGPGQVFPVQGNVGNADAGTTGVGVPAG









WWPPLPTRLQTLSVASPWLCP*AAASARSPPSGLSGE









*TLFYTFSFLPPVVIAASPPAGLASEARPCFPRFHSYP





873
A
131
677
PSSLS/CDIFLRSPISTPSPSPLPRTPTSTPVHVKQGTAG









SVINNPYVIMDKQPGQVIGATTPSTGSPTNKISTASQI









SQGTGSPVPKIHGSSFVTSTVKVIIKQEPGEAPHVPAT









GAASQSPLPQYVTVKGGHMIAVSPQKQVITPGEGIAQ









SAKVQPSKVL/GQIG*CLPTLARADLLYSVC





874
A
1617
4994
MWFLFLCPNLWAMPVQIIMGVILLYNLLGSSALVGA









AVIVLLAPIQYFIATKLAEAQKSTLDYSTERLKKTNEI









LKGIKLLKLYAWEHIFCKSVEETRMKELSSLKTFALY









TSLSIFMNAAIPIAAVLATFVTHAYASGNNLKPAEAF









ASLSLFHILVTPLFLLSTVVRFAVKAIISVQKLNEFLLS









DEIGDDSWRTGESSLPFESCKKHTGVQPKTINRKQPG









RYHLDSYEQSTRRLRPAETEDIAIKVTNGYFSWGSGL









ATLSNIDIRIPTGQLTMIVGQVGCGKSSLLLAILGEMQ









TLEGKVHWSNVNESEPSFEATRSRNRYSVAYAAQKP









WLLNATVEENITFGSPFNKQRYKAVTDACSLQPDIDL









LPFGDQTEIGERGINLSGGQRQRICVARALYQNTNIV









FLDDPFSALDIHLSDHLMQEGILKFLQDDKRTLVLVT









HKLQYLTHADWIIAMKDGSVLREGTLKDIQTKDVEL









YEHWKTLMNRQDQELEKDMEADQTTLERKTLRRA









MYSREAKAQMEDEDEEEEEEEDEDDNMSTVMRLRT









KMPWKTCWRYLTSGGFFLLILMIFSKLLKHSVIVAID









YWLATWTSEYSINNTGKADQTYYVAGFSILCGAGIF









LCLVTSLTVEWMGLTAAKNLHHNLLNKIILGPIRFFD









TTPLGLILNRFSADTNIIDQHIPPTLESLTRSTLLCLSAI









GMISYATPVFLVALLPLGVAFYFIQKYFRVASKDLQE









LDDSTQLPLLCHFSETAEGLTTIRAFRHETRFKQRML









ELTDTNNIAYLFLSAANRWLEVRTDYLGACIVLTASI









ASISGSSNSGLVGLGLLYALTITNYLNWVVRNLADLE









VQMGAVKKVNSFLTMESENYEGTMDPSQVPEHWPQ









EGEIKIHDLCVRYENNLKPVLKHVKAYIKPGQKVGIC









GRTGSGKSSLSLAFFRMVDIFDGKIVIDGIDISKLPLHT









LRSRLSIILQDPILFSGSIRFNLDPECKCTDDRLWEALE









IAQLKNMVKSLPGGLDAVVTEGGENFSVGQRQLFCL









ARAFVRKSSILIMDEATASIDMATENILQKVVMTAFA









DRTVVTMAHRVSSIMDAGLVLVFSEGILVECDTVPN









LFAHKNGPFSTLVMTNK*





875
A
3
1004
KYSGVHFNSQSIAPTIEQIDQSFGATHPGVYNSAEQLF









HLNFRGLSFSFQLDSWTEAPKYEPNFAHGLASLQIPH









GATVKRMYIYSGNSLQDTKAPMMPLSCFLGNVYAE









SVDVLRDGTGPAGLRLRLLAAGCGPGLLADAKMRV









FERSVYFGDSCQDVLSMLGSPHKVFYKSEDKMKIHS









PSPHKQVPSKCNDYFFNYFTLGVDILFDANTHKVKK









FVLHTNYPGHYNFNIYHRCEFKIPLAIKKENADGQTE









TCTTYSKWDNIQELLGHPVEKPVVLHRSSSPNNTNPF









GSTFCFGLQRMIFEVMQNNHIASVTLYGPPRPGSHLR









TAELP





876
A
485
717
EPLLLLYLVSKIRTSGSNPLRSTFLSRSRSISKNGDPGA









ASAACSRTARGPLSSGSNSRRRTEAKRWLRQQQHCE









AF





877
A
2
1828
NYKTLIIICALFTLVTVLLWNKCSSDKAIQFPRRSSSG









FRVDGFEKRAAASESNNYMNHVAKQQSEEAFPQEQ









QKAPPVVGGFNSNVGSKVLGLKYEEIDCLINDEHTIK









GRREGNEVFLPFTWVEKYFDVYGKVVQYDGYDRFE









FSHSYSKVYAQRAPYHPDGVFMSFEGYNVEVRDRV









KCISGVEGVPLSTQWGPQGYFYPIQIAQYGLSHYSKN









LTEKPPHIEVYETAEDRDKNKPNDWTVPKGCFMAN









VADKSRFTNVKQFIAPETSEGVSLQLGNTKDFIISFDL









KFLTNGSVSVVLETTEKNQLFTIHYVSNAQLIAFKER









DIYYGIGPRTSWSTVTRDLVTDLRKGVGLSNTKAVK









PTKIMPKKVVRLIAKGKGFLDNITISTTAHMAAFFAA









SDWLVRNQDEKGGWPIMVTRKLGEGFKSLEPGWYS









AMAQGQAISTLVRAYLLTKDHIFLNSALRATAPYKF









LSEQHGVKAVFMNKHDWYEEYPTTPSSFVLNGFMY









SLIGLYDLKETAGEKLGKEARSLYERGMESLKAMLP









LYDTGSGTIYDLRHFMLGIAPNLARWDYHTTHINQL









QLLSTIDESPIFKEFVKRWKSYLKGSRAKHN





878
A
353
646
FYWNWVPFTNWQNPRLMGQK*HARWLHLRSLLPA









M*ATLL*RENNR*LLLLTLTSIFKTFRIRRLSVSKP*VK









AKKKTRLIIWSTSKFLSCMMLKFT





879
A
1648
1258
NSTFICYVASSASAFLTAPLLEFLLALYFLFADAMQL









NDKWQGLCWPMMDFLRCVTAALIYFAISITAIAKYS









DGASKAAGVFGFFATIVFATDFYLIFNDVAKFLKQG









DSADETTAHKTEEENSDSDSD





880
A
92
422
ASAEPPAMPGIVVFRRRWSVGSDDLVLPAIFLFLLHT









TWFVILSVVLFGLVYNPHEACSLNLVDHGRGYLGILL









SCMIAEMAIIWLSMRGGILYTEPRDSMQYVLYVRLA





881
A
946
1424
YFRFLCVIFCSFLLRCLVSRVLLYPMLIAEIPRVQGRG









GPSWPGLGGRLLKELSRLLTFLKVRKLSILSGWREVP









LGQPSLTEPP/PPAPPHPGPGSWLASASAPHILSQPPAA









GPAGQPPSPGSPVPGGCSLALPVTSVLCLEPPALKPA









AASAPVVAVH





882
A
1
1917
MVIGRADGENKRKKTFIEHLLCPNPCDFQHKITVQAS









PNLDKRRSLNSSSSSPPSSPTMMPRLRAIQLTSDESNK









TWGRNTVFRQEEFEDVKRNFKKKGCTWGPNSIQMK









DRTDCKERIRPLSDGNSPWSTILIKNQKTMPLASLFV









DQPGSCEEPKLSPDGLEHRKPKQIKLPSQAYIDLPLG









KDAQRENPAEAESWEEAASANAATVSIEMTPTNSLS









RSPQRKKTESALYGCTVLLASVALGLDLRELHKAQA









AEEPLPKEEKKKREGIFQRASKSRRSASPPTSLPSTCG









EASSPPSLPLSSALGILSTPSFSTKCLLQMDSEDPLVDS









APVTCDSEMLTPDFCPTAPGSGREPALMPRLDTDCSV









SRNLPSSFLQQTCGNVPYCASSKHRPSHHRRTMSDG









NPTPTGATIISATGASALPLCPSPAPHSHLPREVSPKK









HSTVHIVPQRRPASLRSRSDLPQAYPQTAVSQLAQTA









CVVGRPGPHPTQFLAAKERTKSHVPSLLDADVEGMK









PQTFAVSVAALKGAASGVVSSSQRVRDLADLRSEAV









DLHTPRDRTLPTHQVQGQGQQQEEQPQVEVAAVGE









LALLRVVVPAPGLLQYIPLSITTPSKHRSPYDRTSRGL









RYGYAKEQTLMSM





883
A
1
684
RTRGPPPQSRSGGRRRRIPLYLPTSCIKELVAGGVAVE









SWPGRDAAQLLLCSCLLSPPPVMTETREPAETGGYA









SLEEDDEDLSPGPEHSSDSEYTLSEPDSEEEEDEEEEE









EETTDDPEYDPGYKVK*RLGGGRGGPSRRAPR\AAQP









PAQPCQLCGRSPLGEAPPGTPRCTGTSCCMPGVRCR









QSPHTGSLAEGVGWEEGAEEIGVVTVVMGDGVLPV









CVVLEVDV





884
A
1
1047
MGITCWIALYAVEALPTCPFSCKCDSRSLEVDCSGLG









LTTVPPDVPAATRTLLLLNNKLSALPSWAFANLSSLQ









RLDLSNNFLDRLPRSIFGDLTNLTELQLRNNSIRTLDR









DLLRHSPLLRHLDLSINGLAQLPPGLFDGLLALRSLSL









RSNRLQNLDRLTFEPLANLQLLQVGDNPWECDCNLR









EFKHWMEWFSYRGGRLDQLACTLPKELRGKDMRM









VPMEMFNYCSQLEDENSSAGLDIPGPPCTKASPEPAK









PKPGAEPEPEPSTACPQKQRHRPASVRRAMGTVIIAG









VVCGVVCIMMVVAAAYGCIYASLMNAKYHRELKKR









QPLMGDPEGEHEDQKQISSVA





885
A
87
554
MEALTLWLLPWICQCVSVRADSIIHIGAIFEENAAKD









DRVFQLAVSDLSLNDDILQSEKITYSIKVIEANNPFQA









VQEACDLMTQGILALVTSTGCASANALQSLTDAIMHI









PHLFVQRNPGGSPRTACHLNPSPDGEAYTLASRPPVR









LNDVMLRL





886
A
269
832
MGSSRLAALLLPLLLIVIDLSDSAGIGFRHLPHWNTR









CPLASHTDDSFTGSSAYIPCRTWWALFSTKPWCVRV









WHCSRCLCQHLLSGGSGLQRGLFHLLVQKSKKSSTF









KFYRRHKMPAPAQRKLLPRRHLSEKSHHISIPSPDISH









KGLRSKRTPPFGSRDMGKAFPKWDSPTPGGDRPSSFE









LLP*





887
A
3
5530
LPHGRTRGPGPAMAPWRKADKERHGVAIYNFQGSG









APQLSLQIGDVVRIQETCGDWYRGYLIKHKMLQGIFP









KSFIHIKEVTVEKRRNTENIIPAEIPLAQEVTTTLWEW









GSIWKQLYVASKKERFLQVQSMMYDLMEWRSQLLS









GTLPKDELKELKQKVTSKIDYGNKILELDLIVRDEDG









NILDPDNTSVISLFHAHEEATDKITERIKEEMSKDQPD









YAMYSRISSSPTHSLYVFVRNFVCRIGEDAELFMSLY









DPNKQTVISENYLVRWGSRGFPKEIEMLNNLKVVFT









DLGNKDLNRDKIYLICQIVRVGKMDLKDTGAKKCTQ









GLRRPFGVAVMDITDIIKGKAESDEEKQHFIPFHPVTA









ENDFLHSLLGKVIASKGDSGGQGLWVTMKMLVGDII









QIRKDYPHLVDRTTVVARKLGFPEIIMPGDVRNDIYIT









LLQGDFDKYNKTTQRNVEVIMCVCAEDGKTLPNAIC









VGAGDKPMNEYRSVVYYQVKQPRWMETVKVAVPI









EDMQRIHLRFMFRHRSSLESKDKGEKNFAMSYVKL









MKEDGTTLHDGFHDLVVLKGDSKKMEDASAYLTLP









SYRHHVENKGATLSRSSSSVGGLSVSSRDVFSISTLV









CSTKLTQNVGLLGLLKWRMKPQLLQENLEKLKIVDG









EEVVKFLQDTLDALFNIMMEHSQSDEYDILVFDALIY









IIGLIADRKFQHFNTVLEAYIQQHFSATLAYKKLMTV









LKTYLDTSSRGEQCEPILRTLKALEYVFKFIVRSRTLF









SQLYEGKEQMEFEESMRRLFESINNLMKSQYKTTILL









QVAALKYIPSVLHDVEMVFDAKLLSQLLYEFYTCIPP









VKLQKQKVQSMNEIVQSNLFKKQECRDILLPVITKEL









KELLEQKDDMQHQVLERKYCVELLNSILEVLSYQDA









AFTYHHIQEIMVQLLRTVNRTVITMGRDHILISHFVA









CMTAILNQMGDQHYSFYIETFQTSSELVDFLMETFIM









FKDLIGKNVYPGDWMAMSMVQNRVFLRAINKFAET









MNQKFLEHTNFEFQLWNNYFHLAVAFITQDSLQLEQ









FSHAKYNKILNKYGDMRRLIGFSIRDMWYKLGQNKI









CFIPGMVGPILEMTLIPEAELRKATIPIFFDMMLCEYQ









RSGDFKKFENEIILKLDHEVEGGRGDEQYMQLLESIL









MECAAEHPTIAKSVENFVNLVKGLLEKLLDYRGVMT









DESKDNRMSCTVNLLNFYKDNNREEMYIRYLYKLR









DLHLDCDNYTEAAYTLLLHTWLLKWSDEQCASQVM









QTGQQHPQTHRQLKETLYETLYETIIGYFDKGKMWEEAIS









LCKELAEQYEMEIFDYELLSQNLIQQAKFYESIMKILR









PKPDYFAVGYYGQGFPSFLRNKVFIYRGKEYERRED









FQMQLMTQFPNAEKMNTTSAPGDDVKNAPGQYIQC









FTVQPVLDEHPRFKNKPVPDQIINFYKSNYVQRFHYS









RPVRRGTVDPENEFASMWIERTSFVTAYKLPGILRWF









EVVHMSQTTISPLENAIETMSTANEKILMMINQYQSD









ETLPINPLSMLLNGIVDPAVMGGFAKYEKAFFTEEYV









RDHPEDQDKLTHLKDLIAWQIPFLGAGIKIHEKRVSD









NLRPFHDRMEECFKNLKMKVEKEYGVREMPDFDDR









RVGRPRSMLRSYRQMSIISLASMNSDCSTPSKPTSESF









DLELASPKTPRVEQEEPISPGSTLPEVKLRRSKKRTKR









SSVVFADEKAAAESDLKRLSRKHEFMSDTNLSEHAA









IPLKASVLSQMSFASQSMPTIPALALSVAGIPGLDEAN









TSPRLSQTFLQLSDGDKKTLTRKKVNQFFKTMLASK









SAEEGKQIPDSLSTDL





888
A
586
959
NWECILHRVTGSSLLRPAPQAAGSWLIGGWERGCGP









RCAPGGAP/APYLARPASSAARGPPVGRRGPPWGWA









ASAAISARSSPPSAAGSGPDWRRPGKGHSPRPTAAAS









ATRAPARAPPSPRLAAA





889
A
399
2002
LSLTHSHLCPHTPTHAHTQTTSAAGGPGQTPMG/PLI









GGPRIKVRG*HPGGLGGGKPQWPHGLSHRTPKGPSM









ALGPRGSGKR/GVRPGPV*GGLSPA*A*SPFSTETKPA









TCRDGCGGPGGRALGPVCSVGGE*RPPGRPPVGGEL









CPK*LGSSWC**WHKGRGEQLGRPGWVPMLVLSVH









RSWCGMAGA*GQGGASVGTADEGGSHSQREGLLRP









PRSRPGLPAEWGQESAGQRALGLSWAGGSQRPPKEP









SGEGRKLAAEGGVGAGAAGPR/GLPG*ERCCSGFPSG









PGEQGPIAGCHVLGATQP\RGPPSTVGAPPAHSHQGD









GARPQEPTTDGEGRELGGLHPNLNPHLRGPRSDTQG/









PAAPRAGPGSGAPWPPPVQVLGAAGPAPVHSVLIAG









SLHVVSSQTATSFSPLTPSQ\GW*AQSG*FPS*SDSEGL









LSPQGCASR*QGRAAQVGENSTLARPLS\ELKAARFG









GDREETGWPLP*GVHRQGMRGQHFRMSPGVQDQPG









QHGETLSLRIGFFAIFWDGSVGSCGPATWSS





890
A
3
655
IQCGGIPLLPLPSPLSMA/HCVQFQAGPPV\HWRVHAG









LVSHSAVRPHQGALVERIIPHPLYSAQNHDYDVALLR









LQTALNFSDTVGAVCLPAKEQHFPKGSRCWVSGWG









HTHPSHTYSSDMLQDTVVPLLSTQLCNSSCVYSGAL









TPRMLCAGYLDGRADACQGDSGGPLVCPDGDTWRL









VGVVSWGRGCAEPNHPGVYAKVAEFLDWIHDTAQD









SLL





891
A
336
1043
MPRRGLILHTRTHWLLLGLALLCSLVLFMYLLECAP









QTDGNASLPGVVGENYGKEYYQALLQEQEEHYQTR









ATSLKRQIAQLKQELQEMSEKMRSLQERRNVGANGI









GYQSNKEQAPSDLLEFLHSQIDKAEVSIGAKLPSEYG









VIPFESFTLMKVFQLEMGLTRHPEEKPVRKDKRDELV









EVIEAGLEVINNPDEDDEQEDEEGPLGEKLIFNENDF









VEGYYRTERDKGTQYELF





892
A
319
492
MQGVRVSFGWAMGLAWGSCALEAFSGTLLLSAAW









TLSLSPPICGHLSPQQVGGRGGD*





893
A
3
1441
KLSVNHRRTHLTKLMHTVEQATLRISQSFQKTTEFDT









NSTDIALKVFFFDSYNMKHIHPHMNMDGDYINIFPKR









KAAYDSNGNVAVAFLYYKSIGPLLSSSDNFLLKPQN









YDNSEEEERVISSVISVSMSSNPPTLYELEKITFTLSHR









KVTDRYRSLCAFWNYSPDTMNGSWSSEGCELTYSN









ETHTSCRCNHLTHFAILMSSGPSIGIKDYNILTRITQLG









IIISLICLAICIFTFWFFSEIQSTRTTIHKNLCCSLFLAEL









VFLVGINTNTNKLFCSIIAGLLHYFFLAAFAWMCIEGI









HLYLIVVGVIYNKGFLHKNFYIFGYLSPAVVVGFSAA









LGYRYYGTTKVCWLSTENNFIWSFIGPACLIILVNLL









AFGVIIYKVFRHTAGLKPEVSCFENIRSWARGALALL









FLLGTTWIFGVLHVVHASVVTAYLFTVSNAFQGMFIF









LFLCVLSRKIQEEYYRLFKNVPCCFGCLR





894
A
303
368
LSSSSSSSSSSSSSSSSSSSSSSSSSHYHHHHHHHHHHH









HHHHVDWIP





895
A
260
1
SSSQMQVKTQDEEMSGQKTRKPRSAPGTTERS*LAA









PTPGLPAPDSAEARKAPAPPPGPAAPPQPAGPAPRSLT









HLGGP*KSTPSR





896
A
1
482
MGCRLLCCVVFCLLQAGPLDTAVSQTPKYLVTQMG









NDKSIKCEQNLGHDTMYWYKQDSKKFLKIMFSYNN









KELIINETVPNRFSPKSPDKAHLNLHINSLELGDSAVY









FCASSQDTALQSHCIPVHKPPGSARKLQGSVCTCTQG









SSLHSLMASDGVPVC





897
A
2
760
YGYTPPPRLLPRNTFSRKAFKLKKPSKYCSWKCAAL









SAIAAALLLAILLAYFIAMHLLGLNWQLQPADGHTF









NNGIRTGLPGNDDVATMPSGGKVPWSLKNSSIDSGE









AEVGRWVTQEVPPGVFWRSQIHISQPQFLKFNISLGK









DALFGVYIRRGLPPSHAQCPLTSHIGSTGPHLSHGAE









MWRQCQGLAVSLHSTSNRALQKPLIRSFISFWRKPDL









YRHVFQNLPFQRSSSCRERLCETKRTLVSSELD





898
A
77
273
PRTGMGCCLPGADPAEIRSSPSPSWSTAGSQGCWMT









SFSPCSCAPCCSSGCACTTGFVSREKESV





899
A
1
4499
SRPWWLRASERPSAPSAMAKRSRGPGRRCLLALVLF









CAWGTLAVVAQKPGAGCPSRCLCFRTTVRCMHLLL









EAVPAVAPQTSILDLRFNRIREIQPGAFRRLRNLNTLL









LNNNQIKRIPSGAFEDLENLKYLYLYKNEIQSIDRQAF









KGLASLEQLYLHFNQIETLDPDSFQHLPKLERLFLHN









NRITHLVPGTFNHLESMKRLRLDSNTLHCDCEILWLA









DLLKTYAESGNAQAAAICEYPRRIQGRSVATITPEEL









NCERPRITSEPQDADVTSGNTVYFTCRAEGNPKPEII









WLRNNNELSMKTDSRLNLLDDGTLMIQNTQETDQGI









YQCMAKNVAGEVKTQEVTLRYFGSPARPTFVIQPQN









TEVLVGESVTLECSATGHPPPRISWTRGDRTPLPVDP









RVNITPSGGLYIQNVVQGDSGEYACSATNNIDSVHAT









AFHVQALPQFTVTPQDRVVIEGQTVDFQCEAKGNPP









PVIAWTKGGSQLSVDRRHLVLSSGTLRISGVALHDQ









GQYECQAVNIIGSQKVVAHLTVQPRVTPVFASIPSDT









TVEVGANVQLPCSSQGEPEPAITWNKDGVQVTESGK









FHISPEGFLTINDVGPADAGRYECVARNTIGSASVSM









VLSVNVPDVSRNGDPFVATSIVEAIATVDRAINSTRT









HLFDSRPRSPNDLLALFRYTPRDPYTVEQARAGEIFER









TLQLIQEHVQHGLMVDLNGTSYHYNDLVSPQYLNLI









ANLSGCTAHRRVNNCSDMCFHQKYRTHDGTCNNLQ









HPMWGASLTAFERLLKSVYENGFNTPRGINPHRLYN









GHALPMPRLVSTTLIGTETVTPDEQFTHMLMQWGQF









LDHDLDSTVVALSQARFSDGQHCSNVCSNDPPCFSV









MIPPNDSRARSGARCMFFVRSSPVCGSGMTSLLMNS









VYPREQINQLTSYIDASNVYGSTEHEARSIRDLASHR









GLLRQGIVQRSGKPLLPFATGPPTECMRDENESPIPCF









LAGDHRANEQLGLTSMHTLWFREHNRIATELLKLNP









HWDGDTIYYETRKIVGAEIQHITYQHWLPKILGEVG









MRTLGEYHGYDPGINAGIFNAFAT\AAFRFGHTLVNP









LLLPGLDENFQPIAQDHLPLHKAFFSPFRIVNEGGIDP









LLRGLFGVAGKMRVPSQLLNTELTERLFSMAHTVAL









DLAAINIQRGRDHGIPPYHDYRVYCNLSAAHTFEDLK









NEIKNPEIREKLKRLYGSTLNIDLFPALVVEDLVPGSR









LGPTLMCLLSTQFKRLRDGDRLWYENPGVFPAQLT









QIKQTSLARILCDNADNITRVQSDVFRVAEFPHGYGS









CDEIPRVDLRVWQDCCEDCRTRGQFNAFSYHFRGRR









SLEFSYQEDKPTKKTRPRKIPSVGRQGEHLSNSTSA\F









STRSDASG\TNDFQRVCSWEMQKTITDLRTQIKKLES









R\LSTTECVDAGGESHANNTKWKKDACTICECKDGQ









VTCFVEACPPATCAVPVNIPGACCPVCLQKRAEEKP





900
A
1674
1839
VVRVTCCPPARSTTERTNAYDEEDCVEMVASGGWN









DVACHTTMYFMCEFDKKNM





901
A
397
2
DPPTSAMSESPSPSLGVGSPLVPS/PPPPLGLPTVRSLL









PPTIR/VAFGTPPPSPARSPSTSPSPHSPRSLVPARDGG









ADSGERLGPGALGLGAGSGGGRARYGPSRSRPSDRA









ADPGGVRPFPVAPGIPPHCTR





902
A
1
411
LLVFQVHQCLHCKLL*/PSYVPLGYTEAFLATQNIGR









VSLWAKHGHPDPFPLARADFRAQESPSPNDPSWLL*









YEER*WSQATTKG*NRCC*RCD*LQAPSRRPEAVHTN









DPR*REVREEHMVLQVLTR





903
A
1
193
LLPRPGSGLDFLLSPVLPS/HSASWACPLPRPSPMPSS









CC*R*RKEMASGFSKGPTLGCCPTCPP





904
A
119
571
MNRRASQMLLMFLLAICLLAIIFVPQEMQMLREVLA









TLGLGASALANTLAFAHGNEVIPTIIRARAMGINATF









ANIAGALAPLMMILSVYSPPLPWIIYGVFPFISGFAFLL









LPETRNKPLFDTIQDEKNERKDPREPKQEDPRVEVTQ









F*





905
A
1
840
MGSVGSQCLEEPSVAGTPDPGIVMSVTFDSHQLEEA









AEAAQGQGLGQGRPSPHGYQVGCVTPGEALWHRGA









MGGHGGLEVVPVTLEEPVPNDRYHAIYFAMLLSGV









GFLLPYNSFITDVDYLHHKYPGTSIVFDMSLTYILVAL









AAVLLNNVLVERLTLHTRITASYLLALGPLLFISICDV









WLQLFSRDQAYAINLAAVGTVAFGCTVQNQNEQVL









VGGPGKEAGDKAKEQQQSRERPIPSPTVQFLYYTRL









DPFLYEQKNQTICSFGGENVLSIMHLA





906
A
257
1559
GTKFCFAIYLSSTGSNTSLTSLIMLGRYFKAAPCKKN









TKGKFIQSIPDNQLVRQKLNCMTKIVESTLFRQSECR









EVLLPLLTDQLSGQLDDNSNKPDHEASSQLLSNILEV









LDRKDVGATAVHIQLIMERLL\RRINRTVIGMNRQSP









HIGSFVACMIALLQQMDDSHYSHYISTFKTRQDIIDFL









METFIMFKDLIGKNVYAKDWMVMNMTQNRVFLRAI









NRFAEVLTRFFMDQASFELQLWNNYFHLAVAFLTHE









SLQLETFSQAKRNKIVKKYGDMRKEIGFRIRDMWYN









LGPHRIKFIPSMVGPILEVTLTPEVELRKATIPIFFDMM









QCEFNFSGNGNFHMFENELITKLDQEVEGGRGDEQY









KVLLEKLLLEHCRKHKYLSSSGEVFALLVSSLLENLL









DYRTIIMQDESKENRMSCTVNVLNFYKKKK





907
A
14
616
TNPTPTAVLTATPATPVSSLTSMQTTRSEETPASAARP









SARSSSRGTSTYIPSTSSTASSPPPCCMSCGRMWVDS









WPPPLATATPQPLSASSGRPFLLARFWACCSSWWGW









LSSSSTRFK*AGTGAAPGRPWSSTTASTLSAWDSPPW









SA*AAPSSTVLTAGPWTTIRTPRALWTWPC*WVPPLV









STPSLTTPSXXVXAGT





908
A
3
211
SSFSIPTLVITEQFATAYQGTRARSDNTHYWLIISCSIA









YVALVTLLIWVPVKVILHKKRYIYRKIKGW





909
A
310
546
MSVVMLSYLLSAFFSQANTAALCTSLVYMISFLPYIV









LLVLHNQLSFVNQTFLCLLSTTAFGQGVFFITFLEGQE









TGIH





910
A
74
541
MHNNYTALLGVWIYGFFVLMLLVLDLLYYSAMNYD









ICKVYLARWGIQGRWMKQDPRRWGNPARAPRPGQR









APQPQPPPGPLPQAPQAVHTLRGDAHSPPLMTFQSSS









AWEGASQQQEIPENEETEKGDDQISSFLGVTSNTKEA









SVIGIQKTVDVL





911
A
1157
918
RSGVPDQPGQHGEAPSLLKIQNLAGRSGGPL*SQLLR









RENRLNLGGGLP*AKIAPRLHPCTPAWVTDRDSVSK









KKILFP





912
A
1199
795
MSWWRNNFWIILAVAIIVVSVGLGLILYCVCKWQLR









RGKKWEIAKPLKHKQVDEEKMYENVLNESPVQLPPL









PPRNWPSLEDSSPQEAPSQPPATYSLVNKVKNKKTVS









IPSYIEPEDDYDDVEIPANTEKASF*





913
A
1
955
PQRAPLQDFGSSKVVNPKGSSPGA*SKPPMGRGPHK









KGWRGPLGGGFPLKSPPLPQRKFSPK*KKQPWAPKR









PPWCQGLFGGGGKRGLFWVFFLSPQKKKKGKGVLP









QASGHPRQEGPPAGASQPLRSHS*PRKEQPQLGPAPR









ATPCSCPHIWQLGPLMQCGSGFLHLKSASLSLL*DQC









LLPASMAPG*PHSPRVSLRPGSSGRGAAGADGRAGA









GQSSADGVLNT/QGDVGGARGLGMPRIWHGGLCVPP









TPGTKAPASGPRSQAPGGGGDQQQFRGRCGQCGPES









PPHSRHCPRGHSGISGALGMPGSLVPREAY





914
A
414
244
MTVMVTVTVTVMVMVMVMVMVTVMVTVTVMVT









VMVTAEMTVMVGVMMMMMMVANIC*





915
A
2
4571
AAASRCPGIMVALRGLGSGLQPWCPLDLRLEWVDT









VWELDFTETEPLDPSTEAEIIETGLAAFTKLYESLLPFA









TGEHGSMESIWTFFIENNVSHSTLVALFYHFVQIVHK









KNVSVQYREYGLHAAGLYFLLLEVPGSVANQVFHP









VMFDKCIQTLKKSWPQESNLNRKKEQPKSSQANP









GRHRKRGKPPRREDIEMDEIIEEQEDENICFSARDLSQ









IRNAIFHLLKNFLRLLPKFSLKEKPQCVQNCIEVFVSL









TNFEPVLHECHVTQARALNQAKYIPELAYYGLYLLC









SPIHGEGDKVISCVFHQMLSVILMLEVGEGSHRAPLA









VTSQVINCRNQAVQFISALVDELKESIFPVVRILLQHI









CAKVVDKSEYRTFAAQSLVQLLSKLPCGEYAMFIAW









LYKYSRSSKIPHRVFTLDVVLALLELPEREVDNTLSL









EHQKFLKHKFLVQEIMFDRCLDKAPTVRSKALSSFA









HCLELTVTSASESILELLINSPTFSVIESHPGTLLRNSSA









FSYQRQTSNRSEPSGEINIDSSGETVGSGERCVMAML









RRRIRDEKTNVRKSALQVLVSILKHCDVSGMKEDLW









ILQDQCRDPAVSVRKQALQSLTELLMAQPRCVQIQK









AWLRGVVPVVMDCESTVQEKALEFLDQLLLQNIRH









HSHFHSGDDSQVLAWALLTLLTTESQELSRYLNKAF









HIWSKKEKFSPTFINNVISHTGTEHSAPAWMLLSKIA









GSSPRLDYSRIIQSWEKISSQQNPNSNTLGHILCVIGHI









AKHLPKSTRDKVTDAVKCKLNGISVGL*EVISSAVDA









LQRLCRASAETPAEEQELLTQVCGDVLSTCEHRLSNI









VLKENGTGNMDEDLLVKYIFTLGDIAQLCPARVEKRI









FLLIQSVLASSADADHSPSSQGSSEAPASQPPPQVRGS









VMPSVIRAHAIITLGKLCLQHEDLAKKSIPALVRELEV









CEDVAVRNNVIIVMCDLCIRYTIMVDKYIPNISMCLK









DSDPFIRKQTLILLTNLLQEEFVKWKGSLFFRFVSTLI









DSHPDIASFGEFCLAHLLLKRNPVMFFQHFIECIFHFN









NYEKHEKYNKFPQSEREKRLFSLKGKSNKERRMKIY









KFLLEHFTDEQRFNITSKICLSILACFADGILPLDLDAS









ELLSDTFEVLSSKEIKLLAMRSKPDKDLLMEEDDMA









LANVVMQEAQKKLISQVQKRNFIENIIPIIISLKTVLEK









NKIPALRELMHYLREVMQDYRDELKDFFAVDKQLA









SELEYDMKKYQEQLVQEQELAKHADVAGTAGGAE









VAPVAQVALCLETVPVPAGQENPAMSPAVSQPCTPR









ASAGHVAVSSPTPETGPLQRLLPKARPMSLSTIAILNS









VKKAVESKSRHRSRSLGVLPFTLNSGSPEKTCSQVSS









YSLEQESNGEIEHVTKRAISTPEKSISDVTFGAGVSYI









GTPRTPSSAKEKIEGRSQGNDILCLSLPDKPPPQPQQW









NVRSPARNKDTPACSRRSLRKTPLKNSQLKQRLPTSV









QAGRSP





916
A
315
569
QSRSCSRHQSKPDRRTDARLHTLHGSFLHTRRGSVN









TAREGHQMADEIDAMALYRAWQQLDNGSCAQIRRV









SEYGEHNNSHADD





917
A
544
983
SVQNPRVNWIHAALQRTGRGRRRHEQHGEDHFVNG









AAGVHQAANGLVNPPRHQVFGAHQAKGDGENHRQ









RGAPDGDLQRDGHFGEVILPLAEIGREEVGGERRHV









AAVFDQ/S*AGPFPRPATRRPTRRVQRPSSEARTSCA









WVGRW





918
A
1
361
MINPNPERSDDLVFWGLFRAGGMWSAIIAPVMILLV









GILLPLGLFPGDALSYERVLAFAQSFIGRVFLFLMIVL









PLWCGLHRMHHAMHDLKIHVPAGKWV\FYGLAAIL









TVVTLIGVVTI





919
A
1
971
MWALFMIRNVKKQRPVNLDLQTIRFPITAIASILHRV









SGVITFVAVGILLWLLEYRLSYLKGSSKLRDYGQLLT









LEIPAALLPIHTGIVNQNINCTETLTASSDNLLRRAFC









GDTHLHEVHLNTLFFNHFLCFAVIFDETRNKDICATS









GQHAHFVDKKRKRELLSHMIGKGNWQQVLVFTRTK









HGANHLAEQLNKDGIRSAAIHGNKSQGARTRALADF









KSGDIRVLVATDIAA/RGLDIEELPHVVNYELPNVPED









YVHRIGRTGRAAATGEALSLVRSFFDWCDDCAAAG









GMGNRNAQLADGIYHLRCIELYLGDGMADFL





920
B
1
5305
MDWLAKYWWILVIVFLVGVLLNVIKDLKRVDHKKF









LANKPELPPHRDFNDKWDDDDDWPKKDQPKKPGNL









SFTSFQSHHHRQAYRHLEYRQYARVHLRLQQCRTGP









ATSDSGIEVNQNIAVARAGDIVSARFGIPWNATIRIGI









CIRCPSGKSSHMRGATINITLIGKQQEKEANGLDPEVL









AEINREREAFLAAQQGSTSTELFTTIEGNYADAVRLL









TTAHSVPFDGKATLFVAERTLQEGMSPERAWSPWIA









ELDIYRQDCAHVDIISPGTFEKIGPIIRATLNRLYPMSS









LNIKQGSDAHFPDYPLASPSNNEIDLLNLISVLWRAK









KTVMAVVFAFACAGLLISFILPQKWTSAAVVTPPEPV









QWQELEKSFTKLRVLDLDIKIDRTEAFNLFIKKFQSVS









LLEEYLRSSPYVMDQLKEAKIDELDLHRAIVALSEK









MKAVDDSASKKKDEPSLYTSWTLSFTAPTSEEAQTV









LSGMFAQTAGKHYPAPITAVKTIEAAARFGREEALN









LENKSFVPLAHTNEARALVGIFLNDQYVKGKAKKLT









KDVETPKQAAACRYVMKDINDKSLTLGMTEAAKLL









NKQLERGKIDGLKLAGVISTIHPTLDYAGFDRVDIVV









EAVVENPKVKKAVLAETEQKVRQDTVLASNTSTIPIS









ELANALERPENFCGMHFFNPVHRMPLVEIIRGEKSSD









ETIAKVVAWASKMGKTPIVVNDCPGFFVNRVLFPYF









AGFSQLLRDGADFRKIDKVMEKQFGWPMGPAYLLD









VVGIDTAHHAQAVMAAGFPQRMQKDYRDAIDALFD









ANRFGQKNGLGFWRYKEDSKGKPKKEEDAAVEDLL









AEVSQPKRDFSEEEIIARMMIPMVNEVVRCLEEGIIAT









PAEADMALVYGLGFPPFHGGAFRWLDTLGSAKYLD









MAQQYQHLGPLYEVPEVGVAVGLHGASVQQQKSFC









PSVSIYSQPIPGLQQLCAAPPPPGLVHRTAISEGVGMP









LHVVNLPPKRFARKRLEPKWVRITWQFADMQDIGKT









PLTACRCQQVFSNVTRRHQRPQHRHNATFAPDLPITI









ELFDYHIPRIVSISSAFSPIIDVASALRRVFSRSGVITAC









NNQSISCASRVSKTLSRFESVVGVNFATRLHISEDIRT









PECIYCLLRVAISNNAVPADGARYREKSVLLGDRVAP









SGGERCKRGNGGARKGGRARGGAPRDPKGAARAK









ANAWPWTEPRECSLIAGEIAIECQRGIGHQDRFQRLIT









TLRQVLECDASALLRYDSRQFIPLAIDGLAKDVLGRR









FALEGHPRLEAIARAGDVVRFPADSELPDPYDGLIPG









QESLKVHACVGLPLFAGQNLIGALTLDGMQPDQFDV









FSDEELRLIAALAAGALSNALLIEQLESQNMLPGDAT









PFEAVKQTQMIGLSPGMTQLKKEIEIVAASDLNVLIS









GETGTGKELVAKAIHEASPRAAKLLRVLQYGDIQRV









GDDRCLRVDVRVLAATNRDLRIEEVLAGDCRDRFVS









SPERVSTFGAAAAAVEALRDHLNTLGGEHHDPVQLL









NIYYETPDNWLRGHDMGLRIRGENGRYEMTMKVAG









RVTGGLHQRPEYNVALSEPTLDLAQLPTEVWPNGEL









PADLASRVQPLFSTDFYREKWLVAVDGSQIEIALDQG









EVKAGEFAEPICELELELLSGDTRAVLKLANQLVSQT









GLRQGSLSKAARGYHLAQGNPAREIKPTTILHVAAK









ADVEQGLEAALELALAQWQYHEELWVRGNDAAKE









Q





921
A
121
1819
KTMHEIADSSKKIADIISVIDGIAFQTNILALNAAVET









ARAGEQGRGFAVVAGEVRNLASRSAQAAKEIKALIE









DSVSRVDTGSVLVESAGETMNNIVNAVTRVTDIMGE









NCFASDEQSRGIDQIALAVSEMDRVTQQNTLYYGCG









LVIPEHLENCWILDLGSGSGRDCYVLSQLVGEKGHV









TGIDMTKGQRSLALRIIGVLALTSCGLAAINADDIW









YFASGGVIGSLLSTTLQPLLHSSGGTIALLCVWAAGL









TLFTGWSWVTIAEKLGGWILNILTFASNRTRRDDTW









VDEDEYEDDEEYEDENHGKQHESRRARILRGALARR









KRLAEKFINPMGRQTDAALFSGKRMDDDEEITYTAR









GVAADPDDVLFSGNRATQPEYDEYDPLLNGSPRDIL









KEQRLQTLKLVREMRADVSELVEMLLATPNMEQRT









QGIGILDRQIARDLRFDHPYADYGNIPKTLFTFTGGD









VFSRVMVRVKETFDSLAMLEFALDNMPDTPLLTEGF









SYKPHAFALGFVEAPRGEYVHWSMLGDNQKLFRWR









FPAITYPQLAGVALHAERQYRL





922
A
338
3
MAIPSVVISGLAVLLVAMALPSLSGSEAIKSMTIPGLV









VPTVVRFMAVPGLIVPAVAKFTVLPDLTVPTEDKSLA









VPSLISRAGNSVPVSSWDVFGVAKLIAKLGLLAAVA





923
A
1
60
FRQSHGP*SLHHHTQKNED*YIHEPYQSHGHP*SLH









HHAQKNED*YIHEQYQSHGHP*SLHHHTQKNED





924
A
2
369
TLVYPAITFILLSICICYWIVTAVFLTTSGVPVYKVIAP









GGHCIHENQTCDPEIFNTPEIAKACPGALCNFAFYGE









KSLYHQYIPTFHVYNLFVFLWLINFVIALGQCALA\G









AFATYYWA





925
A
3
400
VEGQEIDFLLDNGAAFSVLISCPRQLSSRSVTIRGILG









QPVTRYFSHLLSCNWETFLQISSPLEDTTTAGPFFTPI









QEEVARVVIIQFPTAIGVSCLEGGLTGEANRASESAL









KIRAPLLLQRNAAPHQQRN





926
A
269
394
AAALRWLMSPRTLLCCFTSTLTQPWNRCGGKRKRK









KSEVPE





927
A
434
1333
GASLCQWLNAHCLARHPPAPGWRSPSSLWTGSLAST









TYCRLCPSSTGFFSNVAPSAEGHQLFLCNVERSVSHF









DAKLLSKYVPVANRYEGTEDDYGDDPSTNSFEKEKQ









DYVYCLESSLQTYNPDYVLMVEDDAVPEEQIFPVLE









HLLRARFSEPHLRDALYLKLYHPERLQHYINPEPMRI









LEWVGVGMLLGPLLTWIYMRFASRPGVSWPGMLFF









SLYSMGLAELVGRHYFLELGRVSPSLYSVVPAPQCC









TPAMLFPAPAARRTLTYLSQVYCHKGFGKDMALYSL









VEGPRERGPM





928
A
1
306
CGCGSCGGCGGRCGGGCGGGCSGGCGGGCGGGCG









GGCGSCTTCRCYRVGCCSSCCPCCRGCCGGCCSTPVI









CCCRRTCGSCGCGYGKGCCQQKCCCQKQCCC





929
A
334
413
T*NGSAAGL*SARPRWRCCSPRGRCT*EVPATLQGPG









LGRVAAGGKRGWRPQA*YRPSSQPQSEGPPEASSPSP









LPHKPSHGPGLNKAMADTVSFAPSTSPISLFFYECLPS





930
A
1
419
EKEGEDERLPPKSRIDYNHPGRVLLQNLTMSYSGLY









QCTAGNEAGKESCVVRVTVQYVQSIGMVAGAVTGI









VAGALLIFLLVWLLIRRKDKERYEEEERPNEIREDAE









APKARFVKPSSSSSGSRSSRSGSSSTSLHSK





931
A
1
375
IETTQPSEDTNANSQDNSMQPETSSQQQLLSPTLSDR









GGSRQDAADAGKPQRKFGQWRLPSAPKPISHSVSSV









NLRFGGRTTMKSVVCKMNPMTDAASCGSEVKKWW









TRQLTVESDESGDDLLDI





932
A
254
652
GRSLSFKTFLIWVLISIYQGGILMYGALVLFESEFVHV









VAISFTALILTELLMVALTVRTWHWLMVVAEPLSLG









CYVSSLAFLNEYFDVAFITTVTFLWKVSAITVVSCLP









LYVLKYLRRKLSPPSYCKLAS





933
A
9
422
ESRERSGNRRGAEDRGTCGLQSPSAMLGAKPHWLPG









PLHSPGLPLVLVLLALGAGWAQEGSEPVLLEGECLV









VCEPGRAAAGGPGGAALGEAPPGRVAFAAVRSHHH









EPAGETGNGTSGAIYFDQVLVNEGGGFDRAS





934
C
346
471
MRFCMLFTLLPLRVFLGTIQAPHFALLWLKGQTFAS









ACPGV





935
A
27
483
RDAEDAIYGRNGYDYGQCRLRVEFPRTYGGRGGWP









RGGRNGPPTRRSDFRVLVS/GWQ/DLKDHMREAGDV









CYADVQKDGVGMVEYLRKEDMEYALRKLDDTKFR









SHEGETSYIRVYPERSTSYGYSRSRSGSRGRDSPYQSR









GSPHYFSPFRPY





936
B
1
654
MTSESPEVKSCEPTTNHRETIRVDEQRKKSSRPTTTDF









SQHKFLPGDNATWNCESIKSLFVDKLPSFGYVFITSV









NTDKYILDEPGGPNAIKRVIVRGKQEGDKSHRNPSGC









VALRGFGQGYLLNNVELFDITTPGYSHSGSINGWLFV









DSILFPGDSSTTPADNGQCMTQHRLTHTSTSSEKDVG









ASLCGFLSPPLVLGKVTALSIVNERSISRNT





937
A
6323
7130
PGCIRCKCVAVDGDPCMKSNNALIVILGTVTLDAVGI









GLVMPVLPGLLRDIVHSDSIASHYGVLLALYALMQF









LCAPVLGALSDRFGRRPVLLASLLGATIDYAIMATTP









VLWIYPLVNSPFCWPRASRYQQGHQDLFILRSDLPSQ









VFIRDKLMERRNRRTGRTEKARIWEVTDRTVRTWIG









EAVAAAAADGVTFSVPVTPHTFRHSYAMHMLYAGI









PLKVLQSLMGHKSISSTEVYTKVFALDVAARHRVQF









AMPESDAVAMLKQLS





938
B
1
806
MPQRALLCQLTYACISAQLICPFAMEQQLVACCHPM









SGECACKPGWSGLYCNETCSPGFYGEACQQICSCQN









GADCDSVTGKCTCAPGFKGIDCSTPCPLGTYGINCSS









RCGCKNDAVCSPVDGSCTCKAGWHGVDCSIRCPSGT









WGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKC









ELPCQDGTYGLNCAERCDCSHADGCHPTTGHCRCLP









GWSVFTGNGVPINNIHIIVWDCFHVKVQTVVTFCLIT









LEAKTVTQDMTVSASX





939
A
3
627
GRLMLAGHGGVFALTLLLILTTTGLFFVFDCPYLARK









LTLAIPIIAAILFFFVMSCLLQTSFTDPGILPRATVCEA









AALEKQIDNTGSSTYRPPPRTREVLINGQMVKLKYCF









TCKMFRPPRTSHCSVCDNCVERFDHHCP\WVGNCVG









RRNYRFFYTFILSLSFLTAFIFACVVTHLTLRELWVRA









VGSGRGQPASSRVSKLQQSLSL





940
A
2
464
FVGVVVGVAEVRNWRCCCLGSTCWCRSLVLVCVLA









ALCFASLALVRRYLHHLLLWVESLDSLLGVLLFVVG









FIAVSFPCGWGYIVLNVAAGYLYGFVLGMGLMMVG









VLIGTFIAHVVCKRLLTAWVAARIQSSEKLSAVIRVK









EGGSGLKWWRL





941
A
1
421
FRSFVTEQNWDSLEVFDGADNTVTMLGSFSGTTVPA









LLNSTSNQLYLHFYSDISVSAAGFHLEYKTVGLSSCP









EPAVPSNGVKTGERYLVNDVVSFQCEPGYALQGHA









HISCMPGTVRRWNYPPPLCIAQCGGTVEEMEG





942
A
120
530
MVAPGLVLGLVLPLILWADRSAGIGFRFASYINNDM









VLQKEPAGAVIWGFGTPGATVTVTLRQGQETIMKKV









TSVKAHSDTWMVVLDPMKPGGPFEVMAQQTLEKIN









FTLRVHDVLFGDVWLCSGQSNMQMTVLQIF





943
A
205
377
NIVENIVFCWPGVCFLQTCTVCINPETSDE/WPGAVA









HACNPSTLGGQDGQITRSGDRE





944
A
2
408
EDGEYFLMIRGKLLKIFCAGMHSDHPKEYVTLVHGD









SENFSEVYGHRLHNPTECPYNGSRRDDCQCRKDYTA









AGFSSFQKIRIDLTSMQIITTDLQFARTSEGHPVPFATA









GDCYSAAKCPQVCPWGLPPCQGFT





945
A
1
4218
MALKNINYLLIFYLSFSLLIYIKNSFCNKNNTRCLSNS









CQNNSTCKDFSKDNDCSCSDTANNLDKDCDNMKDP









CFSNPCQGSATCVNTPGERSFLCKCPPGYSGTICETTI









GSCGKNSCQHGGICHQDPIYPVCICPAGYAGRFCEID









HDECASSPCQNGAVCQDGIDGYSCFCVPGYQGRHCD









LEVDECASDPCKNEATCLNEIGRYTCICPHNYSGVNC









ELEIDECWSQPCLNGATCQDALGAYFCDCAPGFLGD









HCELNTDECASQPCLHGGLCVDGENRYSCNCTGSGF









TGTHCETLMPLCWSKPCHNNATCEDSVDNYTCHCW









PGYTGAQCEIDLNECNSNPCQSNGECVELSSEKQYG









RITGLPSSFSYHEASGYVCICQPGFTGIHCEEDVNECS









SNPCQNGGTCENLPGNYTCHCPFDNLSRTFYGGRDC









SDILLGCTHQQCLNNGTCIPHFQDGQHGFSCLCPSGY









TGSLCEIATTLSFEGDGFLWVKSGSVTTKGSVCNIAL









RFQTVQPMALLLFRSNRDVFVKLELLSGYIHLSIQVN









NQSKVLLFISHNTSDGEWHFVEVIFAEAVTLTLIDDS









CKEKCIAKAPTPLESDQSICAFQNSFLGGLPVGMTSN









GVALLNFYNMPSTPSFVGCLQDIKIDWNHITLENISSG









SSLNVKAGCVRKDWCESQPCQSRGRCINLWLSYQC









DCHRPYEGPNCLREYVAGRFGQDDSTGYVIFTLDES









YGDTISLSMFVRTLQPSGLLLALENSTYQYIRVWLER









GRLAMLTPNSPKLVVKFVLNDGNVHLISLKIKPYKIE









LYQSSQNLGFISASTWKIEKGDVIYIGGLPDKQETELN









GGFFKGCIQDVRLNNQNLEFFPNPTNNASLNPVLVN









VTQGCAGDNSCKSNPCHNGGVCHSRWDDFSCSCPA









LTSGKACEEGQRCGFSPCPHGAHCQPVLQGFECIAN









AVFNGQSGQILFRSNGNITRELTNITFGFRTRDANVIIL









HAEKEPEFLNISIQDSRLFFQLQSGNSFYMLSLTSLQS









VNDGTWHEVTLSMTDPLSQTSRWQMEVDNETPFVT









STIATGSLNFLKDNTDIYVGDRAIDNIKGLQGCLSTIEI









GGIYLSYFENVHGFINKPQEEQFLKISTNSVVTGCLQL









NVCNSNPCLHGGNCEDIYSSYHCSCPLGWSGKHCEL









NIDECFSNPCIHGNCSDRVAAYHCTCEPGYTGVNCE









VDIDNCQSHQCANGATCISHTNGYSCLCFGNFTGKF









CRQSRLPSTVCGNEKTNLTCYNGGNCTEFQTELKCM









CRPGFTGEWCEKDIDECASDPCVNGGLCQDLLNKFQ









CLCDVAFAGERCEVDLADDLISDIFTTIGSVTVALLLI









LLLAIVASVVTSNKRATQGTYSPSRQEKEGSRVEMW









NLMPPPAMERLI





946
A
2
2131
RVARGWGGCGACGGSGIVGQGKGEPSRRRGRAAGR









PQSMERGKMAEAESLETAAEHERILREIESTDTACIGP









TLRSVYDGEEHGRFMEKLETRIRNHDREIEKMCNFH









YQGFVDSITELLKVRGEAQKLKNQVTDTNRKLQHEG









KELVIAMEELKQCRLQQRNISATVDKLMLCLPVLEM









YSKLRDQMKTKRHYPALKTLEHLEHTYLPQVSHYRF









CKVMVDNIPKLREEIKDVSMSDLKDFLESIRKHSDKI









GETAMKQAQQQRNLDNIVLQQPRIGSKRKSKKDAYI









IFDTEIESTSPKSEQDSGILDVEDEEDDEEVPGAQDLV









DFSPVYRCLHIYSVLGARETFENYYRKQRRKQARLV









LQPPSNMHETLDGYRKYFNQIVGFFVVEDHILHTTQ









GLVNRAYIDELWEMALSKTIAALRTHSSYCSDPNLV









LDLKNLIVLFADTLQVYGFPVNQLFDMLLEIRDQYSE









TLLKKWAGIFRNILDSDNYSPIPVTSEEMYKKVVGQF









PFQDIELEKQPFPKKFPFSEFVPKVYNQIKEFIYACLKF









SEDLHLSSTEVDDMIRKSTNLLLTRTLSNSLQNVIKR









KNIGLTELVQIIINTTHLEKSCKYLEEFITNITNVLPET









VHTTKLYGTTTFKDARHAAEEEIYTNLNQKIDQFLQL









ADYDWMTGDLGNKASDYLVDLIAFLRSTFAVFTHLP









VSGSCSYFVLYI





947
A
236
3
MLSVTAFILAETVLASQEVQGGVQVRVYLMNAVPD









GLQGGSPVGGLGLLLAPDNSGHRRSSCRIPAARVYX









XXXPRPP





948
A
1
2369
AGGARLRPARGRPPRLLPPRPGPCRPPPVPAPTVNER









RAPPRAGWERRSDAGLSRGARPAEMYGVCGCYGAL









RPRYKRLVDNIFPEDPEDGLVKTNMEKLTFYALSAPE









KLDRIGAYLSERLIRDVGRHRYGYVCIAMEALDQLL









MACHCQSINLFVESFLKMVAKLLESEKPNLQILGTNS









FVKFANIEEDTPSYHRSYDFFVSRFSEMCHSSHDDLEI









KTKIRMSGIKGLQGVVRKTVNDELQANIWDPQHMD









KIVPSLLFNLQHVEEAESRSPSPLQAPEKEKESPAELA









ERCLRELLGRAAFGNIKNAIKPVLIHLDNHSLWEPKV









FAIRCFKIIMYSIQPQHSHLVIQQLLGHLDANSRSAAT









VRAGIVEVLSEAAVIAATGSVGPTVLEMFN\TLLRQL









RLSIDYALTGSYDGAVSLGTKIIKEHEERMFQEAVIK









TVGSFASTLPTYQRSEVILFIMSKVPRPSLHQAVDTGR









TGENRNRLTQIMLLKSLLQVSTGFQCNNMMSALPSN









FLDRLLSTALMEDAEIRLFVLEILISFIDRHGNRHKFST









ISTLSDISVLKLKVDKCSRQDTVFMKKHSQQLYRHIY









LSCKEETNVQKHYEALYGLLALISIELANEEVVVDLI









RLVLAVQDVAQVNEENLPVYNRCALYALGAAYLNL









ISQLTTVPAFCQHIHEVIETRKKEAPYMLPEDVFVERP









RLSQNLDGVVIELLFRQSKISEVLGGSGYNSDRLCLP









YIPQLTDEDRLSKRRSIGETISLQVEVESRNSPEKEEVS









VRATVLGQPHLL





949
A
906
1046
PDHHNWSQ*TTTGAQRQT*KRTVKEV*SAHNEAMCF









GTCASDCLYR





950
A
489
855
RPVGRGGSRSDRGARAGRCAPDTLSALRCCWRSPAG









APGTQDPDPAGPGAATEAPALHPAGGTGTSPPPPATA









APTGGRGRPCADCCRRGARPGPAPTTAAAPAAATAA









TNTSAARLSGPAP





951
A
310
393
PHTDISGTPEIMHYVHVHRVTTQPRNKP





952
A
3
428
SSRLVLLAGAAALASGSQGDREPVYRDCVLQCEEQN









CSGGALNHFRSRQPIYMSLAGWTCRDDCKYECMWV









TVGLYLQEGHKVPQFHGKWPFSRFLFFQEPASAVAS









FLNGLASLVMLCRYRTFVPASSPMYHTCVAFAWVS





953
A
105
335
GRLFPKVLSYHSVGYLPLILFCHFLLANCILCCLMHFL









*FFQSYRF*G*KFGFTQHHCHYIFHKQWPLLWKNFPE









H





954
A
51
482
MVLGLLVQIWALQEASSLSVQQGPNLLQVRQGSQAT









LVCQVDQATAWERLRVKWTKDGAILCQPYITNGSLS









LGVCGPQGRLSWQAPSHLTLQLDPVSLNHSGAYVC









WAAVEIPELEEAEGNITRLFVDPDDPTQNRNRIASPP





955
A
425
1333
ELFMIKPPRNIIILNQQKLEAFPLKTGTRQGCPLSPLLF









NIVLEVLARAIRQEKEIKGIQLGKEEVKLSLFADDMIV









YLENPIVSAQNLLKLISNFSKVSGYKINVQKSQAFLY









TNNRQTESQIMSELPFTIASKRIKYLGIQLTRDVKDLF









KENYKPLLKEIKEDTNKWKNIPCSWVGRINIVKMAIL









PKVIYRFNAIPIKLPMTFFTELEKTTLKFIWNQKRARI









AKSILSQKNKSGGITLPDFKLYYKATVTKTACSPHSIV









LPATMMEQNRALRNNTTHHGRGSDPAGQAAAAAG









ATCQ





956
A
226
444
MRPDDINPRTGLVVALVRVFLVFGFMFTVSGMKGET









LGNIPLLAIGPAICLPGIAAIALARKTEGCTKWPEND





957
A
3
1371
SYFSSSTPTYPVGTTVEFSCDPGYTLEQGSIIIECVDPH









DPQWNETEPACRAVCSGEITDSAGVVLSPNWPEPYG









RGQDCIWGVHVEEDKRIMLDIRVLRIGPGDVLTFYD









GDDLTARVLGQYSGPRSHFKLFTSMADVTIQFQSDP









GTSVLGYQQGFVIHFFEVPRNDTCPELPEIPNGWKSP









SQPELVHGTVVTYQCYPGYQVVGSSVLMCQWDLT









WSEDLPSCQRVTSCHDPGDVEHSRRLISSPKFPVGAT









VQYICDQGFVLMGSSILTCHDRQAGSPKWSDRAPKC









LLEQLKPCHGLSAPENGARSPEKQLHPAGATIHFSCA









PGYVLKGQASIKCVPGHPSHWSDPPPICRAASLDGFY









NSRSLDVAKAPAASSTLDAAHIAAAIFLPLVAMVLLV









GGVYFYFSRLQGKSSLQLPRPRPRP\YNRI\TIESAF\DN









PTYETGETREYEVSI





958
A
1
2667
GAYHKHLMELALQQTYQDTCNCIKSRIKLEFEKRQQ









ERLLLSLLPAHIAMEMKAEIIQRLQGPKAGQMENTN









NFHNLYVKRHTNVSILYADIVGFTRLASDCSPGELVH









MLNELFGKFDQIAKENECMRIKILGDCYYCVSGLPIS









LPNHAKNCVKMGLDMCEAIKKVRDATGVDINMRV









GVHSGNVLCGVIGLQKWQYDVWSHDVTLANHMEA









GGVPGRVHISSVTLEHLNGAYKVEEGDGDIRDPYLK









QHLVKTYFVINPKGERRSPQHLFRPRHTLDGAKMRA









SVRMTRYLESWGAAKPFAHLHHRDSMTTENGKISTT









DVPMGQHNFQNRTLRTKSQKKRFEEELNERMIQAID









GINAQKQWLKSEDIQRISLLFYNKVLEKEYRATALPA









FKYYVTCACLIFFCIFIVQILVLPKTSVLGISFGAAFLL









LAFILFVCFAGQLLQCSKKASPLLMWLLKSSGIIANRP









WPRISLTIITTAIILMMAVFNMFFLSDSEETIPPTANTT









NTSFSASNNQVAILRAQNLFFLPYFIYSCILGLISCS\VF









LRVNYELKMLIMMVALVGYNTILLHTHAHVLGDYS









QVLFERPGIWKDLKTMGSVSLSIFFITLLVLGRQNEY









YCRLDFLWKNKFKKEREEIETMENLNRVLLENVLPA









HVAEHFLARSLKNEELYHQSYDCVCVMFASIPDFKE









FYTESDVNKEGLECLRLLNEIIADF\DDLLSKPKFSGV









EKIKTIGSTYMAATGLSAVPSQEHSQEPERQYMHIGT









MV\EFAFAL\VGKLDAINKHSFNDFKLRVGINHGPVIA









GVIGAQKPQYDIWGNTVNVASRMDSTGVLDKIQVTE









ETSLVLQTLGYTCTCRGIINVKGKGDLKTYFVNTEMS









RSLSQSNVAS





959
A
281
1092
AFCTVTLIFPHFQGAVIHKLGITLVSLLLFLTLTKTFPV









TCLVDDWFVHKASFPARLCYLYVVMQASKPKYYFA









WTLADAVNNAAGFGFSGVDKNGNFCWDLLSNLNIW









KIETATSFKMYLENWNIQTATWLKCVCYQRVPWYP









TVLTFILSALWHGVYPGYYFTFLTGILVTLAARAVRN









NYRHYFLSSRALKAVYDAGTWAVTQLAVSYTVAPF









VMLAVEPTISLYKSMYFYLHIISLLIILFLPMKPQAHT









QRRPQTLNSINKRKTD





960
A
1
361
VCFYVSAMVPVKSPREYYVQQEVIVLFCETVERALD









FGYLTQDMIDDYEPALMFSIPRLAIVWGLVVYADGP









LNLDRKVEDMSELFRPFHTLLRKIRDLLQTLTEEELH









TLERNLCISQD





961
A
710
1831
IRMKSKEIIARCIKPYHSMARTQPGTRNKENGPAGPT









ALDNVASSDDTGRHRPQTTQLAPGFAHPLQLASFRR









MVLFLSGEGRSRGGPELQFPASCRRGEGSPGVRESGS









GGIAATSTPNYPPNQDSKEHDIRGAEHQKQEQPAKPP









HTARSAYPPQKSSYPANAKATRHSPETAAAKEARAP









AAAQPQRHQPNPSPAPHTRPATAATRQPERRVPSPTH









RHPAATRLSPRRQSPSPRPHHDRRGFPRLAETLQHPM









CPLPLVASAGHHRKHHRLLLLLAPQAPEREASDEI\VFS









GRSRSRGCPTEFQESAMCFPNPGLPDSCGESQAVTILI









LRKFQKVIWVIEVPLDYKKGSWEYFSRMETIIMPFEN









IQSE





962
A
3
226
LDPNGEQVVWQASGWAARIIQHEMDHLQGCLFIDK









MDSRTFTNVYWMKDGTQKVQNNILSHVAILQLCPD









EENG





963
A
2
505
QGSRAKLSTPLGLSCTRSTAGPSRFARCSLGGCSHPS









RHSPHLPPPPPVQFRAGPRGRQGSPSRGSPS\GAFPAG









PGGAAAAAVGDDQQQQEQHGAHEGEENNEGNSVP









CG/PGKTGGSSVSPGLPEPWPPAPLWTQPSWSAPCH\P









*KPPIPPTRQVLGRTGCFLLPAP





964
A
1
709
DDPDYAQLGTRWHEGDADSISLELRKPDGTLVSFTA









DFKKDVKVFRALILGELEKGQSQFQALCFVTQLQHN









EIIPSEAMAKLRQKNPRAVRQAEEVRGLEHLHMDVA









VNFSQGALLSPHLHNVCAEAVDAIYTRQEDVRFWLE









QGVDSSVFEALPKASEQAELPRCRQVGDRGKPCVCH









YGLSLAWYPCMLKYCHSRDRPTPYKCGIRSCQKSYS









FDFYVPQRQLCLWDEDPYPG





965
A
1
1183
RLITVKLRR/GDTGRIPLSHIRLLPPDYKIQCAEPSPAL









LVPSAKRRSRKTSKDTGEGKDGGTAGSEEPGAKARG









RGRKPSAKAKGDRAATLEEGNPTDEVPSTPLALEPSS









TPGSKKSPPEPVDKRAKAPKARPALPQPSPAPPAFTS









CPAPEPFVELPAPATTLAPAPLITMPATRPKPKKARA









AKESGAKGPRRPGEEAELLVKLDHEGVMSPKSKKA









KEALLLWEDPGRGGLGPDRDLAQEPGPGLTFEDSGN









PKSPDKAQAEQDGAEESESSSSSSSSSSGSETEGEEEG









DKNGDGGCGAGGRGAPHQGHQAGWQGAASAHSPG









KKTPAPQPQAPPPQPTQPLQPKTQAGAKSRPKKREG









VHLPTTKELAKRQRLPSVENRPKIAAFLPA





966
A
1023
766
MLCSRLGTTASWRRLGIRAWAPLLLLFPWDWHFILS









FSSRPWAGTLLAPHDVIMGSSTFPQSCQAEAGPRHA









WPTGRFSRRLRRV*





967
A
651
836
TPGAPSGAQSNGWSSCEQSRPDVGEKGPLGRALCVP









CPSSPTHPKAKDGFPPFTAVILTSF





968
A
1
1206
MALSSWPVVLRLNMADFVFSFLCLGIGTSIVLGILFY









LLQAHRYLQEGMTYQLALSFYLTWASVFLFLMTGM









GEDEESALQTLLDPRSSYLLVSLEILPTNPSPLSPCAVS









EDESEMRGLSLLRRQSQATGRLEPTFKHDSTLLALQG









ALGLYDGHTPPYAACLGFEFRKHLGNPAKDGGNVT









VSLFYRNDSAHLPLPLSLPGCPAPCPLGRFYQLTAPA









RPPAHGVSCHGPYEAVIPPGPGAIIPSTGPAVGMQRE









RSEVGSGVPARTVYASEQHAYMWHSALIPDSGLRGK









PTLSSRKPPQTSCGPEFANVLSLALCGALVVCKARA









MDQARPRQLIGIDALRDPRASSRTRAGGLGMIRRQEE









EPAARTVLARCDSSPSECPSHARAPYDTGPLFNAKG





969
A
250
1013
NQPGWHGGGPSAGRAAKKCPGEVGPGAPAAAAEPA









RGDAGGEAGGCHPE/SPSTTSS*VIPST*ESSS*PSSPGF









GSSHMPGSTLMPPWCTPRSGPRTHSQRREVTCAWCS









CWEPGRTAASPAASQTSAGAS*PSPAAQATACPTNCS









SSSGPE*GGAHRDHSNRARTISTSSAPT*WT*TAKLRP









LDTPTLPGTSSWKTSCSVEWAASPTSTSSGGWRPFSA









GRNSRKDASGSLMLKMKNHLKLFNISSIFRGE





970
A
1
6384
MVSPEPKTGHSIRNWLDELKDLPILHAYSNLPSSPAV









DLAIHSSKEGRMDWTEGQVTGPVVRSAATSGAGSTT









SGVVSGSLGSREINYILRVLGPAACRNPDIFTEVANCC









IRIALPAPRGSGTGNGSSRIPRESAPEMATAESLVEEL









SEDAAGGASPGVELPALGCSELPAAEVSPTASSKNLE









TICEYAYCMAMLPETGLDPYPKRGFLDLTQERIWTDI









PPSPGNIPTTHPLMVRHADHSSLTLGSGSSTTRLTQGI









GRSQRTLRQLTANTGHTIHVHYPGNRQPNPPLILQRL









LGPSAAADILQLSSSLPLQSRGRARLL\VGNDDVHIIA









RSDDELLDDFFHDQ\STATSQAGTLSSIPTALTRWTEE









CKVLDAESMHDCVSVVKVSIVNHLEFLRDEELEERR









EKRRKQLAEEETKITDKGKEDKENRDQSAQCTASKS









NDSTEQNLSDGTPMPDSYPTTPSSTDAATSESKETLG









TLQSSQQQPTLPTPPALGEVPQELQSPAGEGGSSTQL









LMPVEPEELGPTRPSGEAETTQMELSPAPTITSLSPER









AEDSDALTAVSSQLEGSPMDTSSLASCTLEEAVGDTS









AAGSSEQPRAGSSTPGDAPPAVAEVQGRSDGSGESA









QPPEDSSPRASSESSSTRDSAVAISGADSRGILEEPLPS









TSSEEEDPLAGISLPEGVDPSFLAALPDDIRREVLQNQ









LGIRPPTRTAPSTNSSAPAVVGNPGVTEVSPEFLAALP









PAIQEEVLAQQRAEQQRRELAQNASSDTPMDPVTFIQ









TLPSDLRRSVLEDMEDSVLAVMPPDIAAEAQALRRE









QEARQRQLMHERLFGHSSTSALSAILRSPAFTSRLSG









NRGVQYTRLAVQRGGTFQMGGSSSHNRPSGSNVDT









LLRLRGRLLLDHEALSCLLVLLFVDEPKLNTSRLHRV









LRNLCYHAQTRHWVIRSLLSILQRSSESELCIETPKLT









TSEEKGKKSSKSCGSSSHENRPLDLLHKMESKSSNQL









SWLSVSMDAALGCRTNIFQIQRSGGRKHTEKHASGG









STVHIHPQAAPVVCRHVLDTLIQLAKVFPSHFTQQRT









KETNCESDRERGNKACSPCSSQSSSSGICTDFWDLLV









KLDNMNVSRKGK\NSV\KSVPVSAGG\EGETSPYSL\E









ASPLG\QLMNMLSHPVIRRSSLLTEKLLRLLSLISIALP









ENKVSEAQANSGSGASSTTTATSTTSTTTTTAASTTP









TPPYVHPPRVTSAPA\LVAATAISTIVVAASTTVTTPT









TATTTVSISPTTKGSKSPAKVSDGGSSSTDFKM\VSSG









LTENQLQLSVEVLTSHSCSEEGLEDAANVLLQLSRGD









SGTRDTVLKLLLNGARHLGYTLCKQIGTLLAELREY









NLEQQRRAQCETLSPDGLPEEQPQTTKLKGKMQSRF\









DMAENVVIVASQKRPLGGRELQLPSMSMLTSKTSTQ









KFFLRVLQVIIQLRDDTRRANKKAKQTGRLGSSGLGS









ASSIQAAVRQLEG*RLDAIIQMVREGQRARRQQQAA









TSESSQSEASVRREESPMDVDQSPSAQDTQSIASDG









TPQGEKEKEERPPELPLLSEQLS\LDELWDMLGECLK









ELEESHDQHAVLVLQAVEAFFLVHATERESKPPVR









DTRESQLAHIKDEPPPLSPAPLTPATP\SSFDQFFSGEP









S\SMHIS\SSLPPDTQKFLRFAETHRTVLNQILRQSTTH









LADGPFAVLVDYIRVLDFDVKRKYFRQELERLDEGL









RK\EDMAVHVRRDHVFEDSYRELHRKSPEEMKNRLY









IVFEGEEGQDAG\GLLREWVYDSSFREMF\NPMYGLF\









RTSPG*FESPNTINPS\SH\CNPNHLS\YFKFCSGRIV\AK









AVYDN\RLL\ECYFTRSFYKHHLGASSVRYTDM\ESE\









DYHF\YQGLGLSGWENDVSTL\GYDLTFQALRVPGV









LGVCEV\R\DLKPNGGQPSWVTEE\NKKEVCTPWYCQ









MRMTGAIRQQVAAFL\EGF\YEIIPK\RLISIF\TEHELEL









LISGLPTIDIDDLNPNTEYHKYQSNSI\QI\QWFLEETLP









FLSNQN*PVPKVPSQFVHGVPSKGNPWQGLCLPLEG









HGMGISGSFQVPFGGGQVPQIALPSAHTCF\N\QLDLP









AYESFEKLRHMLLLAI\QECSEGFGLA





971
A
3
1186
GGGGFSPRSKSQKPGRGRDGAVTPNRKNKGNYKKN









PAKRCEASESSHGKVRSSSTCSVQLPQVKEALKTIHI









KVIDDEAYEKNKYFIEMMGPRMVDMSFQKDVTDR









KLTMEEEEAKRIAEMGKPVLGEHPKLEVIIEESYEFK









TTVDKLIKKTNLALVVGTHSWRDQFMEAITVSAAGD









EDEDESGEERLPSCFDYVMHFLTVFWKVLFACVPPT









EYCHGWACFAVSILIIGMLTAIIGDLASHFGCTIGLKD









SVTAVVFVAFGTSVPDTFASKAAALQDVYADASIGN









VTGSNAVNVFLGIGLAWSVAAIYWALQGQEFHVSA









GTLAFSVTLFTIFAFVCISVLLYRRRPHLGGELGGPRG









CKLATTWLFVSLWLLYILFATLEAYCYIKGF





972
A
1
284
ERQDWESRLEAMECAFHLEKSVNQSLLELHQLAME









KGDPQLCDFLESHFLNQQVKAIKKLGDYLSNLCKT*









APEAGLAEYLFDKLTLGGSEEDT





973
A
2
2020
SQVRASLPEPRNSAAAMASNMDREMILADFQACTGI









ENIDEAITLLEQNNWDLVAAINGVIPQENGILQSEYG









GETIPGPAFNP\ASHPASAPYS/SPSSLPAFRPVMPTQG









RL*ER\QPRMLDFRVEYRDRNVDVVLEDTCTVGEIKQ









ILENELQIPVSKMLLKGWKTGDVEDSTVLKSLHLPK









NNSLYVLTPDLPPPSSSSHAGALQESLNQNFMLIITHR









EVQREYNLNFSGSSTIQEVKRNVYDLTSIPVRHQLWE









GWPTSATDDSMCLAESGLSYPCHRLTVGRRSSPAQT









REQSEEQITDVHMVSDSDGDDFEDATEFGVDDGEVF









GMASSALRKSPMMPENAENEGDALLQFTAEFSSRYG









DCHPVFFIGSLEAAFQEAFYVKARDRKLLAIYLHHDE









SVLTNVFCSQMLCAESIVSYLSQNFITWAWDLTKDS









NRARFLTMCNRHFGSVVAQTIRTQKTDQFPLFLIIMG









KRSSNEVLNVIQGNTTVDELMMRLMAAMEIFTAQQ









QEDIKDEDEREARENVKREQDEAYRLSLEADRAKRE









AHEREMAEQFRLEQIRKEQEEEREAIRLSLEQALPPEP









*EENAEPVSKLRIRTPSGEFLERGFLASNKLQIVFDFV









ASK\GF\PWDEYKLLSTFP\RRDVTQLDPNKS\LL\EVK









LFP\QETLFPWKPKE





974
A
1
1232
FPGRRFRLVVRLRGAEAASERQVYSVTMKLLLLHPA









FQSCLLLTLLGLWRTTPEAHASSPGAPAISAASFL*DL









IHRYGEGDSLTLQQLKALLNHLDVGVGRGNVSQHV









QGHRNPTTCFSSGDLFTAHNF\SEQLRIGSSELHEFCP









TILQQLDSRACTSENQENEENEQTEEGRPSAVEVWGF









GFLSVSLINLASLLGVLVLPCTEKAFFSRVLTYFIALSI









GTLLSNALFQLIPERSYKNKAQVDSLPTFLAQAGMLL









WRVRIRRRVVDPIRESWMLPFTKIPLWGYGLLCVTVI









SLCSLLGASVVPFMKKTFYKRLLLYFIALAIGTLYSN









ALFQLIPENRRKWWQPVHNTFGGSTAWHTDKSIEQS









IDTLFDEVKKESEKETPSLQIGDLGPQESLKTFNNTNS









PHH





975
A
1
740
AFVPFLLVTWSSAAFIISYVVAVLSGHVNPFLPYISDT









GTTPPESGIFGFMINFSAFLGAATMYTRYKIVQKQNQ









TCYFSTPVFNLVSLVLGLVGCFGMGIVANEQELAVP









VVHDGGALLAFVCGVVYTLLQSIISYKSCPQWNSLST









CHIRMVISAVSCAAVIPMIVCASLISITKLEWNPREKD









YVYHVVSAICEWTVAFGFIFYFLTFIQDFQSVTL\GYP









QKSMVIFEERRIQSHSVNVAGHF





976
A
2
374
IRRESTHLQQALGTTPQDRLTCTGHSAQPPACSASPL









PPGPP*SSAWPLPPSTRLARQKQAAATAQP*PLTTQTL









GPWSSASTWTSAHKQPGAAAQEWTSTAGSRQLLAG









ASGSSPSSCSVWTN





977
A
2
728
PSLIQCGGIPLTFRALRRALCRLPLPVHVRADPLRTW









RWHNLLVSFAHSIVSGIWALLCVWQTPDMLVEIETA









WSLSGYLLVCFSAGYFIHDTVDIVASGQTRASWEYL









VHHVMAMGAFFSGIFWSSFVGGGVLTLLVEVSNIFL









TIRMMMKISNAQDHLLYRVNKYVNLVMYFLFRLAP









QA\YLTHFFLRYVNQRTLGTFLLGILLMLDVMIIIYFS









RLLRSDFCPEHVPKKQHKDKFLTE





978
A
120
327
RGKLLEQGLDAWALLKPPASGQRPLRMQEDAGELQ









NERVGWLVVRFLQRVCCCGPCALVLPRLPISAA





979
A
238
2526
ALTKVNEGSMETKDLIVIGGGINGAGIAADAAGRGL









SVLMLEAQDLACATSSASSKLIHGGLRYLEHYEFRLV









SEALAEREVLLKMAPHIAFPMRFRLPHRPHLRPAWM









IRIGLFMYDHLGKRTSLPGSTGLRFGANSVLKPEIKR









GFEYSDCWVDDARLVLANAQMVALEKCNSIVAFVV









CHTSDEPCNCLPQVYGSDENASLTAYWYSGGVQGV









CALLPGHHRVPGLLYAGSAAAVSGLLRQFGIRLDVQ









VRLIPILAQFAGISVLLGVWAFSRPWHQPGKALALAK









RKADVAFEFFHKLHVPFYCFHDVDVSHESASLKEYI









NNFAQMVNVLAGKQEESGVKVLCGTVNCFTNPRYG









AGYKTLLNTDLRQEREHLGRFMQMVVEHKHKIGFQ









GTLLIEPNPQDPTKHQYDCDAATVYGFLKQFGLEKEI









KLNIQAIHATLAGLSFHQGTKLEPLKKGWLNCGKGR









SLRSFWLLRNVAKGVCVQRRLSWQFKHAWLIKFWA









PIPAVIASGILSTYYFGITGTFWTVTGEFTRWGGQLLQ









LFGVHAEEWGYFKIIHLEGSPLSRIDGMMILGMFGGC









FAAALWANNVKLRMPRTRTRIMQAIIGGILAGFGARL









AMGCSLAAFFTGIPGYRARNSNPGKEFCQVTAHCQT









RVKAGDNAANDGLHNSDTAARHSQFDIVGPQLFGK









PAANHREDHHPVDAVTPLFSMDADQLQELAAPTGK









FTRDSPKGASNAKIVSIENTRRYDRRDGGPEFNQPGV









FEMLP





980
B
1
3129
MASGRLNALAPEATPQGHNLQVDIVYGVDYQASVF









VQGAAFATGIPPDLYAFHRYTWNSTPLYETQACQYQ









MQFPVIPINACTLRITAAGFSKKFRSFPWAYTMVRAP









VFTTVNRRNPLRSRSGGVCGCFRKLTRKLAAKSALT









NCCVPSTKSMHTGSTTLPDFFAGMSDDFTPPIFAGYC









RDDSHELRFRLYALLLISDAIALGIEQKPDLILLGGDY









VLFDMSLNFSAFSDVLSPLAECAPTFACFGNHDRPGR









IAAASIGPLNNTVRKKASTRSSDSIQGVKRPQWRSQV









HDRGNKPDAHQQPEEKHATDNTLTVCVIFGGEPANS









ANNQRAYYALTQYGGYYTGSFQPGFQIRERIFPRTVS









RCDLRTGSECTRCLRNLRNFGDEKDIELQETECAVIR









ALVQVSHWQSTLAAAGQVLTIIVRTITVAFQHAADK









AADNGNLTAISWIHVSSLFLQAMRVAIPAVIVALSVG









TSEVQNMLNAIPEVVTNGLNIAGGMIVVVGYAMVIN









MMRAGYLMPFFYLGFVTAAFTNFNLVALGVIGTVM









AVLYIQLSPKYNRVAVRLLRQLVSEMVDTTQTTTEK









KLTQSDIRGVFLRSNLFQGSWNFERMQALGFCFSMV









PAIRRLYPENNEARKQAIRRHLEFFNTQPFVAAPILGV









TLALEEQRANGAEIDDGAINGIKVGLMGPLAGVGDPI









FWGTVRPVFAALGAGIAMSGSLLSPLLFFILFNLVRL









ATRYYGVRLVVLDLGLPDEDGLHFLARIRQKKYTLP









VLILTARDTLTDKIAGLDVGADDYLVKPFALEELHA









RIRALLRRHNNQGESELIVGNLTLNMGRRQVWMGG









EELILTPKEYALLSRLMLKAGSPVHREILYNDIYNWD









NEPSTNTLEVHIHNLRDKVGKARIRTVRGPGYMLLIS









VFWLWHESTEQIQLFEQALRDNRNNDRHIMREIREA









VASLIVPGVFMVSLTLFICYQAVRRITRPLAELQKELE









ARTADNLTPIAIHSATLEIEAVVSALNDLVSRLTSTLD









NERLFTADVAHELRTPLAG





981
A
1
939
MVPDRPAYPDVYDQLRFWQAGSLDIRNLHTLKVVLI









PGADRRSNCAITESRAEQFEPRQRHRDGAGQSRGAH









TVDWSAGDYRALCATAIVGPIAFIGLMMPHMARWL









VGADHRWSLPVTLLATPALLLFADIIGRVIVPGELRV









SVVSAFIGAPVLIFLVRRKTRGIWGLRSGAVTLETSQ









VFAALMGDAPRSMTMVVTEWRLPRVLMALLIGAAL









GVSGAIFQSLMRNPLGSPDVMGFNTGAWSGVLVAM









VLFGQDLTAIALSAMVGGIVTSLLVWLLAWRNGIDT









FRLIIIGIGVRAMLVAFNTWLLLKAS





982
B
1
1941
MKLTTHHLRTGAALLLAGILLAGCDQSSSDAKHIKV









GVINGAEQDVAEVAKKVAKEKYGLDVELVGFSGSL









LPNDATNHGELDANVFQHRTFLEQDNQAHGYKLVA









VGNTFVFPMADYGTRGGAVPRVLDDPKVDVAIISTT









YIQQTGLSPVHDSVFIEDKNSPYVNILVAREDNKNAE









NVQTKCQGRTNDHQIKKRQNQTAVNDKVCSLSGIK









RQQNQTANQHTPADDWTCRHIKCLQVIIMRVVDIML









ALQVCCWRWCWWQFSARRLINYRCISATKARRFRV









VDRISYSVKQGEVVGIVGESGSVNQTSSPAFPPAPAR









RRGTHPETRRNRQCVPELSSNKPTVSTSRETIATFTPP









KRSGIQPNMTRITTNAPPNAISPTTLHSDGRSPARPVT









LRTVPLQAHASTGNTIVVIASHVQNDRGSPVSYIAW









MPAGPLVILLFFTIYCASGIVAGARLFESTFGMSYETA









LWAGAAATILYTFIGGFLAVSWTDTVQASLMIFALIL









TPVIVIISVGGFGDSLEVIKQKSIENVDMLKGLNFVAII









SLMGWGLGYFGQPHILARFMAADSHHSIVHARRISM









TWMILCLAGAVAVGFFGIAYPNDHPALAGAVNQNA









ERVFIELAQILFNPWIAGILLSQFWRR





983
A
3
964
TISTVRWNSRIGMVLGVAIQKRAV\PGLY\SFEEAYAR









ADKEAPRPCHKGSWCSSNQLCRECQAFMAHTMPKL









KAFSMSSAYNAYRAVYAVAHGLHQLLGCASGACSR









GRVYPWQLLEQIHKVHFLLHKDTVAFNDNRDPLSSY









NILAWDWNGPKWTFTVLGSSTWSPVQLNINETKIQW









HGKDNQVPKSVCSSDCLEGHQRVVTGFHHCCFECVP









CGAGTFLNK/QCYLGKDLPENYNEAKCVTFSLLFNFV









SWIAFFTTASVYDGKYLPAANMMAGLSSLSSGFGGY









FLPKCYVILCRPDLNSTEHFQASIQDYTRRCGST





984
A
163
431
PTRNMATSAVPSDNLPTYKLVVVGDGGVFIIALNILS









FQTILAPGYYPYMCNIYLLHTAMDNHMPFLDV\LDK









PGPVEGTTIRYQYLRPG





985
A
398
553
ETGACIHCHCYWTPCQGHQRHHHHHHHQYHHHHH









HHQCHHHQYYHHHHHHFH





986
C
123
359
MRLKKHRWYKKILKSQDPIIFSVGWRRFQTILLYYIE









DHNGRQXASKXIPHSTCIVEQPFWAWIFRIAATRSLS









LXLG





987
A
2
410
IEIHSQCGGIPHRKLGMAGQKLGSSALLCYIRPDTAD









PASSFSLTVANVGTCQAVLCRGGKPVPLSKVFSLEQD









PEEAQRVKDQKAIITEVPEDLKEFQTTDIPVHHCNFP









AVLGCLCSASVLGSYAGPAPRWRRT





988
A
482
23
VASHGLGLLGLLLCSFGSECFQFTRIRWVFKRRLGLL









GRTLEASASATTLLPVSWVAHATIQDFWDDSIPDIIPR









WEFGGALYLGWAAGIFLALGGLLLIFSACLGKEDVP









FPLMAGPTVPLSCAPVEESDGSFHLMLRPRNLEFLVT









AYGLD





989
A
3
455
SWTLWRCCQRVVGWVPVLFITFVVVWSYYAYVVEL









CVY*CGNQGH*FEPELSYYPWK*R/QMFYLSNSEKER









YEKEFSQERQQEILRRAARALPIYTTSASKTIRYCEKC









QLIKLPDRAHHCSACDSCILKMDHHCPWVNNCVG\FS









NYKFFLL





990
A
93
320
VTPTPPQYYTCSCVLGFIACSIFLQMSLKPKVMLLTV









ALVACLVFNLSQCWQRDCCSQGLGNLTEPSGTNR*









GPA





991
A
2
445
EIDRKWYYDSYTCCPPPWFMITVTLLEVAFFLYNWV









SLGQFVLQVTHPRYLKNSLVYHPQLRAQVWRYLTYI









FMHAGIEHLGLNVVLQLLVGVPLEMVHGATRIGLVY









MAGVVAGSLAVSVADMTAPVVGSSGGVYALVSAH









LANIVM





992
A
3
457
VQRSIEDDGAERPS/PPGRSGASLVSGFPF*PLADSLLF









SSSVERGTDSGDGHPQRPSLGFPGTS/GFSAALGRKS









AHGPGLQAP\TGAPGG*YLPMPPGPCRILAGS*GGRA









ASLSYSPGFPLSLALFCHWAARGGLRSSLQQRERPRA









QTGV





993
A
27
437
RVDDIHCTAA*GRATPGSGTSLPGTLSSSPRRRCSSPS









CSAPASAATRSRRAWSTCWRPSTAWPRSSATSLTAS









SRPTAPSTVSTRTCRPPGATAGSSSPPSRKPTCWTSITP









T*S*SCMLLSPRSRFQGLVLITRL





994
A
2
406
FLVETEFCYVGQAGLELLTSRDPPASASKGAGMTGV









SHQVQPQ**S*LWT*/PSSVEAGTSFGLSFLSSSWALS









AQEGCLAVPS/SGSRGLLVGALLLWTKPSPQLSPVPA









SQRLSSLSLMPPLPQPQHLTHTSIET





995
A
1
439
GTRKPVYKPLVFVLLAVLVLSVTTQINYLNKALDT









FNTSLVTPIYYVFFTSMVVTCSAILFQEWYGMTAGDI









IGTLSGFFTIIIGIFLLHAEKNTDITWSELTSTAKKEAVS









LNVNENNYVLLENLECSAPGYNDDVTLFSRTDD





996
A
756
1016
KLRPFIFSNQSLWLHSYEGAELEKTFIKGSWATFWVK









VASCWACVLLYLGLLLAPLCWPPTQKPQPLILRRRR









HRIISPDNKYPPV





997
A
1497
717
HTPMA/FFL/SFLSTSET/VYTFVILPKMLINLLSVARTI









SFNCCALQMFFFLGFAITNCLLLGVMGYDRYAAICH









PLHYPTLMSWQVCGKLAAACAIGGFLASLTVVNLVF









SLPFCSTNKVNHYFCDISAVILLACTNTDVNGFVIFIC









GVLVLVVPFLFICVSYFCILRTILKIPSAEGRRKAPSTC









ASHLSVVIVHYGCASFIYLRPTANYVSNKDRLVTVTY









TIVTPLLNPMVYSLRNKDVQLAIRKVLGKKGSLKLY









N





998
B
1
975
MSPPGREQGLLLNLLRPSGLDNAGKTTILKKFNGEDI









DTISPTLGFNIKTLEHRGFKLNIWDVGGQKSLRSYWR









NYFESTDGLIWVVDSADRQRMQDCQRELQSLLVEEV









GSSYPLCTWRFFSYLRIEQMYNLVLYRDIQFPDFCFN









SNTDWSKGLKTHARFGNTSLHVAHTDSTNTTNFVD









VWRGRTKSLACLLQLSSLTCIYTAGKMRLQDRIATFF









FPKGMMLTTAALMLFFLHLGIFIRDVHNFCITYHYDH









MSFHYTVVLMFSQVISICWAAMGSLYAEMTENKYV









CFSALTILMLNGAMFFNRLSLEFLAIEYREEHH










[0408]

9









TABLE 9












Number of
Position of




Transmembrane
Transmembrane




Regions
Region;



SEQ ID NO:
Predicted
TMPred Score




















338
1
184-201; 807



340
2
21-46; 1142





54-70; 3147



341
1
297-319; 2854



343
1
14-34; 1660



346
1
20-47; 3001



347
1
9-31; 2958



348
1
28-44; 2183



349
1
41-59; 2412



350
2
34-53; 1125





67-84; 2061



353
2
34-51; 1665





133-151; 1190



354
1
20-39; 1830



355
1
75-92; 1800



356
1
48-63; 2723



357
3
28-43; 1680





58-73; 1675





90-105; 1928



359
1
53-68; 3633



360
1
142-159; 2140



362
1
69-87; 2593



363
1
17-35; 2291



365
3
16-37; 895





52-69; 1796





100-120; 1617



366
1
22-37; 2183



369
2
238-257; 908





396-412; 1281



371
1
27-42; 2043



372
3
52-75; 2018





325-346; 865





375-392; 839



373
1
353-370; 2096



374
1
25-45; 2047



375
1
24-47; 2800



376
2
71-86; 1595





102-121; 2779



377
10
25-41; 1489





54-72; 2563





87-103; 1436





116-134; 2525





149-165; 1474





178-196; 2516





211-227; 1420





240-258; 2456





273-289; 1392





302-320; 2395



378
2
22-48; 2007





141-164; 1410



379
2
21-41; 1941





102-117; 3056



380
8
29-44; 1389





61-74; 917





88-103; 1267





115-129; 890





179-193; 898





204-221; 1978





220-238; 1076





259-275; 1735



381
1
26-43; 1767



383
2
36-51; 2233





100-113; 2408



384
2
40-56; 1175





69-85; 1803



387
1
35-53; 2023



389
4
17-32; 2238





39-60; 1679





79-95; 2605





114-129; 1098



391
1
23-42; 2878



392
2
36-58; 1952





189-210; 874



395
4
25-48; 2108





276-291; 1253





334-351; 1063





399-416; 1680



396
4
22-37; 2458





45-60; 1250





82-98; 1641





159-176; 933



397
1
12-38; 1749



402
1
43-59; 2213



403
1
13-34; 2984



405
6
25-41; 1898





103-119; 1328





131-148; 2506





180-203; 1533





205-228; 1303





245-260; 1634



406
1
30-49; 2416



407
1
32-50; 1597



408
1
284-299; 1055



409
1
124-141; 2071



411
1
92-108; 1857



413
1
28-44; 2543



415
2
43-58; 1396





60-75; 2059



416
3
5-35; 1780





59-73; 1361





80-103; 1826



417
5
16-32; 1576





72-87; 1083





104-121; 1825





145-160; 1294





227-247; 1337



419
1
39-53; 1731



420
1
245-258; 1771



421
1
58-81; 2868



422
1
16-33; 1894



423
1
290-310; 2684



425
2
264-282; 1757





383-403; 1000



427
2
18-33; 892





108-126; 1867



428
1
37-56; 2054



429
1
369-387; 2530



430
2
14-34; 1939





187-208; 1365



431
2
43-58; 1060





155-170; 2602



432
4
24-45; 2509





98-119; 2954





129-147; 1343





183-201; 2141



433
1
142-157; 1775



434
1
33-49; 2264



435
1
43-57; 1794



437
1
15-38; 1948



438
2
20-34; 1518





82-98; 1908



439
1
64-80; 1560



440
1
24-40; 2347



441
1
14-32; 2720



442
1
23-44; 1807



443
2
15-31; 1300





118-140; 3012



444
4
95-111; 2524





104-139; 1338





125-147; 2138





174-209; 1036



445
2
6-38; 1711





49-67; 1103



446
2
15-31; 3431





69-86; 889



447
3
13-32; 2547





95-110; 1692





112-132; 1903



451
3
41-57; 1768





82-97; 2647





122-136; 968



452
1
250-265; 1867



453
3
46-62; 911





68-84; 1367





154-166; 1297



454
2
32-51; 2342





114-130; 1188



455
1
23-39; 2309



457
2
85-114; 2984





221-238; 959



458
2
35-50; 1595





66-85; 2779



459
2
17-32; 1331





57-71; 1728



460
3
14-31; 1963





40-58; 1009





66-86; 1248



461
1
226-242; 2202



462
2
46-61; 832





73-90; 2191



463
1
34-56; 1058



464
1
154-172; 2074



465
3
34-49; 1210





66-99; 1252





97-113; 2355



466
1
18-33; 1975



467
4
158-174; 1945





199-216; 1112





225-242; 1673





254-271; 946



468
1
15-33; 1775



469
1
181-199; 1868



470
5
38-54; 1712





67-94; 2110





114-128; 918





240-256; 855





277-292; 1359



471
2
50-74; 2625





130-149; 1166



472
4
16-38; 1473





43-59; 1371





77-94; 1851





199-214; 1092



473
1
46-62; 3051



474
1
17-34; 2743



475
1
95-118; 3033



476
1
213-230; 985



477
1
8-31; 3667



478
1
83-101; 2361



479
3
47-62; 1204





51-79; 1625





96-109; 1118



481
4
13-35; 1282





58-73; 2648





91-107; 1319





148-165; 1783



482
4
41-56; 1354





62-78; 1639





88-103; 977





134-150; 1946



483
2
25-46; 2369





66-81; 1705



484
5
20-43; 823





51-73; 1163





87-106; 1827





105-125; 1017





153-186; 1554



486
1
74-89; 3414



487
1
31-57; 2521



488
3
27-46; 2157





130-160; 1822





236-250; 888



490
10
28-44; 2267





50-76; 1625





68-88; 2769





93-113; 1629





118-138; 2697





153-168; 1629





178-194; 2313





203-238; 1733





244-263; 2730





269-284; 1367



491
1
40-67; 1986



494
3
23-40; 2163





266-285; 985





291-304; 1229



495
3
18-34; 2249





256-272; 1362





280-299; 1671



496
1
21-39; 2045



497
4
21-37; 2440





57-74; 1286





84-112; 1585





122-143; 1004



498
2
48-63; 1829





197-216; 1112



501
1
29-48; 1619



503
2
16-32; 1602





191-205; 890



504
3
44-60; 2409





103-123; 941





165-185; 2002



506
3
19-35; 2153





38-53; 1100





78-97; 1064



507
2
57-72; 2060





93-110; 939



508
8
23-47; 1290





60-80; 1779





87-106; 1447





159-187; 2236





202-216; 1085





234-249; 981





270-299; 1491





324-338; 1352



509
1
21-39; 2481



510
2
27-52; 1562





66-84; 864



511
2
15-31; 1529





41-56; 2722



512
1
21-36; 2544



513
2
16-34; 1960





40-55; 951



514
1
174-191; 1728



515
1
16-32; 827



516
1
45-66; 1964



517
3
17-40; 2165





71-83; 1112





116-143; 1198



518
1
23-39; 3165



519
1
42-59; 859



521
5
75-90; 1359





107-122; 1520





135-151; 1967





175-191; 1416





236-251; 2332



522
1
14-32; 2317



526
2
214-236; 1046





282-294; 966



527
6
125-141; 2144





157-173; 1116





185-204; 1756





223-238; 926





243-259; 1271





273-288; 1225



528
2
38-55; 1680





151-168; 2550



529
2
30-51; 2155





161-176; 905



530
6
36-50; 2210





58-74; 1644





126-141; 914





152-173; 1406





187-202; 2224





221-236; 1055



531
5
49-70; 1075





88-104; 1052





123-140; 1710





157-175; 2590





191-204; 1390



532
2
25-45; 1365





64-84; 1812



534
2
46-59; 1059





186-206; 1046



535
1
97-112; 1026



536
1
26-41; 1887



537
3
82-102; 1765





119-134; 1405





167-183; 2521



538
4
15-45; 1726





42-67; 2522





207-229; 861





274-291; 922



539
1
13-31; 2843



540
3
23-38; 1889





50-66; 831





121-137; 1096



541
3
19-35; 1356





72-87; 1830





105-120; 1373



543
2
22-48; 1148





384-399; 2339



545
1
36-51; 2076



546
6
83-100; 2781





111-133; 1847





157-173; 2151





175-191; 1172





236-251; 3053





307-322; 1307



547
1
14-34; 2733



548
1
31-50; 2047



549
1
118-137; 812



551
1
234-248; 948



552
1
7-41; 2396



554
1
18-33; 1771



555
1
15-39; 2946



557
1
36-51; 1750



558
3
30-58; 2255





69-85; 1303





102-116; 965



559
7
5-33; 2407





48-62; 834





82-101; 1768





116-136; 1635





165-185; 2884





226-247; 1338





263-282; 1779



561
1
26-47; 2958



562
1
43-58; 2185



563
1
51-66; 896



564
1
20-39; 1851



565
1
30-48; 2719



566
2
50-67; 1746





105-120; 1144



567
1
108-123; 1623



568
1
34-48; 2268



569
2
14-38; 2868





281-297; 941



571
1
217-239; 1272



572
1
146-168; 2684



573
2
90-107; 1944





363-377; 1338



574
3
64-81; 2157





84-100; 1243





97-133; 1672



575
1
48-72; 2661



576
3
2-38; 971





22-46; 1497





84-99; 1261



577
2
34-61; 2058





93-108; 1716



578
2
40-59; 1918





234-249; 859



579
1
24-45; 2330



581
1
296-313; 812



582
1
21-44; 2763



583
1
21-36; 2617



584
1
26-51; 825



586
4
34-55; 2354





150-169; 1592





311-333; 1867





353-375; 892



587
5
59-80; 1228





88-107; 866





157-176; 3161





198-216; 1250





223-238; 2194



588
1
195-210; 1193



589
1
19-35; 2865



590
1
69-98; 822



591
3
18-33; 2344





94-115; 1093





232-249; 1415



592
1
14-31; 2117



593
1
166-182; 2113



597
1
11-31; 871



599
1
31-53; 2985



601
1
20-44; 2459



602
1
20-37; 2284



603
1
22-42; 3116



604
1
46-62; 2496



606
1
19-33; 1834



607
2
41-71; 1782





65-86; 3101



608
3
19-34; 1101





46-62; 1928





185-201; 1841



609
1
17-39; 1978



610
1
364-379; 1065



612
1
22-40; 1765



614
1
38-53; 1788



615
1
14-32; 2099



616
2
32-52; 1769





77-102; 2317



617
4
153-175; 2138





189-204; 1068





261-283; 2271





290-306; 1112



618
1
1-34; 1975



619
1
10-38; 1023



620
1
15-31; 1522



621
1
74-91; 2543



622
5
49-64; 1187





82-96; 1485





119-140; 1408





129-153; 2110





206-222; 2257



623
1
66-83; 2200



626
2
75-94; 924





180-195; 1494



627
5
43-67; 2282





70-91; 1282





121-137; 2440





169-183; 1439





197-232; 1120



628
3
14-34; 1791





83-97; 1381





115-144; 1592



629
4
43-62; 1533





195-216; 2160





222-237; 1314





257-270; 1867



630
2
13-31; 1516





69-88; 2277



631
5
25-42; 1555





74-89; 1237





114-142; 2195





154-169; 1023





185-200; 2114



632
3
24-47; 1711





61-79; 2020





192-207; 2454



633
2
36-56; 1076





90-110; 1216



634
1
16-33; 2206



635
2
17-36; 2654





64-76; 932



636
1
19-34; 1366



637
1
28-46; 2247



638
2
23-43; 1069





58-75; 1756



639
4
21-39; 1494





81-97; 1518





125-143; 1312





148-169; 2440



640
10
7-32; 2014





82-96; 1124





107-123; 1475





148-167; 1298





170-193; 1565





258-273; 1090





296-316; 1839





324-345; 1356





354-369; 1159





420-437; 1669



641
2
44-60; 963





75-90; 3007



642
4
29-44; 1865





76-93; 1315





119-138; 1894





155-176; 1330



643
1
42-69; 2215



644
2
36-55; 2620





41-76; 845



645
1
3-35; 3176



646
1
56-73; 3062



648
3
45-61; 2010





110-125; 1024





175-193; 839



649
1
18-39; 2254



650
3
55-76; 2276





89-112; 1167





148-168; 2134



651
1
16-36; 2701



652
2
82-107; 1813





168-186; 2844



653
1
17-35; 2449



654
1
36-53; 2305



655
1
29-45; 2349



656
1
26-43; 2340



657
2
50-68; 1787





82-94; 808



658
2
41-55; 1214





76-91; 2379



659
1
120-139; 1924



660
2
25-41; 2077





208-223; 986



661
2
25-45; 1955





167-181; 1187



662
3
47-62; 2783





76-92; 1090





115-130; 2791



664
1
58-85; 1106



665
4
33-48; 1166





71-88; 2044





108-123; 1229





134-154; 2709



667
1
79-94; 1909



668
6
16-33; 2461





94-113; 2485





137-152; 1212





190-212; 3236





237-253; 971





266-285; 1138



670
2
48-66; 1420





56-86; 2350



671
1
14-32; 2650



672
2
23-42; 2154





134-155; 1123



673
3
16-34; 1811





55-70; 1301





82-99; 1627



674
1
43-58; 890











[0409]

10









TABLE 10








SEQ ID
SEQ ID
SEQ ID
SEQ ID
Identification of Priority


NO: of
NO: of
NO: of
NO:
Application that contig


full-length
full-length
contig
of contig
nucleotide sequence was


nucleotide
peptide
nucleotide
peptide
filed (Attorney Docket


sequence
sequence
sequence
sequence
No._SEQ ID NO.)*



















1
338





2
339


3
340
675
837
787_7102


4
341


5
342
676
838
790_16366


6
343
677
839
784_3345


7
344


8
345


9
346


10
347
678
840
788_13033


11
348


12
349


13
350


14
351


15
352
679
841
790_28890


16
353


17
354


18
355


19
356


20
357
680
842
790_24599


21
358
681
843
784_3534


22
359
682
844
790_9494


23
360


24
361
683
845
785_3560


25
362
684
846
787_2054


26
363


27
364
685
847
784_4307


28
365
686
848
787_2905


29
366


30
367
687
849
784_8386


31
368
688
850
790_17736


32
369
689
851
784_5156


33
370
690
852
787_2104


34
371


35
372
691
853
790_935


36
373


37
374


38
375
692
854
787_3283


39
376
693
855
787_7951


40
377
694
856
784_2168


41
378


42
379
695
857
784_9629


43
380


44
381


45
382


46
383
696
858
785_14


47
384
697
859
790_7599


48
385
698
860
787_4843


49
386
699
861
790_2819


50
387
700
862
790_8044


51
388


52
389
701
863
784_337


53
390
702
864
785_706


54
391
703
865
787_9834


55
392


56
393


57
394
704
866
787_3554


58
395
705
867
790_8276


59
396


60
397
706
868
790_18037


61
398


62
399
707
869
784_7084


63
400


64
401


65
402
708
870
790_3034


66
403
709
871
785_1867


67
404


68
405
710
872
790_3651


69
406
711
873
790_22283


70
407
712
874
785_1538


71
408
713
875
784_5140


72
409
714
876
790_14249


73
410


74
411


75
412


76
413
715
877
787_5698


77
414
716
878
790_29400


78
415


79
416
717
879
784_4813


80
417
718
880
784_9771


81
418
719
881
790_10961


82
419
720
882
790_11763


83
420


84
421
721
883
790_9831


85
422


86
423
722
884
790_16986


87
424
723
885
785_3654


88
425
724
886
785_102


89
426
725
887
784_4307


90
427


91
428


92
429


93
430
726
888
787_6896


94
431
727
889
789_3174


95
432


96
433


97
434


98
435


99
436


100
437


101
438


102
439
728
890
784_3746


103
440


104
441
729
891
785_2855


105
442


106
443


107
444
730
892
785_1465


108
445
731
893
784_1644


109
446
732
894
789_5053


110
447
733
895
787_1411


111
448
734
896
787_5936


112
449


113
450
735
897
784_2486


114
451
736
898
790_28311


115
452


116
453


117
454


118
455


119
456
737
899
784_3665


120
457


121
458
738
900
787_7951


122
459


123
460


124
461
739
901
787_4539


125
462
740
902
790_26713


126
463
741
903
790_10585


127
464


128
465
742
904
785_1092


129
466


130
467
743
905
790_17470


131
468


132
469
744
906
784_844


133
470
745
907
787_9644


134
471
746
908
789_1867


135
472
747
909
785_612


136
473


137
474
748
910
785_852


138
475
749
911
787_7533


139
476


140
477
750
912
785_2515


141
478
751
913
784_715


142
479
752
914
785_631


143
480
753
915
784_3853


144
481
754
916
790_10815


145
482
755
917
790_25607


146
483
756
918
790_10374


147
484
757
919
790_10504


148
485
758
920
790_21640


149
486
759
921
790_18317


150
487


151
488
760
922
785_640


152
489


153
490
761
923
787_5233


154
491
762
924
788_2575


155
492
763
925
790_22555


156
493
764
926
790_18977


157
494


158
495
765
927
792_4675


159
496
766
928
784_2550


160
497
767
929
787_7445


161
498


162
499
768
930
787_5416


163
500
769
931
784_4167


164
501
770
932
784_4677


165
502


166
503
771
933
784_10126


167
504


168
505


169
506


170
507


171
508


172
509


173
510
772
934
790_19568


174
511


175
512
773
935
791_3005


176
513


177
514


178
515


179
516


180
517


181
518
774
936
790_1155


182
519
775
937
790_10740


183
520


184
521


185
522


186
523


187
524


188
525


189
526
776
938
790_8077


190
527


191
528
777
939
784_929


192
529


193
530


194
531
778
940
787_5943


195
532


196
533
779
941
787_2691


197
534
780
942
785_3660


198
535


199
536


200
537


201
538


202
539


203
540


204
541
781
943
788_2020


205
542


206
543
782
944
787_4919


207
544


208
545


209
546


210
547
783
945
784_4970


211
548


212
549


213
550
784
946
784_4845


214
551


215
552


216
553
785
947
785_1670


217
554


218
555


219
556


220
557
786
948
787_4525


221
558
787
949
792_4456


222
559


223
560


224
561


225
562


226
563


227
564
788
950
790_16768


228
565
789
951
788_11952


229
566


230
567
790
952
787_2489


231
568


232
569


233
570
791
953
792_3487


234
571


235
572
792
954
785_395


236
573


237
574


238
575


239
576


240
577
793
955
790_10170


241
578
794
956
785_1618


242
579


243
580


244
581


245
582
795
957
787_4486


246
583
796
958
787_4256


247
584


248
585


249
586
797
959
784_5437


250
587


251
588
798
960
787_2155


252
589
799
961
790_15300


253
590


254
591


255
592


256
593


257
594
800
962
790_11358


258
595


259
596


260
597


261
598


262
599
801
963
790_3760


263
600
802
964
784_4787


264
601


265
602
803
965
787_4483


266
603


267
604
804
966
785_598


268
605


269
606
805
967
791_2994


270
607


271
608
806
968
790_11947


272
609


273
610
807
969
787_6368


274
611
808
970
790_21374


275
612


276
613
809
971
790_26925


277
614
810
972
788_8317


278
615
811
973
784_5609


279
616


280
617
812
974
790_4252


281
618
813
975
784_3437


282
619


283
620


284
621
814
976
790_11072


285
622
815
977
784_1021


286
623
816
978
790_16269


287
624


288
625


289
626


290
627


291
628
817
979
790_16011


292
629


293
630
818
980
790_28920


294
631
819
981
790_17932


295
632
820
982
790_25383


296
633


297
634


298
635


299
636


300
637


301
638


302
639


303
640


304
641


305
642
821
983
784_3789


306
643
822
984
787_4340


307
644


308
645
823
985
790_17189


309
646


310
647


311
648
824
986
790_20324


312
649


313
650
825
987
784_2129


314
651


315
652
826
988
787_5627


316
653


317
654


318
655
827
989
787_614


319
656
828
990
784_1483


320
657
829
991
787_2548


321
658


322
659
830
992
789_3213


323
660
831
993
789_4901


324
661


325
662


326
663


327
664
832
994
788_1187


328
665
833
995
784_4265


329
666


330
667
834
996
784_4819


331
668
835
997
784_3677


332
669


333
670


334
671


335
672


336
673
836
998
790_21539


337
674






784_XXX = SEQ ID NO: XXX of Attorney Docket No. 784, U.S. Ser. No. 09/488,725 filed Jan. 21, 2000, the entire disclosure of which, including sequence listing, is incorporated herein by reference. This application is the parent application of a continuation-in-part application bearing Attorney Docket No. 784CIP, U.S. application Ser. No. 09/552,317, filed Apr. 25, 2000, which in turn is a parent application of continuation-in-part application bearing Attorney



# Docket No. 784CIP3A/PCT, PCT Ser. No. PCT/US00/35017 filed Dec. 22, 2000, both of which are incorporated herein by reference in their entirety, including Tables, and Sequence Listing.



785_XXX = SEQ ID NO: XXX of Attorney Docket No. 785, U.S. Ser. No. 09/491,404 filed Jan. 25, 2000, the entire disclosure of which, including sequence listing, is incorporated herein by reference. This application is the parent application of a continuation-in-part application bearing Attorney Docket No. 785CIP3/PCT, PCT Ser. No. PCT/US01/02623 filed Jan. 25, 2001, which is incorporated herein by reference in its entirety, including Tables, and Sequence Listing.




787_XXX = SEQ ID NO: XXX of Attorney Docket No. 787, U.S. Ser. No. 09/496,914 filed Feb. 03, 2000, the entire disclosure of which, including sequence listing, is incorporated herein by reference. This application is the parent application of a continuation-in-part application bearing Attorney Docket No. 787CIP, U.S. application Ser. No. 09/560,875, filed Apr. 27, 2000, which in turn is a parent application of continuation-in-part application bearing Attorney



# Docket No. 787CIP3/PCT, PCT Ser. No. PCT/US01/03800 filed Feb. 5, 2001, both of which are incorporated herein by reference in their entirety, including Tables, and Sequence Listing.



788_XXX = SEQ ID NO: XXX of Attorney Docket No. 788, U.S. Ser. No. 09/515,126 filed Feb. 28, 2000, the entire disclosure of which, including sequence listing, is incorporated herein by reference. This application is the parent application of a continuation-in-part application bearing Attorney Docket No. 788CIP, U.S. application Ser. No. 09/577,409, filed May 18, 2000, which in turn is a parent application of continuation-in-part application bearing Attorney



# Docket No. 788CIP3/PCT, PCT Ser. No. PCT/US01/04927 filed Feb. 26, 2001, both of which are incorporated herein by reference in their entirety, including Tables, and Sequence Listing.



789_XXX = SEQ ID NO: XXX of Attorney Docket No. 789, U.S. Ser. No. 09/519,705 filed Mar. 07, 2000, the entire disclosure of which, including sequence listing, is incorporated herein by reference. This application is the parent application of a continuation-in-part application bearing Attorney Docket No. 789CIP, U.S. application Ser. No. 09/574,454, filed May 19, 2000, which in turn is a parent application of continuation-in-part application bearing Attorney



# Docket No. 789CIP3/PCT, PCT Ser. No. PCT/US01/04941 filed Mar. 5, 2001, both of which are incorporated herein by reference in their entirety, including Tables, and Sequence Listing.



790_XXX = SEQ ID NO: XXX of Attorney Docket No. 790, U.S. Ser. No. 09/540,217 filed Mar. 31, 2000, the entire disclosure of which, including sequence listing, is incorporated herein by reference. This application is the parent application of a continuation-in-part application bearing Attorney Docket No. 790CIP, U.S. application Ser. No. 09/649,167, filed Aug. 23, 2000, which in turn is a parent application of continuation-in-part application bearing Attorney



# Docket No. 790CIP3/PCT, PCT Ser. No. PCT/US01/08631 filed Mar. 30, 2001, both of which are incorporated herein by reference in their entirety, including Tables, and Sequence Listing.



791_XXX = SEQ ID NO: XXX of Attorney Docket No. 791, U.S. Ser. No. 09/552,929 filed Apr. 18, 2000, the entire disclosure of which, including sequence listing, is incorporated herein by reference. This application is the parent application of a continuation-in-part application bearing Attorney Docket No. 791CIP, U.S. application Ser. No. 09/770,160, filed Jan. 26, 2001, which in turn is a parent application of continuation-in-part application bearing Attorney



# Docket No. 791CIP3/PCT, PCT Ser. No. PCT/US01/8656 filed Apr. 18, 2001, both of which are incorporated herein by reference in their entirety, including Tables, and Sequence Listing.



792_XXX = SEQ ID NO: XXX of Attorney Docket No. 792, U.S. Ser. No. 09/577,408 filed May 18, 2000, the entire disclosure of which, including sequence listing, is incorporated herein by reference. This application is the parent application of a continuation-in-part application bearing 792CIP3/PCT, PCT Ser. No. PCT/US01/14827 filed May 16, 2001, which is incorporated herein by reference in its entirety, including Tables, and Sequence Listing.








Claims
  • 1. An isolated polynucleotide comprising a nucleotide sequence selected from the group consisting of SEQ ID NO: 1-337.
  • 2. An isolated polynucleotide encoding a polypeptide with biological activity, wherein said polynucleotide hybridizes to the polynucleotide of claim 1 under stringent hybridization conditions.
  • 3. An isolated polynucleotide encoding a polypeptide with biological activity, wherein said polynucleotide has greater than about 99% sequence identity with the polynucleotide of claim 1.
  • 4. The polynucleotide of claim 1 wherein said polynucleotide is DNA.
  • 5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the complementary sequences.
  • 6. A vector comprising the polynucleotide of claim 1.
  • 7. An expression vector comprising the polynucleotide of claim 1.
  • 8. A host cell genetically engineered to comprise the polynucleotide of claim 1.
  • 9. A host cell genetically engineered to comprise the polynucleotide of claim 1 operatively associated with a regulatory sequence that modulates expression of the polynucleotide in the host cell.
  • 10. An isolated polypeptide, wherein the polypeptide is selected from the group consisting of: (a) a polypeptide encoded by any one of the polynucleotides of claim 1; and (b) a polypeptide encoded by a polynucleotide hybridizing under stringent conditions with any one of SEQ ID NO: 1-337.
  • 11. A composition comprising the polypeptide of claim 10 and a carrier.
  • 12. An antibody directed against the polypeptide of claim 10.
  • 13. A method for detecting the polynucleotide of claim 1 in a sample, comprising: a) contacting the sample with a compound that binds to and forms a complex with the polynucleotide of claim 1 for a period sufficient to form the complex; and b) detecting the complex, so that if a complex is detected, the polynucleotide of claim 1 is detected.
  • 14. A method for detecting the polynucleotide of claim 1 in a sample, comprising: a) contacting the sample under stringent hybridization conditions with nucleic acid primers that anneal to the polynucleotide of claim 1 under such conditions; b) amplifying a product comprising at least a portion of the polynucleotide of claim 1; and c) detecting said product and thereby the polynucleotide of claim 1 in the sample.
  • 15. The method of claim 14, wherein the polynucleotide is an RNA molecule and the method further comprises reverse transcribing an annealed RNA molecule into a cDNA polynucleotide.
  • 16. A method for detecting the polypeptide of claim 10 in a sample, comprising: a) contacting the sample with a compound that binds to and forms a complex with the polypeptide under conditions and for a period sufficient to form the complex; and b) detecting formation of the complex, so that if a complex formation is detected, the polypeptide of claim 10 is detected.
  • 17. A method for identifying a compound that binds to the polypeptide of claim 10, comprising: a) contacting the compound with the polypeptide of claim 10 under conditions sufficient to form a polypeptide/compound complex; and b) detecting the complex, so that if the polypeptide/compound complex is detected, a compound that binds to the polypeptide of claim 10 is identified.
  • 18. A method for identifying a compound that binds to the polypeptide of claim 10, comprising: a) contacting the compound with the polypeptide of claim 10, in a cell, under conditions sufficient to form a polypeptide/compound complex, wherein the complex drives expression of a reporter gene sequence in the cell; and b) detecting the complex by detecting reporter gene sequence expression, so that if the polypeptide/compound complex is detected, a compound that binds to the polypeptide of claim 10 is identified.
  • 19. A method of producing the polypeptide of claim 10, comprising, a) culturing a host cell comprising a polynucleotide sequence selected from the group consisting of any of the polynucleotides from SEQ ED NO: 1-337, under conditions sufficient to express the polypeptide in said cell; and b) isolating the polypeptide from the cell culture or cells of step (a).
  • 20. An isolated polypeptide comprising an amino acid sequence selected from the group consisting of any one of the polypeptides SEQ ID NO: 338-674.
  • 21. The polypeptide of claim 20 wherein the polypeptide is provided on a polypeptide array.
  • 22. A collection of polynucleotides, wherein the collection comprising of at least one of SEQ ID NO: 1-337.
  • 23. The collection of claim 22, wherein the collection is provided on a nucleic acid array.
  • 24. The collection of claim 23, wherein the array detects full-matches to any one of the polynucleotides in the collection.
  • 25. The collection of claim 23, wherein the array detects mismatches to any one of the polynucleotides in the collection.
  • 26. The collection of claim 22, wherein the collection is provided in a computer-readable format.
Priority Claims (7)
Number Date Country Kind
PCT/US01/02623 Jan 2001 WO
PCT/US01/03800 Feb 2001 WO
PCT/US01/04927 Feb 2001 WO
PCT/US01/04941 Mar 2001 WO
PCT/US01/08631 Mar 2001 WO
PCT/US01/08656 Apr 2001 WO
PCT/US01/14827 May 2001 WO
Cross Reference to Related Applications

[0001] This application is a continuation-in-part application of PCT application Ser. No. PCT/US00/35017 filed Dec. 22, 2000 entitled “Novel Contigs Obtained from Various Libraries”, Attorney Docket No. 784CIP3A/PCT, which in turn is a continuation-in-part application of U.S. application Ser. No. 09/552,317 filed Apr. 25, 2000 entitled “Novel Contigs Obtained from Various Libraries”, Attorney Docket No. 784CIP, which in turn is a continuation-in-part application of U.S. application Ser. No. 09/488,725 filed Jan. 21, 2000 entitled “Novel Contigs Obtained from Various Libraries”, Attorney Docket No. 784; PCT application Ser. No. PCT/US01/02623 filed Jan. 25, 2001 entitled “Novel Contigs Obtained from Various Libraries”, Attorney Docket No. 785CIP3/PCT, which in turn is a continuation-in-part application of U.S. application Ser. No. 09/491,404 filed Jan. 25, 2000 entitled “Novel Contigs Obtained from Various Libraries”, Attorney Docket No. 785; PCT application Ser. No. PCT/US01/03800 filed Feb. 5, 2001 entitled “Novel Contigs Obtained from Various Libraries”, Attorney Docket No. 787CIP3/PCT, which in turn is a continuation-in-part application of U.S. application Ser. No. 09/560,875 filed Apr. 27, 2000 entitled “Novel Contigs Obtained from Various Libraries”, Attorney Docket No. 787CIP, which in turn is a continuation-in-part application of U.S. application Ser. No. 09/496,914 filed Feb. 03, 2000 entitled “Novel Contigs Obtained from Various Libraries”, Attorney Docket No. 787; PCT application Ser. No. PCT/US01/04927 filed Feb. 26, 2001 entitled “Novel Contigs Obtained from Various Libraries”, Attorney Docket No. 788CIP3/PCT, which in turn is a continuation-in-part application of U.S. application Ser. No. 09/577,409 filed May 18, 2000 entitled “Novel Contigs Obtained from Various Libraries”, Attorney Docket No. 788CIP, which in turn is a continuation-in-part application of U.S. application Ser. No. 09/515,126 filed Feb. 28, 2000 entitled “Novel Contigs Obtained from Various Libraries”, Attorney Docket No. 788; PCT application Ser. No. PCT/US01/04941 filed Mar. 5, 2001 entitled “Novel Contigs Obtained from Various Libraries”, Attorney Docket No. 789CIP3/PCT, which in turn is a continuation-in-part application of U.S. application Ser. No. 09/574,454 filed May 19, 2000 entitled “Novel Contigs Obtained from Various Libraries”, Attorney Docket No. 789CIP, which in turn is a continuation-in-part application of U.S. application Ser. No. 09/519,705 filed Mar. 07, 2000 entitled “Novel Contigs Obtained from Various Libraries”, Attorney Docket No. 789; PCT application Ser. No. PCT/US01/08631 filed Mar. 30, 2001 entitled “Novel Contigs Obtained from Various Libraries”, Attorney Docket No. 790CIP3/PCT, which in turn is a continuation-in-part application of U.S. application Ser. No. 09/649,167 filed Aug. 23, 2000 entitled “Novel Contigs Obtained from Various Libraries”, Attorney Docket No. 790CIP, which in turn is a continuation-in-part application of U.S. application Ser. No. 09/540,217 filed Mar. 31, 2000 entitled “Novel Contigs Obtained from Various Libraries”, Attorney Docket No. 790; PCT application Ser. No. PCT/US01/08656 filed Apr. 18, 2001 entitled “Novel Contigs Obtained from Various Libraries”, Attorney Docket No. 791CIP3/PCT, which in turn is a continuation-in-part application of U.S. application Ser. No. 09/770,160 filed Jan. 26, 2001 entitled “Novel Contigs Obtained from Various Libraries”, Attorney Docket No. 791CIP, which is in turn a continuation-in-part application of U.S. application Ser. No. 09/552,929 filed Apr. 18, 2000 entitled “Novel Contigs Obtained from Various Libraries”, Attorney Docket No. 791; and PCT application Ser. No. PCT/US01/14827 filed May 16, 2001 entitled “Novel Contigs Obtained from Various Libraries”, Attorney Docket No. 792CIP3/PCT, which in turn is a continuation-in-part application of U.S. application Ser. No. 09/577,408 filed May 18, 2000 entitled “Novel Contigs Obtained from Various Libraries”, Attorney Docket No. 792; all of which are incorporated herein by reference in their entirety.

Provisional Applications (1)
Number Date Country
60322511 Sep 2001 US
Continuation in Parts (3)
Number Date Country
Parent PCT/US00/35017 Dec 2000 US
Child 10243552 Sep 2002 US
Parent 09552317 Apr 2000 US
Child 10243552 Sep 2002 US
Parent 09488725 Jan 2000 US
Child 10243552 Sep 2002 US