Novel methods of diagnosis of angiogenesis, compositions and methods of screening for angiogenesis modulators

Abstract
Described herein are methods and compositions that can be used for diagnosis and treatment of angiogenic phenotypes and angiogenesis-associated diseases. Also described herein are methods that can be used to identify modulators of angiogenesis.
Description


FIELD OF THE INVENTION

[0002] The invention relates to the identification of nucleic acid and protein expression profiles and nucleic acids, products, and antibodies thereto that are involved in angiogenesis; and to the use of such expression profiles and compositions in diagnosis and therapy of angiogenesis. The invention further relates to methods for identifying and using agents and/or targets that modulate angiogenesis.



BACKGROUND OF THE INVENTION

[0003] Both vasculogenesis, the development of an interactive vascular system comprising arteries and veins, and angiogenesis, the generation of new blood vessels, play a role in embryonic development. In contrast, angiogenesis is limited in a normal adult to the placenta, ovary, endometrium and sites of wound healing. However, angiogenesis, or its absence, plays an important role in the maintenance of a variety of pathological states. Some of these states are characterized by neovascularization, e.g., cancer, diabetic retinopathy, glaucoma, and age related macular degeneration. Others, e.g., stroke, infertility, heart disease, ulcers, and scleroderma, are diseases of angiogenic insufficiency. Angiogenesis has a number of stages (see, e.g., Folkman, J. Natl Cancer Inst. 82.4-6, 1990; Firestein, J Clin Invest.103:3-4, 1999; Koch, Arthritis Rheum.41:951-62, 1998; Carter, Oncologist 5(Suppl 1):51-4, 2000; Browder et al., Cancer Res. 60:1878-86, 2000; and Zhu and Witte, Invest New Drugs 17:195-212, 1999). The early stages of angiogenesis include endothelial cell protease production, migration of cells, and proliferation. The early stages also appear to require some growth factors, with VEGF, TGF-A, angiostatin, and selected chemokines all putatively playing a role. Later stages of angiogenesis include population of the vessels with mural cells (pericytes or smooth muscle cells), basement membrane production, and the induction of vessel bed specializations. The final stages of vessel formation include what is known as “remodeling”, wherein a forming vasculature becomes a stable, mature vessel bed. Thus, the process is highly dynamic, often requiring coordinated spatial and temporal waves of gene expression.


[0004] Conversely, the complex process may be subject to disruption by interfering with one or more critical steps. Thus, the lack of understanding of the dynamics of angiogenesis prevents therapeutic intervention in serious diseases such as those indicated. It is an object of the invention to provide methods that can be used to screen compounds for the ability to modulate angiogenesis. Additionally, it is an object to provide molecular targets for therapeutic intervention in disease states which either have an undesirable excess or a deficit in angiogenesis. The present invention provides solutions to both.



SUMMARY OF THE INVENTION

[0005] The present invention provides compositions and methods for detecting or modulating angiogenesis associated sequences.


[0006] In one aspect, the invention provides a method of detecting an angiogenesis-associated transcript in a cell in a patient, the method comprising contacting a biological sample from the patient with a polynucleotide that selectively hybridized to a sequence at least 80% identical to a sequence as shown in Table 1. In one embodiment, the biological sample is a tissue sample. In another embodiment, the biological sample comprises isolated nucleic acids, which are often mRNA.


[0007] In another embodiment, the method further comprises the step of amplifying nucleic acids before the step of contacting the biological sample with the polynucleotide. Often, the polynucleotide comprises a sequence as shown in Table 1. The polynucleotide can be labeled, for example, with a fluorescent label and can be immobilized on a solid surface.


[0008] In other embodiments the patient is undergoing a therapeutic regimen to treat a disease associated with angiogenesis or the patient is suspected of having an angiogenesis-associated disorder.


[0009] In another aspect, the invention comprises an isolated nucleic acid molecule consisting of a polynucleotide sequence as shown in Table 1. The nucleic acid molecule can be labeled, for example, with a fluorescent label.


[0010] In other aspects, the invention provides an expression vector comprising an isolated nucleic acid molecule consisting of a polynucleotide sequence as shown in Table 1 or a host cell comprising the expression vector.


[0011] In another embodiment, the isolated nucleic acid molecule encodes a polypeptide having an amino acid sequence as shown in Table 2.


[0012] In another aspect, the invention provides an isolated polypeptide which is encoded by a nucleic acid molecule having polynucleotide sequence as shown in Table 1. In one embodiment, the isolated polypeptide has an amino acid sequence as shown in Table 2.


[0013] In another embodiment, the invention provides an antibody that specifically binds a polypeptide that has an amino acid sequence as shown in Table 2. The antibody can be conjugated to an effector component such as a fluorescent label, a toxin, or a radioisotope. In some embodiments, the antibody is an antibody fragment or a humanized antibody.


[0014] In another aspect, the invention provides a method of detecting a cell undergoing angiogenesis in a biological sample from a patient, the method comprising contacting the biological sample with an antibody that specifically binds to a polypeptide that has an amino acid sequence as shown in Table 2. In some embodiment, the antibody is further conjugated to an effector component, for example, a fluorescent label.


[0015] In another embodiment, the invention provides a method of detecting antibodies specific to angiogenesis in a patient, the method comprising contacting a biological sample from the patient with a polypeptide comprising a sequence as shown in Table 2.


[0016] The invention also provides a method of identifying a compound that modulates the activity of an angiogenesis-associated polypeptide, the method comprising the steps of: (i) contacting the compound with a polypeptide that comprises at least 80% identity to an amino acid sequence as shown in Table 2; and (ii) detecting an increase or a decrease in the activity of the polypeptide. In one embodiment, the polypeptide has an amino acid sequence as shown in Table 2. In another embodiment, the polypeptide is expressed in a cell.


[0017] The invention also provides a method of identifying a compound that modulates angiogenesis, the method comprising steps of: (i) contacting the compound with a cell undergoing angiogenesis; and (ii) detecting an increase or a decrease in the expression of a polypeptide sequence as shown in Table 2. In one embodiment, the detecting step comprises hybridizing a nucleic acid sample from the cell with a polynucleotide that selectively hybridizes to a sequence at least 80% identical to a sequence as shown in Table 1. In another embodiment, the method further comprises detecting an increase or decrease in the expression of a second sequence as shown in Table 2.


[0018] In another embodiment, the invention provides a method of inhibiting angiogenesis in a cell that expresses a polypeptide at least 80% identical to a sequence as shown in Table 2, the method comprising the step of contacting the cell with a therapeutically effective amount of an inhibitor of the polypeptide. In one embodiment, the polypeptide has an amino acid sequence shown in Table 2. In another embodiment, the inhibitor is an antibody.


[0019] In other embodiments, the invention provides a method of activating angiogenesis in a cell that expresses a polypeptide at least 80% identical to a sequence as shown in Table 2, the method comprising the step of contacting the cell with a therapeutically effective amount of an activator of the polypeptide. In one embodiment, the polypeptide has an amino acid sequence shown in Table 2.


[0020] Other aspects of the invention will become apparent to the skilled artisan by the following description of the invention.


[0021] Table 1 provides nucleotide sequence of genes that exhibit changes in expression levels as a function of time in tissue undergoing angiogenesis compared to tissue that is not.


[0022] Table 2 provides polypeptide sequence of proteins that exhibit changes in expression levels as a function of time in tissue undergoing angiogenesis compared to tissue that is not.



DESCRIPTION OF THE SPECIFIC EMBODIMENTS

[0023] In accordance with the objects outlined above, the present invention provides novel methods for diagnosis and treatment of disorders associated with angiogenesis (sometimes referred to herein as angiogenesis disorders or AD), as well as methods for screening for compositions which modulate angiogenesis. By “disorder associated with angiogenesis” or “disease associated with angiogenesis” herein is meant a disease state which is marked by either an excess or a deficit of vessel development. Angiogenesis disorders asociated with increased angiogenesis include, but are not limited to, cancer and proliferative diabetic retinopathy. Pathological states for which it may be desirable to increase angiogenesis include stroke, heart disease, infertility, ulcers, and scleradoma. Also provided are methods for treating AD.


[0024] Definitions


[0025] The term “angiogenesis protein” or “angiogenesis polynucleotide” refers to nucleic acid and polypeptide polymorphic variants, alleles, mutants, and interspecies homologs that: (1) have an amino acid sequence that has greater than about 60% amino acid sequence identity, 65%, 70%, 75%, 80%, 85%, 90%, preferably 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% or greater amino acid sequence identity, preferably over a region of over a region of at least about 25, 50, 100, 200, 500, 1000, or more amino acids, to an angiogenesis protein sequence of Table 2; (2) bind to antibodies, e.g. polyclonal antibodies, raised against an immunogen comprising an amino acid sequence of Table 2, and conservatively modified variants thereof; (3) specifically hybridize under stringent hybridization conditions to an anti-sense strand corresponding to a nucleic acid sequence of Table 1 and conservatively modified variants thereof; (4) have a nucleic acid sequence that has greater than about 95%, preferably greater than about 96%, 97%, 98%, 99%, or higher nucleotide sequence identity, preferably over a region of at least about 25, 50, 100, 200, 500, 1000, or more nucleotides, to a sense sequence corresponding to one set out in Table 1. A polynucleotide or polypeptide sequence is typically from a mammal including, but not limited to, primate, e.g., human; rodent, e.g., rat, mouse, hamster; cow, pig, horse, sheep, or any mammal. An “angiogenesis polypeptide” and an “angiogenesis polynucleotide,” include both naturally occurring or recombinant.


[0026] A “full length” angiogenesis protein or nucleic acid refers to an agiogenesis polypeptide or polynucleotide sequence, or a variant thereof, that contains all of the elements normally contained in one or more naturally occurring, wild type angiogenesis polynucleotide or polypeptide sequences. The “full length” may be prior to, or after, various stages of post-translation processing.


[0027] “Biological sample” as used herein is a sample of biological tissue or fluid that contains nucleic acids or polypeptides, e.g., of an angiogenic protein. Such samples include, but are not limited to, tissue isolated from primates, e.g., humans, or rodents, e.g., mice, and rats. Biological samples may also include sections of tissues such as biopsy and autopsy samples, and frozen sections taken for histologic purposes. A biological sample is typically obtained from a eukaryotic organism, most preferably a mammal such as a primate e.g., chimpanzee or human; cow; dog; cat; a rodent, e.g., guinea pig, rat, mouse; rabbit; or a bird; reptile; or fish.


[0028] “Providing a biological sample” means to obtain a biological sample for use in methods described in this invention. Most often, this will be done by removing a sample of cells from an animal, but can also be accomplished by using previously isolated cells (e.g., isolated by another person, at another time, and/or for another purpose), or by performing the methods of the invention in vivo. Archival tissues, having treatment or outcome histroy, will be particularly useful.


[0029] The terms “identical” or percent “identity,” in the context of two or more nucleic acids or polypeptide sequences, refer to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same (i.e., about 70% identity, preferably 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or higher identity over a specified region (e.g., SEQ ID NOS:1-4), when compared and aligned for maximum correspondence over a comparison window or designated region) as measured using a BLAST or BLAST 2.0 sequence comparison algorithms with default parameters described below, or by manual alignment and visual inspection (see, e.g., NCBI web site http://www.ncbi.nlm.nih.gov/BLAST/ or the like). Such sequences are then said to be “substantially identical.” This definition also refers to, or may be applied to, the compliment of a test sequence. The definition also includes sequences that have deletions and/or additions, as well as those that have substitutions. As described below, the preferred algorithms can account for gaps and the like. Preferably, identity exists over a region that is at least about 25 amino acids or nucleotides in length, or more preferably over a region that is 50-100 amino acids or nucleotides in length.


[0030] For sequence comparison, typically one sequence acts as a reference sequence, to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are entered into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. Preferably, default program parameters can be used, or alternative parameters can be designated. The sequence comparison algorithm then calculates the percent sequence identities for the test sequences relative to the reference sequence, based on the program parameters.


[0031] A “comparison window”, as used herein, includes reference to a segment of any one of the number of contiguous positions selected from the group consisting of from 20 to 600, usually about 50 to about 200, more usually about 100 to about 150 in which a sequence may be compared to a reference sequence of the same number of contiguous positions after the two sequences are optimally aligned. Methods of alignment of sequences for comparison are well-known in the art. Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith & Waterman, Adv. Appl. Math. 2:482 (1981), by the homology alignment algorithm of Needleman & Wunsch, J. Mol. Biol. 48:443 (1970), by the search for similarity method of Pearson & Lipman, Proc. Nat'l. Acad. Sci. USA 85:2444 (1988), by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., Madison, Wis.), or by manual alignment and visual inspection (see, e.g., Current Protocols in Molecular Biology (Ausubel et al., eds. 1995 supplement)).


[0032] A preferred example of algorithm that is suitable for determining percent sequence identity and sequence similarity are the BLAST and BLAST 2.0 algorithms, which are described in Altschul et al., Nuc. Acids Res. 25:3389-3402 (1977) and Altschul et al., J. Mol. Biol. 215:403-410 (1990), respectively. BLAST and BLAST 2.0 are used, with the parameters described herein, to determine percent sequence identity for the nucleic acids and proteins of the invention. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold (Altschul et al., supra). These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always >0) and N (penalty score for mismatching residues; always <0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 11, an expectation (E) of 10, M=5, N=-4 and a comparison of both strands. For amino acid sequences, the BLASTN program uses as defaults a wordlength of 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff & Henikoff, Proc. Natl. Acad. Sci. USA 89:10915 (1989)) alignments (B) of 50, expectation (E) of 10, M=5, N=-4, and a comparison of both strands.


[0033] The BLAST algorithm also performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin & Altschul, Proc. Nat'l. Acad. Sci. USA 90:5873-5787 (1993)). One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid to the reference nucleic acid is less than about 0.2, more preferably less than about 0.01, and most preferably less than about 0.001.


[0034] An indication that two nucleic acid sequences or polypeptides are substantially identical is that the polypeptide encoded by the first nucleic acid is immunologically cross reactive with the antibodies raised against the polypeptide encoded by the second nucleic acid, as described below. Thus, a polypeptide is typically substantially identical to a second polypeptide, for example, where the two peptides differ only by conservative substitutions. Another indication that two nucleic acid sequences are substantially identical is that the two molecules or their complements hybridize to each other under stringent conditions, as described below. Yet another indication that two nucleic acid sequences are substantially identical is that the same primers can be used to amplify the sequences.


[0035] A “host cell” is a naturally occurring cell or a transformed cell that contains an expression vector and supports the replication or expression of the expression vector. Host cells may be cultured cells, explants, cells in vivo, and the like. Host cells may be prokaryotic cells such as E. coli, or eukaryotic cells such as yeast, insect, amphibian, or mammalian cells such as CHO, HeLa, and the like (see, e.g., the American Type Culture Collection catalog or web site, www.atcc.org).


[0036] The terms “polypeptide,” “peptide” and “protein” are used interchangeably herein to refer to a polymer of amino acid residues. The terms apply to amino acid polymers in which one or more amino acid residue is an artificial chemical mimetic of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers and non-naturally occurring amino acid polymer.


[0037] The term “amino acid” refers to naturally occurring and synthetic amino acids, as well as amino acid analogs and amino acid mimetics that function in a manner similar to the naturally occurring amino acids. Naturally occurring amino acids are those encoded by the genetic code, as well as those amino acids that are later modified, e.g., hydroxyproline, γ-carboxyglutamate, and O-phosphoserine. Amino acid analogs refers to compounds that have the same basic chemical structure as a naturally occurring amino acid, i.e., an a carbon that is bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs have modified R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic chemical structure as a naturally occurring amino acid. Amino acid mimetics refers to chemical compounds that have a structure that is different from the general chemical structure of an amino acid, but that functions in a manner similar to a naturally occurring amino acid.


[0038] Amino acids may be referred to herein by either their commonly known three letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly accepted single-letter codes.


[0039] “Conservatively modified variants” applies to both amino acid and nucleic acid sequences. With respect to particular nucleic acid sequences, conservatively modified variants refers to those nucleic acids which encode identical or essentially identical amino acid sequences, or where the nucleic acid does not encode an amino acid sequence, to essentially identical sequences. Because of the degeneracy of the genetic code, a large number of functionally identical nucleic acids encode any given protein. For instance, the codons GCA, GCC, GCG and GCU all encode the amino acid alanine. Thus, at every position where an alanine is specified by a codon, the codon can be altered to any of the corresponding codons described without altering the encoded polypeptide. Such nucleic acid variations are “silent variations,” which are one species of conservatively modified variations. Every nucleic acid sequence herein which encodes a polypeptide also describes every possible silent variation of the nucleic acid. One of skill will recognize that each codon in a nucleic acid (except AUG, which is ordinarily the only codon for methionine, and TGG, which is ordinarily the only codon for tryptophan) can be modified to yield a functionally identical molecule. Accordingly, each silent variation of a nucleic acid which encodes a polypeptide is implicit in each described sequence with respect to the expression product, but not with respect to actual probe sequences.


[0040] As to amino acid sequences, one of skill will recognize that individual substitutions, deletions or additions to a nucleic acid, peptide, polypeptide, or protein sequence which alters, adds or deletes a single amino acid or a small percentage of amino acids in the encoded sequence is a “conservatively modified variant” where the alteration results in the substitution of an amino acid with a chemically similar amino acid. Conservative substitution tables providing functionally similar amino acids are well known in the art. Such conservatively modified variants are in addition to and do not exclude polymorphic variants, interspecies homologs, and alleles of the invention.


[0041] The following eight groups each contain amino acids that are conservative substitutions for one another: 1) Alanine (A), Glycine (G); 2) Aspartic acid (D), Glutamic acid (E); 3) Asparagine (N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) Isoleucine (I), Leucine (L), Methionine (M), Valine (V); 6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W); 7) Serine (S), Threonine (T); and 8) Cysteine (C), Methionine (M) (see, e.g., Creighton, Proteins (1984)).


[0042] Macromolecular structures such as polypeptide structures can be described in terms of various levels of organization. For a general discussion of this organization, see, e.g., Alberts et al., Molecular Biology of the Cell (3rd ed., 1994) and Cantor and Schimmel, Biophysical Chemistry Part I: The Conformation of Biological Macromolecules (1980). “Primary structure” refers to the amino acid sequence of a particular peptide. “Secondary structure” refers to locally ordered, three dimensional structures within a polypeptide. These structures are commonly known as domains. Domains are portions of a polypeptide that form a compact unit of the polypeptide and are typically 25 to approximately 500 amino acids long. Typical domains are made up of sections of lesser organization such as stretches of β-sheet and a-helices. “Tertiary structure” refers to the complete three dimensional structure of a polypeptide monomer. “Quaternary structure” refers to the three dimensional structure formed, usually by the noncovalent association of independent tertiary units. Anisotropic terms are also known as energy terms.


[0043] A “label” or a “detectable moiety” is a composition detectable by spectroscopic, photochemical, biochemical, immunochemical, chemical, or other physical means. For example, useful labels include 32P, fluorescent dyes, electron-dense reagents, enzymes (e.g., as commonly used in an ELISA), biotin, digoxigenin, or haptens and proteins which can be made detectable, e.g., by incorporating a radiolabel into the peptide or used to detect antibodies specifically reactive with the peptide.


[0044] An “effector” or “effector moiety” or “effector component” is a molecule that is bound (or linked, or conjugated), either covalently, through a linker or a chemical bond, or noncovalently, through ionic, van der Waals, electrostatic, or hydrogen bonds, to an antibody. The “effector” can be a variety of molecules including, for example, detection moieties including radioactive compounds, fluroescent compounds, an enzyme or substrate, tags such as epitope tags, a toxin; a chemotherapeutic agent; a lipase; an antibiotic; or a radioisotope emitting “hard” e.g., beta radiation.


[0045] A “labeled nucleic acid probe or oligonucleotide” is one that is bound, either covalently, through a linker or a chemical bond, or noncovalently, through ionic, van der Waals, electrostatic, or hydrogen bonds to a label such that the presence of the probe may be detected by detecting the presence of the label bound to the probe. Alternatively, method using high affinity interactions may achieve the same results where one of a pair of binding partners binds to the other, e.g., biotin, streptavidin.


[0046] As used herein a “nucleic acid probe or oligonucleotide” is defined as a nucleic acid capable of binding to a target nucleic acid of complementary sequence through one or more types of chemical bonds, usually through complementary base pairing, usually through hydrogen bond formation. As used herein, a probe may include natural (i.e., A, G, C, or T) or modified bases (7-deazaguanosine, inosine, etc.). In addition, the bases in a probe may be joined by a linkage other than a phosphodiester bond, so long as it does not interfere with hybridization. Thus, for example, probes may be peptide nucleic acids in which the constituent bases are joined by peptide bonds rather than phosphodiester linkages. It will be understood by one of skill in the art that probes may bind target sequences lacking complete complementarity with the probe sequence depending upon the stringency of the hybridization conditions. The probes are preferably directly labeled as with isotopes, chromophores, lumiphores, chromogens, or indirectly labeled such as with biotin to which a streptavidin complex may later bind. By assaying for the presence or absence of the probe, one can detect the presence or absence of the select sequence or subsequence.


[0047] The term “recombinant” when used with reference, e.g., to a cell, or nucleic acid, protein, or vector, indicates that the cell, nucleic acid, protein or vector, has been modified by the introduction of a heterologous nucleic acid or protein or the alteration of a native nucleic acid or protein, or that the cell is derived from a cell so modified. Thus, for example, recombinant cells express genes that are not found within the native (non-recombinant) form of the cell or express native genes that are otherwise abnormally expressed, under expressed or not expressed at all.


[0048] The term “heterologous” when used with reference to portions of a nucleic acid indicates that the nucleic acid comprises two or more subsequences that are not found in the same relationship to each other in nature. For instance, the nucleic acid is typically recombinantly produced, having two or more sequences from unrelated genes arranged to make a new functional nucleic acid, e.g., a promoter from one source and a coding region from another source. Similarly, a heterologous protein indicates that the protein comprises two or more subsequences that are not found in the same relationship to each other in nature (e.g., a fusion protein).


[0049] A “promoter” is defined as an array of nucleic acid control sequences that direct transcription of a nucleic acid. As used herein, a promoter includes necessary nucleic acid sequences near the start site of transcription, such as, in the case of a polymerase II type promoter, a TATA element. A promoter also optionally includes distal enhancer or repressor elements, which can be located as much as several thousand base pairs from the start site of transcription. A “constitutive” promoter is a promoter that is active under most environmental and developmental conditions. An “inducible” promoter is a promoter that is active under environmental or developmental regulation. The term “operably linked” refers to a functional linkage between a nucleic acid expression control sequence (such as a promoter, or array of transcription factor binding sites) and a second nucleic acid sequence, wherein the expression control sequence directs transcription of the nucleic acid corresponding to the second sequence.


[0050] An “expression vector” is a nucleic acid construct, generated recombinantly or synthetically, with a series of specified nucleic acid elements that permit transcription of a particular nucleic acid in a host cell. The expression vector can be part of a plasmid, virus, or nucleic acid fragment. Typically, the expression vector includes a nucleic acid to be transcribed operably linked to a promoter.


[0051] The phrase “selectively (or specifically) hybridizes to” refers to the binding, duplexing, or hybridizing of a molecule only to a particular nucleotide sequence under stringent hybridization conditions when that sequence is present in a complex mixture (e.g., total cellular or library DNA or RNA).


[0052] The phrase “stringent hybridization conditions” refers to conditions under which a probe will hybridize to its target subsequence, typically in a complex mixture of nucleic acids, but to no other sequences. Stringent conditions are sequence-dependent and will be different in different circumstances. Longer sequences hybridize specifically at higher temperatures. An extensive guide to the hybridization of nucleic acids is found in Tijseen, Techniques in Biochemistry and Molecular Biology—Hybridization with Nucleic Probes, “Overview of principles of hybridization and the strategy of nucleic acid assays” (1993). Generally, stringent conditions are selected to be about 5-10° C. lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength pH. The Tm is the temperature (under defined ionic strength, pH, and nucleic concentration) at which 50% of the probes complementary to the target hybridize to the target sequence at equilibrium (as the target sequences are present in excess, at Tm, 50% of the probes are occupied at equilibrium). Stringent conditions will be those in which the salt concentration is less than about 1.0 M sodium ion, typically about 0.01 to 1.0 M sodium ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30° C. for short probes (e.g., 10 to 50 nucleotides) and at least about 60° C. for long probes (e.g., greater than 50 nucleotides). Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide. For selective or specific hybridization, a positive signal is at least two times background, preferably 10 times background hybridization. Exemplary stringent hybridization conditions can be as following: 50% formamide, 5× SSC, and 1% SDS, incubating at 42° C., or, 5× SSC, 1% SDS, incubating at 65° C., with wash in 0.2× SSC, and 0.1% SDS at 65° C. For PCR, a temperature of about 36° C. is typical for low stringency amplification, although annealing temperatures may vary between about 32° C. and 48° C. depending on primer length. For high stringency PCR amplification, a temperature of about 62° C. is typical, although high stringency annealing temperatures can range from about 50° C. to about 65° C., depending on the primer length and specificity. Typical cycle conditions for both high and low stringency amplifications include a denaturation phase of 90° C.-95° C. for 30 sec-2 min., an annealing phase lasting 30 sec.-2 min., and an extension phase of about 72° C. for 1-2 min. Protocols and guidelines for low and high stringency amplification reactions are provided, e.g., in Innis et al. (1990) PCR Protocols, A Guide to Methods and Applications, Academic Press, Inc. N.Y.).


[0053] Nucleic acids that do not hybridize to each other under stringent conditions are still substantially identical if the polypeptides which they encode are substantially identical. This occurs, for example, when a copy of a nucleic acid is created using the maximum codon degeneracy permitted by the genetic code. In such cases, the nucleic acids typically hybridize under moderately stringent hybridization conditions. Exemplary “moderately stringent hybridization conditions” include a hybridization in a buffer of 40% formamide, 1 M NaCl, 1% SDS at 37° C., and a wash in 1× SSC at 45° C. A positive hybridization is at least twice background. Those of ordinary skill will readily recognize that alternative hybridization and wash conditions can be utilized to provide conditions of similar stringency. Additional guidelines for determining hybridization parameters are provided in numerous reference, e.g., and Current Protocols in Molecular Biology, ed. Ausubel, et al.


[0054] The phrase “functional effects” in the context of assays for testing compounds that modulate activity of an angiogenesis protein includes the determination of a parameter that is indirectly or directly under the influence of the angiogenesis protein, e.g., a functional, physical, or chemical effect, such as the ability to increase or decrease angiogenesis. It includes binding activity, the ability of cells to proliferate, expression in cells undergoing angiogenesis, and other characteristics of angiogenic cells. “Functional effects” include in vitro, in vivo, and ex vivo activities.


[0055] By “determining the functional effect” is meant assaying for a compound that increases or decreases a parameter that is indirectly or directly under the influence of an angiogenesis protein sequence, e.g., functional, physical and chemical effects. Such functional effects can be measured by any means known to those skilled in the art, e.g., changes in spectroscopic characteristics (e.g., fluorescence, absorbance, refractive index), hydrodynamic (e.g., shape), chromatographic, or solubility properties for the protein, measuring inducible markers or transcriptional activation of the angiogenesis protein; measuring binding activity or binding assays, e.g. binding to antibodies, and measuring cellular proliferation, particularly endothelial cell proliferation. Determination of the functional effect of a compound on angiogenesis can also be performed using angiogenesis assays known to those of skill in the art such as an in vitro assays, e.g., in vitro endothelial cell tube formation assays, and other assays such as the chick CAM assay, the mouse corneal assay, and assays that assess vascularization of an implanted tumor. The functional effects can be evaluated by many means known to those skilled in the art, e.g., microscopy for quantitative or qualitative measures of alterations in morphological features, e.g., tube or blood vessel formation, measurement of changes in RNA or protein levels for angiogenesis-associated sequences, measurement of RNA stability, identification of downstream or reporter gene expression (CAT, luciferase, β-gal, GFP and the like), e.g., via chemiluminescence, fluorescence, colorimetric reactions, antibody binding, inducible markers, and ligand binding assays.


[0056] “Inhibitors”, “activators”, and “modulators” of angiogenic polynucleotide and polypeptide sequences are used to refer to activating, inhibitory, or modulating molecules identified using in vitro and in vivo assays of angiogenic polynucleotide and polypeptide sequences. Inhibitors are compounds that, e.g., bind to, partially or totally block activity, decrease, prevent, delay activation, inactivate, desensitize, or down regulate the activity or expression of angiogenesis proteins, e.g., antagonists. “Activators” are compounds that increase, open, activate, facilitate, enhance activation, sensitize, agonize, or up regulate angiogenesis protein activity. Inhibitors, activators, or modulators also include genetically modified versions of angiogenesis proteins, e.g., versions with altered activity, as well as naturally occurring and synthetic ligands, antagonists, agonists, antibodies, small chemical molecules and the like. Such assays for inhibitors and activators include, e.g., expressing the angiogenic protein in vitro, in cells, or cell membranes, applying putative modulator compounds, and then determining the functional effects on activity, as described above. Activators and inhibitors of angiogenesis can also be identified by incubating angiogenic cells with the test compound and determining increases or decreases in the expression of 1 or more angiogenesis proteins, e.g., 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 40, 50 or more angiogenesis proteins, such as angiogenesis proteins comprising the sequences set out in Table 2.


[0057] Samples or assays comprising angiogenesis proteins that are treated with a potential activator, inhibitor, or modulator are compared to control samples without the inhibitor, activator, or modulator to examine the extent of inhibition. Control samples (untreated with inhibitors) are assigned a relative protein activity value of 100%. Inhibition of a polypeptide is achieved when the activity value relative to the control is about 80%, preferably 50%, more preferably 25-0%. Activation of an angiogenesis polypeptide is achieved when the activity value relative to the control (untreated with activators) is 110%, more preferably 150%, more preferably 200-500% (i.e., two to five fold higher relative to the control), more preferably 1000-3000% higher.


[0058] “Antibody” refers to a polypeptide comprising a framework region from an immunoglobulin gene or fragments thereof that specifically binds and recognizes an antigen. The recognized immunoglobulin genes include the kappa, lambda, alpha, gamma, delta, epsilon, and mu constant region genes, as well as the myriad immunoglobulin variable region genes. Light chains are classified as either kappa or lambda. Heavy chains are classified as gamma, mu, alpha, delta, or epsilon, which in turn define the immunoglobulin classes, IgG, IgM, IgA, IgD and IgE, respectively. Typically, the antigen-binding region of an antibody will be most critical in specificity and affinity of binding.


[0059] An exemplary immunoglobulin (antibody) structural unit comprises a tetramer. Each tetramer is composed of two identical pairs of polypeptide chains, each pair having one “light” (about 25 kD) and one “heavy” chain (about 50-70 kD). The N-terminus of each chain defines a variable region of about 100 to 110 or more amino acids primarily responsible for antigen recognition. The terms variable light chain (VL) and variable heavy chain (VH) refer to these light and heavy chains respectively.


[0060] Antibodies exist, e.g., as intact immunoglobulins or as a number of well-characterized fragments produced by digestion with various peptidases. Thus, for example, pepsin digests an antibody below the disulfide linkages in the hinge region to produce F(ab)′2, a dimer of Fab which itself is a light chain joined to VH-CH1 by a disulfide bond. The F(ab)′2 may be reduced under mild conditions to break the disulfide linkage in the hinge region, thereby converting the F(ab)′2 dimer into an Fab′ monomer. The Fab′ monomer is essentially Fab with part of the hinge region (see Fundamental Immunology (Paul ed., 3d ed. 1993). While various antibody fragments are defined in terms of the digestion of an intact antibody, one of skill will appreciate that such fragments may be synthesized de novo either chemically or by using recombinant DNA methodology. Thus, the term antibody, as used herein, also includes antibody fragments either produced by the modification of whole antibodies, or those synthesized de novo using recombinant DNA methodologies (e.g., single chain Fv) or those identified using phage display libraries (see, e.g., McCafferty et al., Nature 348:552-554 (1990))


[0061] For preparation of antibodies, e.g., recombinant, monoclorial, or polyclonal antibodies, many technique known in the art can be used (see, e.g., Kohler & Milstein, Nature 256:495-497 (1975); Kozbor et al., Immunology Today 4: 72 (1983); Cole et al., pp. 77-96 in Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc. (1985); Coligan, Current Protocols in Immunology (1991); Harlow & Lane, Antibodies, A Laboratory Manual (1988); and Goding, Monoclonal Antibodies: Principles and Practice (2d ed. 1986)). Techniques for the production of single chain antibodies (U.S. Pat. No. 4,946,778) can be adapted to produce antibodies to polypeptides of this invention. Also, transgenic mice, or other organisms such as other mammals, may be used to express humanized antibodies. Alternatively, phage display technology can be used to identify antibodies and heteromeric Fab fragments that specifically bind to selected antigens (see, e.g., McCafferty et al., Nature 348:552-554 (1990); Marks et al., Biotechnology 10:779-783 (1992)).


[0062] A “chimeric antibody” is an antibody molecule in which (a) the constant region, or a portion thereof, is altered, replaced or exchanged so that the antigen binding site (variable region) is linked to a constant region of a different or altered class, effector function and/or species or an entirely different molecule which confers new properties to the chimeric antibody, e.g., an enzyme, toxin, hormone, growth factor, drug, etc.; or (b) the variable region, or a portion thereof, is altered, replaced or exchanged with a variable region having a different or altered antigen specificity.


[0063] The present application may be related to U.S. Ser. No. 09/437,702, filed Nov. 10, 1999; U.S. Ser. No. 09/437,528, filed Nov. 10, 1999; U.S. Ser. No. 09/434,197, filed Nov. 4, 1999; U.S. Ser. No. 60/183,926, filed Feb. 22, 2000; U.S. Ser. No. 09/440,493, filed Nov. 15, 1999; U.S. Ser. No. 09/520,478, filed Mar. 8, 2000; U.S. Ser. No. 09/440,369, filed Nov. 12, 1999; Attorney Docket number A68928, filed Dec. 15, 2000; Attorney Docket number A69789, filed Jan. 22, 2001; and Attorney Docket number A69806, filed Dec. 15, 2000.


[0064] The detailed description of the invention includes discussion of the following aspects of the invention:


[0065] Expression of angiogenesis-associated sequences


[0066] Informatics


[0067] Angiogenesis-associated sequences


[0068] Detection of angiogenesis sequence for diagnostic and therapeutic applications


[0069] Modulators of angiogenesis


[0070] Methods of identifying variant angiogenesis-associated sequences


[0071] Administration of pharmaceutical and vaccine compositions


[0072] Kits for use in diagnostic and/or prognostic applications.


[0073] Expression of Angiogenesis-associated Sequences


[0074] In one aspect, the expression levels of genes are determined in different patient samples for which diagnosis information is desired, to provide expression profiles. An expression profile of a particular sample is essentially a “fingerprint” of the state of the sample; while two states may have any particular gene similarly expressed, the evaluation of a number of genes simultaneously allows the generation of a gene expression profile that is unique to the state of the cell. That is, normal tissue may be distinguished from AD tissue. By comparing expression profiles of tissue in known different angiogenesis states, information regarding which genes are important (including both up- and down-regulation of genes) in each of these states is obtained. The identification of sequences that are differentially expressed in angiogenic versus non-angiogenic tissue allows the use of this information in a number of ways. For example, a particular treatment regime may be evaluated: does a chemotherapeutic drug act to down-regulate angiogenesis, and thus tumor growth or recurrence, in a particular patient. Similarly, diagnosis and treatment outcomes may be done or confirmed by comparing patient samples with the known expression profiles. Angiogenic tissue can also be analyzed to determine the stage of angiogenesis in the tissue. Furthermore, these gene expression profiles (or individual genes) allow screening of drug candidates with an eye to mimicking or altering a particular expression profile; for example, screening can be done for drugs that suppress the angiogenic expression profile. This may be done by making biochips comprising sets of the important angiogenesis genes, which can then be used in these screens. These methods can also be done on the protein basis; that is, protein expression levels of the angiogenic proteins can be evaluated for diagnostic purposes or to screen candidate agents. In addition, the angiogenic nucleic acid sequences can be administered for gene therapy purposes, including the administration of antisense nucleic acids, or the angiogenic proteins (including antibodies and other modulators thereof) administered as therapeutic drugs.


[0075] Thus the present invention provides nucleic acid and protein sequences that are differentially expressed in angiogenesis, herein termed “angiogenesis sequences”. As outlined below, angiogenesis sequences include those that are up-regulated (i.e. expressed at a higher level) in disorders associated with angiogenesis, as well as those that are down-regulated (i.e. expressed at a lower level). In a preferred embodiment, the angiogenesis sequences are from humans; however, as will be appreciated by those in the art, angiogenesis sequences from other organisms may be useful in animal models of disease and drug evaluation; thus, other angiogenesis sequences are provided, from vertebrates, including mammals, including rodents (rats, mice, hamsters, guinea pigs, etc.), primates, farm animals (including sheep, goats, pigs, cows, horses, etc). Angiogenesis sequences from other organisms may be obtained using the techniques outlined below.


[0076] Angiogenesis sequences can include both nucleic acid and amino acid sequences. In a preferred embodiment, the angiogenesis sequences are recombinant nucleic acids. By the term “recombinant nucleic acid” herein is meant nucleic acid, originally formed in vitro, in general, by the manipulation of nucleic acid e.g., using polymerases and endonucleases, in a form not normally found in nature. Thus an isolated nucleic acid, in a linear form, or an expression vector formed in vitro by ligating DNA molecules that are not normally joined, are both considered recombinant for the purposes of this invention. It is understood that once a recombinant nucleic acid is made and reintroduced into a host cell or organism, it will replicate non-recombinantly, i.e. using the in vivo cellular machinery of the host cell rather than in vitro manipulations; however, such nucleic acids, once produced recombinantly, although subsequently replicated non-recombinantly, are still considered recombinant for the purposes of the invention.


[0077] Similarly, a “recombinant protein” is a protein made using recombinant techniques, i.e. through the expression of a recombinant nucleic acid as depicted above. A recombinant protein is distinguished from naturally occurring protein by at least one or more characteristics. For example, the protein may be isolated or purified away from some or all of the proteins and compounds with which it is normally associated in its wild type host, and thus may be substantially pure. For example, an isolated protein is unaccompanied by at least some of the material with which it is normally associated in its natural state, preferably constituting at least about 0.5%, more preferably at least about 5% by weight of the total protein in a given sample. A substantially pure protein comprises at least about 75% by weight of the total protein, with at least about 80% being preferred, and at least about 90% being particularly preferred. The definition includes the production of an angiogenesis protein from one organism in a different organism or host cell. Alternatively, the protein may be made at a significantly higher concentration than is normally seen, through the use of an inducible promoter or high expression promoter, such that the protein is made at increased concentration levels. Alternatively, the protein may be in a form not normally found in nature, as in the addition of an epitope tag or amino acid substitutions, insertions and deletions, as discussed below.


[0078] In a preferred embodiment, the angiogenesis sequences are nucleic acids. As will be appreciated by those in the art and is more fully outlined below, angiogenesis sequences are useful in a variety of applications, including diagnostic applications, which will detect naturally occurring nucleic acids, as well as screening applications; for example, biochips comprising nucleic acid probes to the angiogenesis sequences can be generated. In the broadest sense, then, by “nucleic acid” or “oligonucleotide” or grammatical equivalents herein means at least two nucleotides covalently linked together. A nucleic acid of the present invention will generally contain phosphodiester bonds, although in some cases, nucleic acid analogs are included that may have alternate backbones, comprising, for example, phosphoramidate, phosphorothioate, phosphorodithioate, or O-methylphophoroamidite linkages (see Eckstein, Oligonucleotides and Analogues: A Practical Approach, Oxford University Press); and peptide nucleic acid backbones and linkages. Other analog nucleic acids include those with positive backbones; non-ionic backbones, and non-ribose backbones, including those described in U.S. Pat. Nos. 5,235,033 and 5,034,506, and Chapters 6 and 7, ASC Symposium Series 580, “Carbohydrate Modifications in Antisense Research”, Ed. Y. S. Sanghui and P. Dan Cook. Nucleic acids containing one or more carbocyclic sugars are also included within one definition of nucleic acids. Modifications of the ribose-phosphate backbone may be done for a variety of reasons, for example to increase the stability and half-life of such molecules in physiological environments or as probes on a biochip.


[0079] As will be appreciated by those in the art, nucleic acid analogs may find use in the present invention. In addition, mixtures of naturally occurring nucleic acids and analogs can be made; alternatively, mixtures of different nucleic acid analogs, and mixtures of naturally occurring nucleic acids and analogs may be made.


[0080] Particularly preferred are peptide nucleic acids (PNA) which includes peptide nucleic acid analogs. These backbones are substantially non-ionic under neutral conditions, in contrast to the highly charged phosphodiester backbone of naturally occurring nucleic acids. This results in two advantages. First, the PNA backbone exhibits improved hybridization kinetics. PNAs have larger changes in the melting temperature (Tm) for mismatched versus perfectly matched basepairs. DNA and RNA typically exhibit a 2-4° C. drop in Tm for an internal mismatch. With the non-ionic PNA backbone, the drop is closer to 7-9° C. Similarly, due to their non-ionic nature, hybridization of the bases attached to these backbones is relatively insensitive to salt concentration. In addition, PNAs are not degraded by cellular enzymes, and thus can be more stable.


[0081] The nucleic acids may be single stranded or double stranded, as specified, or contain portions of both double stranded or single stranded sequence. As will be appreciated by those in the art, the depiction of a single strand also defines the sequence of the complementary strand; thus the sequences described herein also provide the complement of the sequence. The nucleic acid may be DNA, both genomic and cDNA, RNA or a hybrid, where the nucleic acid may contain combinations of deoxyribo- and ribo-nucleotides, and combinations of bases, including uracil, adenine, thymine, cytosine, guanine, inosine, xanthine hypoxanthine, isocytosine, isoguanine, etc. As used herein, the term “nucleoside” includes nucleotides and nucleoside and nucleotide analogs, and modified nucleosides such as amino modified nucleosides. In addition, “nucleoside” includes non-naturally occurring analog structures. Thus for example the individual units of a peptide nucleic acid, each containing a base, are referred to herein as a nucleoside.


[0082] An angiogenesis sequence can be initially identified by substantial nucleic acid and/or amino acid sequence homology to the angiogenesis sequences outlined herein. Such homology can be based upon the overall nucleic acid or amino acid sequence, and is generally determined as outlined below, using either homology programs or hybridization conditions.


[0083] For identifying angiogenesis-associated sequences, the angiogenesis screen typically includes comparing genes identified in a modification of an in vitro model of angiogenesis as described in Hiraoka, Cell 95:365 (1998) with genes identified in controls. Samples of normal tissue and tissue undergoing angiogenesis are applied to biochips comprising nucleic acid probes. The samples are first microdissected, if applicable, and treated as is known in the art for the preparation of mRNA. Suitable biochips are commercially available, for example from Affymetrix. Gene expression profiles as described herein are generated and the data analyzed.


[0084] In a preferred embodiment, the genes showing changes in expression as between normal and disease states are compared to genes expressed in other normal tissues, including, but not limited to lung, heart, brain, liver, breast, kidney, muscle, prostate, small intestine, large intestine, spleen, bone and placenta. In a preferred embodiment, those genes identified during the angiogenesis screen that are expressed in any significant amount in other tissues are removed from the profile, although in some embodiments, this is not necessary. That is, when screening for drugs, it is usually preferable that the target be disease specific, to minimize possible side effects.


[0085] In a preferred embodiment, angiogenesis sequences are those that are up-regulated in angiogenesis disorders; that is, the expression of these genes is higher in the disease tissue as compared to normal tissue. “Up-regulation” as used herein means at least about a two-fold change, preferably at least about a three fold change, with at least about five-fold or higher being preferred. All accession numbers herein are for the GenBank sequence database and the sequences of the accession numbers are hereby expressly incorporated by reference. GenBank is known in the art, see, e.g., Benson, DA, et al., Nucleic Acids Research 26:1-7 (1998) and http://www.ncbi.nlm.nih.gov/. Sequences are also avialable in other databases, e.g., European Molecular Biology Laboratory (EMBL) and DNA Database of Japan (DDBJ). In addition, most preferred genes were found to be expressed in a limited amount or not at all in heart, brain, lung, liver, breast, kidney, prostate, small intestine and spleen.


[0086] In another preferred embodiment, angiogenesis sequences are those that are down-regulated in the angiogenesis disorder; that is, the expression of these genes is lower in angiogenic tissue as compared to normal tissue. “Down-regulation” as used herein means at least about a two-fold change, preferably at least about a three fold change, with at least about five-fold or higher being preferred.


[0087] Angiogenesis sequences according to the invention may be classified into discrete clusters of sequences based on common expression profiles of the sequences. Expression levels of angiogenesis sequences may increase or decrease as a function of time in a manner that correlates with the induction of angiogenesis. Alternatively, expression levels of angiogenesis sequences may both increase and decrease as a function of time. For example, expression levels of some angiogenesis sequences are temporarily induced or diminished during the switch to the angiogenesis phenotype, followed by a return to baseline expression levels. Table 1 provides genes, the mRNA expression of which varies as a function of time in angiogenesis tissue when compared to normal tissue.


[0088] Table 2 provides protein sequences corresponding to the coding regions of the sequences that undergo changes in expression as a function of time in tissue undergoing angiogenesis.


[0089] In a particularly preferred embodiment, angiogenesis sequences are those that are induced for a period of time, typically by positive angiogenic factors, followed by a return to the baseline levels. Sequences that are temporarily induced provide a means to target angiogenesis tissue, for example neovascularized tumors, at a particular stage of angiogenesis, while avoiding rapidly growing tissue that require perpetual vascularization. Such positive angiogenic factors include αFGF, βFGF, VEGF, angiogenin and the like.


[0090] Induced angiogenesis sequences also are further categorized with respect to the timing of induction. For example, some angiogenesis genes may be induced at an early time period, such as within 10 minutes of the induction of angiogenesis. Others may be induced later, such as between 5 and 60 minutes, while yet others may be induced for a time period of about two hours or more followed by a return to baseline expression levels.


[0091] In another preferred embodiment are angiogenesis sequences that are inhibited or reduced as a function of time followed by a return to “normal” expression levels. Inhibitors of angiogenesis are examples of molecules that have this expression profile. These sequences also can be further divided into groups depending on the timing of diminished expression. For example, some molecules may display reduced expression within 10 minutes of the induction of angiogenesis. Others may be diminished later, such as between 5 and 60 minutes, while others may be diminished for a time period of about two hours or more followed by a return to baseline. Examples of such negative angiogenic factors include thrombospondin and endostatin to name a few.


[0092] In yet another preferred embodiment are angiogenesis sequences that are induced for prolonged periods. These sequences are typically associated with induction of angiogenesis and may participate in induction and/or maintenance of the angiogenesis phenotype.


[0093] In another preferred embodiment are angiogenesis sequences, the expression of which is reduced or diminished for prolonged periods in angiogenic tissue. These sequences are typically angiogenesis inhibitors and their diminution is correlated with an increase in angiogenesis.


[0094] Informatics


[0095] The ability to identify genes that undergo changes in expression with time during angiogenesis can additionally provide high-resolution, high-sensitivity datasets which can be used in the areas of diagnostics, therapeutics, drug development, biosensor development, and other related areas. For example, the expression profiles can be used in diagnostic or prognostic evaluation of patients with angiogenesis-associated disease. Or as another example, subcellular toxicological information can be generated to better direct drug structure and activity correlation (see, Anderson, L., “Pharmaceutical Proteomics: Targets, Mechanism, and Function,” paper presented at the IBC Proteomics conference, Coronado, Calif. (Jun. 11-12, 1998)). Subcellular toxicological information can also be utilized in a biological sensor device to predict the likely toxicological effect of chemical exposures and likely tolerable exposure thresholds (see, U.S. Pat. No. 5,811,231). Similar advantages accrue from datasets relevant to other biomolecules and bioactive agents (e.g. nucleic acids, saccharides, lipids, drugs, and the like).


[0096] Thus, in another embodiment, the present invention provides a database that includes at least one set of data assay data. The data contained in the database is acquired, e.g., using array analysis either singly or in a library format. The database can be in substantially any form in which data can be maintained and transmitted, but is preferably an electronic database. The electronic database of the invention can be maintained on any electronic device allowing for the storage of and access to the database, such as a personal computer, but is preferably distributed on a wide area network, such as the World Wide Web.


[0097] The focus of the present section on databases that include peptide sequence data is for clarity of illustration only. It will be apparent to those of skill in the art that similar databases can be assembled for any assay data acquired using an assay of the invention.


[0098] The compositions and methods for identifying and/or quantitating the relative and/or absolute abundance of a variety of molecular and macromolecular species from a biological sample undergoing angiogenesis, i.e., the identification of angiogenesis-associated sequences described herein, provide an abundance of information, which can be correlated with pathological conditions, predisposition to disease, drug testing, therapeutic monitoring, gene-disease causal linkages, identification of correlates of immunity and physiological status, among others. Although the data generated from the assays of the invention is suited for manual review and analysis, in a preferred embodiment, prior data processing using high-speed computers is utilized.


[0099] An array of methods for indexing and retrieving biomolecular information is known in the art. For example, U.S. Pat. Nos. 6,023,659 and 5,966,712 disclose a relational database system for storing biomolecular sequence information in a manner that allows sequences to be catalogued and searched according to one or more protein function hierarchies. U.S. Pat. No. 5,953,727 discloses a relational database having sequence records containing information in a format that allows a collection of partial-length DNA sequences to be catalogued and searched according to association with one or more sequencing projects for obtaining fill-length sequences from the collection of partial length sequences. U.S. Pat. No. 5,706,498 discloses a gene database retrieval system for making a retrieval of a gene sequence similar to a sequence data item in a gene database based on the degree of similarity between a key sequence and a target sequence. U.S. Pat. No. 5,538,897 discloses a method using mass spectroscopy fragmentation patterns of peptides to identify amino acid sequences in computer databases by comparison of predicted mass spectra with experimentally-derived mass spectra using a closeness-of-fit measure. U.S. Pat. No. 5,926,818 discloses a multi-dimensional database comprising a functionality for multi-dimensional data analysis described as on-line analytical processing (OLAP), which entails the consolidation of projected and actual data according to more than one consolidation path or dimension. U.S. Pat. No. 5,295,261 reports a hybrid database structure in which the fields of each database record are divided into two classes, navigational and informational data, with navigational fields stored in a hierarchical topological map which can be viewed as a tree structure or as the merger of two or more such tree structures.


[0100] The present invention provides a computer database comprising a computer and software for storing in computer-retrievable form assay data records cross-tabulated, e.g., with data specifying the source of the target-containing sample from which each sequence specificity record was obtained.


[0101] In an exemplary embodiment, at least one of the sources of target-containing sample is from a control tissue sample known to be free of pathological disorders. In a variation, at least one of the sources is a known pathological tissue specimen, e.g. a neoplastic lesion or another tissue specimen to be analyzed for angiogenesis. In another variation, the assay records cross-tabulate one or more of the following parameters for each target species in a sample: (1) a unique identification code, which can include, e.g., a target molecular structure and/or characteristic separation coordinate (e.g. electrophoretic coordinates); (2) sample source; and (3) absolute and/or relative quantity of the target species present in the sample.


[0102] The invention also provides for the storage and retrieval of a collection of target data in a computer data storage apparatus, which can include magnetic disks, optical disks, magneto-optical disks, DRAM, SRAM, SGRAM, SDRAM, RDRAM, DDR RAM, magnetic bubble memory devices, and other data storage devices, including CPU registers and on-CPU data storage arrays. Typically, the target data records are stored as a bit pattern in an array of magnetic domains on a magnetizable medium or as an array of charge states or transistor gate states, such as an array of cells in a DRAM device (e.g., each cell comprised of a transistor and a charge storage area, which may be on the transistor). In one embodiment, the invention provides such storage devices, and computer systems built therewith, comprising a bit pattern encoding a protein expression fingerprint record comprising unique identifiers for at least 10 target data records cross-tabulated with target source.


[0103] When the target is a peptide or nucleic acid, the invention preferably provides a method for identifying related peptide or nucleic acid sequences, comprising performing a computerized comparison between a peptide or nucleic acid sequence assay record stored in or retrieved from a computer storage device or database and at least one other sequence. The comparison can include a sequence analysis or comparison algorithm or computer program embodiment thereof (e.g., FASTA, TFASTA, GAP, BESTFIT) and/or the comparison may be of the relative amount of a peptide or nucleic acid sequence in a pool of sequences determined from a polypeptide or nucleic acid sample of a specimen.


[0104] The invention also preferably provides a magnetic disk, such as an IBM-compatible (DOS, Windows Windows95/98/2000, Windows NT, OS/2) or other format (e.g., Linux, SunOS, Solaris, AIX, SCO Unix, VMS, MV, Macintosh, etc.) floppy diskette or hard (fixed, Winchester) disk drive, comprising a bit pattern encoding data from an assay of the invention in a file format suitable for retrieval and processing in a computerized sequence analysis, comparison, or relative quantitation method.


[0105] The invention also provides a network, comprising a plurality of computing devices linked via a data link, such as an Ethernet cable (coax or 10 BaseT), telephone line, ISDN line, wireless network, optical fiber, or other suitable signal tranmission medium, whereby at least one network device (e.g., computer, disk array, etc.) comprises a pattern of magnetic domains (e.g., magnetic disk) and/or charge domains (e.g., an array of DRAM cells) composing a bit pattern encoding data acquired from an assay of the invention.


[0106] The invention also provides a method for transmitting assay data that includes generating an electronic signal on an electronic communications device, such as a modem, ISDN terminal adapter, DSL, cable modem, ATM switch, or the like, wherein the signal includes (in native or encrypted format) a bit pattern encoding data from an assay or a database comprising a plurality of assay results obtained by the method of the invention.


[0107] In a preferred embodiment, the invention provides a computer system for comparing a query target to a database containing an array of data structures, such as an assay result obtained by the method of the invention, and ranking database targets based on the degree of identity and gap weight to the target data. A central processor is preferably initialized to load and execute the computer program for alignment and/or comparison of the assay results. Data for a query target is entered into the central processor via an I/O device. Execution of the computer program results in the central processor retrieving the assay data from the data file, which comprises a binary description of an assay result.


[0108] The target data or record and the computer program can be transferred to secondary memory, which is typically random access memory (e.g. DRAM, SRAM, SGRAM, or SDRAM). Targets are ranked according to the degree of correspondence between a selected assay characteristic (e.g., binding to a selected affinity moiety) and the same characteristic of the query target and results are output via an I/O device. For example, a central processor can be a conventional computer (e.g., Intel Pentium, PowerPC, Alpha, PA-8000, SPARC, MIPS 4400, MIPS 10000, VAX, etc.); a program can be a commercial or public domain molecular biology software package (e.g., UWGCG Sequence Analysis Software, Darwin); a data file can be an optical or magnetic disk, a data server, a memory device (e.g., DRAM, SRAM, SGRAM, SDRAM, EPROM, bubble memory, flash memory, etc.); an I/O device can be a terminal comprising a video display and a keyboard, a modem, an ISDN terminal adapter, an Ethernet port, a punched card reader, a magnetic strip reader, or other suitable I/O device.


[0109] The invention also preferably provides the use of a computer system, such as that described above, which comprises: (1) a computer; (2) a stored bit pattern encoding a collection of peptide sequence specificity records obtained by the methods of the invention, which may be stored in the computer; (3) a comparison target, such as a query target; and (4) a program for alignment and comparison, typically with rank-ordering of comparison results on the basis of computed similarity values.


[0110] Angiogenesis-associated Sequences


[0111] Angiogenesis proteins of the present invention may be classified as secreted proteins, transmembrane proteins or intracellular proteins. In one embodiment, the angiogenesis protein is an intracellular protein. Intracellular proteins may be found in the cytoplasm and/or in the nucleus. Intracellular proteins are involved in all aspects of cellular function and replication (including, e.g., signaling pathways); aberrant expression of such proteins often results in unregulated or disregulated cellular processes (see, e.g., Molecular Biology of the Cell, 3rd Edition, Alberts, Ed., Garland Pub., 1994). For example, many intracellular proteins have enzymatic activity such as protein kinase activity, protein phosphatase activity, protease activity, nucleotide cyclase activity, polymerase activity and the like. Intracellular proteins also serve as docking proteins that are involved in organizing complexes of proteins, or targeting proteins to various subcellular localizations, and are involved in maintaining the structural integrity of organelles.


[0112] An increasingly appreciated concept in characterizing proteins is the presence in the proteins of one or more motifs for which defined functions have been attributed. In addition to the highly conserved sequences found in the enzymatic domain of proteins, highly conserved sequences have been identified in proteins that are involved in protein-protein interaction. For example, Src-homology-2 (SH2) domains bind tyrosine-phosphorylated targets in a sequence dependent manner. PTB domains, which are distinct from SH2 domains, also bind tyrosine phosphorylated targets. SH3 domains bind to proline-rich targets. In addition, PH domains, tetratricopeptide repeats and WD domains to name only a few, have been shown to mediate protein-protein interactions. Some of these may also be involved in binding to phospholipids or other second messengers. As will be appreciated by one of ordinary skill in the art, these motifs can be identified on the basis of primary sequence; thus, an analysis of the sequence of proteins may provide insight into both the enzymatic potential of the molecule and/or molecules with which the protein may associate.


[0113] In another embodiment, the angiogenesis sequences are transmembrane proteins. Transmembrane proteins are molecules that span a phospholipid bilayer of a cell. They may have an intracellular domain, an extracellular domain, or both. The intracellular domains of such proteins may have a number of functions including those already described for intracellular proteins. For example, the intracellular domain may have enzymatic activity and/or may serve as a binding site for additional proteins. Frequently the intracellular domain of transmembrane proteins serves both roles. For example certain receptor tyrosine kinases have both protein kinase activity and SH2 domains. In addition, autophosphorylation of tyrosines on the receptor molecule itself, creates binding sites for additional SH2 domain containing proteins.


[0114] Transmembrane proteins may contain from one to many transmembrane domains. For example, receptor tyrosine kinases, certain cytokine receptors, receptor guanylyl cyclases and receptor serine/threonine protein kinases contain a single transmembrane domain. However, various other proteins including channels and adenylyl cyclases contain numerous transmembrane domains. Many important cell surface receptors such as G protein coupled receptors (GPCRs) are classified as “seven transmembrane domain” proteins, as they contain 7 membrane spanning regions. Characteristics of transmembrane domains include approximately 20 consecutive hydrophobic amino acids that may be followed by charged amino acids. Therefore, upon analysis of the amino acid sequence of a particular protein, the localization and number of transmembrane domains within the protein may be predicted (see, e.g. PSORT web site http://psort.nibb.acjp/).


[0115] The extracellular domains of transmembrane proteins are diverse; however, conserved motifs are found repeatedly among various extracellular domains. Conserved structure and/or functions have been ascribed to different extracellular motifs. Many extracellular domains are involved in binding to other molecules. In one aspect, extracellular domains are found on receptors. Factors that bind the receptor domain include circulating ligands, which may be peptides, proteins, or small molecules such as adenosine and the like. For example, growth factors such as EGF, FGF and PDGF are circulating growth factors that bind to their cognate receptors to initiate a variety of cellular responses. Other factors include cytokines, mitogenic factors, neurotrophic factors and the like. Extracellular domains also bind to cell-associated molecules. In this respect, they mediate cell-cell interactions. Cell-associated ligands can be tethered to the cell for example via a glycosylphosphatidylinositol (GPI) anchor, or may themselves be transmembrane proteins. Extracellular domains also associate with the extracellular matrix and contribute to the maintenance of the cell structure.


[0116] Angiogenesis proteins that are transmembrane are particularly preferred in the present invention as they are readily accessible targets for immunotherapeutics, as are described herein. In addition, as outlined below, transmembrane proteins can be also useful in imaging modalities. Antibodies may be used to label such readily accessible proteins in situ. Alternatively, antibodies can also label intracellular proteins, in which case samples are typically permeablized to provide acess to intracellular proteins.


[0117] It will also be appreciated by those in the art that a transmembrane protein can be made soluble by removing transmembrane sequences, for example through recombinant methods. Furthermore, transmembrane proteins that have been made soluble can be made to be secreted through recombinant means by adding an appropriate signal sequence.


[0118] In another embodiment, the angiogenesis proteins are secreted proteins; the secretion of which can be either constitutive or regulated. These proteins have a signal peptide or signal sequence that targets the molecule to the secretory pathway. Secreted proteins are involved in numerous physiological events; by virtue of their circulating nature, they serve to transmit signals to various other cell types. The secreted protein may function in an autocrine manner (acting on the cell that secreted the factor), a paracrine manner (acting on cells in close proximity to the cell that secreted the factor) or an endocrine manner (acting on cells at a distance). Thus secreted molecules find use in modulating or altering numerous aspects of physiology. Angiogenesis proteins that are secreted proteins are particularly preferred in the present invention as they serve as good targets for diagnostic markers, e.g., for blood or serum tests.


[0119] An angiogenesis sequence is initially identified by substantial nucleic acid and/or amino acid sequence homology or linkage to the angiogenesis sequences outlined herein. Such homology can be based upon the overall nucleic acid or amino acid sequence, and is generally determined as outlined below, using either homology programs or hybridization conditions. Typically, linked sequences on a mRNA are found on the same molecule.


[0120] As detailed in the definitions, percent identity can be determined using an algorithm such as BLAST. A preferred method utilizes the BLASTN module of WU-BLAST-2 set to the default parameters, with overlap span and overlap fraction set to 1 and 0.125, respectively. The alignment may include the introduction of gaps in the sequences to be aligned. In addition, for sequences which contain either more or fewer nucleotides than those of the nucleic acids of the figure it is understood that the percentage of homology will be determined based on the number of homologous nucleosides in relation to the total number of nucleosides. Thus, for example, homology of sequences shorter than those of the sequences identified herein and as discussed below, will be determined using the number of nucleosides in the shorter sequence.


[0121] In one embodiment, the nucleic acid homology is determined through hybridization studies. Thus, e.g., nucleic acids which hybridize under high stringency to a nucleic acid of Table 1, or its complement, or is also found on naturally occurring mRNAs is considered an angiogenesis sequence. In another embodiment, less stringent hybridization conditions are used; for example, moderate or low stringency conditions may be used, as are known in the art; see Ausubel, supra, and Tijssen, supra.


[0122] In addition, the angiogenesis nucleic acid sequences of the invention, e.g, the sequence in Table 1, are fragments of larger genes, i.e. they are nucleic acid segments. “Genes” in this context includes coding regions, non-coding regions, and mixtures of coding and non-coding regions. Accordingly, as will be appreciated by those in the art, using the sequences provided herein, extended sequences, in either direction, of the angiogenesis genes can be obtained, using techniques well known in the art for cloning either longer sequences or the full length sequences; see Ausubel, et al., supra. Much can be done by informatics and many sequences can be clustered to include multiple sequences, e.g., systems such as UniGene (see, http://www.ncbi.nlm.nih.gov/UniGene/).


[0123] Once the angiogenesis nucleic acid is identified, it can be cloned and, if necessary, its constituent parts recombined to form the entire angiogenesis nucleic acid coding regions or the entire mRNA sequence. Once isolated from its natural source, e.g., contained within a plasmid or other vector or excised therefrom as a linear nucleic acid segment, the recombinant angiogenesis nucleic acid can be further-used as a probe to identify and isolate other angiogenesis nucleic acids, for example extended coding regions. It can also be used as a “precursor” nucleic acid to make modified or variant angiogenesis nucleic acids and proteins.


[0124] The angiogenesis nucleic acids of the present invention are used in several ways. In a first embodiment, nucleic acid probes to the angiogenesis nucleic acids are made and attached to biochips to be used in screening and diagnostic methods, as outlined below, or for administration, for example for gene therapy, vaccine, and/or antisense applications. Alternatively, the angiogenesis nucleic acids that include coding regions of angiogenesis proteins can be put into expression vectors for the expression of angiogenesis proteins, again for screening purposes or for administration to a patient.


[0125] In a preferred embodiment, nucleic acid probes to angiogenesis nucleic acids (both the nucleic acid sequences outlined in the figures and/or the complements thereof) are made. The nucleic acid probes attached to the biochip are designed to be substantially complementary to the angiogenesis nucleic acids, i.e. the target sequence (either the target sequence of the sample or to other probe sequences, for example in sandwich assays), such that hybridization of the target sequence and the probes of the present invention occurs. As outlined below, this complementarity need not be perfect; there may be any number of base pair mismatches which will interfere with hybridization between the target sequence and the single stranded nucleic acids of the present invention. However, if the number of mutations is so great that no hybridization can occur under even the least stringent of hybridization conditions, the sequence is not a complementary target sequence. Thus, by “substantially complementary” herein is meant that the probes are sufficiently complementary to the target sequences to hybridize under normal reaction conditions, particularly high stringency conditions, as outlined herein.


[0126] A nucleic acid probe is generally single stranded but can be partially single and partially double stranded. The strandedness of the probe is dictated by the structure, composition, and properties of the target sequence. In general, the nucleic acid probes range from about 8 to about 100 bases long, with from about 10 to about 80 bases being preferred, and from about 30 to about 50 bases being particularly preferred. That is, generally whole genes are not used. In some embodiments, much longer nucleic acids can be used, up to hundreds of bases.


[0127] In a preferred embodiment, more than one probe per sequence is used, with either overlapping probes or probes to different sections of the target being used. That is, two, three, four or more probes, with three being preferred, are used to build in a redundancy for a particular target. The probes can be overlapping (i.e. have some sequence in common), or separate. In some cases, PCR primers may be used to amplify signal for higher sensitivity.


[0128] As will be appreciated by those in the art, nucleic acids can be attached or immobilized to a solid support in a wide variety of ways. By “inunobilized” and grammatical equivalents herein is meant the association or binding between the nucleic acid probe and the solid support is sufficient to be stable under the conditions of binding, washing, analysis, and removal as outlined below. The binding can typically be covalent or non-covalent. By “non-covalent binding” and grammatical equivalents herein is meant one or more of electrostatic, hydrophilic, and hydrophobic interactions. Included in non-covalent binding is the covalent attachment of a molecule, such as, streptavidin to the support and the non-covalent binding of the biotinylated probe to the streptavidin. By “covalent binding” and grammatical equivalents herein is meant that the two moieties, the solid support and the probe, are attached by at least one bond, including sigma bonds, pi bonds and coordination bonds. Covalent bonds can be formed directly between the probe and the solid support or can be formed by a cross linker or by inclusion of a specific reactive group on either the solid support or the probe or both molecules. Immobilization may also involve a combination of covalent and non-covalent interactions.


[0129] In general, the probes are attached to the biochip in a wide variety of ways, as will be appreciated by those in the art. As described herein, the nucleic acids can either be synthesized first, with subsequent attachment to the biochip, or can be directly synthesized on the biochip.


[0130] The biochip comprises a suitable solid substrate. By “substrate” or “solid support” or other grammatical equivalents herein is meant a material that can be modified to contain discrete individual sites appropriate for the attachment or association of the nucleic acid probes and is amenable to at least one detection method. As will be appreciated by those in the art, the number of possible substrates are very large, and include, but are not limited to, glass and modified or functionalized glass, plastics (including acrylics, polystyrene and copolymers of styrene and other materials, polypropylene, polyethylene, polybutylene, polyurethanes, TeflonJ, etc.), polysaccharides, nylon or nitrocellulose, resins, silica or silica-based materials including silicon and modified silicon, carbon, metals, inorganic glasses, plastics, etc. In general, the substrates allow optical detection and do not appreciably fluorescese. A preferred substrate is described in copending application entitled Reusable Low Fluorescent Plastic Biochip, U.S. application Ser. No. 09/270,214, filed Mar. 15, 1999, herein incorporated by reference in its entirety.


[0131] Generally the substrate is planar, although as will be appreciated by those in the art, other configurations of substrates may be used as well. For example, the probes may be placed on the inside surface of a tube, for flow-through sample analysis to minimize sample volume. Similarly, the substrate may be flexible, such as a flexible foam, including closed cell foams made of particular plastics.


[0132] In a preferred embodiment, the surface of the biochip and the probe may be derivatized with chemical functional groups for subsequent attachment of the two. Thus, for example, the biochip is derivatized with a chemical functional group including, but not limited to, amino groups, carboxy groups, oxo groups and thiol groups, with amino groups being particularly preferred. Using these functional groups, the probes can be attached using functional groups on the probes. For example, nucleic acids containing amino groups can be attached to surfaces comprising amino groups, for example using linkers as are known in the art; for example, homo-or hetero-bifunctional linkers as are well known (see 1994 Pierce Chemical Company catalog, technical section on cross-linkers, pages 155-200, incorporated herein by reference). In addition, in some cases, additional linkers, such as alkyl groups (including substituted and heteroalkyl groups) may be used.


[0133] In this embodiment, oligonucleotides are synthesized as is known in the art, and then attached to the surface of the solid support. As will be appreciated by those skilled in the art, either the 5′ or 3′ terminus may be attached to the solid support, or attachment may be via an internal nucleoside.


[0134] In another embodiment, the immobilization to the solid support may be very strong, yet non-covalent. For example, biotinylated oligonucleotides can be made, which bind to surfaces covalently coated with streptavidin, resulting in attachment.


[0135] Alternatively, the oligonucleotides may be synthesized on the surface, as is known in the art. For example, photoactivation techniques utilizing photopolymerization compounds and techniques are used. In a preferred embodiment, the nucleic acids can be synthesized in situ, using well known photolithographic techniques, such as those described in WO 95/25116; WO 95/35505; U.S. Pat. Nos. 5,700,637 and 5,445,934; and references cited within, all of which are expressly incorporated by reference; these methods of attachment form the basis of the Affimetrix GeneChip™ technology.


[0136] Often, amplification-based assays are performed to measure the expression level of angiogenesis-associated sequences. These assays are typically performed in conjunction with reverse transcription. In such assays, an angiogenesis-associated nucleic acid sequence acts as a template in an amplification reaction (e.g., Polymerase Chain Reaction, or PCR). In a quantitative amplification, the amount of amplification product will be proportional to the amount of template in the original sample. Comparison to appropriate controls provides a measure of the amount of angiogenesis-associated RNA Methods of quantitative amplification are well known to those of skill in the art. Detailed protocols for quantitative PCR are provided, e.g., in Innis et al. (1990) PCR Protocols, A Guide to Methods and Applications, Academic Press, Inc. N.Y.).


[0137] In some embodiments, a TaqMan based assay is used to measure expression. TaqMan based assays use a fluorogenic oligonucleotide probe that contains a 5′ fluorescent dye and a 3′ quenching agent. The probe hybridizes to a PCR product, but cannot itself be extended due to a blocking agent at the 3′ end. Then the PCR product is amplified in subsequent cycles, the 5′ nuclease activity of the polymerase, e.g., AmpliTaq, results in the cleavage of the TaqMan probe. This cleavage separates the 5′ fluorescent dye and the 3′ quenching agent, thereby resulting in an increase in fluorescence as a function of amplification (see, for example, literature provided by Perkin-Elmer, e.g., www2.perkin-elmer.com).


[0138] Other suitable amplification methods include, but are not limited to, ligase chain reaction (LCR) (see, Wu and Wallace (1989) Genomics 4: 560, Landegren et al. (1988) Science 241: 1077, and Barringer et al. (1990) Gene 89: 117), transcription amplification (Kwoh et al. (1989) Proc. Natl. Acad. Sci. USA 86: 1173), self-sustained sequence replication (Guatelli et al. (1990) Proc. Nat. Acad. Sci. USA 87: 1874), dot PCR, and linker adapter PCR, etc.


[0139] In a preferred embodiment, angiogenesis nucleic acids, e.g., encoding angiogenesis proteins are used to make a variety of expression vectors to express angiogenesis proteins which can then be used in screening assays, as described below. Expression vectors and recombinant DNA technology are well known to those of skill in the art (see, e.g., Ausubel, supra, and Gene Expression Systems, Fernandez & Hoeffler, Eds, Academic Press, 1999) and are used to express proteins. The expression vectors may be either self-replicating extrachromosomal vectors or vectors which integrate, into a host genome. Generally, these expression vectors include transcriptional and translational regulatory nucleic acid operably linked to the nucleic acid encoding the angiogenesis protein. The term “control sequences” refers to DNA sequences used for the expression of an operably linked coding sequence in a particular host organism. Control sequences that are suitable for prokaryotes, for example, include a promoter, optionally an operator sequence, and a ribosome binding site. Eukaryotic cells are known to utilize promoters, polyadenylation signals, and enhancers.


[0140] Nucleic acid is “operably linked” when it is placed into a functional relationship with another nucleic acid sequence. For example, DNA for a presequence or secretory leader is operably linked to DNA for a polypeptide if it is expressed as a preprotein that participates in the secretion of the polypeptide; a promoter or enhancer is operably linked to a coding sequence if it affects the transcription of the sequence; or a ribosome binding site is operably linked to a coding sequence if it is positioned so as to facilitate translation. Generally, “operably linked” means that the DNA sequences being linked are contiguous, and, in the case of a secretory leader, contiguous and in reading phase. However, enhancers do not have to be contiguous. Linking is typically accomplished by ligation at convenient restriction sites. If such sites do not exist, synthetic oligonucleotide adaptors or linkers are used in accordance with conventional practice. Transcriptional and translational regulatory nucleic acid will generally be appropriate to the host cell used to express the angiogenesis protein; for example, transcriptional and translational regulatory nucleic acid sequences from Bacillus are preferably used to express the angiogenesis protein in Bacillus. Numerous types of appropriate expression vectors, and suitable regulatory sequences are known in the art for a variety of host cells.


[0141] In general, transcriptional and translational regulatory sequences may include, but are not limited to, promoter sequences, ribosomal binding sites, transcriptional start and stop sequences, translational start and stop sequences, and enhancer or activator sequences. In a preferred embodiment, the regulatory sequences include a promoter and transcriptional start and stop sequences.


[0142] Promoter sequences encode either constitutive or inducible promoters. The promoters may be either naturally occurring promoters or hybrid promoters. Hybrid promoters, which combine elements of more than one promoter, are also known in the art, and are useful in the present invention.


[0143] In addition, an expression vector may comprise additional elements. For example, the expression vector may have two replication systems, thus allowing it to be maintained in two organisms, for example in mammalian or insect cells for expression and in a procaryotic host for cloning and amplification. Furthermore, for integrating expression vectors, the expression vector contains at least one sequence homologous to the host cell genome, and preferably two homologous sequences which flank the expression construct. The integrating vector may be directed to a specific locus in the host cell by selecting the appropriate homologous sequence for inclusion in the vector. Constructs for integrating vectors are well known in the art (e.g., Fernandez & Hoeffler, supra).


[0144] In addition, in a preferred embodiment, the expression vector contains a selectable marker gene to allow the selection of transformed host cells. Selection genes are well known in the art and will vary with the host cell used.


[0145] The angiogenesis proteins of the present invention are produced by culturing a host cell transformed with an expression vector containing nucleic acid encoding an angiogenesis protein, under the appropriate conditions to induce or cause expression of the angiogenesis protein. Conditions appropriate for angiogenesis protein expression will vary with the choice of the expression vector and the host cell, and will be easily ascertained by one skilled in the art through routine experimentation or optimization. For example, the use of constitutive promoters in the expression vector will require optimizing the growth and proliferation of the host cell, while the use of an inducible promoter requires the appropriate growth conditions for induction. In addition, in some embodiments, the timing of the harvest is important. For example, the baculoviral systems used in insect cell expression are lytic viruses, and thus harvest time selection can be crucial for product yield.


[0146] Appropriate host cells include yeast, bacteria, archaebacteria, fungi, and insect and animal cells, including mammalian cells. Of particular interest are Saccharomyces cerevisiae and other yeasts, E. coli, Bacillus subtilis, Sf9 cells, C129 cells, 293 cells, Neurospora, BHK, CHO, COS, HeLa cells, HUVEC (human umbilical vein endothelial cells), THP1 cells (a macrophage cell line) and various other human cells and cell lines.


[0147] In a preferred embodiment, the angiogenesis proteins are expressed in mammalian cells. Mammalian expression systems are also known in the art, and include retroviral and adenoviral systems. Of particular use as mammalian promoters are the promoters from mammalian viral genes, since the viral genes are often highly expressed and have a broad host range. Examples include the SV40 early promoter, mouse mammary tumor virus LTR promoter, adenovirus major late promoter, herpes simplex virus promoter, and the CMV promoter (see, e.g., Fernandez & Hoeffler, supra). Typically, transcription termination and polyadenylation sequences recognized by mammalian cells are regulatory regions located 3′ to the translation stop codon and thus, together with the promoter elements, flank the coding sequence. Examples of transcription terminator and polyadenlytion signals include those derived form SV40.


[0148] The methods of introducing exogenous nucleic acid into mammalian hosts, as well as other hosts, is well known in the art, and will vary with the host cell used. Techniques include dextran-mediated transfection, calcium phosphate precipitation, polybrene mediated transfection, protoplast fusion, electroporation, viral infection, encapsulation of the polynucleotide(s) in liposomes, and direct microinjection of the DNA into nuclei.


[0149] In a preferred embodiment, angiogenesis proteins are expressed in bacterial systems. Bacterial expression systems are well known in the art. Promoters from bacteriophage may also be used and are known in the art. In addition, synthetic promoters and hybrid promoters are also useful; for example, the tac promoter is a hybrid of the trp and lac promoter sequences. Furthermore, a bacterial promoter can include naturally occurring promoters of non-bacterial origin that have the ability to bind bacterial RNA polymerase and initiate transcription. In addition to a functioning promoter sequence, an efficient ribosome binding site is desirable. The expression vector may also include a signal peptide sequence that provides for secretion of the angiogenesis protein in bacteria. The protein is either secreted into the growth media (gram-positive bacteria) or into the periplasmic space, located between the inner and outer membrane of the cell (gram-negative bacteria). The bacterial expression vector may also include a selectable marker gene to allow for the selection of bacterial strains that have been transformed. Suitable selection genes include genes which render the bacteria resistant to drugs such as ampicillin, chloramphenicol, erythromycin, kanamycin, neomycin and tetracycline. Selectable markers also include biosynthetic genes, such as those in the histidine, tryptophan and leucine biosynthetic pathways. These components are assembled into expression vectors. Expression vectors for bacteria are well known in the art, and include vectors for Bacillus subtilis, E. coli, Streptococcus cremoris, and Streptococcus lividans, among others (e.g., Fernandez & Hoeffler, supra). The bacterial expression vectors are transformed into bacterial host cells using techniques well known in the art, such as calcium chloride treatment, electroporation, and others.


[0150] In one embodiment, angiogenesis proteins are produced in insect cells. Expression vectors for the transformation of insect cells, and in particular, baculovirus-based expression vectors, are well known in the art.


[0151] In a preferred embodiment, angiogenesis protein is produced in yeast cells. Yeast expression systems are well known in the art, and include expression vectors for Saccharomyces cerevisiae, Candida albicans and C. maltosa, Hansenula polymorpha, Kluyveromyces fragilis and K lactis, Pichia guillerimondii and P. pastoris, Schizosaccharomyces pombe, and Yarrowia lipolytica.


[0152] The angiogenesis protein may also be made as a fusion protein, using techniques well known in the art. Thus, for example, for the creation of monoclonal antibodies, if the desired epitope is small, the angiogenesis protein may be fused to a carrier protein to form an immunogen. Alternatively, the angiogenesis protein may be made as a fusion protein to increase expression, or for other reasons. For example, when the angiogenesis protein is an angiogenesis peptide, the nucleic acid encoding the peptide may be linked to other nucleic acid for expression purposes.


[0153] In one embodiment, the angiogenesis nucleic acids, proteins and antibodies of the invention are labeled. By “labeled” herein is meant that a compound has at least one element, isotope or chemical compound attached to enable the detection of the compound. In general, labels fall into three classes: a) isotopic labels, which may be radioactive or heavy isotopes; b) immune labels, which may be antibodies or antigens; and c) colored or fluorescent dyes. The labels may be incorporated into the angiogenesis nucleic acids, proteins and antibodies at any position. For example, the label should be capable of producing, either directly or indirectly, a detectable signal. The detectable moiety may be a radioisotope, such as 3H, 14C, 32P, 35S, or 125I, a fluorescent or chemiluminescent compound, such as fluorescein isothiocyanate, rhodamine, or luciferin, or an enzyme, such as alkaline phosphatase, beta-galactosidase or horseradish peroxidase. Any method known in the art for conjugating the antibody to the label may be employed, including those methods described by Hunter et al., Nature, 144:945 (1962); David et al., Biochemistry, 13:1014 (1974); Pain et al., J. Immunol. Meth., 40:219 (1981); and Nygren, J. Histochem. and Cytochem., 30:407 (1982).


[0154] Accordingly, the present invention also provides angiogenesis protein sequences. An angiogenesis protein of the present invention may be identified in several ways. “Protein” in this sense includes proteins, polypeptides, and peptides. As will be appreciated by those in the art, the nucleic acid sequences of the invention can be used to generate protein sequences. There are a variety of ways to do this, including cloning the entire gene and verifying its frame and amino acid sequence, or by comparing it to known sequences to search for homology to provide a frame, assuming the angiogenesis protein has an identifiable motif or homology to some protein in the database being used. Generally, the nucleic acid sequences are input into a program that will search all three frames for homology. This is done in a preferred embodiment using the following NCBI Advanced BLAST parameters. The program is blastx or blastn. The database is nr. The input data is as “Sequence in FASTA format”. The organism list is “none”. The “expect” is 10; the filter is default. The “descriptions” is 500, the “alignments” is 500, and the “alignment view” is pairwise. The “Query Genetic Codes” is standard (1). The matrix is BLOSUM62; gap existence cost is 11, per residue gap cost is 1; and the lambda ratio is 0.85 default. This results in the generation of a putative protein sequence.


[0155] Also included within one embodiment of angiogenesis proteins are amino acid variants of the naturally occurring sequences, as determined herein. Preferably, the variants are preferably greater than about 75% homologous to the wild-type sequence, more preferably greater than about 80%, even more preferably greater than about 85% and most preferably greater than 90%. In some embodiments the homology will be as high as about 93 to 95 or 98%. As for nucleic acids, homology in this context means sequence similarity or identity, with identity being preferred. This homology will be determined using standard techniques well known in the art as are outlined above for the nucleic acid homologies.


[0156] Angiogenesis proteins of the present invention may be shorter or longer than the wild type amino acid sequences. Thus, in a preferred embodiment, included within the definition of angiogenesis proteins are portions or fragments of the wild type sequences. herein. In addition, as outlined above, the angiogenesis nucleic acids of the invention may be used to obtain additional coding regions, and thus additional protein sequence, using techniques known in the art.


[0157] In a preferred embodiment, the angiogenesis proteins are derivative or variant angiogenesis proteins as compared to the wild-type sequence. That is, as outlined more fully below, the derivative angiogenesis peptide will often contain at least one amino acid substitution, deletion or insertion, with amino acid substitutions being particularly preferred. The amino acid substitution, insertion or deletion may occur at any residue within the angiogenesis peptide.


[0158] Also included within one embodiment of angiogenesis proteins of the present invention are amino acid sequence variants. These variants typically fall into one or more of three classes: substitutional, insertional or deletional variants. These variants ordinarily are prepared by site specific mutagenesis of nucleotides in the DNA encoding the angiogenesis protein, using cassette or PCR mutagenesis or other techniques well known in the art, to produce DNA encoding the variant, and thereafter expressing the DNA in recombinant cell culture as outlined above. However, variant angiogenesis protein fragments having up to about 100-150 residues may be prepared by in vitro synthesis using established techniques. Amino acid sequence variants are characterized by the predetermined nature of the variation, a feature that sets them apart from naturally occurring allelic or interspecies variation of the angiogenesis protein amino acid sequence. The variants typically exhibit the same qualitative biological activity as the naturally occurring analogue, although variants can also be selected which have modified characteristics as will be more fully outlined below.


[0159] While the site or region for introducing an amino acid sequence variation is predetermined, the mutation per se need not be predetermined. For example, in order to optimize the performance of a mutation at a given site, random mutagenesis may be conducted at the target codon or region and the expressed angiogenesis variants screened for the optimal combination of desired activity. Techniques for making substitution mutations at predetermined sites in DNA having a known sequence are well known, for example, M13 primer mutagenesis and PCR mutagenesis. Screening of the mutants is done using assays of angiogenesis protein activities.


[0160] Amino acid substitutions are typically of single residues; insertions usually will be on the order of from about 1 to 20 amino acids, although considerably larger insertions may be tolerated. Deletions range from about 1 to about 20 residues, although in some cases deletions may be much larger.


[0161] Substitutions, deletions, insertions or any combination thereof may be used to arrive at a final derivative. Generally these changes are done on a few amino acids to minimize the alteration of the molecule. However, larger changes may be tolerated in certain circumstances. When small alterations in the characteristics of the angiogenesis protein are desired, substitutions are generally made in accordance with the amino acid substitution chart provided in the definition section.


[0162] Substantial changes in function or immunological identity are made by selecting substitutions that are less conservative than those provided in the definition of “conservative substitution”. For example, substitutions may be made which more significantly affect: the structure of the polypeptide backbone in the area of the alteration, for example the alpha-helical or beta-sheet structure; the charge or hydrophobicity of the molecule at the target site; or the bulk of the side chain. The substitutions which in general are expected to produce the greatest changes in the polypeptide's properties are those in which (a) a hydrophilic residue, e.g. seryl or threonyl, is substituted for (or by) a hydrophobic residue, e.g. leucyl, isoleucyl, phenylalanyl, valyl or alanyl; (b) a cysteine or proline is substituted for (or by) any other residue; (c) a residue having an electropositive side chain, e.g. lysyl, arginyl, or histidyl, is substituted for (or by) an electronegative residue, e.g. glutamyl or aspartyl; or (d) a residue having a bulky side chain, e.g. phenylalanine, is substituted for (or by) one not having a side chain, e.g. glycine.


[0163] The variants typically exhibit the same qualitative biological activity and will elicit the same immune response as the naturally-occurring analog, although variants also are selected to modify the characteristics of the angiogenesis proteins as needed. Alternatively, the variant may be designed such that the biological activity of the angiogenesis protein is altered. For example, glycosylation sites may be altered or removed.


[0164] Covalent modifications of angiogenesis polypeptides are included within the scope of this invention. One type of covalent modification includes reacting targeted amino acid residues of an angiogenesis polypeptide with an organic derivatizing agent that is capable of reacting with selected side chains or the N-or C-terminal residues of an angiogenesis polypeptide. Derivatization with bifunctional agents is useful, for instance, for crosslinking angiogenesis polypeptides to a water-insoluble support matrix or surface for use in the method for purifying anti-angiogenesis polypeptide antibodies or screening assays, as is more fully described below. Commonly used crosslinking agents include, e.g., 1,1-bis(diazoacetyl)-2-phenylethane, glutaraldehyde, N-hydroxysuccinimide esters, for example, esters with 4-azidosalicylic acid, homobifunctional imidoesters, including disuccinimidyl esters such as 3,3′-dithiobis(succinimidylpropionate), bifunctional maleimides such as bis-N-maleimido-1,8-octane and agents such as methyl-3-[(p-azidophenyl)dithio]propioimidate.


[0165] Other modifications include deamidation of glutaminyl and asparaginyl residues to the corresponding glutamyl and aspartyl residues, respectively, hydroxylation of proline and lysine, phosphorylation of hydroxyl groups of seryl, threonyl or tyrosyl residues, methylation of the γ-amino groups of lysine, arginine, and histidine side chains [T. E. Creighton, Proteins: Structure and Molecular Properties, W.H. Freeman & Co., San Francisco, pp. 79-86 (1983)], acetylation of the N-terminal amine, and amidation of any C-terminal carboxyl group.


[0166] Another type of covalent modification of the angiogenesis polypeptide included within the scope of this invention comprises altering the native glycosylation pattern of the polypeptide. “Altering the native glycosylation pattern” is intended for purposes herein to mean deleting one or more carbohydrate moieties found in native sequence angiogenesis polypeptide, and/or adding one or more glycosylation sites that are not present in the native sequence angiogenesis polypeptide. Glycosylation patterns can be altered in many ways. For example the use of different cell types to express angiogenesis-associated sequences can result in different glycosylation patterns.


[0167] Addition of glycosylation sites to angiogenesis polypeptides may also be accomplished by altering the amino acid sequence thereof. The alteration may be made, for example, by the addition of, or substitution by, one or more serine or threonine residues to the native sequence angiogenesis polypeptide (for O-linked glycosylation sites). The angiogenesis amino acid sequence may optionally be altered through changes at the DNA level, particularly by mutating the DNA encoding the angiogenesis polypeptide at preselected bases such that codons are generated that will translate into the desired amino acids.


[0168] Another means of increasing the number of carbohydrate moieties on the angiogenesis polypeptide is by chemical or enzymatic coupling of glycosides to the polypeptide. Such methods are described in the art, e.g., in WO 87/05330 published Sep. 11, 1987, and in Aplin and Wriston, CRC Crit. Rev. Biochem., pp. 259-306 (1981).


[0169] Removal of carbohydrate moieties present on the angiogenesis polypeptide may be accomplished chemically or enzymatically or by mutational substitution of codons encoding for amino acid residues that serve as targets for glycosylation. Chemical deglycosylation techniques are known in the art and described, for instance, by Hakimuddin, et al., Arch. Biochem. Biophys., 259:52 (1987) and by Edge et al., Anal. Biochem., 118:131 (1981). Enzymatic cleavage of carbohydrate moieties on polypeptides can be achieved by the use of a variety of endo-and exo-glycosidases as described by Thotakura et al., Meth. Enzymol., 138:350 (1987).


[0170] Another type of covalent modification of angiogenesis comprises linking the angiogenesis polypeptide to one of a variety of nonproteinaceous polymers, e.g., polyethylene glycol, polypropylene glycol, or polyoxyalkylenes, in the manner set forth in U.S. Pat. Nos. 4,640,835; 4,496,689; 4,301,144; 4,670,417; 4,791,192 or 4,179,337.


[0171] Angiogenesis polypeptides of the present invention may also be modified in a way to form chimeric molecules comprising an angiogenesis polypeptide fused to another, heterologous polypeptide or amino acid sequence. In one embodiment, such a chimeric molecule comprises a fusion of an angiogenesis polypeptide with a tag polypeptide which provides an epitope to which an anti-tag antibody can selectively bind. The epitope tag is generally placed at the amino-or carboxyl-terminus of the angiogenesis polypeptide. The presence of such epitope-tagged forms of an angiogenesis polypeptide can be detected using an antibody against the tag polypeptide. Also, provision of the epitope tag enables the angiogenesis polypeptide to be readily purified by affinity purification using an anti-tag antibody or another type of affinity matrix that binds to the epitope tag. In an alternative embodiment, the chimeric molecule may comprise a fusion of an angiogenesis polypeptide with an immunoglobulin or a particular region of an immunoglobulin. For a bivalent form of the chimeric molecule, such a fusion could be to the Fc region of an IgG molecule.


[0172] Various tag polypeptides and their respective antibodies are well known in the art. Examples include poly-histidine (poly-his).or poly-histidine-glycine (poly-his-gly) tags; HIS6 and metal chelation tags, the flu HA tag polypeptide and its antibody 12CA5 [Field et al., Mol. Cell. Biol., 8:2159-2165 (1988)]; the c-myc tag and the 8F9, 3C7, 6E10, G4, B7 and 9E10 antibodies thereto [Evan et al., Molecular and Cellular Biology, 5:3610-3616 (1985)]; and the Herpes Simplex virus glycoprotein D (gD) tag and its antibody [Paborsky et al., Protein Engineering, 3(6):547-553 (1990)]. Other tag polypeptides include the Flag-peptide [Hopp et al., BioTechnology, 6:1204-1210 (1988)]; the KT3 epitope peptide [Martin et al. Science, 255:192-194 (1992)]; tubulin epitope peptide [Skinner et al., J. Biol. Chem., 266:15163-15166 (1991)]; and the T7 gene 10 protein peptide tag [Lutz-Freyermuth et al., Proc. Natl. Acad. Sci. USA, 87:6393-6397 (1990)].


[0173] Also included with an embodiment of angiogenesis protein are other angiogenesis proteins of the angiogenesis family, and angiogenesis proteins from other organisms, which are cloned and expressed as outlined below. Thus, probe or degenerate polymerase chain reaction (PCR) primer sequences may be used to find other related angiogenesis proteins from humans or other organisms. As will be appreciated by those in the art, particularly useful probe and/or PCR primer sequences include the unique areas of the angiogenesis nucleic acid sequence. As is generally known in the art, preferred PCR primers are from about 15 to about 35 nucleotides in length, with from about 20 to about 30 being preferred, and may contain inosine as needed. The conditions for the PCR reaction are well known in the art (e.g., Innis, PCR Protocols, supra).


[0174] In addition, as is outlined herein, angiogenesis proteins can be made that are longer than those encoded by the nucleic acids of the figures, e.g., by the elucidation of extended sequences, the addition of epitope or purification tags, the addition of other fusion sequences, etc.


[0175] Angiogenesis proteins may also be identified as being encoded by angiogenesis nucleic acids. Thus, angiogenesis proteins are encoded by nucleic acids that will hybridize to the sequences of the sequence listings, or their complements, as outlined herein.


[0176] In a preferred embodiment, when the angiogenesis protein is to be used to generate antibodies, e.g., for immunotherapy or immunodiagnosis, the angiogenesis protein should share at least one epitope or determinant with the full length protein. By “epitope” or “determinant” herein is typically meant a portion of a protein which will generate and/or bind an antibody or T-cell receptor in the context of MHC. Thus, in most instances, antibodies made to a smaller angiogenesis protein will be able to bind to the full-length protein, particularly linear epitopes. In a preferred embodiment, the epitope is unique; that is, antibodies generated to a unique epitope show little or no cross-reactivity. In a preferred embodiment, the epitope is selected from a protein sequence set out in Table 2.


[0177] Methods of preparing polyclonal antibodies are known to the skilled artisan (e.g., Coligan, supra; and Harlow & Lane, supra). Polyclonal antibodies can be raised in a mammal, e.g., by one or more injections of an immunizing agent and, if desired, an adjuvant. Typically, the immunizing agent and/or adjuvant will be injected in the mammal by multiple subcutaneous or intraperitoneal injections. The immunizing agent may include a protein encoded by a nucleic acid of the figures or fragment thereof or a fusion protein thereof. It may be useful to conjugate the immunizing agent to a protein known to be immunogenic in the mammal being immunized. Examples of such immunogenic proteins include but are not limited to keyhole limpet hemocyanin, serum albumin, bovine thyroglobulin, and soybean trypsin inhibitor. Examples of adjuvants which may be employed include Freund's complete adjuvant and MPL-TDM adjuvant (monophosphoryl Lipid A, synthetic trehalose dicorynomycolate). The immunization protocol may be selected by one skilled in the art without undue experimentation.


[0178] The antibodies may, alternatively, be monoclonal antibodies. Monoclonal antibodies may be prepared using hybridoma methods, such as those described by Kohler and Milstein, Nature, 256:495 (1975). In a hybridoma method, a mouse, hamster, or other appropriate host animal, is typically immunized with an immunizing agent to elicit lymphocytes that produce or are capable of producing antibodies that will specifically bind to the immunizing agent. Alternatively, the lymphocytes may be immunized in vitro. The immunizing agent will typically include a polypeptide encoded by a nucleic acid of Table 1, or fragment thereof, or a fusion protein thereof. Generally, either peripheral blood lymphocytes (“PBLs”) are used if cells of human origin are desired, or spleen cells or lymph node cells are used if non-human mammalian sources are desired. The lymphocytes are then fused with an immortalized cell line using a suitable fusing agent, such as polyethylene glycol, to form a hybridoma cell [Goding, Monoclonal Antibodies: Principles and Practice, Academic Press, (1986) pp. 59-103]. Immortalized cell lines are usually transformed mammalian cells, particularly myeloma cells of rodent, bovine and human origin. Usually, rat or mouse myeloma cell lines are employed. The hybridoma cells may be cultured in a suitable culture medium that preferably contains one or more substances that inhibit the growth or survival of the unfused, immortalized cells. For example, if the parental cells lack the enzyme hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium for the hybridomas typically will include hypoxanthine, aminopterin, and thymidine (“HAT medium”), which substances prevent the growth of HGPRT-deficient cells.


[0179] In one embodiment; the antibodies are bispecific antibodies. Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that have binding specificities for at least two different antigens or that have binding specificities for two epitopes on the same antigen. In one embodiment, one of the binding specificities is for a protein encoded by a nucleic acid Table 1 or a fragment thereof, the other one is for any other antigen, and preferably for a cell-surface protein or receptor or receptor subunit, preferably one that is tumor specific. Alternatively, tetramer-type technology may create multivalent reagents.


[0180] In a preferred embodiment, the antibodies to angiogenesis protein are capable of reducing or eliminating a biological function of an angiogenesis protein, as is described below. That is, the addition of anti-angiogenesis protein antibodies (either polyclonal or preferably monoclonal) to angiogenic tissue (or cells containing angiogenesis) may reduce or eliminate the angiogenesis activity. Generally, at least a 25% decrease in activity is preferred, with at least about 50% being particularly preferred and about a 95-100% decrease being especially preferred.


[0181] In a preferred embodiment the antibodies to the angiogenesis proteins are humanized antibodies (e.g., Xenerex Biosciences, Mederex, Inc., Abgenix, Inc., Protein Design Labs,Inc.) Humanized forms of non-human (e.g., murine) antibodies are chimeric molecules of immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab′, F(ab′)2 or other antigen-binding subsequences of antibodies) which contain minimal sequence derived from non-human immunoglobulin. Humanized antibodies include human immunoglobulins (recipient antibody) in which residues form a complementary determining region (CDR) of the recipient are replaced by residues from a CDR of a non-human species (donor antibody) such as mouse, rat or rabbit having the desired specificity, affinity and capacity. In some instances, Fv framework residues of the human immunoglobulin are replaced by corresponding non-human residues. Humanized antibodies may also comprise residues which are found neither in the recipient antibody nor in the imported CDR or framework sequences. In general, a humanized antibody will comprise substantially all of at least one, and typically two, variable domains, in which all or substantially all of the CDR regions correspond to those of a non-human immunoglobulin and all or substantially all of the framework (FR) regions are those of a human immunoglobulin consensus sequence. The humanized antibody optimally also will comprise at least a portion of an immunoglobulin constant region (Fc), typically that of a human immunoglobulin [Jones et al., Nature, 321:522-525 (1986); Riechmann et al., Nature, 332:323-329 (1988); and Presta, Curr. Op. Struct. Biol., 2:593-596 (1992)].


[0182] Methods for humanizing non-human antibodies are well known in the art. Generally, a humanized antibody has one or more amino acid residues introduced into it from a source which is non-human. These non-human amino acid residues are often referred to as import residues, which are typically taken from an import variable domain. Humanization can be essentially performed following the method of Winter and co-workers [Jones et al., Nature, 321:522-525 (1986); Riechmann et al., Nature, 332:323-327 (1988); Verhoeyen et al., Science, 239:1534-1536 (1988)], by substituting rodent CDRs or CDR sequences for the corresponding sequences of a human antibody. Accordingly, such humanized antibodies are chimeric antibodies (U.S. Pat. No. 4,816,567), wherein substantially less than an intact human variable domain has been substituted by the corresponding sequence from a non-human species. In practice, humanized antibodies are typically human antibodies in which some CDR residues and possibly some FR residues are substituted by residues from analogous sites in rodent antibodies.


[0183] Human antibodies can also be produced using various techniques known in the art, including phage display libraries [Hoogenboom and Winter, J. Mol. Biol., 227:381 (1991); Marks et al., J. Mol. Biol., 222:581 (1991)]. The techniques of Cole et al. and Boemer et al. are also available for the preparation of human monoclonal antibodies (Cole et al., Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, p. 77 (1985) and Boerner et al., J. Immunol., 147(1):86-95 (1991)]. Similarly, human antibodies can be made by introducing of human immunoglobulin loci into transgenic animals, e.g., mice in which the endogenous immunoglobulin genes have been partially or completely inactivated. Upon challenge, human antibody production is observed, which closely resembles that seen in humans in all respects, including gene rearrangement, assembly, and antibody repertoire. This approach is described, for example, in U.S. Pat. Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 5,633,425; 5,661,016, and in the following scientific publications: Marks et al., Bio/Technology 10, 779-783 (1992); Lonberg et al., Nature 368 856-859 (1994); Morrison, Nature 368, 812-13 (1994); Fishwild et al., Nature Biotechnology 14, 845-51 (1996); Neuberger, Nature Biotechnology 14, 826 (1996); Lonberg and Huszar, Intern. Rev. Immunol. 13 65-93 (1995).


[0184] By immunotherapy is meant treatment of angiogenesis with an antibody raised against angiogenesis proteins. As used herein, immunotherapy can be passive or active. Passive immunotherapy as defined herein is the passive transfer of antibody to a recipient (patient). Active immunization is the induction of antibody and/or T-cell responses in a recipient (patient). Induction of an immune response is the result of providing the recipient with an antigen to which antibodies are raised. As appreciated by one of ordinary skill in the art, the antigen may be provided by injecting a polypeptide against which antibodies are desired to be raised into a recipient, or contacting the recipient with a nucleic acid capable of expressing the antigen and under conditions for expression of the antigen, leading to an immune response.


[0185] In a preferred embodiment the angiogenesis proteins against which antibodies are raised are secreted proteins as described above. Without being bound by theory, antibodies used for treatment, bind and prevent the secreted protein from binding to its receptor, thereby inactivating the secreted angiogenesis protein.


[0186] In another preferred embodiment, the angiogenesis protein to which antibodies are raised is a transmembrane protein. Without being bound by theory, antibodies used for treatment, bind the extracellular domain of the angiogenesis protein and prevent it from binding to other proteins, such as circulating ligands or cell-associated molecules. The antibody may cause down-regulation of the transmembrane angiogenesis protein. As will be appreciated by one of ordinary skill in the art, the antibody may be a competitive, non-competitive or uncompetitive inhibitor of protein binding to the extracellular domain of the angiogenesis protein. The antibody is also an antagonist of the angiogenesis protein. Further, the antibody prevents activation of the transmembrane angiogenesis protein. In one aspect, when the antibody prevents the binding of other molecules to the angiogenesis protein, the antibody prevents growth of the cell. The antibody may also be used to target or sensitize the cell to cytotoxic agents, including, but not limited to TNF-α, TNF-β, IL-1, INF-γ and IL-2, or chemotherapeutic agents including 5FU, vinblastine, actinomycin D, cisplatin, methotrexate, and the like. In some instances the antibody belongs to a sub-type that activates serum complement when complexed with the transmembrane protein thereby mediating cytotoxicity or antigen-dependent cytotoxicity (ADCC). Thus, angiogenesis is treated by administering to a patient antibodies directed against the transmembrane angiogenesis protein. Antibody-labeling may activate a co-toxin, localize a toxin payload, or otherwise provide means to locally ablate cells.


[0187] In another preferred embodiment, the antibody is conjugated to an effector moiety. The effector moiety can be any number of molecules, including labelling moieties such as radioactive labels or fluorescent labels, or can be a therapeutic moiety. In one aspect the therapeutic moiety is a small molecule that modulates the activity of the angiogenesis protein. In another aspect the therapeutic moiety modulates the activity of molecules associated with or in close proximity to the angiogenesis protein. The therapeutic moiety may inhibit enzymatic activity such as protease or collagenase activity associated with angiogenesis.


[0188] In a preferred embodiment, the therapeutic moiety can also be a cytotoxic agent. In this method, targeting the cytotoxic agent to angiogenesis tissue or cells, results in a reduction in the number of afflicted cells, thereby reducing symptoms associated with angiogenesis. Cytotoxic agents are numerous and varied and include, but are not limited to, cytotoxic drugs or toxins or active fragments of such toxins. Suitable toxins and their corresponding fragments include diphtheria A chain, exotoxin A chain, ricin A chain, abrin A chain, curcin, crotin, phenomycin, enomycin and the like. Cytotoxic agents also include radiochemicals made by conjugating radioisotopes to antibodies raised against angiogenesis proteins, or binding of a radionuclide to a chelating agent that has been covalently attached to the antibody. Targeting the therapeutic moiety to transmembrane angiogenesis proteins not only serves to increase the local concentration of therapeutic moiety in the angiogenesis afflicted area, but also serves to reduce deleterious side effects that may be associated with the therapeutic moiety.


[0189] In another preferred embodiment, the angiogenesis protein against which the antibodies are raised is an intracellular protein. In this case, the antibody may be conjugated to a protein which facilitates entry into the cell. In one case, the antibody enters the cell by endocytosis. In another embodiment, a nucleic acid encoding the antibody is administered to the individual or cell. Moreover, wherein the angiogenesis protein can be targeted within a cell, i.e., the nucleus, an antibody thereto contains a signal for that target localization, i.e., a nuclear localization signal.


[0190] The angiogenesis antibodies of the invention specifically bind to angiogenesis proteins. By “specifically bind” herein is meant that the antibodies bind to the protein with a Kd of at least about 0.1 mM, more usually at least about 1 μ M, preferably at least about 0.1 μM or better, and most preferably, 0.01 μM or better. Selectivity of binding is also important.


[0191] In a preferred embodiment, the angiogenesis protein is purified or isolated after expression. Angiogenesis proteins may be isolated or purified in a variety of ways known to those skilled in the art depending on what other components are present in the sample. Standard purification methods include electrophoretic, molecular, immunological and chromatographic techniques, including ion exchange, hydrophobic, affinity, and reverse-phase HPLC chromatography, and chromatofocusing. For example, the angiogenesis protein may be purified using a standard anti-angiogenesis protein antibody column. Ultrafiltration and diafiltration techniques, in conjunction with protein concentration, are also useful. For general guidance in suitable purification techniques, see Scopes, R., Protein Purification, Springer-Verlag, N.Y. (1982). The degree of purification necessary will vary depending on the use of the angiogenesis protein. In some instances no purification will be necessary.


[0192] Once expressed and purified if necessary, the angiogenesis proteins and nucleic acids are useful in a number of applications. They may be used as immunoselection reagents, as vaccine reagents, as screening agents, etc.


[0193] Detection of Angiogenesis Sequence for Diagnostic and Therapeutic Applications


[0194] In one aspect, the RNA expression levels of genes are determined for different cellular states in the angiogenesis phenotype. Expression levels of genes in normal tissue (i.e., not undergoing angiogenesis) and in angiogenesis tissue (and in some cases, for varying severities of angiogenesis that relate to prognosis, as outlined below) are evaluated to provide expression profiles. An expression profile of a particular cell state or point of development is essentially a “fingerprint” of the state. While two states may have any particular gene similarly expressed, the evaluation of a number of genes simultaneously allows the generation of a gene expression profile that is reflective of the state of the cell. By comparing expression profiles of cells in different states, information regarding which genes are important (including both up- and down-regulation of genes) in each of these states is obtained. Then, diagnosis may be performed or confirmed to determine whether a tissue sample has the gene expression profile of normal or angiogenesic tissue. This will provide for molecular diagnosis of related conditions.


[0195] “Differential expression,” or grammatical equivalents as used herein, refers to qualitative or quantitative differences in the temporal and/or cellular gene expression patterns within and among cells and tissue. Thus, a differentially expressed gene can qualitatively have its expression altered, including an activation or inactivation, in, e.g., normal versus angiogenic tissue. Genes may be turned on or turned off in a particular state, relative to another state thus permitting comparison of two or more statese. A qualitatively regulated gene will exhibit an expression pattern within a state or cell type which is detectable by standard techniques. Some genes will be expressed in one state or cell type, but not in both. Alternatively, the difference in expression may be quantitative, e.g., in that expression is increased or decreased; i.e., gene expression is either upregulated, resulting in an increased amount of transcript, or downregulated, resulting in a decreased amount of transcript. The degree to which expression differs need only be large enough to quantify via standard characterization techniques as outlined below, such as by use of Affymetrix GeneChip™ expression arrays, Lockhart, Nature Biotechnology, 14:1675-1680 (1996), hereby expressly incorporated by reference. Other techniques include, but are not limited to, quantitative reverse transcriptase PCR, Northern analysis and RNase protection. As outlined above, preferably the change in expression (i.e., upregulation or downregulation) is at least about 50%, more preferably at least about 100%, more preferably at least about 150%, more preferably at least about 200%, with from 300 to at least 1000% being especially preferred.


[0196] Evaluation may be at the gene transcript, or the protein level. The amount of gene expression may be monitored using nucleic acid probes to the DNA or RNA equivalent of the gene transcript, and the quantification of gene expression levels, or, alternatively, the final gene product itself (protein) can be monitored, e.g., with antibodies to the angiogenesis protein and standard immunoassays (ELISAs, etc.) or other techniques, including mass spectroscopy assays, 2D gel electrophoresis assays, etc. Proteins corresponding to angiogenesis genes, i.e., those identified as being important in an angiogenesis phenotype, can be evaluated in an angiogenesis diagnostic test.


[0197] In a preferred embodiment, gene expression monitoring is performed simultaneously on a number of genes. Multiple protein expression monitoring can be performed as well. Similarly, these assays may be performed on an individual basis as well.


[0198] In this embodiment, the angiogenesis nucleic acid probes are attached to biochips as outlined herein for the detection and quantification of angiogenesis sequences in a particular cell. The assays are further described below in the example. PCR techniques can be used to provide greater sensitivity.


[0199] In a preferred embodiment nucleic acids encoding the angiogenesis protein are detected. Although DNA or RNA encoding the angiogenesis protein may be detected, of particular interest are methods wherein an mRNA encoding an angiogenesis protein is detected. Probes to detect mRNA can be a nucleotide/deoxynucleotide probe that is complementary to and hybridizes with the mRNA and includes, but is not limited to, oligonucleotides, cDNA or RNA. Probes also should contain a detectable label, as defined herein. In one method the mRNA is detected after immobilizing the nucleic acid to be examined on a solid support such as nylon membranes and hybridizing the probe with the sample. Following washing to remove the non-specifically bound probe, the label is detected. In another method detection of the mRNA is performed in situ. In this method permeabilized cells or tissue samples are contacted with a detectably labeled nucleic acid probe for sufficient time to allow the probe to hybridize with the target mRNA. Following washing to remove the non-specifically bound probe, the label is detected. For example a digoxygenin labeled riboprobe (RNA probe) that is complementary to the mRNA encoding an angiogenesis protein is detected by binding the digoxygenin with an anti-digoxygenin secondary antibody and developed with nitro blue tetrazolium and 5-bromo-4-chloro-3-indoyl phosphate.


[0200] In a preferred embodiment, various proteins from the three classes of proteins as described herein (secreted, transmembrane or intracellular proteins) are used in diagnostic assays. The angiogenesis proteins, antibodies, nucleic acids, modified proteins and cells containing angiogenesis sequences are used in diagnostic assays. This can be performed on an individual gene or corresponding polypeptide level. In a preferred embodiment, the expression profiles are used, preferably in conjunction with high throughput screening techniques to allow monitoring for expression profile genes and/or corresponding polypeptides.


[0201] As described and defined herein, angiogenesis proteins, including intracellular, transmembrane or secreted proteins, find use as markers of angiogenesis. Detection of these proteins in putative angiogenesis tissue allows for detection or diagnosis of angiogenesis. In one embodiment, antibodies are used to detect angiogenesis proteins. A preferred method separates proteins from a sample by electrophoresis on a gel (typically a denaturing and reducing protein gel, but may be another type of gel, including isoelectric focusing gels and the like). Following separation of proteins, the angiogenesis protein is detected, e.g., by immunoblotting with antibodies raised against the angiogenesis protein. Methods of immunoblotting are well known to those of ordinary skill in the art.


[0202] In another preferred method, antibodies to the angiogenesis protein find use in in situ imaging techniques, e.g., in histology (e.g., Methods in Cell Biology: Antibodies in Cell Biology, volume 37 (Asai, ed. 1993)). In this method cells are contacted with from one to many antibodies to the angiogenesis protein(s). Following washing to remove non-specific antibody binding, the presence of the antibody or antibodies is detected. In one embodiment the antibody is detected by incubating with a secondary antibody that contains a detectable label. In another method the primary antibody to the angiogenesis protein(s) contains a detectable label, for example an enzyme marker that can act on a substrate. In another preferred embodiment each one of multiple primary antibodies contains a distinct and detectable label. This method finds particular use in simultaneous screening for a plurality of angiogenesis proteins. As will be appreciated by one of ordinary skill in the art, many other histological imaging techniques are alsoprovided by the invention.


[0203] In a preferred embodiment the label is detected in a fluorometer which has the ability to detect and distinguish emissions of different wavelengths. In addition, a fluorescence activated cell sorter (FACS) can be used in the method.


[0204] In another preferred embodiment, antibodies find use in diagnosing angiogenesis from blood samples. As previously described, certain angiogenesis proteins are secreted/circulating molecules. Blood samples, therefore, are useful as samples to be probed or tested for the presence of secreted angiogenesis proteins. Antibodies can be used to detect an angiogenesis protein by previously described immunoassay techniques including ELISA, immunoblotting (Western blotting), immunoprecipitation, BIACORE technology and the like. Conversely, the presence of antibodies may indicate an immune response against an endogenous angiogenesis protein.


[0205] In a preferred embodiment, in situ hybridization of labeled angiogenesis nucleic acid probes to tissue arrays is done. For example, arrays of tissue samples, including angiogenesis tissue and/or normal tissue, are made. In situ hybridization (see, e.g., Ausubel, supra) is then performed. When comparing the fingerprints between an individual and a standard, the skilled artisan can make a diagnosis, a prognosis, or a prediction based on the findings. It is further understood that the genes which indicate the diagnosis may differ from those which indicate the prognosis and molecular profiling of the condition of the cells may lead to distinctions between responsive or refractory conditions or may be predictive of outcomes.


[0206] In a preferred embodiment, the angiogenesis proteins, antibodies, nucleic acids, modified proteins and cells containing angiogenesis sequences are used in prognosis assays. As above, gene expression profiles can be generated that correlate to angiogenesis severity, in terms of long term prognosis. Again, this may be done on either a protein or gene level, with the use of genes being preferred. As above, angiogenesis probes may be attached to biochips for the detection and quantification of angiogenesis sequences in a tissue or patient. The assays proceed as outlined above for diagnosis. PCR method may provide more sensitive and accurate quantification.


[0207] In a preferred embodiment members of the three classes of proteins as described herein are used in drug screening assays. The angiogenesis proteins, antibodies, nucleic acids, modified proteins and cells containing angiogenesis sequences are used in drug screening assays or by evaluating the effect of drug candidates on a “gene expression profile” or expression profile of polypeptides. In a preferred embodiment, the expression profiles are used, preferably in conjunction with high throughput screening techniques to allow monitoring for expression profile genes after treatment with a candidate agent (e.g., Zlokarnik, et al., Science 279, 84-8 (1998); Heid, Genome Res 6:986-94, 1996).


[0208] In a preferred embodiment, the angiogenesis proteins, antibodies, nucleic acids, modified proteins and cells containing the native or modified angiogenesis proteins are used in screening assays. That is, the present invention provides novel methods for screening for compositions which modulate the angiogenesis phenotype or an identified physiological function of an angiogenesis protein. As above, this can be done on an individual gene level or by evaluating the effect of drug candidates on a “gene expression profile”. In a preferred embodiment, the expression profiles are used, preferably in conjunction with high throughput screening techniques to allow monitoring for expression profile genes after treatment with a candidate agent, see Zlokarnik, supra.


[0209] Having identified the differentially expressed genes herein, a variety of assays may be executed. In a preferred embodiment, assays may be run on an individual gene or protein level. That is, having identified a particular gene as up regulated in angiogenesis, test compounds can be screened for the ability to modulate gene expression or for binding to the angiogenic protein. “Modulation” thus includes both an increase and a decrease in gene expression. The preferred amount of modulation will depend on the original change of the gene expression in normal versus tissue undergoing angiogenesis, with changes of at least 10%, preferably 50%, more preferably 100-300%, and in some embodiments 300-1000% or greater. Thus, if a gene exhibits a 4-fold increase in angiogenic tissue compared to normal tissue, a decrease of about four-fold is often desired; similarly, a 10-fold decrease in angiogenic tissue compared to normal tissue often provides a target value of a 10-fold increase in expression to be induced by the test compound.


[0210] The amount of gene expression may be monitored using nucleic acid probes and the quantification of gene expression levels, or, alternatively, the gene product itself can be monitored, e.g., through the use of antibodies to the angiogenesis protein and standard immunoassays. Proteomics and separation techniques may also allow quantification of expression.


[0211] In a preferred embodiment, gene expression or protein monitoring of a number of entitites, i.e., an expression profile, is monitored simultaneously. Such profiles will typically invove a plurality of those entitites described herein.


[0212] In this embodiment, the angiogenesis nucleic acid probes are attached to biochips as outlined herein for the detection and quantification of angiogenesis sequences in a particular cell. Alternatively, PCR may be used. Thus, a series, e.g., of microtiter plate, may be used with dispensed primers in desired wells. A PCR reaction can then be performed and analyzed for each well.


[0213] Modulators of Angiogenesis


[0214] Expression monitoring can be performed to identify compounds that modify the expression of one or more angiogenesis-associated sequences, e.g., a polynucleotide sequence set out in Table 1. Generally, in a preferred embodiment, a test modulator is added to the cells prior to analysis. Moreover, screens are also provided to identify agents that modulate angiogenesis, modulate angiogenesis proteins, bind to an angiogenesis protein, or interfere with the binding of an angiogenesis protein and an antibody or other binding partner.


[0215] The term “test compound” or “drug candidate” or “modulator” or grammatical equivalents as used herein describes any molecule, e.g., protein, oligopeptide, small organic molecule, polysaccharide, polynucleotide, etc., to be tested for the capacity to directly or indirectly alter the angiogenesis phenotype or the expression of an angiogenesis sequence, e.g., a nucleic acid or protein sequence. In preferred embodiments, modulators alter expression profiles, or expression profile nucleic acids or proteins provided herein. In one embodiment, the modulator suppresses an angiogenesis phenotype, for example to a normal tissue fingerprint. In another embodiment, a modulator induced an angiogenesis phenotype. Generally, a plurality of assay mixtures are run in parallel with different agent concentrations to obtain a differential response to the various concentrations. Typically, one of these concentrations serves as a negative control, i.e., at zero concentration or below the level of detection.


[0216] In one aspect, a modulator will neutralize the effect of an angiogenesis protein. By “neutralize” is meant that activity of a protein is inhibited or blocked and thereby has substantially no effect on a cell.


[0217] In certain embodiments, combinatorial libraries of potential modulators will be screened for an ability to bind to an angiogenesis polypeptide or to modulate activity. Conventionally, new chemical entities with useful properties are generated by identifying a chemical compound (called a “lead compound”) with some desirable property or activity, e.g., inhibiting activity, creating variants of the lead compound, and evaluating the property and activity of those variant compounds. Often, high throughput screening (HTS) method are employed for such an analysis.


[0218] In one preferred embodiment, high throughput screening methods involve providing a library containing a large number of potential therapeutic compounds (candidate compounds). Such “combinatorial chemical libraries” are then screened in one or more assays to identify those library members (particular chemical species or subclasses) that display a desired characteristic activity. The compounds thus identified can serve as conventional “lead compounds” or can themselves be used as potential or actual therapeutics.


[0219] A combinatorial chemical library is a collection of diverse chemical compounds generated by either chemical synthesis or biological synthesis by combining a number of chemical “building blocks” such as reagents. For example, a linear combinatorial chemical library, such as a polypeptide (e.g., mutein) library, is formed by combining a set of chemical building blocks called amino acids in every possible way for a given compound length (i.e., the number of amino acids in a polypeptide compound). Millions of chemical compounds can be synthesized through such combinatorial mixing of chemical building blocks (Gallop et al. (1994) J. Med. Chem. 37(9): 1233-1251).


[0220] Preparation and screening of combinatorial chemical libraries is well known to those of skill in the art. Such combinatorial chemical libraries include, but are not limited to, peptide libraries (see, e.g., U.S. Pat. No. 5,010,175, Furka (1991) Int. J. Pept. Prot. Res., 37: 487-493, Houghton et al. (1991) Nature, 354: 84-88), peptoids (PCT Publication No WO 91/19735, 26 Dec. 1991), encoded peptides (PCT Publication WO 93/20242, Oct. 14, 1993), random bio-oligomers (PCT Publication WO 92/00091, Jan. 9, 1992), benzodiazepines (U.S. Pat. No. 5,288,514), diversomers such as hydantoins, benzodiazepines and dipeptides (Hobbs et al., (1993) Proc. Nat. Acad. Sci. USA 90: 6909-6913), vinylogous polypeptides (Hagihara et al. (1992) J. Amer. Chem. Soc. 114: 6568), nonpeptidal peptidomimetics with a Beta-D-Glucose scaffolding (Hirschmann et al., (1992) J. Amer. Chem. Soc. 114: 9217-9218), analogous organic syntheses of small compound libraries (Chen et al. (1994) J. Amer. Chem. Soc. 116: 2661), oligocarbamates (Cho, et al., (1993) Science 261:1303), and/or peptidyl phosphonates (Campbell et al., (1994) J. Org. Chem. 59: 658). See, generally, Gordon et al., (1994) J. Med. Chem. 37:1385, nucleic acid libraries (see, e.g., Strategene, Corp.), peptide nucleic acid libraries (see, e.g., U.S. Pat. No. 5,539,083), antibody libraries (see, e.g. Vaughn et al. (1996) Nature Biotechnology, 14(3): 309-314), and PCT/US96/10287), carbohydrate libraries (see, e.g., Liang et al., (1996) Science, 274: 1520-1522, and U.S. Pat. No. 5,593,853), and small organic molecule libraries (see, e.g., benzodiazepines, Baum (1993) C&EN, January 18, page 25; isoprenoids, U.S. Pat. No. 5,569,588; thiazolidinones and metathiazanones, U.S. Pat. No. 5,549,974; pyrrolidines, U.S. Pat. Nos. 5,525,735 and 5,519,134; morpholino compounds, U.S. Pat. No. 5,506,337; benzodiazepines, U.S. Pat. No. 5,288,514; and the like).


[0221] Devices for the preparation of combinatorial libraries are commercially available (see, e.g., 357 MPS, 390 MPS, Advanced Chem Tech, Louisville Ky., Symphony, Rainin, Woburn, Mass., 433A Applied Biosystems, Foster City, Calif., 9050 Plus, Millipore, Bedford, Mass.).


[0222] A number of well known robotic systems have also been developed for solution phase chemistries. These systems include automated workstations like the automated synthesis apparatus developed by Takeda Chemical Industries, LTD. (Osaka, Japan) and many robotic systems utilizing robotic arms (Zymate II, Zymark Corporation, Hopkinton, Mass.; Orca, Hewlett-Packard, Palo Alto, Calif.), which mimic the manual synthetic operations performed by a chemist. Any of the above devices are suitable for use with the present invention. The nature and implementation of modifications to these devices (if any) so that they can operate as discussed herein will be apparent to persons skilled in the relevant art. In addition, numerous combinatorial libraries are themselves commercially available (see, e.g., ComGenex, Princeton, N.J., Asinex, Moscow, Ru, Tripos, Inc., St. Louis, Mo., ChemStar, Ltd, Moscow, RU, 3D Pharmaceuticals, Exton, Pa., Mattek Biosciences, Columbia, Md., etc.).


[0223] The assays to identify modulators are amenable to high throughput screening. Preferred assays thus detect enhancement or inhibition of angiogenesis gene transcription, inhibition or enhancement of polypeptide expression, and inhibition or enhancement of polypeptide activity.


[0224] High throughput assays for the presence, absence, quantification, or other properties of particular nucleic acids or protein products are well known to those of skill in the art. Similarly, binding assays and reporter gene assays are similarly well known. Thus, for example, U.S. Pat. No. 5,559,410 discloses high throughput screening methods for proteins, U.S. Pat. No. 5,585,639 discloses high throughput screening methods for nucleic acid binding (i.e., in arrays), while U.S. Pat. Nos. 5,576,220 and 5,541,061 disclose high throughput methods of screening for ligand/antibody binding.


[0225] In addition, high throughput screening systems are commercially available (see, e.g., Zymark Corp., Hopkinton, Mass.; Air Technical Industries, Mentor, Ohio; Beckman Instruments, Inc. Fullerton, Calif.; Precision Systems, Inc., Natick, Mass., etc.). These systems typically automate entire procedures, including all sample and reagent pipetting, liquid dispensing, timed incubations, and final readings of the microplate in detector(s) appropriate for the assay. These configurable systems provide high throughput and rapid start up as well as a high degree of flexibility and customization. The manufacturers of such systems provide detailed protocols for various high throughput systems. Thus, for example, Zymark Corp. provides technical bulletins describing screening systems for detecting the modulation of gene transcription, ligand binding, and the like.


[0226] In one embodiment, modulators are proteins, often naturally occurring proteins or fragments of naturally occurring proteins. Thus, e.g., cellular extracts containing proteins, or random or directed digests of proteinaceous cellular extracts, may be used. In this way libraries of proteins may be made for screening in the methods of the invention. Particularly preferred in this embodiment are libraries of bacterial, fungal, viral, and mammalian proteins, with the latter being preferred, and human proteins being especially preferred. Paticularly useful test compound will be directed to the class of proteins to which the target belongs, e.g., substrates for enzymes or ligands and receptors.


[0227] In a preferred embodiment, modulators are peptides of from about 5 to about 30 amino acids, with from about 5 to about 20 amino acids being preferred, and from about 7 to about 15 being particularly preferred. The peptides may be digests of naturally occurring proteins as is outlined above, random peptides, or “biased” random peptides. By “randomized” or grammatical equivalents herein is meant that each nucleic acid and peptide consists of essentially random nucleotides and amino acids, respectively. Since generally these random peptides (or nucleic acids, discussed below) are chemically synthesized, they may incorporate any nucleotide or amino acid at any position. The synthetic process can be designed to generate randomized proteins or nucleic acids, to allow the formation of all or most of the possible combinations over the length of the sequence, thus forming a library of randomized candidate bioactive proteinaceous agents.


[0228] In one embodiment, the library is fully randomized, with no sequence preferences or constants at any position. In a preferred embodiment, the library is biased. That is, some positions within the sequence are either held constant, or are selected from a limited number of possibilities. For example, in a preferred embodiment, the nucleotides or amino acid residues are randomized within a defined class, for example, of hydrophobic amino acids, hydrophilic residues, sterically biased (either small or large) residues, towards the creation of nucleic acid binding domains, the creation of cysteines, for cross-linking, prolines for SH-3 domains, serines, threonines, tyrosines or histidines for phosphorylation sites, etc., or to purines, etc.


[0229] Modulators of angiogenesis can also be nucleic acids, as defined above.


[0230] As described above generally for proteins, nucleic acid modulating agents may be naturally occurring nucleic acids, random nucleic acids, or “biased” random nucleic acids. For example, digests of procaryotic or eucaryotic genomes may be used as is outlined above for proteins.


[0231] In a preferred embodiment, the candidate compounds are organic chemical moieties, a wide variety of which are available in the literature.


[0232] After the candidate agent has been added and the cells allowed to incubate for some period of time, the sample containing a target sequence to be analyzed is added to the biochip. If required, the target sequence is prepared using known techniques. For example, the sample may be treated to lyse the cells, using known lysis buffers, electroporation, etc., with purification and/or amplification such as PCR performed as appropriate. For example, an in vitro transcription with labels covalently attached to the nucleotides is performed. Generally, the nucleic acids are labeled with biotin-FITC or PE, or with cy3 or cy5.


[0233] In a preferred embodiment, the target sequence is labeled with, for example, a fluorescent, a chemiluminescent, a chemical, or a radioactive signal, to provide a means of detecting the target sequence's specific binding to a probe. The label also can be an enzyme, such as, alkaline phosphatase or horseradish peroxidase, which when provided with an appropriate substrate produces a product that can be detected. Alternatively, the label can be a labeled compound or small molecule, such as an enzyme inhibitor, that binds but is not catalyzed or altered by the enzyme. The label also can be a moiety or compound, such as, an epitope tag or biotin which specifically binds to streptavidin. For the example of biotin, the streptavidin is labeled as described above, thereby, providing a detectable signal for the bound target sequence. Unbound labeled streptavidin is typically removed prior to analysis.


[0234] As will be appreciated by those in the art, these assays can be direct hybridization assays or can comprise “sandwich assays”, which include the use of multiple probes, as is generally outlined in U.S. Pat. Nos. 5,681,702, 5,597,909, 5,545,730, 5,594,117, 5,591,584, 5,571,670, 5,580,731, 5,571,670, 5,591,584, 5,624,802, 5,635,352, 5,594,118, 5,359,100, 5,124,246 and 5,681,697, all of which are hereby incorporated by reference. In this embodiment, in general, the target nucleic acid is prepared as outlined above, and then added to the biochip comprising a plurality of nucleic acid probes, under conditions that allow the formation of a hybridization complex.


[0235] A variety of hybridization conditions may be used in the present invention, including high, moderate and low stringency conditions as outlined above. The assays are generally run under stringency conditions which allows formation of the label probe hybridization complex only in the presence of target. Stringency can be controlled by altering a step parameter that is a thermodynamic variable, including, but not limited to, temperature, formamide concentration, salt concentration, chaotropic salt concentration pH, organic solvent concentration, etc.


[0236] These parameters may also be used to control non-specific binding, as is generally outlined in U.S. Pat. No. 5,681,697. Thus it may be desirable to perform certain steps at higher stringency conditions to reduce non-specific binding.


[0237] The reactions outlined herein may be accomplished in a variety of ways. Components of the reaction may be added simultaneously, or sequentially, in different orders, with preferred embodiments outlined below. In addition, the reaction may include a variety of other reagents. These include salts, buffers, neutral proteins, e.g. albumin, detergents, etc. which may be used to facilitate optimal hybridization and detection, and/or reduce non-specific or background interactions. Reagents that otherwise improve the efficiency of the assay, such as protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may also be used as appropriate, depending on the sample preparation methods and purity of the target.


[0238] The assay data are analyzed to determine the expression levels, and changes in expression levels as between states, of individual genes, forming a gene expression profile.


[0239] Screens are performed to identify modulators of the angiogenesis phenotype. In one embodiment, screening is performed to identify modulators that can induce or suppress a particular expression profile, thus preferably generating the associated phenotype. In another embodiment, e.g., for diagnostic applications, having identified differentially expressed genes important in a particular state, screens can be performed to identify modulators that alter expression of individual genes. In an another embodiment, screening is performed to identify modulators that alter a biological function of the expression product of a differentially expressed gene. Again, having identified the importance of a gene in a particular state, screens are performed to identify agents that bind and/or modulate the biological activity of the gene product.


[0240] In addition screens can be done for genes that are induced in response to a candidate agent. After identifying a modulator based upon its ability to suppress an angiogenesis expression pattern leading to a normal expression pattern, or to modulate a single angiogenesis gene expression profile so as to mimic the expression of the gene from normal tissue, a screen as described above can be performed to identify genes that are specifically modulated in response to the agent. Comparing expression profiles between normal tissue and agent treated angiogenesis tissue reveals genes that are not expressed in normal tissue or angiogenesis tissue, but are expressed in agent treated tissue. These agent-specific sequences can be identified and used by methods described herein for angiogenesis genes or proteins. In particular these sequences and the proteins they encode find use in marking or identifying agent treated cells. In addition, antibodies can be raised against the agent induced proteins and used to target novel therapeutics to the treated angiogenesis tissue sample.


[0241] Thus, in one embodiment, a test compound is administered to a population of angiogenic cells, that have an associated angiogenesis expression profile. By “administration” or “contacting” herein is meant that the candidate agent is added to the cells in such a manner as to allow the agent to act upon the cell, whether by uptake and intracellular action, or by action at the cell surface. In some embodiments, nucleic acid encoding a proteinaceous candidate agent (i.e., a peptide) may be put into a viral construct such as an adenoviral or retroviral construct, and added to the cell, such that expression of the peptide agent is accomplished, e.g., PCT US97/01019. Regulatable gene therapy systems can also be used.


[0242] Once the test compound has been administered to the cells, the cells can be washed if desired and are allowed to incubate under preferably physiological conditions for some period of time. The cells are then harvested and a new gene expression profile is generated, as outlined herein.


[0243] Thus, for example, angiogenesis tissue may be screened for agents that modulate, e.g., induce or suppress the angiogenesis phenotype. A change in at least one gene, preferably many, of the expression profile indicates that the agent has an effect on angiogenesis activity. By defining such a signature for the angiogenesis phenotype, screens for new drugs that alter the phenotype can be devised. With this approach, the drug target need not be known and need not be represented in the original expression screening platform, nor does the level of transcript for the target protein need to change.


[0244] Measure of angiogenesis polypeptide activity, or of angiogenesis or the angiogenic phenotype can be performed using a variety of assays. For example, the effects of the test compounds upon the function of the anagiogenesis polypeptides can be measured by examining parameters described above. A suitable physiological change that affects activity can be used to assess the influence of a test compound on the polypeptides of this invention. When the functional consequences are determined using intact cells or animals, one can also measure a variety of effects such as, in the case of angiogenesis associated with tumors, tumor growth, neovascularization, hormone release, transcriptional changes to both known and uncharacterized genetic markers (e.g., northern blots), changes in cell metabolism such as cell growth or pH changes, and changes in intracellular second messengers such as cGMP. In the assays of the invention, mammalian angiogenesis polypeptide is typically used, e.g., mouse, preferably human.


[0245] A variety of angiogenesis assays are known to those of skill in the art. Various models have been employed to evaluate angiogenesis (e.g., Croix et al., Science 289:1197-1202, 2000 and Kahn et al., Amer. J. Pathol. 156:1887-1900). Assessement of angiogenesis in the presence of a potential modulator of angiogenesis can be performed using cell-cultre-based angiogenesis assays, e.g., endothelial cell tube formation assays, as well as other bioassays such as the chick CAM assay, the mouse corneal assay, and assays measuring the effect of administering potential modulators on implanted tumors. The chick CAM assay is described by O'Reilly, et al. Cell 79: 315-328, 1994. Briefly, 3 day old chicken embryos with intact yolks are separated from the egg and placed in a petri dish. After 3 days of incubation, a methylcellulose disc containing the protein to be tested is applied to the CAM of individual embryos. After about 48 hours of incubation, the embryos and CAMs are observed to determine whether endothelial growth has been inhibited. The mouse corneal assay involves implanting a growth factor-containing pellet, along with another pellet containing the suspected endothelial growth inhibitor, in the cornea of a mouse and observing the pattern of capillaries that are elaborated in the cornea. Angiogenesis can also be measured by determining the extent of neovascularization of a tumor. For example, carcinoma cells can be subcutaneously inoculated into athymic nude mice and tumor growth then monitored. The cancer cells are treated with an angiogenesis inhibitor, such as an antibody, or other compound that is exogenously administered, or can be transfected prior to inoculation with a polynucleotide inhibitor of angiogenesis. Immunoassays using endothelial cell-specific antibodies are typically used to stain for vascularization of tumor and the number of vessels in the tumor.


[0246] Assays to identify compounds with modulating activity can be performed in vitro. For example, an angiogenesis polypeptide is first contacted with a potential modulator and incubated for a suitable amount of time, e.g., from 0.5 to 48 hours. In one embodiment, the angiogenesis polypeptide levels are determined in vitro by measuring the level of protein or mRNA. The level of protein is measured using immunoassays such as western blotting, ELISA and the like with an antibody that selectively binds to the angiogenesis polypeptide or a fragment thereof. For measurement of mRNA, amplification, e.g., using PCR, LCR, or hybridization assays, e.g., northern hybridization, RNAse protection, dot blotting, are preferred. The level of protein or mRNA is detected using directly or indirectly labeled detection agents, e.g., fluorescently or radioactively labeled nucleic acids, radioactively or enzymatically labeled antibodies, and the like, as described herein.


[0247] Alternatively, a reporter gene system can be devised using the angiogenesis protein promoter operably linked to a reporter gene such as luciferase, green fluorescent protein, CAT, or β-gal. The reporter construct is typically transfected into a cell. After treatment with a potential modulator, the amount of reporter gene transcription, translation, or activity is measured according to standard techniques known to those of skill in the art.


[0248] In a preferred embodiment, as outlined above, screens may be done on individual genes and gene products (proteins). That is, having identified a particular differentially expressed gene as important in a particular state, screening of modulators of the expression of the gene or the gene product itself can be done. The gene products of differentially expressed genes are sometimes referred to herein as “angiogenesis proteins”. In preferred embodiments the angiogenesis protein comprises a sequence shown in Table 2. The angiogenesis protein may be a fragment, or alternatively, be the full length protein to a fragment shown herein.


[0249] Preferably, the angiogenesis protein is a fragment of approximately 14 to 24 amino acids long. More preferably the fragment is a soluble fragment. In one embodiment an angiogenesis protein is conjugated to an immunogenic agent or BSA.


[0250] In one embodiment, screening for modulators of expression of specific genes is performed. Typically, the expression of only one or a few genes are evaluated. In another embodiment, screens are designed to first find compounds that bind to differentially expressed proteins. These compounds are then evaluated for the ability to modulate differentially expressed activity. Moreover, once initial candidate compounds are identified, variants can be further screened to better evaluate strucutre activity relationships.


[0251] In a preferred embodiment, binding assays are done. In general, purified or isolated gene product is used; that is, the gene products of one or more differentially expressed nucleic acids are made. For example, antibodies are generated to the protein gene products, and standard immunoassays are run to determine the amount of protein present. Alternatively, cells comprising the angiogenesis proteins can be used in the assays.


[0252] Thus, in a preferred embodiment, the methods comprise combining an angiogenesis protein and a candidate compound, and determining the binding of the compound to the angiogenesis protein. Preferred embodiments utilize the human angiogenesis protein, although other mammalian proteins may also be used, for example for the development of animal models of human disease. In some embodiments, as outlined herein, variant or derivative angiogenesis proteins may be used.


[0253] Generally, in a preferred embodiment of the methods herein, the angiogenesis protein or the candidate agent is non-diffusably bound to an insoluble support having isolated sample receiving areas (e.g. a microtiter plate, an array, etc.). The insoluble supports may be made of any composition to which the compositions can be bound, is readily separated from soluble material, and is otherwise compatible with the overall method of screening. The surface of such supports may be solid or porous and of any convenient shape. Examples of suitable insoluble supports include microtiter plates, arrays, membranes and beads. These are typically made of glass, plastic (e.g., polystyrene), polysaccharides, nylon or nitrocellulose, teflon™, etc. Microtiter plates and arrays are especially convenient because a large number of assays can be carried out simultaneously, using small amounts of reagents and samples. The particular manner of binding of the composition is not crucial so long as it is compatible with the reagents and overall methods of the invention, maintains the activity of the composition and is nondiffusable. Preferred methods of binding include the use of antibodies (which do not sterically block either the ligand binding site or activation sequence when the protein is bound to the support), direct binding to “sticky” or ionic supports, chemical crosslinking, the synthesis of the protein or agent on the surface, etc. Following binding of the protein or agent, excess unbound material is removed by washing. The sample receiving areas may then be blocked through incubation with bovine serum albumin (BSA), casein or other innocuous protein or other moiety.


[0254] In a preferred embodiment, the angiogenesis protein is bound to the support, and a test compound is added to the assay. Alternatively, the candidate agent is bound to the support and the angiogenesis protein is added. Novel binding agents include specific antibodies, non-natural binding agents identified in screens of chemical libraries, peptide analogs, etc. Of particular interest are screening assays for agents that have a low toxicity for human cells. A wide variety of assays may be used for this purpose, including labeled in vitro protein-protein binding assays, electrophoretic mobility shift assays, immunoassays for protein binding, functional assays (phosphorylation assays, etc.) and the like.


[0255] The determination of the binding of the test modulating compound to the angiogenesis protein may be done in a number of ways. In a preferred embodiment, the compound is labelled, and binding determined directly, e.g., by attaching all or a portion of the angiogenesis protein to a solid support, adding a labelled candidate agent (e.g. a fluorescent label), washing off excess reagent, and determining whether the label is present on the solid support. Various blocking and washing steps may be utilized as appropriate.


[0256] By “labeled” herein is meant that the compound is either directly or indirectly labeled with a label which provides a detectable signal, e.g. radioisotope, fluorescers, enzyme, antibodies, particles such as magnetic particles, chemiluminescers, or specific binding molecules, etc. Specific binding molecules include pairs, such as biotin and streptavidin, digoxin and antidigoxin, etc. For the specific binding members, the complementary member would normally be labeled with a molecule which provides for detection, in accordance with known procedures, as outlined above. The label can directly or indirectly provide a detectable signal.


[0257] In some embodiments, only one of the components is labeled, e.g., the proteins (or proteinaceous candidate compounds) can be labeled. Alternatively, more than one component can be labeled with different labels, e.g., 125 for the proteinsand a fluorophor for the compound. Proximity reagents, e.g., quenching or energy transfer reagents are also useful.


[0258] In one embodiment, the binding of the test compound is determined by competitive binding assay. The competitor is a binding moiety known to bind to the target molecule (i.e. an angiogenesis protein), such as an antibody, peptide, binding partner, ligand, etc. Under certain circumstances, there may be competitive binding between the compound and the binding moiety, with the binding moiety displacing the compound. In one embodiment, the test compound is labeled. Either the compound, or the competitor, or both, is added first to the protein for a time sufficient to allow binding, if present. Incubations may be performed at a temperature which facilitates optimal activity, typically between 4 and 40° C. Incubation periods are typically optimized, e.g., to facilitate rapid high throughput screening. Typically between 0.1 and 1 hour will be sufficient. Excess reagent is generally removed or washed away. The second component is then added, and the presence or absence of the labeled component is followed, to indicate binding.


[0259] In a preferred embodiment, the competitor is added first, followed by the test compound. Displacement of the competitor is an indication that the test compound is binding to the angiogenesis protein and thus is capable of binding to, and potentially modulating, the activity of the angiogenesis protein. In this embodiment, either component can be labeled. Thus, for example, if the competitor is labeled, the presence of label in the wash solution indicates displacement by the agent. Alternatively, if the test compound is labeled, the presence of the label on the support indicates displacement.


[0260] In an alternative embodiment, the test compound is added first, with incubation and washing, followed by the competitor. The absence of binding by the competitor may indicate that the test compound is bound to the angiogenesis protein with a higher affinity. Thus, if the test compound is labeled, the presence of the label on the support, coupled with a lack of competitor binding, may indicate that the test compound is capable of binding to the angiogenesis protein.


[0261] In a preferred embodiment, the methods comprise differential screening to identity agents that are capable of modulating the activitity of the angiogenesis proteins. In this embodiment, the methods comprise combining an angiogenesis protein and a competitor in a first sample. A second sample comprises a test compound, an angiogenesis protein, and a competitor. The binding of the competitor is determined for both samples, and a change, or difference in binding between the two samples indicates the presence of an agent capable of binding to the angiogenesis protein and potentially modulating its activity. That is, if the binding of the competitor is different in the second sample relative to the first sample, the agent is capable of binding to the angiogenesis protein.


[0262] Alternatively, differential screening is used to identify drug candidates that bind to the native angiogenesis protein, but cannot bind to modified angiogenesis proteins. The structure of the angiogenesis protein may be modeled, and used in rational drug design to synthesize agents that interact with that site. Drug candidates that affect the activity of an angiogenesis protein are also identified by screening drugs for the ability to either enhance or reduce the activity of the protein.


[0263] Positive controls and negative controls may be used in the assays. Preferably control and test samples are performed in at least triplicate to obtain statistically significant results. Incubation of all samples is for a time sufficient for the binding of the agent to the protein. Following incubation, samples are washed free of non-specifically bound material and the amount of bound, generally labeled agent determined. For example, where a radiolabel is employed, the samples may be counted in a scintillation counter to determine the amount of bound compound.


[0264] A variety of other reagents may be included in the screening assays. These include reagents like salts, neutral proteins, e.g. albumin, detergents, etc. which may be used to facilitate optimal protein-protein binding and/or reduce non-specific or background interactions. Also reagents that otherwise improve the efficiency of the assay, such as protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may be used. The mixture of components may be added in an order that provides for the requisite binding.


[0265] In a preferred embodiment, the invention provides methods for screening for a compound capable of modulating the activity of an angiogenesis protein. The methods comprise adding a test compound, as defined above, to a cell comprising angiogenesis proteins. Preferred cell types include almost any cell. The cells contain a recombinant nucleic acid that encodes an angiogenesis protein. In a preferred embodiment, a library of candidate agents are tested on a plurality of cells.


[0266] In one aspect, the assays are evaluated in the presence or absence or previous or subsequent exposure of physiological signals, for example hormones, antibodies, peptides, antigens, cytokines, growth factors, action potentials, pharmacological agents including chemotherapeutics, radiation, carcinogenics, or other cells (i.e. cell-cell contacts). In another example, the determinations are determined at different stages of the cell cycle process.


[0267] In this way, compounds that modulate angiogenesis agents are identified. Compounds with pharmacological activity are able to enhance or interfere with the activity of the angiogenesis protein. Once identified, similar structures are evaluated to identify critical structural feature of the compound.


[0268] In one embodiment, a method of inhibiting angiogenic cell division is provided. The method comprises administration of an angiogenesis inhibitor. In another embodiment, a method of inhibiting angiogenesis is provided. The method comprises administration of an angiogenesis inhibitor. In a further embodiment, methods of treating cells or individuals with angiogenesis are provided. The method comprises administration of an angiogenesis inhibitor.


[0269] In one embodiment, an angiogenesis inhibitor is an antibody as discussed above. In another embodiment, the angiogenesis inhibitor is an antisense molecule.


[0270] Polynucleotide Modulators of Angiogenesis


[0271] Antisense Polynucleotides


[0272] In certain embodiments, the activity of an angiogenesis-associated protein is downregulated, or entirely inhibited, by the use of antisense polynucleotide, i.e., a nucleic acid complementary to, and which can preferably hybridize specifically to, a coding mRNA nucleic acid sequence, e.g. in angiogenesis protein mRNA, or a subsequence thereof. Binding of the antisense polynucleotide to the mRNA reduces the translation and/or stability of the mRNA.


[0273] In the context of this invention, antisense polynucleotides can comprise naturally-occurring nucleotides, or synthetic species formed from naturally-occurring subunits or their close homologs. Antisense polynucleotides may also have altered sugar moieties or inter-sugar linkages. Exemplary among these are the phosphorothioate and other sulfur containing species which are known for use in the art. Analogs are comprehended by this invention so long as they function effectively to hybridize with the angiogenesis protein mRNA. See, e.g., Isis Pharmaceuticals, Carlsbad, CA; Sequitor, Inc., Natick, MA.


[0274] Such antisense polynucleotides can readily be synthesized using recombinant means, or can be synthesized in vitro. Equipment for such synthesis is sold by several vendors, including Applied Biosystems. The preparation of other oligonucleotides such as phosphorothioates and alkylated derivatives is also well known to those of skill in the art.


[0275] Antisense molecules as used herein include antisense or sense oligonucleotides. Sense oligonucleotides can, e.g., be employed to block trancription by binding to the anti-sense strand. The antisense and sense oligonucleotide comprise a single-stranded nucleic acid sequence (either RNA or DNA) capable of binding to target mRNA (sense) or DNA (antisense) sequences for angiogenesis molecules. A preferred antisense molecule is for an angiogenesis sequences in Table 1, or for a ligand or activator thereof. Antisense or sense oligonucleotides, according to the present invention, comprise a fragment generally at least about 14 nucleotides, preferably from about 14 to 30 nucleotides. The ability to derive an antisense or a sense oligonucleotide, based upon a cDNA sequence encoding a given protein is described in, for example, Stein and Cohen (Cancer Res. 48:2659, 1988) and van der Krol et al. (BioTechniques 6:958, 1988).


[0276] Ribozymes


[0277] In addition to antisense polynucleotides, ribozymes can be used to target and inhibit transcription of angiogenesis-associated nucleotide sequences. A ribozyme is an RNA molecule that catalytically cleaves other RNA molecules. Different kinds of ribozymes have been described, including group I ribozymes, hammerhead ribozymes, hairpin ribozymes, RNase P, and axhead ribozymes (see, e.g., Castanotto et al. (1994) Adv. in Pharmacology 25: 289-317 for a general review of the properties of different ribozymes).


[0278] The general features of hairpin ribozymes are described, e.g., in Hampel et al. (1990) Nucl. Acids Res. 18: 299-304; Hampel et al. (1990) European Patent Publication No. 0 360 257; U.S. Pat. No. 5,254,678. Methods of preparing are well known to those of skill in the art (see, e.g., Wong-Staal et al., WO 94/26877; Ojwang et al. (1993) Proc. Natl. Acad. Sci. USA 90: 6340-6344; Yamada et al. (1994) Human Gene Therapy 1: 39-45; Leavitt et al. (1995) Proc. Natl. Acad. Sci. USA 92: 699-703; Leavitt et al. (1994) Human Gene Therapy 5: 1151-120; and Yamada et al. (1994) Virology 205: 121-126).


[0279] Polynucleotide modulators of angiogenesis may be introduced into a cell containing the target nucleotide sequence by formation of a conjugate with a ligand binding molecule, as described in WO 91/04753. Suitable ligand binding molecules include, but are not limited to, cell surface receptors, growth factors, other cytokines, or other ligands that bind to cell surface receptors. Preferably, conjugation of the ligand binding molecule does not substantially interfere with the ability of the ligand binding molecule to bind to its corresponding molecule or receptor, or block entry of the sense or antisense oligonucleotide or its conjugated version into the cell. Alternatively, a polynucleotide modulator of angiogenesis may be introduced into a cell containing the target nucleic acid sequence, e.g., by formation of an polynucleotide-lipid complex, as described in WO 90/10448. It is understood that the use of aitisense molecules or knock out and knock in models may also be used in screening assays as discussed above, in addition to methods of treatment.


[0280] Thus, in one embodiment, methods of modulating angiogenesis in cells or organisms are provided. In one embodiment, the methods comprise administering to a cell an anti-angiogenesis antibody that reduces or eliminates the biological activity of an endogeneous angiogenesis protein. Alternatively, the methods comprise administering to a cell or organism a recombinant nucleic acid encoding an angiogenesis protein. This may be accomplished in any number of ways. In a preferred embodiment, for example when the angiogenesis sequence is down-regulated in angiogenesis, such state may be reversed by increasing the amount of angiogenesis gene product in the cell. This can be accomplished, e.g., by overexpressing the endogeneous angiogenesis gene or administering a gene encoding the angiogenesis sequence, using known gene-therapy techniques, for example. In a preferred embodiment, the gene therapy techniques include the incorporation of the exogenous gene using enhanced homologous recombination (EHR), for example as described in PCT/US93/03868, hereby incorporated by reference in its entireity. Alternatively, for example when the angiogenesis sequence is up-regulated in angiogenesis, the activity of the endogeneous angiogenesis gene is decreased, for example by the administration of a angiogenesis antisense nucleic acid.


[0281] In one embodiment, the angiogenesis proteins of the present invention may be used to generate polyclonal and monoclonal antibodies to angiogenesis proteins. Similarly, the angiogenesis proteins can be coupled, using standard technology, to affinity chromatography columns. These columns may then be used to purify angiogenesis antibodies useful for production, diagnostic, or therapeutic purposes. In a preferred embodiment, the antibodies are generated to epitopes unique to a angiogenesis protein; that is, the antibodies show little or no cross-reactivity to other proteins. The angiogenesis antibodies may be coupled to standard affinity chromatography columns and used to purify angiogenesis proteins. The antibodies may also be used as blocking polypeptides, as outlined above, since they will specifically bind to the angiogenesis protein.


[0282] Methods of Identifying Variant Angiogenesis-associated Sequences


[0283] Without being bound by theory, expression of various angiogenesis sequences is correlated with angiogenesis. Accordingly, disorders based on mutant or variant angiogenesis genes may be determined. In one embodiment, the invention provides methods for identifying cells containing variant angiogenesis genes, e.g., determining all or part of the sequence of at least one endogeneous angiogenesis genes in a cell. This may be accomplished using any number of sequencing techniques. In a preferred embodiment, the invention provides methods of identifying the angiogenesis genotype of an individual, e.g., determining all or part of the sequence of at least one angiogenesis gene of the individual. This is generally done in at least one tissue of the individual, and may include the evaluation of a number of tissues or different samples of the same tissue. The method may include comparing the sequence of the sequenced angiogenesis gene to a known angiogenesis gene, i.e., a wild-type gene.


[0284] The sequence of all or part of the angiogenesis gene can then be compared to the sequence of a known angiogenesis gene to determine if any differences exist. This can be done using any number of known homology programs, such as Bestfit, etc. In a preferred embodiment, the presence of a a difference in the sequence between the angiogenesis gene of the patient and the known angiogenesis gene correlates with a disease state or a propensity for a disease state, as outlined herein.


[0285] In a preferred embodiment, the angiogenesis genes are used as probes to determine the number of copies of the angiogenesis gene in the genome.


[0286] In another preferred embodiment, the angiogenesis genes are used as probes to determine the chromosomal localization of the angiogenesis genes. Information such as chromosomal localization finds use in providing a diagnosis or prognosis in particular when chromosomal abnormalities such as translocations, and the like are identified in the angiogenesis gene locus.


[0287] Administration of Pharmaceutical and Vaccine Compositions


[0288] In one embodiment, a therapeutically effective dose of an angiogenesis protein or modulator thereof, is administered to a patient. By “therapeutically effective dose” herein is meant a dose that produces effects for which it is administered. The exact dose will depend on the purpose of the treatment, and will be ascertainable by one skilled in the art using known techniques (e.g., Ansel et al., Pharmaceuitcal Dosage Forms and Drug Delivery, Lippincott, Williams & Wilkins Publishers, ISBN:0683305727; Lieberman (1992) Pharmaceutical Dosage Forms (vols. 1-3), Dekker, ISBN 0824770846, 082476918X, 0824712692, 0824716981; Lloyd (1999) The Art, Science and Technology of Pharmaceutical Compounding, Amer. Pharmacutical Assn, ISBN 0917330889; and Pickar (1999) Dosage Calculations, Delmar Pub, ISBN 0766805042). As is known in the art, adjustments for angiogenesis degradation, systemic versus localized delivery, and rate of new protease synthesis, as well as the age, body weight, general health, sex, diet, time of administration, drug interaction and the severity of the condition may be necessary, and will be ascertainable with routine experimentation by those skilled in the art.


[0289] A “patient” for the purposes of the present invention includes both humans and other animals, particularly mammals. Thus the methods are applicable to both human therapy and veterinary applications. In the preferred embodiment the patient is a mammal, preferably a primate, and in the most preferred embodiment the patient is human.


[0290] The administration of the angiogenesis proteins and modulators thereof of the present invention can be done in a variety of ways as discussed above, including, but not limited to, orally, subcutaneously, intravenously, intranasally, transdermally, intraperitoneally, intramuscularly, intrapulmonary, vaginally, rectally, or intraocularly. In some instances, for example, in the treatment of wounds and inflammation, the angiogenesis proteins and modulators may be directly applied as a solution or spray.


[0291] The pharmaceutical compositions of the present invention comprise an angiogenesis protein in a form suitable for administration to a patient. In the preferred embodiment, the pharmaceutical compositions are in a water soluble form, such as being present as pharmaceutically acceptable salts, which is meant to include both acid and base addition salts. “Pharmaceutically acceptable acid addition salt” refers to those salts that retain the biological effectiveness of the free bases and that are not biologically or otherwise undesirable, formed with inorganic acids such as hydrochloric acid, hydrobromic acid, sulfuric acid, nitric acid, phosphoric acid and the like, and organic acids such as acetic acid, propionic acid, glycolic acid, pyruvic acid, oxalic acid, maleic acid, malonic acid, succinic acid, fumaric acid, tartaric acid, citric acid, benzoic acid, cinnamic acid, mandelic acid, methanesulfonic acid, ethanesulfonic acid, p-toluenesulfonic acid, salicylic acid and the like. “Pharmaceutically acceptable base addition salts” include those derived from inorganic bases such as sodium, potassium, lithium, ammonium, calcium, magnesium, iron, zinc, copper, manganese, aluminum salts and the like. Particularly preferred are the ammonium, potassium, sodium, calcium, and magnesium salts. Salts derived from pharmaceutically acceptable organic non-toxic bases include salts of primary, secondary, and tertiary amines, substituted amines including naturally occurring substituted amines, cyclic amines and basic ion exchange resins, such as isopropylamine, trimethylamine, diethylamine, triethylamine, tripropylamine, and ethanolamine.


[0292] The pharmaceutical compositions may also include one or more of the following: carrier proteins such as serum albumin; buffers; fillers such as microcrystalline cellulose, lactose, corn and other starches; binding agents; sweeteners and other flavoring agents; coloring agents; and polyethylene glycol.


[0293] The pharmaceutical compositions can be administered in a variety of unit dosage forms depending upon the method of administration. For example, unit dosage forms suitable for oral administration include, but are not limited to, powder, tablets, pills, capsules and lozenges. It is recognized that angiogenesis protein modulators (e.g. antibodies, antisense constructs, ribozymes, small organic molecules, etc.) when administered orally, should be protected from digestion. This is typically accomplished either by complexing the molecule(s) with a composition to render it resistant to acidic and enzymatic hydrolysis, or by packaging the molecule(s) in an appropriately resistant carrier, such as a liposome or a protection barrier. Means of protecting agents from digestion are well known in the art.


[0294] The compositions for administration will commonly comprise an angiogenesis protein modulator dissolved in a pharmaceutically acceptable carrier, preferably an aqueous carrier. A variety of aqueous carriers can be used, e.g., buffered saline and the like. These solutions are sterile and generally free of undesirable matter. These compositions may be sterilized by conventional, well known sterilization techniques. The compositions may contain pharmaceutically acceptable auxiliary substances as required to approximate physiological conditions such as pH adjusting and buffering agents, toxicity adjusting agents and the like, for example, sodium acetate, sodium chloride, potassium chloride, calcium chloride, sodium lactate and the like. The concentration of active agent in these formulations can vary widely, and will be selected primarily based on fluid volumes, viscosities, body weight and the like in accordance with the particular mode of administration selected and the patient's needs (e.g., Remington's Pharmaceutical Science, 15th ed., Mack Publishing Company, Easton, Pa. (1980) and Goodman and Gillman, The Pharmacologial Basis of Therapeutics,(Hardman, J. G, Limbird, L. E, Molinoff, P. B., Ruddon, R. W, and Gilman, A. G., eds) TheMcGraw-Hill Companies, Inc.,1996).


[0295] Thus, a typical pharmaceutical composition for intravenous administration would be about 0.1 to 10 mg per patient per day. Dosages from 0.1 up to about 100 mg per patient per day may be used, particularly when the drug is administered to a secluded site and not into the blood stream, such as into a body cavity or into a lumen of an organ. Substantially higher dosages are possible in topical administration. Actual methods for preparing parenterally administrable compositions will be known or apparent to those skilled in the art, e.g., Remington's Pharmaceutical Science and Goodman and Gillman, The Pharmacologial Basis of Therapeutics, supra.


[0296] The compositions containing modulators of angiogenesis proteins can be administered for therapeutic or prophylactic treatments. In therapeutic applications, compositions are administered to a patient suffering from a disease (e.g., a cancer) in an amount sufficient to cure or at least partially arrest the disease and its complications. An amount adequate to accomplish this is defined as a “therapeutically effective dose.” Amounts effective for this use will depend upon the severity of the disease and the general state of the patient's health. Single or multiple administrations of the compositions may be administered depending on the dosage and frequency as required and tolerated by the patient. In any event, the composition should provide a sufficient quantity of the agents of this invention to effectively treat the patient. An amount of modulator that is capable of preventing or slowing the development of cancer in a mammal is referred to as a “prophylactically effective dose.” The particular dose required for a prophylactic treatment will depend upon the medical condition and history of the mammal, the particular cancer being prevented, as well as other factors such as age, weight, gender, administration route, efficiency, etc. Such prophylactic treatments may be used, e.g., in a mammal who has previously had cancer to prevent a recurrence of the cancer, or in a mammal who is suspected of having a significant likelihood of developing cancer.


[0297] It will be appreciated that the present angiogenesis protein-modulating compounds can be administered alone or in combination with additional angiogenesis modulating compounds or with other therapeutic agent, e.g., other anti-cancer agents or treatments.


[0298] In numerous embodiments, one or more nucleic acids, e.g., polynucleotides comprising nucleic acid sequences set forth in Table 1, such as antisense polynucleotides or ribozyrnes, will be introduced into cells, in vitro or in vivo. The present invention provides methods, reagents, vectors, and cells useful for expression of angiogenesis-associated polypeptides and nucleic acids using in vitro (cell-free), ex vivo or in vivo (cell or organism-based) recombinant expression systems.


[0299] The particular procedure used to introduce the nucleic acids into a host cell for expression of a protein or nucleic acid is application specific. Many procedures for introducing foreign nucleotide sequences into host cells may be used. These include the use of calcium phosphate transfection, spheroplasts, electroporation, liposomes, microinjection, plasma vectors, viral vectors and any of the other well known methods for introducing cloned genomic DNA, cDNA, synthetic DNA or other foreign genetic material into a host cell (see, e.g., Berger and Kimmel, Guide to Molecular Cloning Techniques, Methods in Enzymology volume 152 Academic Press, Inc., San Diego, Calif. (Berger), F. M. Ausubel et al., eds., Current Protocols, a joint venture between Greene Publishing Associates, Inc. and John Wiley & Sons, Inc., (supplemented through 1999), and Sambrook et al., Molecular Cloning—A Laboratory Manual (2nd Ed.), Vol. 1-3, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y., 1989.


[0300] In a preferred embodiment, angiogenesis proteins and modulators are administered as therapeutic agents, and can be formulated as outlined above. Similarly, angiogenesis genes (including both the full-length sequence, partial sequences, or regulatory sequences of the angiogenesis coding regions) can be administered in a gene therapy application. These angiogenesis genes can include antisense applications, either as gene therapy (i.e. for incorporation into the genome) or as antisense compositions, as will be appreciated by those in the art.


[0301] Angiogenesis polypeptides and polynucleotides can also be administered as vaccine compositions to stimulate HTL, CTL and antibody responses. Such vaccine compositions can include, for example, lipidated peptides (e.g.,Vitiello, A. et al., J. Clin. Invest. 95:341, 1995), peptide compositions encapsulated in poly(DL-lactide-co-glycolide) (“PLG”) microspheres (see, e.g., Eldridge, et al., Milec. Immunol. 28:287-294, 1991: Alonso et al., Vaccine 12:299-306, 1994; Jones et al., Vaccine 13:675-681, 1995), peptide compositions contained in immune stimulating complexes (ISCOMS) (see, e.g. Takahashi et al., Nature 344:873-875, 1990; Hu et al., Clin Exp Immunol. 113:235-243, 1998), multiple antigen peptide systems (MAPs) (see e.g., Tam, J. P., Proc. Natl. Acad. Sci. U.S.A. 85:5409-5413, 1988; Tam, J. P., J. Immunol. Methods 196:17-32, 1996), peptides formulated as multivalent peptides; peptides for use in ballistic delivery systems, typically crystallized peptides, viral delivery vectors (Perkus, M. E. et al., In: Concepts in vaccine development, Kaufinann, S. H. E., ed., p. 379, 1996; Chakrabarti, S. et al., Nature 320:535, 1986; Hu, S. L. et al., Nature 320:537, 1986; Kieny, M. -P. et al., AIDS Bio/Technology 4:790, 1986; Top, F. H. et al., J. Infect. Dis. 124:148, 1971; Chanda, P. K. et al., Virology 175:535, 1990), particles of viral or synthetic origin (e.g., Kofler, N. et al., J. Immunol. Methods. 192:25, 1996; Eldridge, J. H. et al., Sem. Hematol. 30:16, 1993; Falo, L. D., Jr. et al., Nature Med. 7:649, 1995), adjuvants (Warren, H. S., Vogel, F. R., and Chedid, L. A. Annu. Rev. Immunol. 4:369, 1986; Gupta, R. K. et al., Vaccine 11:293, 1993), liposomes (Reddy, R. et al., J. Immunol. 148:1585, 1992; Rock, K. L., Immunol. Today 17:131, 1996), or, naked or particle absorbed cDNA (Ulmer, J. B. et al., Science 259:1745, 1993; Robinson, H. L., Hunt, L. A., and Webster, R. G., Vaccine 11:957, 1993; Shiver, J. W. et al., In: Concepts in vaccine development, Kaufmann, S. H. E., ed., p. 423, 1996; Cease, K. B., and Berzofsky, J. A., Annu. Rev. Immunol. 12:923, 1994 and Eldridge, J. H. et al., Sem. Hematol. 30:16, 1993). Toxin-targeted delivery technologies, also known as receptor mediated targeting, such as those of Avant Immunotherapeutics, Inc. (Needham, Mass.) may also be used.


[0302] Vaccine compositions often include adjuvants. Many adjuvants contain a substance designed to protect the antigen from rapid catabolism, such as aluminum hydroxide or mineral oil, and a stimulator of immune responses, such as lipid A, Bortadella pertussis or Mycobacterium tuberculosis derived proteins. Certain adjuvants are commercially available as, for example, Freund's Incomplete Adjuvant and Complete Adjuvant (Difco Laboratories, Detroit, Mich.); Merck Adjuvant 65 (Merck and Company, Inc., Rahway, N.J.); AS-2 (SmithKline Beecham, Philadelphia, Pa.); aluminum salts such as aluminum hydroxide gel (alum) or aluminum phosphate; salts of calcium, iron or zinc; an insoluble suspension of acylated tyrosine; acylated sugars; cationically or anionically derivatized polysaccharides; polyphosphazenes; biodegradable inicrospheres; monophosphoryl lipid A and quil A. Cytokines, such as GM-CSF, interleukin-2, -7, -12, and other like growth factors, may also be used as adjuvants.


[0303] Vaccines can be administered as nucleic acid compositions wherein DNA or RNA encoding one or more of the polypeptides, or a fragment thereof, is administered to a patient. This approach is described, for instance, in Wolff et. al., Science 247:1465 (1990) as well as U.S. Pat. Nos. 5,580,859; 5,589,466; 5,804,566; 5,739,118; 5,736,524; 5,679,647; WO 98/04720; and in more detail below. Examples of DNA-based delivery technologies include “naked DNA”, facilitated (bupivicaine, polymers, peptide-mediated) delivery, cationic lipid complexes, and particle-mediated (“gene gun”) or pressure-mediated delivery (see, e.g., U.S. Pat. No. 5,922,687).


[0304] For therapeutic or prophylactic immunization purposes, the peptides of the invention can be expressed by viral or bacterial vectors. Examples of expression vectors include attenuated viral hosts, such as vaccinia or fowlpox. This approach involves the use of vaccinia virus, for example, as a vector to express nucleotide sequences that encode angiogenic polypeptides or polypeptide fragments. Upon introduction into a host, the recombinant vaccinia virus expresses the immunogenic peptide, and thereby elicits an immune response. Vaccinia vectors and methods useful in immunization protocols are described in, e.g., U.S. Pat. No. 4,722,848. Another vector is BCG (Bacille Calmette Guerin). BCG vectors are described in Stover et al., Nature 351:456-460 (1991). A wide variety of other vectors useful for therapeutic administration or immunization e.g. adeno and adeno-associated virus vectors, retroviral vectors, Salmonella typhi vectors, detoxified anthrax toxin vectors, and the like, will be apparent to those skilled in the art from the description herein (see, e.g., Shata et al. (2000) Mol Med Today, 6: 66-71; Shedlock et al., J. Leukoc Biol 68,:793-806, 2000; Hipp et al., In Vivo 14:571-85, 2000).


[0305] Methods for the use of genes as DNA vaccines are well known, and include placing an angiogenesis gene or portion of an angiogenesis gene under the control of a regulatable promoter or a tissue-specific promoter for expression in an angiogenesis patient. The angiogenesis gene used for DNA vaccines can encode full-length angiogenesis proteins, but more preferably encodes portions of the angiogenesis proteins including peptides derived from the angiogenesis protein. In one embodiment, a patient is immunized with a DNA vaccine comprising a plurality of nucleotide sequences derived from an angiogenesis gene. For example, angiogenesis-associated genes or sequence encoding subfragments of an angiogenesis protein are introduced into expression vectors and tested for their immunogenicity in the context of Class I MHC and an ability to generate cytotoxic T cell responses. This procedure provides for production of cytotoxic T cell responses against cells which present antigen, including intracellular epitopes.


[0306] In a preferred embodiment, the DNA vaccines include a gene encoding an adjuvant molecule with the DNA vaccine. Such adjuvant molecules include cytokines that increase the immunogenic response to the angiogenesis polypeptide encoded by the DNA vaccine. Additional or alternative adjuvants are available.


[0307] In another preferred embodiment angiogenesis genes find use in generating animal models of angiogenesis. When the angiogenesis gene identified is repressed or diminished in angiogenesic tissue, gene therapy technology, e.g., wherein antisense RNA directed to the angiogenesis gene will also diminish or repress expression of the gene. Animal models of angiogenesis find use in screening for modulators of an angiogenesis-associated sequence or modulators of angiogenesis. Similarly, transgenic animal technology including gene knockout technology, for example as a result of homologous recombination with an appropriate gene targeting vector, will result in the absence or increased expression of the angiogenesis protein. When desired, tissue-specific expression or knockout of the angiogenesis protein may be necessary.


[0308] It is also possible that the angiogenesis protein is overexpressed in angiogenesis. As such, transgenic animals can be generated that overexpress the angiogenesis protein. Depending on the desired expression level, promoters of various strengths can be employed to express the transgene. Also, the number of copies of the integrated transgene can be determined and compared for a determination of the expression level of the transgene. Animals generated by such methods find use as animal models of angiogenesis and are additionally useful in screening for modulators to treat angiogenesis.


[0309] Kits for Use in Diagnostic and/or Prognostic Applications


[0310] For use in diagnostic, research, and therapeutic applications suggested above, kits are also provided by the invention. In the diagnostic and research applications such kits may include any or all of the following: assay reagents, buffers, angiogenesis-specific nucleic acids or antibodies, hybridization probes and/or primers, antisense polynucleotides, ribozymes, dominant negative angiogenesis polypeptides or polynucleotides, small molecules inhibitors of angiogenesis-associated sequences etc. A therapeutic product may include sterile saline or another pharmaceutically acceptable emulsion and suspension base.


[0311] In addition, the kits may include instructional materials containing directions (i.e., protocols) for the practice of the methods of this invention. While the instructional materials typically comprise written or printed materials they are not limited to such. Any medium capable of storing such instructions and communicating them to an end user is contemplated by this invention. Such media include, but are not limited to electronic storage media (e.g., magnetic discs, tapes, cartridges, chips), optical media (e.g., CD ROM), and the like. Such media may include addresses to internet sites that provide such instructional materials.


[0312] The present invention also provides for kits for screening for modulators of angiogenesis-associated sequences. Such kits can be prepared from readily available materials and reagents. For example, such kits can comprise one or more of the following materials: an angiogenesis-associated polypeptide or polynucleotide, reaction tubes, and instructions for testing angiogenic-associated activity. Optionally, the kit contains biologically active angiogenesis protein. A wide variety of kits and components can be prepared according to the present invention, depending upon the intended user of the kit and the particular needs of the user. Diagnosis would typically involve evaluation of a plurality of genes or products. The genes will be selected based on correlations with important parameters in disease which may be identified in historical or outcome data.


[0313] It is understood that the examples described above in no way serve to limit the true scope of this invention, but rather are presented for illustrative purposes. All publications, sequences of accession numbers, and Patent applications cited in this specification are herein incorporated by reference as if each individual publication or Patent application were specifically and individually indicated to be incorporated by reference.







EXAMPLES


Example 1


Tissue Preparation, Labeling Chips, and Fingerprints

[0314] Purify Total RNA from Tissue Using TRIzol Reagent


[0315] Homogenize tissue samples in 1 ml of TRIzol per 50 mg of tissue using a Polytron 3100 homogenizer. The generator/probe used depends upon the tissue size. A generator that is too large for the amount of tissue to be homogenized will cause a loss of sample and lower RNA yield. TRIzol is added directly to frozen tissue, which is then homogenize. Following homogenization, insoluble material is removed by centrifugation at 7500× g for 15 min in a Sorvall superspeed or 12,000× g for 10 min. in an Eppendorf centrifuge at 4° C. The clear homogenate is transferred to a new tube for use. The samples may be frozen now at −60° to −70° C. (and kept for at least one month). The homogenate is mixed with 0.2 ml of chloroform per 1 ml of TRIzol reagent used in the original homogenization and incubated at room temp. for 2-3 minutes. The aqueous phase is then separated by centrifugation and transferred to a fresh tube and the RNA precipitated using isopropyl alcohol. The pellet is isolated by centrifugation, washed, air-dried, resuspended in an appropriate volume of DEPC H2O, and the absorbance measured.


[0316] Purification of poly A+ mRNA from total RNA is performed as follows. Heat an oligotex suspension to 37° C. and mixing immediately before adding to RNA. The Elution Buffer is heated at 70° C. Warm up 2× Binding Buffer at 65° C. if there is precipitate in the buffer. Mix total RNA with DEPC-treated water, 2× Binding Buffer, and Oligotex according to Table 2 on page 16 of the Oligotex Handbook. Incubate for 3 minutes at 65° C. Incubate for 10 minutes at room temperature. Centrifuge for 2 minutes at 14,000 to 18,000 g. Remove supernatant without disturbing Oligotex pellet. A little bit of solution can be left behind to reduce the loss of Oligotex. Gently resuspend in Wash Buffer OW2 and pipet onto spin column. Centrifuge the spin column at full speed for 1 minute. Transfer spin column to a new collection tube and gently resuspend in Wash Buffer OW2 and centrifuge as describe herein. Transfer spin column to a new tube and elute with 20 to 100 ul of preheated (70° C.) Elution Buffer. Gently resuspend Oligotex resin by pipetting up and down. Centrifuge as above. Repeat elution with fresh elution buffer or use first eluate to keep the elution volume low. Read absorbance, using diluted Elution Buffer as the blank. Before proceeding with cDNA synthesis, precipitate the mRNA as follows: add 0.4 vol. of 7.5 M NH4OAc+2.5 vol. of cold 100% ethanol. Precipitate at −20° C. 1 hour to overnight (or 20-30 min. at −70° C.). Centrifuge at 14,000-16,000× g for 30 minutes at 4° C. Wash pellet with 0.5 ml of 80%ethanol (−20° C.) then centrifuge at 14,000-16,000× g for 5 minutes at room temperature. Repeat 80% ethanol wash. Air dry the ethanol from the pellet in the hood. Suspend pellet in DEPC H2O at 1 ug/ul concentration.


[0317] To further Clean up total RNA using Qiagen's RNeasy kit, add no more than 100 ug to an RNeasy column. Adjust sample to a volume of 100 ul with RNase-free water. Add 350 ul Buffer RLT then 250 ul ethanol (100%) to the sample. Mix by pipetting (do not centrifuge) then apply sample to an RNeasy mini spin column. Centrifuge for 15 sec at >10,00 rpm. Transfer column to a new 2-ml collection tube. Add 500 ul Buffer RPE and centrifuge for 15 sec at >10,00 rpm. Discard flowthrough. Add 500 ul Buffer RPE and centrifuge for 15 sec at >10,000 rpm. Discard flowthrough then centrifuge for 2 min at maximum speed to dry column membrane. Transfer column to a new 1.5-ml collection tube and apply 30-50 ul of RNase-free water directly onto column membrane. Centrifuge 1 min at >10,000 rpm. Repeat elution. and read absorbance.


[0318] cDNA Synthesis Using Gibco's “SuperScript Choice System for cDNA Synthesis” Kit


[0319] First Strand cDNA synthesis is performed as follows. Use 5 ug of total RNA or 1 ug of polyA+ mRNA as starting material. For total RNA, use 2 ul of SuperScript RT. For polyA+ mRNA, use 1 ul of SuperScript RT. Final volume of first strand synthesis mix is 20 ul. RNA must be in a volume no greater than 10 ul. Incubate RNA with 1 ul of 100 pmol T7-T24 oligo for 10 min at 70 C. On ice, add 7 ul of: 4 ul 5× 1st Strand Buffer, 2 ul of 0.1M DTT, and 1 ul of 10 nM dNTP mix. Incubate at 37 C. for 2 min then add SuperScript RT. Incubate at 37 C. for 1 hour.


[0320] For the second strand synthesis, place 1st strand reactions on ice and add: 91 ul DEPC H2O; 30 ul 5× 2nd Strand Buffer; 3 ul 10 mM dNTP mix; 1 ul 10 U/ul E. coli DNA Ligase; 4 ul 10 U/ul E. coli DNA Polymerase; and 1 ul 2 U/ul RNase H. Mix and incubate 2 hours at 16 C. Add 2 ul T4 DNA Polymerase. Incubate 5 min at 16 C. Add 10 ul of 0.5M EDTA. A further clean-up of DNA is performed using phenol:chloroform:isoamyl Alcohol (25:24:1) purification.


[0321] In vitro Transcription (IVT) and labeling with biotin is performed as follows: Pipet 1.5 ul of cDNA into a thin-wall PCR tube. Make NTP labeling mix by combining 2 ul T7 10× ATP (75 mM) (Ambion); 2 ul T7 10× GTP (75 mM) (Ambion); 1.5 ul T7 10× CTP (75 mM) (Ambion); 1.5 ul T7 10× UTP (75 mM) (Ambion); 3.75 ul 10 mM Bio-11-UTP (Boehringer-Mannheim/Roche or Enzo); 3.75 ul 10 nM Bio-16-CTP (Enzo); 2 ul 10× T7 transcription buffer (Ambion); and 2 ul 10× T7 enzyme mix (Ambion). The final volume is 20 ul. Incubate 6 hours at 37° C. in a PCR machine. The RNA can be furthered cleaned.


[0322] Fragmentation is performed as follows. 15 ug of labeled RNA is usually fragmented. Try to minimize the fragmentation reaction volume; a 10 ul volume is recommended but 20 ul is all right. Do not go higher than 20 ul because the magnesium in the fragmentation buffer contributes to precipitation in the hybridization buffer. Fragment RNA by incubation at 94 C. for 35 minutes in 1× Fragmentation buffer (5× Fragmentation buffer is 200 mM Tris-acetate, pH 8.1; 500 mM KOAc; 150 mM MgOAc). The labeled RNA transcript can be analyzed before and after fragmentation. Samples can be heated to 65° C. for 15 minutes and electrophoresed on 1% agarose/TBE gels to get an approximate idea of the transcript size range.


[0323] For hybridization, 200 ul (10 ug cRNA) of a hybridization mix is put on the chip. If multiple hybridizations are to be done (such as cycling through a 5 chip set), then it is recommended that an initial hybridization mix of 300 ul or more be made. The hybridization mix is: fragment labeled RNA (50 ng/ul final conc.); 50 pM 948-b control oligo; 1.5 pM BioB; 5 pM BioC; 25 pM BioD; 100 pM CRE; 0.1 mg/ml herring sperm DNA; 0.5 mg/ml acetylated BSA; and 300 ul with 1× MES hyb buffer.


[0324] Labeling is performed as follows: The hybridization reaction includes non-biotinylated IVT (purified by RNeasy columns); IVT antisense RNA 4 μg:μl; random Hexamers (1 μg/μl) 4 μl and water to 14 ul. The reaciton is incubated at 70° C., 10 min. Reverse transcriptionis performed in the following reaction: 5× First Strand (BRL) buffer, 6 μl; 0.1 M DTT, 3 μl; 50× dNTP mix, 0.6 μl; H2O, 2.4 μl; Cy3 or CyS dUTP (lmM), 3 pL; SS RT II (BRL), 1 μl in a final volume of 16 μl. Add to hybridization reaction. Incubate 30 min., 42° C. Add 1 μl SSII and incubate another hour. Put on ice. 50× dNTP mix (25 mM of cold dATP, dCTP, and dGTP, 10 mM of dTTP: 25 μl each of 100 mM dATP, dCTP, and dGTP; 10 μl of 100 mM dTTP to 15 μl H2O. dNTPs from Pharmacia). RNA degradation is performed as follows. Add 86 μl H2O, 1.5 μl 1M NaOH/2 mM EDTA and incubate at 65° C., 10 min. For U-Con 30, 500 μl TE/sample spin at 7000 g for 10 min, save flow through for purification. For Qiagen purification, suspend u-con recovered material in 500 μl buffer PB and proceed using Qiagen protocol. For DNAse digestion, add 1 ul of {fraction (1/100)}dil of DNAse/30 ul Rx and incubate at 37° C. for 15 min. Incubate at 5 min 95° C. to denature the DNAse/.


[0325] For sample preparation, add Cot-1 DNA, 10 μl; 50× dNTPs, 1 μl; 20× SSC, 2.3 μl; Na pyro phosphate, 7.5 μl; 10 mg/ml Herring sperm DNA; 1 ul of {fraction (1/10)} dilution to 21.8 final vol. Dry in speed vac. Resuspend in 15 μl H2O. Add 0.38 μl 10% SDS. Heat 95° C., 2 min and slow cool at room temp. for 20 min. Put on slide and hybridize overnight at 64° C. Washing after the hybridization: 3× SSC0.03% SDS: 2 min., 37.5 mls 20× SSC+0.75 mls 10% SDS in 250 mls H2O; 1× SSC: 5 min., 12.5 mls 20× SSC in 250 mls H2O; 0.2× SSC: 5 min., 2.5 mls 20× SSC in 250 mls H2O. Dry slides and scan at appropiate PMT's and channels.



Example 2


A Model of Angiogenesis is Used to Determine Expression in Angiogenesis

[0326] In the model of angiogenesis used to determine expression of angiogenesis-associated sequences, human umbilical vein endothelial cells (HUVEC) were obtained, e.g., as passage 1 (p1) frozen cells from Cascade Biologics (Oregon) and grown in maintenance medium: Medium 199 (Life Technologies) supplemented with 20% pooled human serum, 100 mg/ml heparin and 75 mg/ml endothelial cell growth supplements (Sigma) and gentamicin (Life Technologies). An in vitro cell system model was used in which 2×105 HUVECs were cultured in 0.5 ml 3 mgs/ml plasminogen-depleted fibrinogen (Calbiochem, San Diego, Calif.) that was polymerized by the addition of 1 unit of maintenance medium supplemented with 100 ng/ml VEGF and HGF and 10 ng/ml TGF-a (R&D Systems, Minneapolis, Minn.) added (growth medium). The growth medium was replaced every 2 days. Samples for RNA were collected, e.g., at 0, 2, 6, 15, 24, 48, and 96 hours of culture. The fibrin clots were placed in Trizol (Life Technologies) and disrupted using a Tissuemizer. Thereafter standard procedures were used for extracting the RNA (e.g., Example 1).


[0327] Angiogenesis associated sequences thus identified are shown in Table 1. As indicated, some of the Accession numbers include expression sequence tags (ESTs). Thus, in one embodiment herein, genes within an expression profile, also termed expression profile genes, include ESTs and are not necessarily full length.
1TABLE 1AAA4 DNA sequenceGene name: CGI-100 proteinUnigene number: Hs.275253Probeset Accession #: AA089688Nucleic Acid Accession #: NM_016040 clusterCoding sequence: 142-831 (predicted start/stop codons underlined)GTTCGCCGCC GCCGCGCCGG CCACCTGGAG TTTTTTCAGA CTCCAGATTT CCCTGTCAAC60CACGAGGAGT CCAGAGAGGA AACGCGGAGC GGAGACAACA GTACCTGACG CCTCTTTCAG120CCCGGGATCG CCCCAGCAGG GATGGGCGAC AAGATCTGGC TGCCCTTCCC CGTGCTCCTT180CTGGCCGCTC TGCCTCCGGT GCTGCTGCCT GGGGCGGCCG GCTTCACACC TTCCCTCGAT240AGCGACTTCA CCTTTACCCT TCCCGCCGGC CAGAAGGAGT GCTTCTACCA GCCCATGCCC300CTGAAGGCCT CGCTGGAGAT CGAGTACCAA GTTTTAGATG GAGCAGGATT AGATATTGAT360TTCCATCTTG CCTCTCCAGA AGGCAAAACC TTAGTTTTTG AACAAAGAAA ATCAGATGGA420GTTCACACTG TAGAGACTGA AGTTGGTGAT TACATGTTCT GCTTTGACAA TACATTCAGC480ACCATTTCTG AGAAGGTGAT TTTCTTTGAA TTAATCCTGG ATAATATGGG AGAACAGGCA540CAAGAACAAG AAGATTGGAA GAAATATATT ACTGGCACAG ATATATTGGA TATGAAACTG600GAAGACATCC TGGAATCCAT CAACAGCATC AAGTCCAGAC TAAGCAAAAG TGGGCACATA660CAAACTCTGC TTAGAGCATT TGAAGCTCGT GATCGAAACA TACAAGAAAG CAACTTTGAT720AGAGTCAATT TCTGGTCTAT GGTTAATTTA GTGGTCATGG TGGTGGTGTC AGCCATTCAA780GTTTATATGC TGAAGAGTCT GTTTGAAGAT AAGAGGAAAA GTAGAACTTAAAACTCCAAA840CTAGAGTACG TAACATTGAA AAATGAGGCA TAAAAATGCA ATAAACTGTT ACAGTCAAGA900CCATTAATGG TCTTCTCCAA AATATTTTGA GATATAAAAG TAGGAAACAG GTATAATTTT960AATGTGAAAA TTAAGTCTTC ACTTTCTGTG CAAGTAATCC TGCTGATCCA GTTGTACTTA1020AGTGTGTAAC AGGAATATTT TGCAGAATAT AGGTTTAACT GAATGAAGCC ATATTAATAA1080CTGCATTTTC CTAACTTTGA AAAATTTTGC AAATGTCTTA GGTGATTTAA ATAAATGAGT1140ATTGGGCCTA AAAAA7 DNA sequenceGene name: Endothelial differentiation, sphingolipid G-protein-coupled receptor, 1(EDG1)Unigene number: Hs.154210Probeset Accession #: M31210Nucleic Acid Accession #: NM_001400 clusterCoding sequence: 251-1396 (predicted start/stop codons underlined)TCTAAAGGTC GGGGGCAGCA GCAAGATGCG AAGCGAGCCG TACAGATCCC GGGCTCTCCG60AACGCAACTT CGCCCTGCTT GAGCGAGGCT GCGGTTTCCG AGGCCCTCTC CAGCGAAGGA120AAAGCTACAC AAAAAGCCTG GATCACTCAT CGAACCACCC CTGAAGCCAG TGAAGGCTCT180CTCGCCTCGC CCTCTAGCGT TCGTCTGGAG TAGCGCCACC CCGGCTTCCT GGGGACACAG240GGTTGGCACC ATGGGGCCCA CCAGCGTCCC GCTGGTCAAG GCCCACCGCA GCTCGGTCTC300TGACTACGTC AACTATGATA TCATCGTCCG GCATTACAAC TACACGGGAA AGCTGAATAT360CAGCGCGGAC AAGGAGAACA GCATTAAACT GACCTCGGTG GTGTTCATTC TCATCTGCTG420CTTTATCATC CTGGAGAACA TCTTTGTCTT GCTGACCATT TGGAAAACCA AGAAATTCCA480CCGACCCATG TACTATTTTA TTGGCAATCT GGCCCTCTCA GACCTGTTGG CAGGAGTAGC540CTACACAGCT AACCTGCTCT TGTCTGGGGC CACCACCTAC AAGCTCACTC CCGCCCAGTG600GTTTCTGCGG GAAGGGAGTA TGTTTGTGGC CCTGTCAGCC TCCGTGTTCA GTCTCCTCGC660CATCGCCATT GAGCGCTATA TCACAATGCT GAAAATGAAA CTCCACAACG GGAGCAATAA720CTTCCGCCTC TTCCTGCTAA TCAGCGCCTG CTGGGTCATC TCCCTCATCC TGGGTGGCCT780GCCTATCATG GGCTGGAACT GCATCAGTGC GCTGTCCAGC TGCTCCACCG TGCTGCCGCT840CTACCACAAG CACTATATCC TCTTCTGCAC CACGGTCTTC ACTCTGCTTC TGCTCTCCAT900CGTCATTCTG TACTGCAGAA TCTACTCCTT GGTCAGGACT CGGAGCCGCC GCCTGACGTT960CCGCAAGAAC ATTTCCAAGG CCAGCCGCAG CTCTGAGAAT GTGGCGCTGC TCAAGACCGT1020AATTATCGTC CTGAGCGTCT TCATCGCCTG CTGGGCACCG CTCTTCATCC TGCTCCTGCT1080GGATGTGGGC TGCAAGGTGA AGACCTGTGA CATCCTCTTC AGAGCGGAGT ACTTCCTGGT1140GTTACCTGTG CTCAACTCCG GCACCAACCC CATCATTTAC ACTCTGACCA ACAAGGAGAT1200GCGTCGGGCC TTCATCCGGA TCATGTCCTG CTGCAAGTGC CCGAGCGGAG ACTCTGCTGG1260CAAATTCAAG CGACCCATCA TCGCCGGCAT GGAATTCAGC CGCAGCAAAT CGGACAATTC1320CTGGCACCCC CAGAAAGACG AAGGGGACAA CCCAGAGACC ATTATGTCTT CTGGAAACGT1380CAACTCTTCT TCCTAGAACT GGAAGCTGTC CACCCACCGG AAGCGCTCTT TACTTGGTCG1440CTGGCCACCC CAGTGTTTGG AAAAAAATCT CTGGGCTTCG ACTGCTGCCA GGGAGGAGCT1500GCTGCAAGCC AGAGGGAGGA AGGGGGAGAA TACGAACAGC CTGGTGGTGT CGGGTGTTGG1560TGGGTAGAGT TAGTTCCTGT GAACAATGCA CTGGGAAGGG TGGAGATCAG GTCCCGGCCT1620GGAATATATA TTCTACCCCC CTGGAGCTTT GATTTTGCAC TGAGCCAAAG GTCTAGCATT1680GTCAAGCTCC TAAAGGGTTC ATTTGGCCCC TCCTCAAAGA CTAATGTCCC CATGTGAAAG1740CGTCTCTTTG TCTGGAGCTT TGAGGAGATG TTTTCCTTCA CTTTAGTTTC AAACCCAAGT1800GAGTGTGTGC ACTTCTGCTT CTTTAGGGAT GCCCTGTACA TCCCACACCC CACCCTCCCT1860TCCCTTCATA CCCCTCCTCA ACGTTCTTTT ACTTTATACT TTAACTACCT GAGAGTTATC1920AGAGCTGGGG TTGTGGAATG ATCGATCATC TATAGCAAAT AGGCTATGTT GAGTACGTAG1980GCTGTGGGAA GATGAAGATG GTTTGGAGGT GTAAAACAAT GTCCTTCGCT GAGGCCAAAG2040TTTCCATGTA AGCGGGATCC GTTTTTTGGA ATTTGGTTGA AGTCACTTTG ATTTCTTTAA2100AAAACATCTT TTCAATGAAA TGTGTTACCA TTTCATATCC ATTGAAGCCG AAATCTGCAT2160AAGGAAGCCC ACTTTATCTA AATGATATTA GCCAGGATCC TTGGTGTCCT AGGAGAAACA2220GACAAGCAAA ACAAAGTGAA AACCGAATGG ATTAACTTTT GCAAACCAAG GGAGATTTCT2280TAGCAAATGA GTCTAACAAA TATGACATCC GTCTTTCCCA CTTTTGTTGA TGTTTATTTC2340AGAATCTTGT GTGATTCATT TCAAGCAACA ACATGTTGTA TTTTGTTGTG TTAAAAGTAC2400TTTTCTTGAT TTTTGAATGT ATTTGTTTCA GGAAGAAGTC ATTTTATGGA TTTTTCTAAC2460CCGTGTTAAC TTTTCTAGAA TCCACCCTCT TGTGCCCTTA AGCATTACTT TAACTGGTAG2520GGAACGCCAG AACTTTTAAG TCCAGCTATT CATTAGATAG TAATTGAAGA TATGTATAAA2580TATTACAAAG AATAAAAATA TATTACTGTC TCTTTAGTAT GGTTTTCAGT GCAATTAAAC2640CGAGAGATGT CTTGTTTTTT TAAAAAGAAT AGTATTTAAT AGGTTTCTGA CTTTTGTGGA2700TCATTTTGCA CATAGCTTTA TCAACTTTTA AACATTAATA AACTGATTTT TTTAAAGAAB3 DNA sequenceGene name: Solute carrier family 20 (phosphate transporter), member 1, Humanleukaemia virus receptor 1 (GLVR1)Unigene number: Hs.78452Probeset Accession #: L20859Nucleic Acid Accession #: NM_005415 clusterCoding sequence: predicted 371-2410 (predicted start/stop codons underlined)GAGCTGTCCC CGGTGCCGCC GACCCGGGCC GTGCCGTGTG CCCGTGGCTC CAGCCGCTGC60CGCCTCGATC TCCTCGTCTC CCGCTCCGCC CTCCCTTTTC CCTGGATGAA CTTGCGTCCT120TTCTCTTCTC CGCCATGGAA TTCTGCTCCG TGCTTTTAGC CCTCCTGAGC CAAAGAAACC180CCAGACAACA GATGCCCATA CGCAGCGTAT AGCAGTAACT CCCCAGCTCG GTTTCTGTGC240CGTAGTTTAC AGTATTTAAT TTTATATAAT ATATATTATT TATTATAGCA TTTTTGATAC300CTCATATTCT GTTTACACAT CTTGAAAGGC GCTCAGTAGT TCTCTTACTA AACAACCACT360ACTCCAGAGA ATGGCAACGC TGATTACCAG TACTACAGCT GCTACCGCCG CTTCTGGTCC420TTTGGTGGAC TACCTATGGA TGCTCATCCT GGGCTTCATT ATTGCATTTG TCTTGGCATT480CTCCGTGGGA GCCAATGATG TAGCAAATTC TTTTGGTACA GCTGTGGGCT CAGGTGTAGT540GACCCTGAAG CAAGCCTGCA TCCTAGCTAG CATCTTTGAA ACAGTGGGCT CTGTCTTACT600GGGGGCCAAA GTGAGCGAAA CCATCCGGAA GGGCTTGATT GACGTGGAGA TGTACAACTC660GACTCAAGGG CTACTGATGG CCGGCTCAGT CAGTGCTATG TTTGGTTCTG CTGTGTGGCA720ACTCGTGGCT TCGTTTTTGA AGCTCCCTAT TTCTGGAACC CATTGTATTG TTGGTGCAAC780TATTGGTTTC TCCCTCGTGG CAAAGGGGCA GGAGGGTGTC AAGTGGTCTG AACTGATAAA840AATTGTGATG TCTTGGTTCG TGTCCCCACT GCTTTCTGGA ATTATGTCTG GAATTTTATT900CTTCCTGGTT CGTGCATTCA TCCTCCATAA GGCAGATCCA GTTCCTAATG GTTTGCGAGC960TTTGCCAGTT TTCTATGCCT GCACAGTTGG AATAAACCTC TTTTCCATCA TGTATACTGG1020AGCACCGTTG CTGGGCTTTG ACAAACTTCC TCTGTGGGGT ACCATCCTCA TCTCGGTGGG1080ATGTGCAGTT TTCTGTGCCC TTATCGTCTG GTTCTTTGTA TGTCCCAGGA TGAAGAGAAA1140AATTGAACGA GAAATAAAGT GTAGTCCTTC TGAAAGCCCC TTAATGGAAA AAAAGAATAG1200CTTGAAAGAA GACCATGAAG AAACAAAGTT GTCTGTTGGT GATATTGAAA ACAAGCATCC1260TGTTTCTGAG GTAGGGCCTG CCACTGTGCC CCTCCAGGCT GTGGTGGAGG AGAGAACAGT1320CTCATTCAAA CTTGGAGATT TGGAGGAAGC TCCAGAGAGA GAGAGGCTTC CCAGCGTGGA1380CTTGAAAGAG GAAACCAGCA TAGATAGCAC CGTGAATGGT GCAGTGCAGT TGCCTAATGG1440GAACCTTGTC CAGTTCAGTC AAGCCGTCAG CAACCAAATA AACTCCAGTG GCCACTCCCA1500GTATCACACC GTGCATAAGG ATTCCGGCCT GTACAAAGAG CTACTCCATA AATTACATCT1560TGCCAAGGTG GGAGATTGCA TGGGAGACTC CGGTGACAAA CCCTTAAGGC GCAATAATAG1620CTATACTTCC TATACCATGG CAATATGTGG CATGCCTCTG GATTCATTCC GTGCCAAAGA1680AGGTGAACAG AAGGGCGAAG AAATGGAGAA GCTGACATGG CCTAATGCAG ACTCCAAGAA1740GCGAATTCGA ATGGACAGTT ACACCAGTTA CTGCAATGCT GTGTCTGACC TTCACTCAGC1800ATCTGAGATA GACATGAGTG TCAAGGCAGC GATGGGTCTA GGTGACAGAA AAGGAAGTAA1860TGGCTCTCTA GAAGAATGGT ATGACCAGGA TAAGCCTGAA GTCTCTCTCC TCTTCCAGTT1920CCTGCAGATC CTTACAGCCT GCTTTTGGTC ATTCGCCCAT GGTGGCAATG ACGTAAGCAA1980TGCCATTGGG CCTCTGGTTG CTTTATATTT GGTTTATGAC ACAGGAGATG TTTCTTCAAA2040AGTGGCAACA CCAATATGGC TTCTACTCTA TGGTGGTGTT GGTATCTGTG TTGGTCTGTG2100GGTTTGGGGA AGAAGAGTTA TCCAGACCAT GGGGAAGGAT CTGACACCGA TCACACCCTC2160TAGTGGCTTC AGTATTGAAC TGGCATCTGC CCTCACTGTG GTGATTGCAT CAAATATTGG2220CCTTCCCATC AGTACAACAC ATTGTAAAGT GGGCTCTGTT GTGTCTGTTG GCTGGCTCCG2280GTCCAAGAAG GCTGTTGACT GGCGTCTCTT TCGTAACATT TTTATGGCCT GGTTTGTCAC2340AGTCCCCATT TCTGGAGTTA TCAGTGCTGC CATCATGGCA ATCTTCAGAT ATGTCATCCT2400CAGAATGTGA AGCTGTTTGA GATTAAAATT TGTGTCAATG TTTGGGACCA TCTTAGGTAT2460TCCTGCTCCC CTGAAGAATG ATTACAGTGT TAACAGAAGA CTGACAAGAG TCTTTTTATT2520TGGGAGCAGA GGAGGGAAGT GTTACTTGTG CTATAACTGC TTTTGTGCTA AATATGAATT2580GTCTCAAAAT TAGCTGTGTA AAATAGCCCG GGTTCCACTG GCTCCTGCTG AGGTCCCCTT2640TCCTTCTGGG CTGTGAATTC CTGTACATAT TTCTCTACTT TTTGTATCAG GCTTCAATTC2700CATTATGTTT TAATGTTGTC TCTGAAGATG ACTTGTGATT TTTTTTTCTT TTTTTTAAAC2760CATGAAGAGC CGTTTGACAG AGCATGCTCT GCGTTGTTGG TTTCACCAGC TTCTGCCCTC2820ACATGCACAG GGATTTAACA ACAAAAATAT AACTACAACT TCCCTTGTAG TCTCTTATAT2880AAGTAGAGTC CTTGGTACTC TGCCCTCCTG TCAGTAGTGG CAGGATCTAT TGGCATATTC2940GGGAGCTTCT TAGAGGGATG AGGTTCTTTG AACACAGTGA AAATTTAAAT TAGTAACTTT3000TTTGCAAGCA GTTTATTGAC TGTTATTGCT AAGAAGAAGT AAGAAAGAAA AAGCCTGTTG3060GCAATCTTGG TTATTTCTTT AAGATTTCTG GCAGTGTGGG ATGGATGAAT GAAGTGGAAT3120GTGAACTTTG GGCAAGTTAA ATGGGACAGC CTTCCATGTT CATTTGTCTA CCTCTTAACT3180GAATAAAAAA GCCTACAGTT TTTAGAAAAA ACCCGAATTCAAB4 DNA sequenceGene name: Matrix metalloproteinase 10 (stromelysin 2)Unigene number: Hs.2258Probeset Accession #: X07820Nucleic Acid Accession #: NM_002425Coding sequence: predicted 23-1453 (predicted start/stop codons underlined)AAAGAAGGTA AGGGCAGTGA GAATGATGCA TCTTGCATTC CTTGTGCTGT TGTGTCTGCC60AGTCTGCTCT GCCTATCCTC TGAGTGGGGC AGCAAAAGAG GAGGACTCCA ACAAGGATCT120TGCCCAGCAA TACCTAGAAA AGTACTACAA CCTCGAAAAG GATGTGAAAC AGTTTAGAAG180AAAGGACAGT AATCTCATTG TTAAAAAAAT CCAAGGAATG CAGAAGTTCC TTGGGTTGGA240GGTGACAGGG AAGCTAGACA CTGACACTCT GGAGGTGATG CGCAAGCCCA GGTGTGGAGT300TCCTGACGTT GGTCACTTCA GCTCCTTTCC TGGCATGCCG AAGTGGAGGA AAACCCACCT360TACATACAGG ATTGTGAATT ATACACCAGA TTTGCCAAGA GATGCTGTTG ATTCTGCCAT420TGAGAAAGCT CTGAAAGTCT GGGAAGAGGT GACTCCACTC ACATTCTCCA GGCTGTATGA480AGGAGAGGCT GATATAATGA TCTCTTTCGC AGTTAAAGAA CATGGAGACT TTTACTCTTT540TGATGGCCCA GGACACAGTT TGGCTCATGC CTACCCACCT GGACCTGGGC TTTATGGAGA600TATTCACTTT GATGATGATG AAAAATGGAC AGAAGATGCA TCAGGCACCA ATTTATTCCT660CGTTGCTGCT CATGAACTTG GCCACTCCCT GGGGCTCTTT CACTCAGCCA ACACTGAAGC720TTTGATGTAC CCACTCTACA ACTCATTCAC AGAGCTCGCC CAGTTCCGCC TTTCGCAAGA780TGATGTGAAT GGCATTCAGT CTCTCTACGG ACCTCCCCCT GCCTCTACTG AGGAACCCCT840GGTGCCCACA AAATCTGTTC CTTCGGGATC TGAGATGCCA GCCAAGTGTG ATCCTGCTTT900GTCCTTCGAT GCCATCAGCA CTCTGAGGGG AGAATATCTG TTCTTTAAAG ACAGATATTT960TTGGCGAAGA TCCCACTGGA ACCCTGAACC TGAATTTCAT TTGATTTCTG CATTTTGGCC1020CTCTCTTCCA TCATATTTGG ATGCTGCATA TGAAGTTAAC AGCAGGGACA CCGTTTTTAT1080TTTTAAAGGA AATGAGTTCT GGGCCATCAG AGGAAATGAG GTACAAGCAG GTTATCCAAG1140AGGCATCCAT ACCCTGGGTT TTCCTCCAAC CATAAGGAAA ATTGATGCAG CTGTTTCTGA1200CAAGGAAAAG AAGAAAACAT ACTTCTTTGC AGCGGACAAA TACTGGAGAT TTGATGAAAA1260TAGCCAGTCC ATGGAGCAAG GCTTCCCTAG ACTAATAGCT GATGACTTTC CAGGAGTTGA1320GCCTAAGGTT GATGCTGTAT TACAGGCATT TGGATTTTTC TACTTCTTCA GTGGATCATC1380ACAGTTTGAG TTTGACCCCA ATGCCAGGAT GGTGACACAC ATATTAAAGA GTAACAGCTG1440GTTACATTGC TAGGCGAGAT AGGGGGAAGA CAGATATGGG TGTTTTTAAT AAATCTAATA1500ATTATTCATC TAATGTATTA TGAGCCAAAA TGGTTAATTT TTCCTGCATG TTCTGTGACT1560GAAGAAGATG AGCCTTGCAG ATATCTGCAT GTGTCATGAA GAATGTTTCT GGAATTCTTC1620ACTTGCTTTT GAATTGCACT GAACAGAATT AAGAAATACT CATGTGCAAT AGGTGAGAGA1680ATGTATTTTC ATAGATGTGT TATTACTTCC TCAATAAAAA GTTTTATTTT GGGCCTGTTC1740CTTAAB6 DNA sequenceGene name: Podocalyxin-likeUnigene number: Hs.16426Probeset Accession #: U97519Nucleic Acid Accession #: NM_005397 clusterCoding sequence: 251-1837 (predicted start/stop codons underlined)AAACGCCGCC CAGGACGCAG CCGCCGCCGC CGCCGCTCCT CTGCCACTGG CTCTGCGCCC60CAGCCCGGCT CTGCTGCAGC GGCAGGGAGG AAGAGCCGCC GCAGCGCGAC TCGGGAGCCC120CGGGCCACAG CCTGGCCTCC GGAGCCACCC ACAGGCCTCC CCGGGCGGCG CCCACGCTCC180TACCGCCCGG ACGCGCGGAT CCTCCGCCGG CACCGCAGCC ACCTGCTCCC GGCCCAGAGG240CGACGACACG ATGCGCTGCG CGCTGGCGCT CTCGGCGCTG CTGCTACTGT TGTCAACGCC300GCCGCTGCTG CCGTCGTCGC CGTCGCCGTC GCCGTCGCCG TCGCCCTCCC AGAATGCAAC360CCAGACTACT ACGGACTCAT CTAACAAAAC AGCACCGACT CCAGCATCCA GTGTCACCAT420CATGGCTACA GATACAGCCC AGCAGAGCAC AGTCCCCACT TCCAAGGCCA ACGAAATCTT480GGCCTCGGTC AAGGCGACCA CCCTTGGTGT ATCCAGTGAC TCACCGGGGA CTACAACCCT540GGCTCAGCAA GTCTCAGGCC CAGTCAACAC TACCGTGGCT AGAGGAGGCG GCTCAGGCAA600CCCTACTACC ACCATCGAGA GCCCCAAGAG CACAAAAAGT GCAGACACCA CTACAGTTGC660AACCTCCACA GCCACAGCTA AACCTAACAC CACAAGCAGC CAGAATGGAG CAGAAGATAC720AACAAACTCT GGGGGGAAAA GCAGCCACAG TGTGACCACA GACCTCACAT CCACTAAGGC780AGAACATCTG ACGACCCCTC ACCCTACAAG TCCACTTAGC CCCCGACAAC CCACTTTGAC840GCATCCTGTG GCCACCCCAA CAAGCTCGGG ACATGACCAT CTTATGAAAA TTTCAAGCAG900TTCAAGCACT GTGGCTATCC CTGGCTACAC CTTCACAAGC CCGGGGATGA CCACCACCCT960ACCGTCATCG GTTATCTCGC AAAGAACTCA ACAGACCTCC AGTCAGATGC CAGCCAGCTC1020TACGGCCCCT TCCTCCCAGG AGACAGTGCA GCCCACGAGC CCGGCAACGG CATTGAGAAC1080ACCTACCCTG CCAGAGACCA TGAGCTCCAG CCCCACAGCA GCATCAACTA CCCACCGATA1140CCCCAAAACA CCTTCTCCCA CTGTGGCTCA TGAGAGTAAC TGGGCAAAGT GTGAGGATCT1200TGAGACACAG ACACAGAGTG AGAAGCAGCT CGTCCTGAAC CTCACAGGAA ACACCCTCTG1260TGCAGGGGGC GCTTCGGATG AGAAATTGAT CTCACTGATA TGCCGAGCAG TCAAAGCCAC1320CTTCAACCCG GCCCAAGATA AGTGCGGCAT ACGGCTGGCA TCTGTTCCAG GAAGTCAGAC1380CGTGGTCGTC AAAGAAATCA CTATTCACAC TAAGCTCCCT GCCAAGGATG TGTACGAGCG1440GCTGAAGGAC AAATGGGATG AACTAAAGGA GGCAGGGGTC AGTGACATGA AGCTAGGGGA1500CCAGGGGCCA CCGGAGGAGG CCGAGGACCG CTTCAGCATG CCCCTCATCA TCACCATCGT1560CTGCATGGCG TCATTCCTGC TCCTCGTGGC GGCCCTCTAT GGCTGCTGCC ACCAGCGCCT1620CTCCCAGAGG AAGGACCAGC AGCGGCTAAC AGAGGAGCTG CAGACAGTGG AGAATGGTTA1680CCATGACAAC CCAACACTGG AAGTGATGGA GACCTCTTCT GAGATGCAGG AGAAGAAGGT1740GGTCAGCCTC AACGGGGAGC TGGGGGACAG CTGGATCGTC CCTCTGGACA ACCTGACCAA1800GGACGACCTG GATGAGGAGG AAGACACACA CCTCTAGTCC GGTCTGCCGG TGGCCTCCAG1860CAGCACCACA GAGCTCCAGA CCAACCACCC CAAGTGCCGT TTGGATGGGG AAGGGAAAGA1920CTGGGGAGGG AGAGTGAACT CCGAGGGGTG TCCCCTCCCA ATCCCCCCAG GGCCTTAATT1980TTTCCCTTTT CAACCTGAAC AAATCACATT CTGTCCAGAT TCCTCTTGTA AAATAACCCA2040CTAGTGCCTG AGCTCAGTGC TGCTGGATGA TGAGGGAGAT CAAGAAAAAG CCACGTAAGG2100GACTTTATAG ATGAACTAGT GGAATCCCTT CATTCTGCAG TGAGATTGCC GAGACCTGAA2160GAGGGTAAGT GACTTGCCCA AGGTCAGAGC CACTTGGTGA CAGAGCCAGG ATGAGAACAA2220AGATTCCATT TGCACCATGC CACACTGCTG TGTTCACATG TGCCTTCCGT CCAGAGCAGT2280CCCGGGCAGG GGTGAAACTC CAGCAGGTGG CTGGGCTGGA AAGGAGGGCA GGGCTACATC2340CTGGCTCGGT GGGATCTGAC GACCTGAAAG TCCAGCTCCC AAGTTTTCCT TCTCCTACCC2400CAGCCTCGTG TACCCATCTT CCCACCCTCT ATGTTCTTAC CCCTCCCTAC ACTCAGTGTT2460TGTTCCCACT TACTCTGTCC TGGGGCCTCT GGGATTAGCA CAGGTTATTC ATAACCTTGA2520ACCCCTTGTT CTGGATTCGG ATTTTCTCAC ATTTGCTTCG TGAGATGGGG GCTTAACCCA2580CACAGGTCTC CGTGCGTGAA CCAGGTCTGC TTAGGGGACC TGCGTGCAGG TGAGGAGAGA2640AGGGGACACT CGAGTCCAGG CTGGTATCTC AGGGCAGCTG ATGAGGGGTC AGCAGGAACA2700CTGGCCCATT GCCCCTGGCA CTCCTTGCAG AGGCCACCCA CGATCTTCTT TGGGCTTCCA2760TTTCCACCAG GGACTAAAAT CTGCTGTAGC TAGTGAGAGC AGCGTGTTCC TTTTGTTGTT2820CACTGCTCAG CTGATGGGAG TGATTCCCTG AGACCCAGTA TGAAAGAGCA GTGGCTGCAG2880GAGAGGCCTT CCCGGGGCCC CCCATCAGCG ATGTGTCTTC AGAGACAATC CATTAAAGCA2940GCCAGGAAGG ACAGGCTTTC CCCTGTATAT CATAGGAAAC TCAGGGACAT TTCAAGTTGC3000TGAGAGTTTT GTTATAGTTG TTTTCTAACC CAGCCCTCCA CTGCCAAAGG CCAAAAGCTC3060AGACAGTTGG CAGACGTCCA GTTAGCTCAT CTCACTCACT CTGATTCTCC TGTGCCACAG3120GAAAAGAGGG CCTGGAAAGC GCAGTGCATG CTGGGTGCAT GAAGGGCAGC CTGGGGGACA3180GACTGTTGTG GGAACGTCCC ACTGTCCTGG CCTGGAGCTA GGCCTTGCTG TTCCTCTTCT3240CTGTGAGCCT AGTGGGGCTG CTGCGGTTCT CTTGCAGTTT CTGGTGGCAT CTCAGGGGAA3300CACAAAAGCT ATGTCTATTC CCCAATATAG GACTTTTATG GGCTCGGCAG TTAGCTGCCA3360TGTAGAAGGC TCCTAAGCAG TGGGCATGGT GAGGTTTCAT CTGATTGAGA AGGGGGAATC3420CTGTGTGGAA TGTTGAACTT TCGCCATGGT CTCCATCGTT CTGGGCGTAA ATTCCCTGGG3480ATCAAGTAGG AAAATGGGCA GAACTGCTTA GGGGAATGAA ATTGCCATTT TTCGGGTGAA3540ACGCCACACC TCCAGGGTCT TAAGAGTCAG GCTCCGGCTG TAGTAGCTCT GATGAAATAG3600GCTATCCACT CGGGATGGCT TACTTTTTAA AAGGGTAGGG GGAGGGGCTG GGGAAGATCT3660GTCCTGCACC ATCTGCCTAA TTCCTTCCTC ACAGTCTGTA GCCATCTGAT ATCCTAGGGG3720GAAAAGGAAG GCCAGGGGTT CACATAGGGC CCCAGCGAGT TTCCCAGGAG TTAGAGGGAT3780GCGAGGCTAA CAAGTTCCAA AAACATCTGC CCCGATGCTC TAGTGTTTGG AGGTGGGCAG3840GATGGAGAAC AGTGCCTGTT TGGGGGAAAA CAGGAAATCT TGTTAGGCTT GAGTGAGGTG3900TTTGCTTCCT TCTTGCCCAG CGCTGGGTTC TCTCCACCCA GTAGGTTTTC TGTTGTGGTC3960CCGTGGGAGA GGCCAGACTG GATTATTCCT CCTTTGCTGA TCCTGGGTCA CACTTCACCA4020GCCAGGGCTT TTGACGGAGA CAGCAAATAG GCCTCTGCAA ATCAATCAAA GGCTGCAACC4080CTATGGCCTC TTGGAGACAG ATGATGACTG GCAAGGACTA GAGAGCAGGA GTGCCTGGCC4140AGGTCGGTCC TGACTCTCCT GACTCTCCAT CGCTCTGTCC AAGGAGAACC CGGAGAGGCT4200CTGGGCTGAT TCAGAGGTTA CTGCTTTATA TTCGTCCAAA CTGTGTTAGT CTAGGCTTAG4260GACAGCTTCA GAATCTGACA CCTTGCCTTG CTCTTGCCAC CAGGACACCT ATGTCAACAG4320GCCAAACAGC CATGCATCTA TAAAGGTCAT CATCTTCTGC CACCTTTACT GGGTTCTAAA4380TGCTCTCTGA TAATTCAGAG AGCATTGGGT CTGGGAAGAG GTAAGAGGAA CACTAGAAGC4440TCAGCATGAC TTAAACAGGT TGTAGCAAAG ACAGTTTATC ATCAACTCTT TCAGTGGTAA4500ACTGTGGTTT CCCCAAGCTG CACAGGAGGC CAGAAACCAC AAGTATGATG ACTAGGAAGC4560CTACTGTCAT GAGAGTGGGG AGACAGGCAG CAAAGCTTAT GAAGGAGGTA CAGAATATTC4620TTTGCGTTGT AAGACAGAAT ACGGGTTTAA TCTAGTCTAG GCRCCAGATT TTTTTCCCGC4680TTGATAAGGA AAGCTAGCAG AAAGTTTATT TAAACCACTT CTTGAGCTTT ATCTTTTTTG4740ACAATATACT GGAGAAACTT TGAAGAACAA GTTCAAACTG ATACATATAC ACATATTTTT4800TTGATAATGT AAATACAGTG ACCATGTTAA CCTACCCTGC ACTGCTTTAA GTGAACATAC4860TTTGAAAAAG CATTATGTTA GCTGAGTGAT GGCCAAGTTT TTTCTCTGGA CAGGAATGTA4920AATGTCTTAC TGGAAATGAC AAGTTTTTGC TTGATTTTTT TTTTTAAACA AAAAATGAAA4980TATAACAAGA CAAACTTATG ATAAAGTATT TGTCTTGTAG ATCAGGTGTT TTGTTTTGTT5040TTTTTAATTT TAAAATGCAA CCCTGCCCCC TCCCCAGCAA AGTCACAGCT CCATTTCAGT5100AAAGGTTGGA GTCAATATGC TCTGGTTGGC AGGCAACCCT GTAGTCATGG AGAAAGGTAT5160TTCAAGATCT AGTCCAATCT TTTTCTAGAG AAAAAGATAA TCTGAAGCTC ACAAAGATGA5220AGTGACTTCC TCAAAATCAC ATGGTTCAGG ACAGAAACAA GATTAAAACC TGGATCCACA5280GACTGTGCGC CTCAGAAGGA ATAATCGGTA AATTAAGAAT TGCTACTCGA AGGTGCCAGA5340ATGACACAAA GGACAGAATT CCTTTCCCAG TTGTTACCCT AGCAAGGCTA GGGAGGGCAT5400GAACACAAAC ATAAGAACTG GTCTTCTCAC ACTTTCTCTG AATCATTTAG GTTTAAGATG5460TAAGTGAACA ATTCTTTCTT TCTGCCAAGA AACAAAGTTT TGGATGAGCT TTTATATATG5520GAACTTACTC CAACAGGACT GAGGGACCAA GGAAACATGA TGGGGGAGGC AAGAGAGGGC5580AAAGAGTAAA ACTGTAGCAT AGCTTTTGTC ACGGTCACTA GCTGATCCCT CAGGTCTGCT5640GCAAACACAG CATGGAGGAC ACAGATGACT CTTTGGTGTT GGTCTTTTTG TCTGCAGTGA5700ATGTTCAACA GTTTGCCCAG GAACTGGGGG ATCATATATG TCTTAGTGGA CAGGGGTCTG5760AAGTACACTG GAATTTACTG AGAAACTTGT TTGTAAAAAC TATAGTTAAT AATTATTGCA5820TTTTCTTACA AAAATATATT TTGGAAAATT GTATACTGTC AATTAAAGTAAB8 DNA sequenceGene name: EGF-containing fibulin-like extracellular matrix protein 1Unigene number: Hs.76224Probeset Accession #: U03877Nucleic Acid Accession #: NM_004105 Transcript variant 1Coding sequence: 150-1631 (predicted start/stop codons underlined)CTAGTATTCT ACTAGAACTG GAAGATTGCT CTCCGAGTTT TTTTTTTGTT ATTTTGTTAA60AAAATAAAAA GCTTGAGCAG CAATTCATAT TACTGTCACA GGTATTTTTG CTGTGCTGTG120CAAGGTAACT CTGCTAGCTA AGATTCACAATGTTGAAAGC CCTTTTCCTA ACTATGCTGA180CTCTGGCGCT GGTCAAGTCA CAGGACACCG AAGAAACCAT CACGTACACG CAATGCACTG240ACGGATATGA GTGGGATCCT GTGAGACAGC AATGCAAAGA TATTGATGAA TGTGACATTG300TCCCAGACGC TTGTAAAGGT GGAATGAAGT GTGTCAACCA CTATGGAGGA TACCTCTGCC360TTCCGAAAAC AGCCCAGATT ATTGTCAATA ATGAACAGCC TCAGCAGGAA ACACAACCAG420CAGAAGGAAC CTCAGGGGGA ACCACCGGGG TTGTAGCTGC CAGCAGCATG GCAACCAGTG480GAGTGTTGCC CGGGGGTGGT TTTGTGGCCA GTGCTGCTGC AGTCGCAGGC CCTGAAATGC540AGACTGGCCG AAATAACTTT GTCATCCGGC GGAACCCAGC TGACCCTCAG CGCATTCCCT600CCAACCCTTC CCACCGTATC CAGTGTGCAG CAGGCTACGA GCAAAGTGAA CACAACGTGT660GCCAAGACAT AGACGAGTGC ACTGCAGGGA CGCACAACTG TAGAGCAGAC CAAGTGTGCA720TCAATTTACG GGGATCCTTT GCATGTCAGT GCCCTCCTGG ATATCAGAAG CGAGGGGAGC780AGTGCGTAGA CATAGATGAA TGTACCATCC CTCCATATTG CCACCAAAGA TGCGTGAATA840CACCAGGCTC ATTTTATTGC CAGTGCAGTC CTGGGTTTCA ATTGGCAGCA AACAACTATA900CCTGCGTAGA TATAAATGAA TGTGATGCCA GCAATCAATG TGCTCAGCAG TGCTACAACA960TTCTTGGTTC ATTCATCTGT CAGTGCAATC AAGGATATGA GCTAAGCAGT GACAGGCTCA1020ACTGTGAAGA CATTGATGAA TGCAGAACCT CAAGCTACCT GTGTCAATAT CAATGTGTCA1080ATGAACCTGG GAAATTCTCA TGTATGTGCC CCCAGGGATA CCAAGTGGTG AGAAGTAGAA1140CATGTCAAGA TATAAATGAG TGTGAGACCA CAAATGAATG CCGGGAGGAT GAAATGTGTT1200GGAATTATCA TGGCGGCTTC CGTTGTTATC CACGAAATCC TTGTCAAGAT CCCTACATTC1260TAACACCAGA GAACCGATGT GTTTGCCCAG TCTCAAATGC CATGTGCCGA GAACTGCCCC1320AGTCAATAGT CTACAAATAC ATGAGCATCC GATCTGATAG GTCTGTGCCA TCAGACATCT1380TCCAGATACA GGCCACAACT ATTTATGCCA ACACCATCAA TACTTTTCGG ATTAAATCTG1440GAAATGAAAA TGGAGAGTTC TACCTACGAC AAACAAGTCC TGTAAGTGCA ATGCTTGTGC1500TCGTGAAGTC ATTATCAGGA CCAAGAGAAC ATATCGTGGA CCTGGAGATG CTGACAGTCA1560GCAGTATAGG GACCTTCCGC ACAAGCTCTG TGTTAAGATT GACAATAATA GTGGGGCCAT1620TTTCATTTTAGTCTTTTCTA AGAGTCAACC ACAGGCATTT AAGTCAGCCA AAGAATATTG1680TTACCTTAAA GCACTATTTT ATTTATAGAT ATATCTAGTG CATCTACATC TCTATACTGT1740ACACTCACCC ATAACAAACA ATTACACCAT GGTATAAAGT GGGCATTTAA TATGTAAAGA1800TTCAAAGTTT GTCTTTATTA CTATATGTAA ATTAGACATT AATCCACTAA ACTGGTCTTC1860TTCAAGAGAG CTAAGTATAC ACTATCTGGT GAAACTTGGA TTCTTTCCTA TAAAAGTGGG1920ACCAAGCAAT GATGATCTTC TGTGGTGCTT AAGGAAACTT ACTAGAGCTC CACTAACAGT1980CTCATAAGGA GGCAGCCATC ATAACCATTG AATAGCATGC AAGGGTAAGA ATGAGTTTTT2040AACTGCTTTG TAAGAAAATG GAAAAGGTCA ATAAAGATAT ATTTCTTTAG AAAATGGGGA2100TCTGCCATAT TTGTGTTGGT TTTTATTTTC ATATCCAGCC TAAAGGTGGT TGTTTATTAT2160ATAGTAATAA ATCATTGCTG TACAACATGC TGGTTTCTGT AGGGTATTTT TAATTTTGTC2220AGAAATTTTA GATTGTGAAT ATTTTGTAAA AAACAGTAAG CAAAATTTTC CAGAATTCCC2280AAAATGAACC AGATACCCCC TAGAAAATTA TACTATTGAG AAATCTATGG GGAGGATATG2340AGAAAATAAA TTCCTTCTAA ACCACATTGG AACTGACCTG AAGAAGCAAA CTCGGAAAAT2400ATAATAACAT CCCTGAATTC AGGCATTCAC AAGATGCAGA ACAAAATGGA TAAAAGGTAT2460TTCACTGGAG AAGTTTTAAT TTCTAAGTAA AATTTAAATC CTAACACTTC ACTAATTTAT2520AACTAAAATT TCTCATCTTC GTACTTGATG CTCACAGAGG AAGAAAATGA TGATGGTTTT2580TATTCCTGGC ATCCAGAGTG ACAGTGAACT TAAGCAAATT ACCCTCCTAC CCAATTCTAT2640GGAATATTTT ATACGTCTCC TTGTTTAAAA TCTGACTGCT TTACTTTGAT GTATCATATT2700TTTAAATAAA AATAAATATT CCTTTAGAAG ATCACTCTAA AAAAB9 DNA sequenceGene name: Melanoma adhesion molecule, MUC 18 glycoproteinUnigene number: Hs.211579Probeset Accession #: M28882Nucleic Acid Accession #: NM_006500 clusterCoding sequence: 27-1967 (predicted start/stop codons underlined)ACTTGCGTCT CGCCCTCCGG CCAAGCATGG GGCTTCCCAG GCTGGTCTGC GCCTTCTTGC60TCGCCGCCTG CTGCTGCTGT CCTCGCGTCG CGGGTGTGCC CGGAGAGGCT GAGCAGCCTG120CGCCTGAGCT GGTGGAGGTG GAAGTGGGCA GCACAGCCCT TCTGAAGTGC GGCCTCTCCC180AGTCCCAAGG CAACCTCAGC CATGTCGACT GGTTTTCTGT CCACAAGGAG AAGCGGACGC240TCATCTTCCG TGTGCGCCAG GGCCAGGGCC AGAGCGAACC TGGGGAGTAC GAGCAGCGGC300TCAGCCTCCA GGACAGAGGG GCTACTCTGG CCCTGACTCA AGTCACCCCC CAAGACGAGC360GCATCTTCTT GTGCCAGGGC AAGCGCCCTC GGTCCCAGGA GTACCGCATC CAGCTCCGCG420TCTACAAAGC TCCGGAGGAG CCAAACATCC AGGTCAACCC CCTGGGCATC CCTGTGAACA480GTAAGGAGCC TGAGGAGGTC GCTACCTGTG TAGGGAGGAA CGGGTACCCC ATTCCTCAAG540TCATCTGGTA CAAGAATGGC CGGCCTCTGA AGGAGGAGAA GAACCGGGTC CACATTCAGT600CGTCCCAGAC TGTGGAGTCG AGTGGTTTGT ACACCTTGCA GAGTATTCTG AAGGCACAGC660TGGTTAAAGA AGACAAAGAT GCCCAGTTTT ACTGTGAGCT CAACTACCGG CTGCCCAGTG720GGAACCACAT GAAGGAGTCC AGGGAAGTCA CCGTCCCTGT TTTCTACCCG ACAGAAAAAG780TGTGGCTGGA AGTGGAGCCC GTGGGAATGC TGAAGGAAGG GGACCGCGTG GAAATCAGGT840GTTTGGCTGA TGGCAACCCT CCACCACACT TCAGCATCAG CAAGCAGAAC CCCAGCACCA900GGGAGGCAGA GGAAGAGACA ACCAACGACA ACGGGGTCCT GGTGCTGGAG CCTGCCCGGA960AGGAACACAG TGGGCGCTAT GAATGTCAGG CCTGGAACTT GGACACCATG ATATCGCTGC1020TGAGTGAACC ACAGGAACTA CTGGTGAACT ATGTGTCTGA CGTCCGAGTG AGTCCCGCAG1080CCCCTGAGAG ACAGGAAGGC AGCAGCCTCA CCCTGACCTG TGAGGCAGAG AGTAGCCAGG1140ACCTCGAGTT CCAGTGGCTG AGAGAAGAGA CAGACCAGGT GCTGGAAAGG GGGCCTGTGC1200TTCAGTTGCA TGACCTGAAA CGGGAGGCAG GAGGCGGCTA TCGCTGCGTG GCGTCTGTGC1260CCAGCATACC CGGCCTGAAC CGCACACAGC TGGTCAAGCT GGCCATTTTT GGCCCCCCTT1320GGATGGCATT CAAGGAGAGG AAGGTGTGGG TGAAAGAGAA TATGGTGTTG AATCTGTCTT1380GTGAAGCGTC AGGGCACCCC CGGCCCACCA TCTCCTGGAA CGTCAACGGC ACGGCAAGTG1440AACAAGACCA AGATCCACAG CGAGTCCTGA GCACCCTGAA TGTCCTCGTG ACCCCGGAGC1500TGTTGGAGAC AGGTGTTGAA TGCACGGCCT CCAACGACCT GGGCAAAAAC ACCAGCATCC1560TCTTCCTGGA GCTGGTCAAT TTAACCACCC TCACACCAGA CTCCAACACA ACCACTGGCC1620TCAGCACTTC CACTGCCAGT CCTCATACCA GAGCCAACAG CACCTCCACA GAGAGAAAGC1680TGCCGGAGCC GGAGAGCCGG GGCGTGGTCA TCGTGGCTGT GATTGTGTGC ATCCTGGTCC1740TGGCGGTGCT GGGCGCTGTC CTCTATTTCC TCTATAAGAA GGGCAAGCTG CCGTGCAGGC1800GCTCAGGGAA GCAGGAGATC ACGCTGCCCC CGTCTCGTAA GACCGAACTT GTAGTTGAAG1860TTAAGTCAGA TAAGCTCCCA GAAGAGATGG GCCTCCTGCA GGGCAGCAGC GGTGACAAGA1920GGGCTCCGGG AGACCAGGGA GAGAAATACA TCGATCTGAG GCATTAGCCC CGAATCACTT1980CAGCTCCCTT CCCTGCCTGG ACCATTCCCA GCTCCCTGCT CACTCTTCTC TCAGCCAAAG2040CCTCCAAAGG GACTAGAGAG AAGCCTCCTG CTCCCCTCAC CTGCACACCC CCTTTCAGAG2100GGCCACTGGG TTAGGACCTG AGGACCTCAC TTGGCCCTGC AAGCCGCTTT TCAGGGACCA2160GTCCACCACC ATCTCCTCCA CGTTGAGTGA AGCTCATCCC AAGCAAGGAG CCCCAGTCTC2220CCGAGCGGGT AGGAGAGTTT CTTGCAGAAC GTGTTTTTTC TTTACACACA TTATGGCTGT2280AAATACCTGG CTCCTGCCAG CAGCTGAGCT GGGTAGCCTC TCTGAGCTGG TTTCCTGCCC2340CAAAGGCTGG CTTCCACCAT CCAGGTGCAC CACTGAAGTG AGGACACACC GGAGCCAGGC2400GCCTGCTCAT GTTGAAGTGC GCTGTTCACA CCCGCTCCGG AGAGCACCCC AGCGGCATCC2460AGAAGCAGCT GCAGTGTTGC TGCCACCACC CTUCTGCTCG CCTCTTCAAA GTCTCCTGTG2520ACATTTTTTC TTTGGTCAGA AGCCAGGAAC TGGTGTCATT CCTTAAAAGA TACGTGCCGG2580GGCCAGGTGT GGTGGCTCAC GCCTGTAATC CCAGCACTTT GGGAGGCCGA GGCGGGCGGA2640TCACAAAGTC AGGACGAGAC CATCCTGGCT AACACGGTGA AACCCTGTCT CTACTAAAAA2700TACAAAAAAA AATTAGCTAG GCGTAGTGGT TGGCACCTAT AGTCCCAGCT ACTCGGAAGG2760CTGAAGCAGG AGAATGGTAT GAATCCAGGA GGTGGAGCTT GCAGTGAGCC GAGACCGTGC2820CACTGCACTC CAGCCTGGGC AACACAGCGA GACTCCGTCT CGAGGAAAAA AAAAGAAAAG2880ACGCGTACCT GCGGTGAGGA AGCTGGGCGC TGTTTTCGAG TTCAGGTGAA TTAGCCTCAA2940TCCCCGTGTT CACTTGCTCC CATAGCCCTC TTGATGGATC ACGTAAAACT GAAAGGCAGC3000GGGGAGCAGA CAAAGATGAG GTCTACACTG TCCTTCATGG GGATTAAAGC TATGGTTATA3060TTAGCACCAA ACTTCTACAA ACCAAGCTCA GGGCCCCAAC CCTAGAAGGG CCCAAATGAG3120AGAATGGTAC TTAGGGATGG AAAACGGGGC CTGGCTAGAG CTTCGGGTGT GTGTGTCTGT3280CTGTGTGTAT GCATACATAT GTGTGTATAT ATGGTTTTGT CAGGTGTGTA AATTTGCAAA3240TTGTTTCCTT TATATATGTA TGTATATATA TATATGAAAA TATATATATA TATGAAAAAT3300AAAGCTTAAT TGTCCCAGAA AATCATACAT TGCTTTTTTA TTCTACATGG GTACCACAGG3360AACCTGGGGG CCTGTGAAAC TACAACCAAA AGGCACACAA AACCGTTTCC AGTTGGCAGC3420AGAGATCAGG GGTTACCTCT GCTTCTGAGC AAATGGCTCA AGCTCTACCA GAGCAGACAG3480CTACCCTACT TTTCAGCAGC AAAACGTCCC GTATGACGCA GCACGAAGGG CCTGGCAGGC3540TGTTAGCAGG AGCTATGTCC CTTCCTATCG TTTCCGTCCA CTTAAC1 DNA sequenceGene name: Matrix metalloproteinase 1 (interstitial collagenase)Unigene number: Hs.83169Probeset Accession #: X54925Nucleic Acid Accession #: NM_002421 clusterCoding sequence: 69-1478 (predicted start/stop codons underlined)ATATTGGAGT AGCAAGAGGC TGGGAAGCCA TCACTTACCT TGCACTGAGA AAGAAGACAA60AGGCCAGTATGCACAGCTTT CCTCCACTGC TGCTGCTGCT GTTCTGGGGT GTGGTGTCTC120ACAGCTTCCC AGCGACTCTA GAAACACAAG AGCAAGATGT GGACTTAGTC CAGAAATACC180TGGAAAAATA CTACAACCTG AAGAATGATG GGAGGCAAGT TGAAAAGCGG AGAAATAGTG240GCCCAGTGGT TGAAAAATTG AAGCAAATGC AGGAATTCTT TGGGCTGAAA GTGACTGGGA300AACCAGATGC TGAAACCCTG AAGGTGATGA AGCAGCCCAG ATGTGGAGTG CCTGATGTGG360CTCAGTTTGT CCTCACTGAG GGGAACCCTC GCTGGGAGCA AACACATCTG ACCTACAGGA420TTGAAAATTA CACGCCAGAT TTGCCAAGAG CAGATGTGGA CCATGCCATT GAGAAAGCCT480TCCAACTCTG GAGTAATGTC ACACCTCTGA CATTCACCAA GGTCTCTGAG GGTCAAGCAG540ACATCATGAT ATCTTTTGTC AGGGGAGATC ATCGGGACAA CTCTCCTTTT GATGGACCTG600GAGGAAATCT TGCTCATGCT TTTCAACCAG GCCCAGGTAT TGGAGGGGAT GCTCATTTTG660ATGAAGATGA AAGGTGGACC AACAATTTCA GAGAGTACAA CTTACATCGT GTTGCGGCTC720ATGAACTCGG CCATTCTCTT GGACTCTCCC ATTCTACTGA TATCGGGGCT TTGATGTACC780CTAGCTACAC CTTCAGTGGT GATGTTCAGC TAGCTCAGGA TGACATTGAT GGCATCCAAG840CCATATATGG ACGTTCCCAA AATCCTGTCC AGCCCATCGG CCCACAAACC CCAAAAGCAT900GTGACAGTAA GCTAACCTTT GATGCTATAA CTACGATTCG GGGAGAAGTG ATGTTCTTTA960AAGACAGATT CTACATGCGC ACAAATCCCT TCTACCCGGA AGTTGAGCTC AATTTCATTT1020CTGTTTTCTG GCCACAACTG CCAAATGGGC TTGAAGCTGC TTACGAATTT GCCGACAGAG1080ATGAAGTCCG GTTTTTCAAA GGGAATAAGT ACTGGGCTGT TCAGGGACAG AATGTGCTAC1140ACGGATACCC CAAGGACATC TACAGCTCCT TTGGCTTCCC TAGAACTGTG AAGCATATCG1200ATGCTGCTCT TTCTGAGGAA AACACTGGAA AAACCTACTT CTTTGTTGCT AACAAATACT1260GGAGGTATGA TGAATATAAA CGATCTATGG ATCCAGGTTA TCCCAAAATG ATAGCACATG1320ACTTTCCTGG AATTGGCCAC AAAGTTGATG CAGTTTTCAT GAAAGATGGA TTTTTCTATT1380TCTTTCATGG AACAAGACAA TACAAATTTG ATCCTAAAAC GAAGAGAATT TTGACTCTCC1440AGAAAGCTAA TAGCTGGTTC AACTGCAGGA AAAATTGAAC ATTACTAATT TGAATGGAAA1500ACACATGGTG TGAGTCCAAA GAAGGTGTTT TCCTGAAGAA CTGTCTATTT TCTCAGTCAT1560TTTTAACCTC TAGAGTCACT GATACACAGA ATATAATCTT ATTTATACCT CAGTTTGCAT1620ATTTTTTTAC TATTTAGAAT GTAGCCCTTT TTGTACTGAT ATAATTTAGT TCCACAAATG1680GTGGGTACAA AAAGTCAAGT TTGTGGCTTA TGGATTCATA TAGGCCAGAG TTGCAAAGAT1740CTTTTCCAGA GTATGCAACT CTGACGTTGA TCCCAGAGAG CAGCTTCAGT GACAAACATA1800TCCTTTCAAG ACAGAAAGAG ACAGGAGACA TGAGTCTTTG CCGGAGGAAA AGCAGCTCAA1860GAACACATGT GCAGTCACTG GTGTCACCCT GGATAGGCAA GGGATAACTC TTCTAACACA1920AAATAAGTGT TTTATGTTTG GAATAAAGTC AACCTTGTTT CTACTGTTTTAAC3 DNA sequenceGene name: Branched chain aminotransferase 1, cytosolicUnigene number: Hs.157205Probeset Accession #: AA423987Nucleic Acid Accession #: NM_005504 clusterCoding sequence: 1-1155 (predicted start/stop codons underlined)ATGGATTGCA GTAACGGATC GGCAGAGTGT ACCGGAGAAG GAGGATCAAA AGAGGTGGTG60GGGACTTTTA AGGCTAAAGA CCTAATAGTC ACACCAGCTA CCATTTTAAA GGAAAAACCA120GACCCCAATA ATCTGGTTTT TGGAACTGTG TTCACGGATC ATATGCTGAC GGTGGAGTGG180TCCTCAGAGT TTGGATGGGA GAAACCTCAT ATCAAGCCTC TTCAGAACCT GTCATTGCAC240CCTGGCTCAT CAGCTTTGCA CTATGCAGTG GAATTATTTG AAGGATTGAA GGCATTTCGA300GGAGTAGATA ATAAAATTCG ACTGTTTCAG CCAAACCTCA ACATGGATAG AATGTATCGC360TCTGCTGTGA GGGCAACTCT GCCGGTATTT GACAAAGAAG AGCTCTTAGA GTGTATTCAA420CAGCTTGTGA AATTGGATCA AGAATGGGTC CCATATTCAA CATCTGCTAG TCTGTATATT480CGTCCTGCAT TCATTGGAAC TGAGCCTTCT CTTGGAGTCA AGAAGCCTAC CAAAGCCCTG540CTCTTTGTAC TCTTGAGCCC AGTGGGACCT TATTTTTCAA GTGGAACCTT TAATCCAGTG600TCCCTGTGGG CCAATCCCAA GTATGTAAGA GCCTGGAAAG GTGGAACTGG GGACTGCAAG660ATGGGAGGGA ATTACGGCTC ATCTCTTTTT GCCCAATGTG AAGACGTAGA TAATGGGTGT720CAGCAGGTCC TGTGGCTCTA TGGCAGAGAC CATCAGATCA CTGAAGTGGG AACTATGAAT780CTTTTTCTTT ACTGGATAAA TGAAGATGGA GAAGAAGAAC TGGCAACTCC TCCACTAGAT840GGCATCATTC TTCCAGGAGT GACAAGGCGG TGCATTCTGG ACCTGGCACA TCAGTGGGGT900GAATTTAAGG TGTCAGAGAG ATACCTCACC ATGGATGACT TGACAACAGC CCTGGAGGGG960AACAGAGTGA GAGAGATGTT TAGCTCTGGT ACAGCCTGTG TTGTTTGCCC AGTTTCTGAT1020ATACTGTACA AAGGCGAGAC AATACACATT CCAACTATGG AGAATGGTCC TAAGCTGGCA1080AGCCGCATCT TGAGCAAATT AACTGATATC CAGTATGGAA GAGAAGAGAG CGACTGGACA1140ATTGTGCTAT CCTGAACG4 DNA sequence:Gene name: Pentaxin-related gene, rapidly induced by IL-1 betaUnigene number: Hs.2050Probeset Accession #: M31166Nucleic Acid Accession #: NM_002852 clusterCoding sequence: 68-1213 (predicted start/stop codons underlined)CTCAAACTCA GCTCACTTGA GAGTCTCCTC CCGCCAGCTG TGGAAAGAAC TTTGCGTCTC60TCCAGCAATG CATCTCCTTG CGATTCTGTT TTGTGCTCTC TGGTCTGCAG TGTTGGCCGA120GAACTCGGAT GATTATGATC TCATGTATGT GAATTTGGAC AACGAAATAG ACAATGGACT180CCATCCCACT GAGGACCCCA CGCCGTGCGA CTGCGGTCAG GAGCACTCGG AATGGGACAA240GCTCTTCATC ATGCTGGAGA ACTCGCAGAT GAGAGAGCGC ATGCTGCTGC AAGCCACGGA300CGACGTCCTG CGGGGCGAGC TGCAGAGGCT GCGGGAGGAG CTGGGCCGGC TCGCGGAAAG360CCTGGCGAGG CCGTGCGCGC CGGGGGCTCC CGCAGAGGCC AGGCTGACCA GTGCTCTGGA420CGAGCTGCTG CAGGCGACCC GCGACGCGGG CCGCAGGCTG GCGCGTATGG AGGGCGCGGA480GGCGCAGCGC CCAGAGGAGG CGGGGCGCGC CCTGGCCGCG GTGCTAGAGG AGCTGCGGCA540GACGCGAGCC GACCTGCACG CGGTGCAGGG CTGGGCTGCC CGGAGCTGGC TGCCGGCAGG600TTGTGAAACA GCTATTTTAT TCCCAATGCG TTCCAAGAAG ATTTTTGGAA GCGTGCATCC660AGTGAGACCA ATGAGGCTTG AGTCTTTTAG TGCCTGCATT TGGGTCAAAG CCACAGATGT720ATTAAACAAA ACCATCCTGT TTTCCTATGG CACAAAGAGG AATCCATATG AAATCCAGCT780GTATCTCAGC TACCAATCCA TAGTGTTTGT GGTGGGTGGA GAGGAGAACA AACTGGTTGC840TGAAGCCATG GTTTCCCTGG GAAGGTGGAC CCACCTGTGC GGCACCTGGA ATTCAGAGGA900AGGGCTCACA TCCTTGTGGG TAAATGGTGA ACTGGCGGCT ACCACTGTTG AGATGGCCAC960AGGTCACATT GTTCCTGAGG GAGGAATCCT GCAGATTGGC CAAGAAAAGA ATGGCTGCTG1020TGTGGGTGGT GGCTTTGATG AAACATTAGC CTTCTCTGGG AGACTCACAG GCTTCAATAT1080CTGGGATAGT GTTCTTAGCA ATGAAGAGAT AAGAGAGACC GGAGGAGCAG AGTCTTGTCA1140CATCCGGGGG AATATTGTTG GGTGGGGAGT CACAGAGATC CAGCCACATG GAGGAGCTCA1200GTATGTTTCA TAAATGTTGT GAAACTCCAC TTGAAGCCAA AGAAAGAAAC TCACACTTAA1260AACACATGCC AGTTGGGAAG GTCTGAAAAC TCAGTGCATA ATAGGAACAC TTGAGACTAA1320TGAAAGAGAG AGTTGAGACC AATCTTTATT TGTACTGGCC AAATACTGAA TAAACAGTTG1380AAGGAAAGAC ATTGGAAAAA GCTTTTGAGG ATAATGTTAC TAGACTTTAT GCCATGGTGC1440TTTCAGTTTA ATGCTGTGTC TCTGTCAGAT AAACTCTCAA ATAATTAAAA AGGACTGTAT1500TGTTGAACAG AGGGACAATT GTTTTACTTT TCTTTGGTTA ATTTTGTTTT GGCCAGAGAT1560GAATTTTACA TTGGAAGAAT AACAAAATAA GATTTGTTGT CCATTGTTCA TTGTTATTGG1620TATGTACCTT ATTACAAAAA AAATGATGAA AACATATTTA TACTACAAGG TGACTTAACA1680ACTATAAATG TAGTTTATGT GTTATAATCG AATGTCACGT TTTTGAGAAG ATAGTCATAT1740AAGTTATATT GCAAAAGGGA TTTGTATTAA TTTAAGACTA TTTTTGTAAA GCTCTACTGT1800AAATAAAATA TTTTATAAAA CTAAAAAAAA AAAAAAAACK5 DNA sequenceGene name: Von Willebrand factor; Coagulation factor VIIIUnigene number: Hs.110802Probeset Accession #: M10321Nucleic Acid Accession #: NM_000552Coding sequence: 311-8752 (predicted start/stop codons underlined)AGCTCACAGC TATTGTGGTG GGAAAGGGAG GGTGGTTGGT GGATGTCACA GCTTGGGCTT60TATCTCCCCC AGCAGTGGGG ACTCCACAGC CCCTGGGCTA CATAACAGCA AGACAGTCCG120GAGCTGTAGC AGACCTGATT GAGCCTTTGC AGCAGCTGAG AGCATGGCCT AGGGTGGGCG180GCACCATTGT CCAGCAGCTG AGTTTCCCAG GGACCTTGGA GATAGCCGCA GCCCTCATTT240GCAGGGGAAG GCACCATTGT CCAGCAGCTG AGTTTCCCAG GGACCTTGGA GATAGCCGCA300GCCCTCATTT ATGATTCCTG CCAGATTTGC CGGGGTGCTG CTTGCTCTGG CCCTCATTTT360GCCAGGGACC CTTTGTGCAG AAGGAACTCG CGGCAGGTCA TCCACGGCCC GATGCAGCCT420TTTCGGAAGT GACTTCGTCA ACACCTTTGA TGGGAGCATG TACAGCTTTG CGGGATACTG480CAGTTACCTC CTGGCAGGGG GCTGCCAGAA ACGCTCCTTC TCGATTATTG GGGACTTCCA540GAATGGCAAG AGAGTGAGCC TCTCCGTGTA TCTTGGGGAA TTTTTTGACA TCCATTTGTT600TGTCAATGGT ACCGTGACAC AGGGGGACCA AAGAGTCTCC ATGCCCTATG CCTCCAAAGG660GCTGTATCTA GAAACTGAGG CTGGGTACTA CAAGCTGTCC GGTGAGGCCT ATGGCTTTGT720GGCCAGGATC GATGGCAGCG GCAACTTTCA AGTCCTGCTG TCAGACAGAT ACTTCAACAA780GACCTGCGGG CTGTGTGGCA ACTTTAACAT CTTTGCTGAA GATGACTTTA TGACCCAAGA840AGGGACCTTG ACCTCGGACC CTTATGACTT TGCCAACTCA TGGGCTCTGA GCAGTGGAGA900ACAGTGGTGT GAACGGGCAT CTCCTCCCAG CAGCTCATGC AACATCTCCT CTGGGGAAAT960GCAGAAGGGC CTGTGGGAGC AGTGCCAGCT TCTGAAGAGC ACCTCGGTGT TTGCCCGCTG1020CCACCCTCTG GTGGACCCCG AGCCTTTTGT GGCCCTGTGT GAGAAGACTT TGTGTGAGTG1080TGCTGGGGGG CTGGAGTGCG CCTGCCCTGC CCTCCTGGAG TACGCCCGGA CCTGTGCCCA1140GGAGGGAATG GTGCTGTACG GCTGGACCGA CCACAGCGCG TGCAGCCCAG TGTGCCCTGC1200TGGTATGGAG TATAGGCAGT GTGTGTCCCC TTGCGCCAGG ACCTGCCAGA GCCTGCACAT1260CAATGAAATG TGTCAGGAGC GATGCGTGGA TGGCTGCAGC TGCCCTGAGG GACAGCTCCT1320GGATGAAGGC CTCTGCGTGG AGAGCACCGA GTGTCCCTGC GTGCATTCCG GAAAGCGCTA1380CCCTCCCGGC ACCTCCCTCT CTCGAGACTG CAACACCTGC ATTTGCCGAA ACAGCCAGTG1440GATCTGCAGC AATGAAGAAT GTCCAGGGGA GTGCCTTGTC ACTGGTCAAT CCCACTTCAA1500GAGCTTTGAC AACAGATACT TCACCTTCAG TGGGATCTGC CAGTACCTGC TGGCCCGGGA1560TTGCCAGGAC CACTCCTTCT CCATTGTCAT TGAGACTGTC CAGTGTGCTG ATGACCGCGA1620CGCTGTGTGC ACCCGCTCCG TCACCGTCCG GCTGCCTGGC CTGCACAACA GCCTTGTGAA1680ACTGAAGCAT GGGGCAGGAG TTGCCATGGA TGGCCAGGAC ATCCAGCTCC CCCTCCTGAA1740AGGTGACCTC CGCATCCAGC ATACAGTGAC GGCCTCCGTG CGCCTCAGCT ACGGGGAGGA1800CCTGCAGATG GACTGGGATG GCCGCGGGAG GCTGCTGGTG AAGCTGTCCC CCGTCTACGC1860CGGGAAGACC TGCGGCCTGT GTGGGAATTA CAATGGCAAC CAGGGCGACG ACTTCCTTAC1920CCCCTCTGGG CTGGCAGAGC CCCGGGTGGA GGACTTCGGG AACGCCTGGA AGCTGCACGG1980GGACTGCCAG GACCTGCAGA AGCAGCACAG CGATCCCTGC GCCCTCAACC CGCGCATGAC2040CAGGTTCTCC GAGGAGGCGT GCGCGGTCCT GACGTCCCCC ACATTCGAGG CCTGCCATCG2100TGCCGTCAGC CCGCTGCCCT ACCTGCGGAA CTGCCGCTAC GACGTGTGCT CCTGCTCGGA2160CGGCCGCGAG TGCCTGTGCG GCGCCCTGGC CAGCTATGCC GCGGCCTGCG CGGGGAGAGG2220CGTGCGCGTC GCGTGGCGCG AGCCAGGCCG CTGTGAGCTG AACTGCCCGA AAGGCCAGGT2280GTACCTGCAG TGCGGGACCC CCTGCAACCT GACCTGCCGC TCTCTCTCTT ACCCGGATGA2340GGAATGCAAT GAGGCCTGCC TGGAGGGCTG CTTCTGCCCC CCAGGGCTCT ACATGGATGA2400GAGGGGGGAC TGCGTGCCCA AGGCCCAGTG CCCCTGTTAC TATGACGGTG AGATCTTCCA2460GCCAGAAGAC ATCTTCTCAG ACCATCACAC CATGTGCTAC TGTGAGGATG GCTTCATGCA2520CTGTACCATG AGTGGAGTCC CCGGAAGCTT GCTGCCTGAC GCTGTCCTCA GCAGTCCCCT2580GTCTCATCGC AGCAAAAGGA GCCTATCCTG TCGGCCCCCC ATGGTCAAGC TGGTGTGTCC2640CGCTGACAAC CTGCGGGCTG AAGGGCTCGA GTGTACCAAA ACGTGCCAGA ACTATGACCT2700GGAGTGCATG AGCATGGGCT GTGTCTCTGG CTGCCTCTGC CCCCCGGGCA TGGTCCGGCA2760TGAGAACAGA TGTGTGGCCC TGGAAAGGTG TCCCTGCTTC CATCAGGGCA AGGAGTATGC2820CCCTGGAGAA ACAGTGAAGA TTGGCTGCAA CACTTGTGTC TGTCGGGACC GGAAGTGGAA2880CTGCACAGAC CATGTGTGTG ATGCCACGTG CTCCACGATC GGCATGGCCC ACTACCTCAC2940CTTCGACGGG CTCAAATACC TGTTCCCCGG GGAGTGCCAG TACGTTCTGG TGCAGGATTA3000CTGCGGCAGT AACCCTGGGA CCTTTCGGAT CCTAGTGGGG AATAAGGGAT GCAGCCACCC3060CTCAGTGAAA TGCAAGAAAC GGGTCACCAT CCTGGTGGAG GGAGGAGAGA TTGAGCTGTT3120TGACGGGGAG GTGAATGTGA AGAGGCCCAT GAAGGATGAG ACTCACTTTG AGGTGGTGGA3180GTCTGGCCGG TACATCATTC TGCTGCTGGG CAAAGCCCTC TCCGTGGTCT GGGACCGCCA3240CCTGAGCATC TCCGTGGTCC TGAAGCAGAC ATACCAGGAG AAAGTGTGTG GCCTGTGTGG3300GAATTTTGAT GGCATCCAGA ACAATGACCT CACCAGCAGC AACCTCCAAG TGGAGGAAGA3360CCCTGTGGAC TTTGGGAACT CCTGGAAAGT GAGCTCGCAG TGTGCTGACA CCAGAAAAGT3420GCCTCTGGAC TCATCCCCTG CCACCTGCCA TAACAACATC ATGAAGCAGA CGATGGTGGA3480TTCCTCCTGT AGAATCCTTA CCAGTGACGT CTTCCAGGAC TGCAACAAGC TGGTGGACCC3540CGAGCCATAT CTGGATGTCT GCATTTACGA CACCTGCTCC TGTGAGTCCA TTGGGGACTG3600CGCCTGCTTC TGCGACACCA TTGCTGCCTA TGCCCACGTG TGTGCCCAGC ATGGCAAGGT3660GGTGACCTGG AGGACGGCCA CATTGTGCCC CCAGAGCTGC GAGGAGAGGA ATCTCCGGGA3720GAACGGGTAT GAGTGTGAGT GGCGCTATAA CAGCTGTGCA CCTGCCTGTC AAGTCACGTG3780TCAGCACCCT GAGCCACTGG CCTGCCCTGT GCAGTGTGTG GAGGGCTGCC ATGCCCACTG3840CCCTCCAGGG AAAATCCTGG ATGAGCTTTT GCAGACCTGC GTTGACCCTG AAGACTGTCC3900AGTGTGTGAG GTGGCTGGCC GGCGTTTTGC CTCAGGAAAG AAAGTCACCT TGAATCCCAG3960TGACCCTGAG CACTGCCAGA TTTGCCACTG TGATGTTGTC AACCTCACCT GTGAAGCCTG4020CCAGGAGCCG GGAGGCCTGG TGGTGCCTCC CACAGATGCC CCGGTGAGCC CCACCACTCT4080GTATGTGGAG GACATCTCGG AACCGCCGTT GCACGATTTC TACTGCAGCA GGCTACTGGA4140CCTGGTCTTC CTGCTGGATG GCTCCTCCAG GCTGTCCGAG GCTGAGTTTG AAGTGCTGAA4200GGCCTTTGTG GTGGACATGA TGGAGCGGCT GCGCATCTCC CAGAAGTGGG TCCGCGTGGC4260CGTGGTGGAG TACCACGACG GCTCCCACGC CTACATCGGG CTCAAGGACC GGAAGCGACC4320GTCAGAGCTG CGGCGCATTG CCAGCCAGGT GAAGTATGCG GGCAGCCAGG TGGCCTCCAC4380CAGCGAGGTC TTGAAATACA CACTGTTCCA AATCTTCAGC AAGATCGACC GCCCTGAAGC4440CTCCCGCATC GCCCTGCTCC TGATGGCCAG CCAGGAGCCC CAACGGATGT CCCGGAACTT4500TGTCCGCTAC GTCCAGGGCC TGAAGAAGAA GAAGGTCATT GTGATCCCGG TGGGCATTGG4560GCCCCATGCC AACCTCAAGC AGATCCGCCT CATCGAGAAG CAGGCCCCTG AGAACAAGGC4620CTTCGTGCTG AGCAGTGTGG ATGAGCTGGA GCAGCAAAGG GACGAGATCG TTAGCTACCT4680CTGTGACCTT GCCCCTGAAG CCCCTCCTCC TACTCTGCCC CCCCACATGG CACAAGTCAC4740TGTGGGCCCG GGGCTCTTGG GGGTTTCGAC CCTGGGGCCC AAGAGGAACT CCATGGTTCT4800GGATGTGGCG TTCGTCCTGG AAGGATCGGA CAAAATTGGT GAAGCCGACT TCAACAGGAG4860CAAGGAGTTC ATGGAGGAGG TGATTCAGCG GATGGATGTG GGCCAGGACA GCATCCACGT4920CACGGTGCTG CAGTACTCCT ACATGGTGAC CGTGGAGTAC CCCTTCAGCG AGGCACAGTC4980CAAAGGGGAC ATCCTGCAGC GGGTGCGAGA GATCCGCTAC CAGGGCGGCA ACAGGACCAA5040CACTGGGCTG GCCCTGCGGT ACCTCTCTGA CCACAGCTTC TTGGTCAGCC AGGGTGACCG5100GGAGCAGGCG CCCAACCTGG TCTACATGGT CACCGGAAAT CCTGCCTCTG ATGAGATCAA5160GAGGCTGCCT GGAGACATCC AGGTGGTGCC CATTGGAGTG GGCCCTAATG CCAACGTGCA5220GGAGCTGGAG AGGATTGGCT GGCCCAATGC CCCTATCCTC ATCCAGGACT TTGAGACGCT5280CCCCCGAGAG GCTCCTGACC TGGTGCTGCA GAGGTGCTGC TCCGGAGAGG GGCTGCAGAT5340CCCCACCCTC TCCCCTGCAC CTGACTGCAG CCAGCCCCTG GACGTGATCC TTCTCCTGGA5400TGGCTCCTCC AGTTTCCCAG CTTCTTATTT TGATGAAATG AAGAGTTTCG CCAAGGCTTT5460CATTTCAAAA GCCAATATAG GGCCTCGTCT CACTCAGGTG TCAGTGCTGC AGTATGGAAG5520CATCACCACC ATTGACGTGC CATGGAACGT GGTCCCGGAG AAAGCCCATT TGCTGAGCCT5580TGTGGACGTC ATGCAGCGGG AGGGAGGCCC CAGCCAAATC GGGGATGCCT TGGGCTTTGC5640TGTGCGATAC TTGACTTCAG AAATGCATGG TGCCAGGCCG GGAGCCTCAA AGGCGGTGGT5700CATCCTGGTC ACGGACGTCT CTGTGGATTC AGTGGATGCA GCAGCTGATG CCGCCAGGTC5760CAACAGAGTG ACAGTGTTCC CTATTGGAAT TGGAGATCGC TACGATGCAG CCCAGCTACG5820GATCTTGGCA GGCCCAGCAG GCGACTCCAA CGTGGTGAAG CTCCAGCGAA TCGAAGACCT5880CCCTACCATG GTCACCTTGG GCAATTCCTT CCTCCACAAA CTGTGCTCTG GATTTGTTAG5940GATTTGCATG GATGAGGATG GGAATGAGAA GAGGCCCGGG GACGTCTGGA CCTTGCCAGA6000CCAGTGCCAC ACCGTGACTT GCCAGCCAGA TGGCCAGACC TTGCTGAAGA GTCATCGGGT6060CAACTGTGAC CGGGGGCTGA GGCCTTCGTG CCCTAACAGC CAGTCCCCTG TTAAAGTGGA6120AGAGACCTGT GGCTGCCGCT GGACCTGCCC CTGCGTGTGC ACAGGCAGCT CCACTCGGCA6180CATCGTGACC TTTGATGGGC AGAATTTCAA GCTGACTGGC AGCTGTTCTT ATGTCCTATT6240TCAAAACAAG GAGCAGGACC TGGAGGTGAT TCTCCATAAT GGTGCCTGCA GCCCTGGAGC6300AAGGCAGGGC TGCATGAAAT CCATCGAGGT GAAGCACAGT GCCCTCTCCG TCGAGCTGCA6360CAGTGACATG GAGGTGACGG TGAATGGGAG ACTGGTCTCT GTTCCTTACG TGGGTGGGAA6420CATGGAAGTC AACGTTTATG GTGCCATCAT GCATGAGGTC AGATTCAATC ACCTTGGTCA6480CATCTTCACA TTCACTCCAC AAAACAATGA GTTCCAACTG CAGCTCAGCC CCAAGACTTT6540TGCTTCAAAG ACGTATGGTC TGTGTGGGAT CTGTGATGAG AACGGAGCCA ATGACTTCAT6600GCTGAGGGAT GGCACAGTCA CCACAGACTG GAAAACACTT GTTCAGGAAT GGACTGTGCA6660GCGGCCAGGG CAGACGTGCC AGCCCATCCT GGAGGAGCAG TGTCTTGTCC CCGACAGCTC6720CCACTGCCAG GTCCTCCTCT TACCACTGTT TGCTGAATGC CACAAGGTCC TGGCTCCAGC6780CACATTCTAT GCCATCTGCC AGCAGGACAG TTGCCACCAG GAGCAAGTGT GTGAGGTGAT6840CGCCTCTTAT GCCCACCTCT GTCGGACCAA CGGGGTCTGC GTTGACTGGA GGACACCTGA6900TTTCTGTGCT ATGTCATGCC CACCATCTCT GGTCTACAAC CACTGTGAGC ATGGCTGTCC6960CCGGCACTGT GATGGCAACG TGAGCTCCTG TGGGGACCAT CCCTCCGAAG GCTGTTTCTG7020CCCTCCAGAT AAAGTCATGT TGGAAGGCAG CTGTGTCCCT GAAGAGGCCT GCACTCAGTG7080CATTGGTGAG GATGGAGTCC AGCACCAGTT CCTGGAAGCC TGGGTCCCGG ACCACCAGCC7140CTGTCAGATC TGCACATGCC TCAGCGGGCG GAAGGTCAAC TGCACAACGC AGCCCTGCCC7200CACGGCCAAA GCTCCCACGT GTGGCCTGTG TGAAGTAGCC CGCCTCCGCC AGAATGCAGA7260CCAGTGCTGC CCCGAGTATG AGTGTGTGTG TGACCCAGTG AGCTGTGACC TGCCCCCAGT7320GCCTCACTGT GAACGTGGCC TCCAGCCCAC ACTGACCAAC CCTGGCGAGT GCAGACCCAA7380CTTCACCTGC GCCTGCAGGA AGGAGGAGTG CAAAAGAGTG TCCCCACCCT CCTGCCCCCC7440GCACCGTTTG CCCACCCTTC GGAAGACCCA GTGCTGTGAT GAGTATGAGT GTGCCTGCAA7500CTGTGTCAAC TCCACAGTGA GCTGTCCCCT TGGGTACTTG GCCTCAACCG CCACCAATGA7560CTGTGGCTGT ACCACAACCA CCTGCCTTCC CGACAAGGTG TGTGTCCACC GAAGCACCAT7620CTACCCTGTG GGCCAGTTCT GGGAGGAGGG CTGCGATGTG TGCACCTGCA CCGACATGGA7680GGATGCCGTG ATGGGCCTCC GCGTGGCCCA GTGCTCCCAG AAGCCCTGTG AGGACAGCTG7740TCGGTCGGGC TTCACTTACG TTCTGCATGA AGGCGAGTGC TGTGGAAGGT GCCTGCCATC7800TGCCTGTGAG GTGGTGACTG GCTCACCGCG GGGGGACTCC CAGTCTTCCT GGAAGAGTGT7860CGGCTCCCAG TGGGCCTCCC CGGAGAACCC CTGCCTCATC AATGAGTGTG TCCGAGTGAA7920GGAGGAGGTC TTTATACAAC AAAGGAACGT CTCCTGCCCC GAGCTGGAGG TCCCTGTCTG7980CCCCTCGGGC TTTCAGCTGA GCTGTAAGAC CTCAGCGTGC TGCCCAAGCT GTCGCTGTGA8040GCGCATGGAG GCCTGCATGC TCAATGGCAC TGTCATTGGG CCCGGGAAGA CTGTGATGAT8100CGATGTGTGC ACGACCTGCC GCTGCATGGT GCAGGTGGGG GTCATCTCTG GATTCAAGCT8160GGAGTGCAGG AAGACCACCT GCAACCCCTG CCCCCTGGGT TACAAGGAAG AAAATAACAC8220AGGTGAATGT TGTGGGAGAT GTTTGCCTAC GGCTTGCACC ATTCAGCTAA GAGGAGGACA8280GATCATGACA CTGAAGCGTG ATGAGACGCT CCAGGATGGC TGTGATACTC ACTTCTGCAA8340GGTCAATGAG AGAGGAGAGT ACTTCTGGGA GAAGAGGGTC ACAGGCTGCC CACCCTTTGA8400TGAACACAAG TGTCTGGCTG AGGGAGGTAA AATTATGAAA ATTCCAGGCA CCTGCTGTGA8460CACATGTGAG GAGCCTGAGT GCAACGACAT CACTGCCAGG CTGCAGTATG TCAAGGTGGG8520AAGCTGTAAG TCTGAAGTAG AGGTGGATAT CCACTACTGC CAGGGCAAAT GTGCCAGCAA8580AGCCATGTAC TCCATTGACA TCAACGATGT GCAGGACCAG TGCTCCTGCT GCTCTCCGAC8640ACGGACGGAG CCCATGCAGG TGGCCCTGCA CTGCACCAAT GGCTCTGTTG TGTACCATGA8700GGTTCTCAAT GCCATGGAGT GCAAATGCTC CCCCAGGAAG TGCAGCAAGTGAGGCTGCTG8760CAGCTGCATG GGTGCCTGCT GCTGCCTGCC TTGGCCTGAT GGCCAGGCCA GAGTGCTGCC8820AGTCCTCTGC ATGTTCTGCT CTTGTGCCCT TCTGAGCCCA CAATAAAGGC TGAGCTCTTA8880TCTTGCTGCA TGTTCTGCTC TTGTGCCCTT CTGAGCCCAC AATAAC7 DNA sequenceGene name: KIAA1294 proteinProbeset Accession #: AA432248 Nucleic Acid Accession #: AB037715Coding sequence: 370-3489 (predicted start/stop codons underlined)GAACGCTCAC AGAACAGGCA GTGCAATTCC ATGTTCCTCT TAAGTATGTT AGCCCTACCG60GGAGCTGAGC TGGCCAGTCT ACTTGGAGAG GAAAAGTAGA TCTGGGGAAG GTGGAAGGGT120CAGTTCCTAA GTGACTTCCT CCTCGGGGAT GGTAAGGGCA TTTGCTGATC TCCAGTGACT180GCCTGGTGCC TCATGGTCAG ACTCGGCTGT CTCACTCCCA GATATCTGAT TTTGCAAAAA240GGGACACACC TATCTGCAGC AAAGAAGACA CTGACCAGAT TGCGAGCGGT GCTTTTGGAT300GCTCTGTAGC CACCCGGGGC CCAGGAGGAC TGACTCGGCA GCAGGATTCG TGCATGGGAA360TCGGAGACCATGGCAGTGCA GCTGGTGCCC GACTCAGCTC TCGGCCTGCT GATGATGACG420GAGGGCCGCC GATGTCAAGT ACATCTTCTT GATGACAGGA AGCTGGAACT CCTAGTACAG480CCCAAGCTGT TGGCCAAGGA GCTTCTTGAC CTTGTGGCTT CTCACTTCAA TCTGAAGGAA540AAGGAGTACT TTGGAATAGC ATTCACAGAT GAAACGGGAC ACTTAAACTG GCTTCAGCTA600GATCGAAGAG TATTGGAACA TGACTTCCCT AAAAAGTCAG GACCCGTGGT TTTATACTTT660TGTGTCAGGT TCTATATAGA AAGCATTTCA TACCTGAAGG ATAATGCTAC CATTGAGCTT720TTCTTTCTGA ACGCGAAGTC CTGCATCTAC AAGGAGCTTA TTGACGTTGA CAGCGAAGTG780GTGTTTGAAT TAGCTTCCTA TATTTTACAG GAGGCAAAGG GAGATTTTTC TAGCAATGAA840GTTGTGAGGA GTGACTTGAA GAAGCTGCCA GCCCTTCCCA CCCAAGCCCT GAAGGAGCAC900CCTTCCCTGG CCTACTGTGA AGACAGAGTC ATTGAGCACT ACAAGAAACT GAACGGTCAG960ACAAGAGGTC AAGCAATCGT AAACTACATG AGCATCGTGG AGTCTCTCCC AACCTACGGG1020GTTCACTATT ATGCAGTGAA GGACAAGCAG GGCATACCAT GGTGGCTGGG CCTGAGCTAC1080AAAGGGATCT TCCAGTATGA CTACCATGAT AAAGTGAAGC CAAGAAAGAT ATTCCAATGG1140AGACAGTTGG AAAACCTGTA CTTCAGAGAA AAGAAGTTTT CCGTGGAAGT TCATGACCCA1200CGCAGGGCTT CAGTGACAAG GAGGACGTTT GGGCACAGCG GCATTGCAGT GCACACGTGG1260TATGCATGTC CGGCATTGAT CAAGTCCATC TGGGCTATGG CCATAAGCCA ACACCAGTTC1320TATCTGGACA GAAAGCAGAG TAAGTCCAAA ATCCATGCAG CACGCAGCCT GAGTGAGATC1380GCCATCGACC TGACCGAGAC GGGGACGCTG AAGACCTCGA AGCTGGCCAA CATGGGTAGC1440AAGGGGAAGA TCATCAGCGG CAGCAGCGGC AGCCTGCTGT CTTCAGGTTC TCAGGAATCA1500GATAGCTCGC AGTCGGCCAA GAAGGACATG CTGGCTGCCT TGAAGTCCAG GCAGGAAGCT1560CTGGAGGAAA CCCTGCGTCA GAGGCTGGAG GAACTGAAGA AGCTGTGTCT CCGAGAAGCT1620GAGCTCACGG GCAAGCTGCC AGTAGAATAT CCCCTGGATC CAGGGGAGGA ACCACCCATT1680GTTCGGAGAA GAATAGGAAC AGCCTTCAAA CTGGATGAAC AGAAAATCCT GCCCAAAGGA1740GAGGAAGCTG AGCTGGAACG CCTGGAACGA GAGTTTGCCA TTCAGTCCCA GATTACGGAG1800GCCGCCCGCC GCCTAGCCAG TGACCCCAAC GTCAGCAAAA AACTGAAGAA ACAAAGGAAA1860ACCTCGTATC TGAATGCACT GAAGAAACTG CAGGAGATTG AAAATGCAAT CAATGAGAAC1920CGCATCAAGT CTGGGAAGAA ACCCACCCAG AGGGCTTCGC TGATCATAGA CGATGGAAAC1980ATTGCCAGTG AAGACAGCTC CCTCTCAGAT GCCCTTGTTC TTGAGGATGA AGACTCTCAG2040GTTACCAGCA CAATATCCCC CCTACATTCT CCTCACAAGG GACTCCCTCC TCGGCCACCG2100TCGCACAACA GGCCTCCTCC TCCCCAGTCC CTGGAGGGAC TCCGACAGAT GCACTATCAC2160CGCAACGACT ATGACAAGTC ACCCATCAAG CCCAAAATGT GGAGTGAGTC CTCTTTAGAT2220GAACCCTATG AGAAGGTCAA GAAGCGCTCC TCTCACAGCC ATTCCAGCAG CCACAAGCGC2280TTCCCCAGCA CAGGAAGCTG TGCGGAAGCC GGCGGAGGAA GCAACTCCTT GCAGAACAGC2340CCCATCCGCG GCCTCCCGCA CTGGAACTCC CAGTCCAGCA TGCCGTCCAC GCCAGACCTG2400CGGGTCCGGA GTCCCCACTA CGTCCATTCC ACGAGGTCGG TGGACATCAG CCCCACCCGA2460CTGCACAGCC TCGCACTGCA CTTTAGGCAC CGGAGCTCCA GCCTGGAGTC CCAGGGCAAG2520CTCCTGGGCT CGGAAAACGA CACCGGGAGC CCCGACTTCT ACACCCCGCG GACTCGTAGC2580AGCAACGGCT CAGACCCCAT GGACGACTGC TCGTCGTGCA CCAGCCACTC GAGCTCGGAG2640CACTACTACC CGGCGCAGAT GAACGCCAAC TACTCCACGC TGGCCGAGGA CTCGCCGTCC2700AAGGCGCGCC AGAGGCAGAG GCAGCGGCAG CGGGCGGCGG GCGCACTGGG CTCAGCCAGC2760TCGGGCAGCA TGCCCAACCT GGCGGCGCGC GGGGGTGCGG GGGGCGCGGG GGGCGCGGGG2820GGCGGTGTGT ACCTGCACAG CCAGAGCCAG CCCAGCTCGC AGTACCGCAT CAAGGAGTAC2880CCGCTGTACA TCGAGGGCGG CGCCACGCCC GTGGTGGTGC GCAGCCTGGA GAGCGACCAG2940GAGTGCCACT ACAGCGTCAA GGCTCAGTTC AAGACGTCCA ACTCCTACAC GGCGGGCGGC3000CTGTTCAAGG AGAGCTGGCG CGGCGGCGGC GGCGACGAGG GCGACACGGG CCGCCTGACG3060CCGTCGCGAT CGCAGATCCT GCGGACTCCG TCGCTGGGCC GCGAGGGCGC CCACGACAAG3120GGCGCGGGCC GTGCCGCCGT CTCAGACGAG CTGCGCCAGT GGTACCAGCG TTCCACCGCC3180TCGCACAAGG AGCACAGCCG CCTGTCGCAC ACCAGCTCCA CCTCCTCGGA CAGCGGCTCG3240CAGTACAGCA CCTCCTCCCA GAGCACCTTC GTGGCGCACA GCAGGGTCAC CAGGATGCCC3300CAGATGTGCA AGGCCACGTC AGCTGCCTTA CCTCAAAGCC AGAGAAGCTC GACACCGTCA3360AGTGAAATTG GAGCCACCCC CCCAAGCAGC CCCCACCACA TCCTAACCTG GCAGACTGGA3420GAAGCAACAG AAAACTCACC CATTCTGGAT GGGTCTGAGT CTCCACCTCA CCAAAGTACT3480GATGAATAGA GGAGCTACAA TGATAGCTGT TTCCTGGATT CCTCCCTCTA TCCAGAACTA3540GCTGATGTCC AGTGGTACGG GCAGGAAAAA GCCAAGCCCG GGACCCTCGT GTGAGCCAGC3600CCGGCCTAAT CTGACCGCCT CAACGCCATT CTGAGATCAC CTCACTGCCT CTCATTTGCC3660TTACCCAGAC GCACCGTCAC CCTGCACCAG CTTTGGCCCT CAGCACTTTT TTTCTCCTGT3720CTCCGCATTC CCTCCCCCTT GAAAACCTGA CTGAGGAGAC ATTCTGGAAG GTTCCGGTCC3780CACTGTGTGT CCCCTGGCGC TCTTGCCCAT AGAGAGCCAG ACACCAATCC TCAATGGCAC3840CTTGGTGGCT TCCCTCTGCC ATGACAGCCC CTAGGCCAGG AACCATCAGG GGGGCCAGCC3900GGCATCCAAT TCCTGCGGAT AAGTAGCGTT GGGAGAGAAC GGGAAAGGGG ACTTGGGTTA3960CAGGGTGACC CAGAAAGACG ATTCAGCTGT GTCCAGCCTG CCACCCATAC GTAGGCCAAC4020CAAGCACTTC ATGAAGAGGA GGCCTCGTGG CATATTCAGT TTACACCTGA AATATTCCTT4080GATGGGACAG CTTGTGGGGA TGGCTATGGG GGAAGGGGAG GTTGAGAAAG GAAGTTCTCG4140ACACCAGAAA TGCATCGGAG GACCACAATC AGTTCTATGC TGCCAAAGAT TAAAAATAAA4200TAAAAACATA AAAAATTAAG AGGGGCCAAG AGGAAGACAT TCTTTCTGCA AGGAAATTTC4260TTTTAAATTC TGAACTGCTA CTACACACAA GTGAAAGTCA ACCCTATGTA AACTGGTGTC4320CTCTCTCTAG CCCTCTCCCT TACTGGCCCA CTTCTCTCTC CGTAGAGAGC CTGAAAAACT4380GCCCCAATGC CACGGTAAAG GCGAGGAAGT CTTGGCTGGC GTTGCTGACT CACAGTCGCC4440ATCCATCTGG ACACAAAGAG AGACCTGTGG GAGTCATAGA GGGTACTGTT AGCCCCGGTC4500CATGCAGGGG GTTCAGCCGA GCCCAAGACT CAAAGCTGCT TTCCTTTCAG GATTTGTAGT4560AACGTAAGGT GATAATGGCC AAAAGTGGTT CTCTCTCATT AAACCAACCA GTAAAAGCGT4620ATCCTATTTT TTTGCATAAG GTGTTTCATT TTCGTTTTTA TGGGAAACCA AGGGAAAAGC4680ACATTGCGAT CCATTCAGTG TTTAACTGTC GTGGCTCATT TTCTGTTCGT TAGCACTTGT4740GTGACAAAAG AGCTCAGATC CGACTTCTCC TATGTGTCAC TTATTCCAAG AACCCAACTA4800TGCCCTTAGG TAGAAAGATT TGACTCGTGT GTCTACTAGC CAACAGGCAG AGCAGGGTTG4860AAAAAAATAT CAGCTCCCAA AGGGCCCATG TGTCTACATC ATCAGTTACT GTCATGCACC4920ACATTTGTGT GCAGATACCA AAAGAGGAGG AAAGAAGAAA AAAATTAATG TGTGGGAGCT4980GCACGTTTAC ATGTTTTGAG CTATGCTTCA AACACAACTG GAAAGCCATC AATCTTCAAA5040GGCCTCAAAA ATACTTTTAT AGTAACAAGT GCACGACTTT AGTTGGGTTA TTCAAGATGG5100CACAAAAAGG TTTCCGCAGA GGTGGTATGC TGTGCTTTTG GCGCAAGTGG TGGGGGGATG5160GGGGTGGGGG TGGAATTTTT TTCTCACTCT AATGACTTCC TATTGGAAAG GCATTGACAG5220CCAGGGACAG GAGCCAGGGT GGGGGTAGTT TTGTGGGAAA GCAGAACTGA AGTTAGCTTA5280AGCATAAAAA CAAAGAAAAA TCTTCGCTTT TCATGTATGT GGAATCCAAG AATAACCATA5340GGCTCTACCA GACCAGGAGG GTAAGGATGG ACACTAAAAT GAAACAAATA CCAAGGTATT5400CCTTCTGCTG CAGCCTGGAG ACCACCGAGA GTCGAGCTGG GGCACACACA CACCTGGCCG5460GGACCCGGCA GGGACAAGGC GGGCCGTGGC CTCCTCCACC AAGTCTCTCT AGACAATTCA5520GGGCCTGCTT TCCCCAGCTC CATGCATGGC TGGACTGGTG ATTCCAGGGT GCAGAAGGGA5580TTCATATTCC CAGAACGCTT TAAGTGTACA CCTGCAGGAT AAAGAGATAC CGGTTACATT5640ATTAAATGAT TCTAGGGATT CACTGGGGGA TATTTTTGTT GCTTTTACTT TCATGGTTAG5700AGCTACAAAG AACAGTGATT TTTTTTTTTT CTCCCTTCCC CATTCAGAAA CATTATACAT5760TGGGCCATTT TTCTTTCTCC CAAAGAAGAT TCATGGATAG TCAGACTGAA CTGTGTGCAA5820CAGGAAAAGT CAAAAGGGAA AAGGCAGCTG ATGAGGTTAC ATGGTTACAT GTTCTACATC5880ATGCAGAGTA GCTTGAAATC TAGTCTGGAG AAAACTGGAT CAAGATTCTA GCCCACTGGA5940GTTGCAAGGA ATGAGAGGCA AAAATTCTAA AGATTTGGGT TATATTTTCA ACTTGGGGGA6000CAGAGAGAAA TGGAGAGCAG GAATTACAGT TCCAACAAAC ATCATGATAG TCTGGTAGTC6060AAGACAGAGA TTAAGTAAAA CAGGTTTTAC TGTTTAGCTG AGTTCAGTTA ATACAAAATG6120TACATAAAAC GTTAGTCCTT TGAGACTGAC ATGATTAATG ATCAGTGTGG TGGGAAATGA6180TGTAGTTATT GTACACAAGC ACTTGCAAAC TCTTTATCCC TATTTCTTTA AAACAAAATA6240AGGTGAAATA CGAAGTCCTT GGTCTGATAT AAAGCCCCTA TTGGATTCTT CGGATGCGTA6300AAAGAAATTG CCTGTTTCAG CCAGAAGACT GGTGAAAACA CATACATCAG ACTATGTTGT6360GAGCCAGGTT GATTTTTTAT TTTATTATAT GCAGGTGAGT GTTGAAACTG TTAAAATTCC6420AATTTGTTTT CATTCAGTAT TAGTTTAGTT CTAAATATAG CAAACCCCAT CCAGGTGCTA6480TCAGATGACC AGTTACTGCT TAGTTAACTA GGTGTAAAGT TTTACATATA CATTAATTTC6540AATAGTTTAT TACAAGTTGT GTAAAATGGA CTCTAGTTTA ATAATGGGGG AAAAAAGATT6600AGGTTGGTCC TGAAACTGAC TGTAGAGCAT GTAAAATGAT TTTACTGGAT TCTGTTCAAC6660TGTAATGAAT GAAAAAGATG TACGTTGTAG ACAAAGTTGC AGAATTAAAA AAAGAAATCT6720GCTTTTAATT TATTCTTTTT GTATTAAGAA TTTGTATAGT ATCTTTACAT TTTGCAAAAC6780AGTGTTGTCA ACACTTATTA AAGCATTTTC AAAATGACG8 DNA sequenceGene name: ubiquitin E3 ligase SMURF2Unigene number: Hs.21806 (3′UTR only)Probeset Accession #: AA398243Nucleic Acid Accession #: AF301463 clusterCoding sequence: 9-2255 (predicted start/stop codons underlined)CCGGGGACATGTCTAACCCC GGAGGCCGGA GGAACGGGCC CGTCAAGCTG CGCCTGACAG60TACTCTGTGC AAAAAACCTG GTGAAAAAGG ATTTTTTCCG ACTTCCTGAT CCATTTGCTA120AGGTGGTGGT TGATGGATCT GGGCAATGCC ATTCTACAGA TACTGTGAAG AATACGCTTG180ATCCAAAGTG GAATCAGCAT TATGACCTGT ATATTGGAAA GTCTGATTCA GTTACGATCA240GTGTATGGAA TCACAAGAAG ATCCATAAGA AACAAGGTGC TGGATTTCTC GGTTGTGTTC300GTCTTCTTTC CAATGCCATC AACCGCCTCA AAGACACTGG TTATCAGAGG TTGGATTTAT360GCAAACTCGG GCCAAATGAC AATGATACAG TTAGAGGACA GATAGTAGTA AGTCTTCAGT420CCAGAGACCG AATAGGCACA GGAGGACAAG TTGTGGACTG CAGTCGTTTA TTTGATAACG480ATTTACCAGA CGGCTGGGAA GAAAGGAGAA CCGCCTCTGG AAGAATCCAG TATCTAAACC540ATATAACAAG AACTACGCAA TGGGAGCGCC CAACACGACC GGCATCCGAA TATTCTAGCC600CTGGCAGACC TCTTAGCTGC TTTGTTGATG AGAACACTCC AATTAGTGGA ACAAATGGTG660CAACATGTGG ACAGTCTTCA GATCCCAGGC TGGCAGAGAG GAGAGTCAGG TCACAACGAC720ATAGAAATTA CATGAGCAGA ACACATTTAC ATACTCCTCC AGACCTACCA GAAGGCTATG780AACAGAGGAC AACGCAACAA GGCCAGGTGT ATTTCTTACA TACACAGACT GGTGTGAGCA840CATGGCATGA TCCAAGAGTG CCCAGGGATC TTAGCAACAT CAATTGTGAA GAGCTTGGTC900CGTTGCCTCC TGGATGGGAG ATCCGTAATA CGGCAACAGG CAGAGTTTAT TTCGTTGACC960ATAACAACAG AACAACACAA TTTACAGATC CTCGGCTGTC TGCTAACTTG CATTTAGTTT1020TAAATCGGCA GAACCAATTG AAAGACCAAC AGCAACAGCA AGTGGTATCG TTATGTCCTG1080ATGACACAGA ATGCCTGACA GTCCCAAGGT ACAAGCGAGA CCTGGTTCAG AAACTAAAAA1140TTTTGCGGCA AGAACTTTCC CAACAACAGC CTCAGGCAGG TCATTGCCGC ATTGAGGTTT1200CCAGGGAAGA GATTTTTGAG GAATCATATC GACAGGTCAT GAAAATGAGA CCAAAAGATC1260TCTGGAAGCG ATTAATGATA AAATTTCGTG GAGAAGAAGG CCTTGACTAT GGAGGCGTTG1320CCAGGGAATG GTTGTATCTC TTGTCACATG AAATGTTGAA TCCATACTAT GGCCTCTTCC1380AGTATTCAAG AGATGATATT TATACATTGC AGATCAATCC TGATTCTGCA GTTAATCCGG1440AACATTTATC CTATTTCCAC TTTGTTGGAC GAATAATGGG AATGGCTGTG TTTCATGGAC1500ATTATATTGA TGGTGGTTTC ACATTGCCTT TTTATAAGCA ATTGCTTGGG AAGTCAATTA1560CCTTGGATGA CATGGAGTTA GTAGATCCGG ATCTTCACAA CAGTTTAGTG TGGATACTTG1620AGAATGATAT TACAGGTGTT TTGGACCATA CCTTCTGTGT TGAACATAAT GCATATGGTG1680AAATTATTCA GCATGAACTT AAACCAAATG GCAAAAGTAT CCCTGTTAAT GAAGAAAATA1740AAAAAGAATA TGTCAGGCTC TATGTGAACT GGAGATTTTT ACGAGGCATT GAGGCTCAAT1800TCTTGGCTCT GCAGAAAGGA TTTAATGAAG TAATTCCACA ACATCTGCTG AAGACATTTG1860ATGAGAAGGA GTTAGAGCTC ATTATTTGTG GACTTGGAAA GATAGATGTT AATGACTGGA1920AGGTAAACAC CCGGTTAAAA CACTGTACAC CAGACAGCAA CATTGTCAAA TGGTTCTGGA1980AAGCTGTGGA GTTTTTTGAT GAAGAGCGAC GAGCAAGATT GCTTCAGTTT GTGACAGGAT2040CCTCTCGAGT GCCTCTGCAG GGCTTCAAAG CATTGCAAGG TGCTGCAGGC CCGAGACTCT2100TTACCATACA CCAGATTGAT GCCTGCACTA ACAACCTGCC GAAAGCCCAC ACTTGCTTCA2160ATCGAATAGA CATTCCACCC TATGAAAGCT ATGAAAAGCT ATATGAAAAG CTGCTAACAG2220CCATTGAAGA AACATGTGGA TTTGCTGTGG AATGACAAGC TTCAAGGATT TACCCAGGACACH1 DNA sequenceGene name: ESTUnigene number: Hs.30089Probeset Accession #: AA410480CAT cluster#: 96816_1Coding sequence: Partial sequence, possible frameshift. Predicted stop codonunderlined.CTCCACTATG GACAGAGCCT CCACTGAGCT GCTGCCTGCC CGCCACATAC CCAGCTGACA60GGGGCCCCGC AGAGCCATGC AGCTGTGCTG GGGTGATCCT GGGCTTCCTC CTGTTCCGAG120GCCACAACTC CCAGCCCACA ATGACCCAGA CCTCTAGCTC TCAGGGAGGC CTTGGCGGTC180TAAGTCTGAC CACAGAGCCA GTTTCTTCCA ACCCAGGATA CATCCCTTCC TCAGAGGCTA240ACAGGCCAAG CCATCTGTCC AGCACTGGTA CCCCAGGCGC AGGTGTCCCC AGCAGTGGAA300GAGACGGAGG CACAAGCAGA GACACATTTC AAACTGTTCC CCCCAATTCA ACCACCATGA360GCCTGAGCAT GAGGGAAGAT GCGACCATCC TGCCCAGCCC CACGTCAGAG ACTGTGCTCA420CTGTGGCTGC ATTTGGTGTT ATCAGCTTCA TTGTCATCCT GGTGGTTGTG GTGATCATCC480TAGTTGGTGT GGTCAGCCTG AGGTTCAAGT GTCGGAAGAG CAAGGAGTCT GGAGATCCCC540AGAAACCTGG AGAGCGGGAG GAGAAGCTGG GACATAGGAG GGAACCCTAC CCCTGGAATT600GACTTGGACT CTGGGTCTGG AAACGCAAGT TCAAATCTCA CCCATTTGTT CCAGGAGGTT660CTGGCTGATG AGGAAGACCC TTGTGGGAGG GGGGCCCCTG CCCTCCAGTT AGCTCTTCTT720GGCTGTGCTG GGTTCCATGT TCTCATGCAG GGATGGAGTC GGGTGGAGAG CCCACTCTGG780CTAGGGGGCG GCAGGCTGAG AGCTCACCTG TTCAGCAGAG AAGTGGAACT CACTTTGCTC840CTGGAGCCTC CCTACACAGT ACTTATCTGG GAAGGGAATG CCGGACTCTT GTTGGCCCCT900TTGTCCCCCC GACTGGCCCC CTTCGCCGACJ2 DNA sequenceGene name: Complement component C1q receptorUnigene number: Hs.97199Probeset Accession #: AA487558Nucleic Acid Accession #: NM_012072Coding sequence: 149-2107. predicted start/stop codons underlinedAAAGCCCTCA GCCTTTGTGT CCTTCTCTGC GCCGGAGTGG CTGCAGCTCA CCCCTCAGCT60CCCCTTGGGG CCCAGCTGGG AGCCGAGATA GAAGCTCCTG TCGCCGCTGG GCTTCTCGCC120TCCCGCAGAG GGCCACACAG AGACCGGGATGGCCACCTCC ATGGGCCTGC TGCTGCTGCT180GCTGCTGCTC CTGACCCAGC CCGGGGCGGG GACGGGAGCT GACACGGAGG CGGTGGTCTG240CGTGGGGACC GCCTGCTACA CGGCCCACTC GGGCAAGCTG AGCGCTGCCG AGGCCCAGAA300CCACTGCAAC CAGAACGGGG GCAACCTGGC CACTGTGAAG AGCAAGGAGG AGGCCCAGCA360CGTCCAGCGA GTACTGGCCC AGCTCCTGAG GCGGGAGGCA GCCCTGACGG CGAGGATGAG420CAAGTTCTGG ATTGGGCTCC AGCGAGAGAA GGGCAAGTGC CTGGACCCTA GTCTGCCGCT480GAAGGGCTTC AGCTGGGTGG GCGGGGGGGA GGACACGCCT TACTCTAACT GGCACAAGGA540GCTCCGGAAC TCGTGCATCT CCAAGCGCTG TGTGTCTCTG CTGCTGGACC TGTCCCAGCC600GCTCCTTCCC AACCGCCTGC CCAAGTGGTC TGAGGGCCCC TGTGGGAGCC CAGGCTCCCC660CGGAAGTAAC ATTGAGGGCT TCGTGTGCAA GTTCAGCTTC AAAGGCATGT GCCGGCCTCT720GGCCCTGGGG GGCCCAGGTC AGGTGACCTA CACCACCCCC TTCCAGACCA CCAGTTCCTC780CTTGGAGGCT GTGCCCTTTG CCTCTGCGGC CAATGTAGCC TGTGGGGAAG GTGACAAGGA840CGAGACTCAG AGTCATTATT TCCTGTGCAA GGAGAAGGCC CCCGATGTGT TCGACTGGGG900CAGCTCGGGC CCCCTCTGTG TCAGCCCCAA GTATGGCTGC AACTTCAACA ATGGGGGCTG960CCACCAGGAC TGCTTTGAAG GGGGGGATGG CTCCTTCCTC TGCGGCTGCC GACCAGGATT1020CCGGCTGCTG GATGACCTGG TGACCTGTGC CTCTCGAAAC CCTTGCAGCT CCAGCCCATG1080TCGTGGGGGG GCCACGTGCG TCCTGGGACC CCATGGGAAA AACTACACGT GCCGCTGCCC1140CCAAGGGTAC CAGCTGGACT CGAGTCAGCT GGACTGTGTG GACGTGGATG AATGCCAGGA1200CTCCCCCTGT GCCCAGGAGT GTGTCAACAC CCCTGGGGGC TTCCGCTGCG AATGCTGGGT1260TGGCTATGAG CCGGGCGGTC CTGGAGAGGG GGCCTGTCAG GATGTGGATG AGTGTGCTCT1320GGGTCGCTCG CCTTGCGCCC AGGGCTGCAC CAACACAGAT GGCTCATTTC ACTGCTCCTG1380TGAGGAGGGC TACGTCCTGG CCGGGGAGGA CGGGACTCAG TGCCAGGACG TGGATGAGTG1440TGTGGGCCCG GGGGGCCCCC TCTGCGACAG CTTGTGCTTC AACACACAAG GGTCCTTCCA1500CTGTGGCTGC CTGCCAGGCT GGGTGCTGGC CCCAAATGGG GTCTCTTGCA CCATGGGGCC1560TGTGTCTCTG GGACCACCAT CTGGGCCCCC CGATGAGGAG GACAAAGGAG AGAAAGAAGG1620GAGCACCGTG CCCCGCGCTG CAACAGCCAG TCCCACAAGG GGCCCCGAGG GCACCCCCAA1680GGCTACACCC ACCACAAGTA GACCTTCGCT GTCATCTGAC GCCCCCATCA CATCTGCCCC1740ACTCAAGATG CTGGCCCCCA GTGGGTCCTC AGGCGTCTGG AGGGAGCCCA GCATCCATCA1800CGCCACAGCT GCCTCTGGCC CCCAGGAGCC TGCAGGTGGG GACTCCTCCG TGGCCACACA1860AAACAACGAT GGCACTGACG GGCAAAAGCT GCTTTTATTC TACATCCTAG GCACCGTGGT1920GGCCATCCTA CTCCTGCTGG CCCTGGCTCT GGGGCTACTG GTCTATCGCA AGCGGAGAGC1980GAAGAGGGAG GAGAAGAAGG AGAAGAAGCC CCAGAATGCG GCAGACAGTT ACTCCTGGGT2040TCCAGAGCGA GCTGAGAGCA GGGCCATGGA GAACCAGTAC AGTCCGACAC CTGGGACAGA2100CTGCTGAAAG TGAGGTGGCC CTAGAGACAC TAGAGTCACC AGCCACCATC CTCAGAGCTT2160TGAACTCCCC ATTCCAAAGG GGCACCGACA TTTTTTTGAA AGACTGGACT GGAATCTTAG2220CAAACAATTG TAAGTCTCCT CCTTAAAGGC CCCTTGGAAC ATGCAGGTAT TTTCTACGGG2280TGTTTGATGT TCCTGAAGTG GAAGCTGTGT GTTGGCGTGC CACGGTGGGG ATTTCGTGAC2340TCTATAATGA TTGTTACTCC CCCTCCCTTT TCAAATTCCA ATGTGACCAA TTCCGGATCA2400GGGTGTGAGG AGGCTGGGGC TAAGGGGCTC CCCTGAATAT CTTCTCTGCT CACTTCCACC2460ATCTAAGAGG AAAAGGTGAG TTGCTCATGC TGATTAGGAT TGAAATGATT TGTTTCTCTT2520CCTAGGATGA AAACTAAATC AATTAATTAT TCAATTAGGT AAGAAGATCT GGTTTTTTGG2580TCAAAGGGAA CATGTTCGGA CTGGAAACAT TTCTTTACAT TTGCATTCCT CCATTTCGCC2640AGCACAAGTC TTGCTAAATG TGATACTGTT GACATCCTCC AGAATGGCCA GAAGTGCAAT2700TAACCTCTTA GGTGGCAAGG AGGCAGGAAG TGCCTCTTTA GTTCTTACAT TTCTAATAGC2760CTTGGGTTTA TTTGCAAAGG AAGCTTGAAA AATATGAGAA AAGTTGCTTG AAGTGCATTA2820CAGGTGTTTG TGAAGTCACA TAATCTACGG GGCTAGGGCG AGAGAGGCCA GGGATTTGTT2880CACAGATACT TGAATTAATT CATCCAAATG TACTGAGGTT ACCACACACT TGACTACGGA2940TGTGATCAAC ACTAACAAGG AAACAAATTC AAGGACAACC TGTCTTTGAG CCAGGGCAGG3000CCTCAGACAC CCTGCCTGTG GCCCCGCCTC CACTTCATCC TGCCCGGAAT GCCAGTGCTC3060CGAGCTCAGA CAGAGGAAGC CCTGCAGAAA GTTCCATCAG GCTGTTTGCT AAAGGATGTG3120TGAACGGGAG ATGATGCACT GTGTTTTGAA AGTTGTCATT TTAAAGCATT TTAGCACAGT3180TCATAGTCCA CAGTTGATGC AGCATCCTGA GATTTTAAAT CCTGAAGTGT GGGTGGCGCA3240CACACCAAGT AGGGAGCTAG TCAGGCAGTT TGCTTAAGGA ACTTTTGTTC TCTGTCTCTT3300TTCCTTAAAA TTGGGGGTAA GGAGGGAAGG AAGAGGGAAA GAGATGACTA ACTAAAATCA3360TTTTTACAGC AAAAACTGCT CAAAGCCATT TAAATTATAT CCTCATTTTA AAAGTTACAT3420TTGCAAATAT TTCTCCCTAT GATAATGCAG TCGATAGTGT GCACTCTTTC TCTCTCTCTC3480TCTCTCTCAC ACACACACAC ACACACACAC ACACACACAC AGAGACACGG CACCATTCTG3540CCTGGGGCAC TGGAACACAT TCCTGGGGGT CACCGATGGT CAGAGTCACT AGAAGTTACC3600TGAGTATCTC TGGGAGGCCT CATGTCTCCT GTGGGCTTTT TACCACCACT GTGCAGGAGA3660ACAGACAGAG GAAATGTGTC TCCCTCCAAG GCCCCAAAGC CTCAGAGAAA GGGTGTTTCT3720GGTTTTGCCT TAGCAATGCA TCGGTCTCTG AGGTGACACT CTGGAGTGGT TGAAGGGCCA3780CAAGGTGCAG GGTTAATACT CTTGCCAGTT TTGAAATATA GATGCTATGG TTCAGATTGT3840TTTTAATAGA AAACTAAAGG GGCAGGGGAA GTGAAAGGAA AGATGGAGGT TTTGTGCGGC3900TCGATGGGGC ATTTGGAACT TCTTTTTAAA GTCATCTCAT GGTCTCCAGT TTTCAGTTGG3960AACTCTGGTG TTTAACACTT AAGGGAGACA AAGGCTGTGT CCATTTGGCA AAACTTCCTT4020GGCCACGAGA CTCTAGGTGA TGTGTGAAGC TGGGCAGTCT GTGGTGTGGA GAGCAGCCAT4080CTGTCTGGCC ATTCAGAGGA TTCTAAAGAC ATGGCTGGAT GCGCTGCTGA CCAACATCAG4140CACTTAAATA AATGCAAATG CAACATTTCT CCCTCTGGGC CTTGAAAATC CTTGCCCTTA4200TCATTTGGGG TGAAGGAGAC ATTTCTGTCC TTGGCTTCCC ACAGCCCCAA CGCAGTCTGT4260GTATGATTCC TGGGATCCAA CGAGCCCTCC TATTTTCACA GTGTTCTGAT TGCTCTCACA4320GCCCAGGCCC ATCGTCTGTT CTCTGAATGC AGCCCTGTTC TCAACAACAG GGAGGTCATG4380GAACCCCTCT GTGGAACCCA CAAGGGGAGA AATGGGTGAT AAAGAATCCA GTTCCTCAAA4440ACCTTCCCTG GCAGGCTGGG TCCCTCTCCT GCTGGGTGGT GCTTTCTCTT GGACACCACT4500CCCACCACGG GGGGAGAGCC AGCAACCCAA CCAGACAGCT CAGGTTGTGC ATCTGATGGA4560AACCACTGGG CTCAAACACG TGCTTTATTC TCCTGTTTAT TTTTGCTGTT ACTTTGAAGC4620ATGGAAATTC TTGTTTGGGG GATCTTGGGG CTACAGTAGT GGGTAAACAA ATGCCCACCG4680GCCAAGAGGC CATTAACAAA TCGTCCTTGT CCTGAGGGGC CCCAGCTTGC TCGGGCGTGG4740CACAGTGGGG AATCCAAGGG TCACAGTATG GGGAGAGGTG CACCCTGCCA CCTGCTAACT4800TCTCGCTAGA CACAGTGTTT CTGCCCAGGT GACCTGTTCA GCAGCAGAAC AAGCCAGGGC4860CATGGGGACG GGGGAAGTTT TCACTTGGAG ATGGACACCA AGACAATGAA GATTTGTTGT4920CCAAATAGGT CAATAATTCT GGGAGACTCT TGGAAAAAAC TGAATATATT CAGGACCAAC4980TCTCTCCCTC CCCTCATCCC ACATCTCAAA GCAGACAATG TAAAGAGAGA ACATCTCACA5040CACCCAGCTC GCCATGCCTA CTCATTCCTG AATTTCAGGT GCCATCACTG CTCTTTCTTT5100CTTCTTTGTC ATTTGAGAAA GGATGCAGGA GGACAATTCC CACAGATAAT CTGAGGAATG5160CAGAAAAACC AGGGCAGGAC AGTTATCGAC AATGCATTAG AACTTGGTGA GCATCCTCTG5220TAGAGGGACT CCACCCCTGC TCAACAGCTT GGCTTCCAGG CAAGACCAAC CACATCTGGT5280CTCTGCCTTC GGTGGCCCAC ACACCTAAGC GTCATCGTCA TTGCCATAGC ATCATGATGC5340AACACATCTA CGTGTAGCAC TACGACGTTA TGTTTGGGTA ATGTGGGGAT GAACTGCATG5400AGGCTCTGAT TAAGGATGTG GGGAAGTGGG CTGCGGTCAC TGTCGGCCTT GCAAGGCCAC5460CTGGAGGCCT GTCTGTTAGC CAGTGGTGGA GGAGCAAGGC TTCAGGAAGG GCCAGCCACA5520TGCCATCTTC CCTGCGATCA GGCAAAAAAG TGGAATTAAA AAGTCAAACC TTTATATGCA5580TGTGTTATGT CCATTTTGCA GGATGAACTG AGTTTAAAAG AATTTTTTTT TCTCTTCAAG5640TTGCTTTGTC TTTTCCATCC TCATCACAAG CCCTTGTTTG AGTGTCTTAT CCCTGAGCAA5700TCTTTCGATG GATGGAGATG ATCATTAGGT ACTTTTGTTT CAACCTTTAT TCCTGTAAAT5760ATTTCTGTGA AAACTAGGAG AACAGAGATG AGATTTGACA AAAAAAAATT GAATTAAAAA5820TAACACAGTC TTTTTAAAAC TAACATAGGA AAGCCTTTCC TATTATTTCT CTTCTTAGCT5880TCTCCATTGT CTAAATCAGG AAAACAGGAA AACACAGCTT TCTAGCAGCT GCAAAATGGT5940TTAATGCCCC CTACATATTT CCATCACCTT GAACAATAGC TTTAGCTTGG GAATCTGAGA6000TATGATCCCA GAAAACATCT GTCTCTACTT CGGCTGCAAA ACCCATGGTT TAAATCTATA6060TGGTTTGTGC ATTTTCTCAA CTAAAAATAG AGATGATAAT CCGAATTCTC CATATATTCA6120CTAATCAAAG ACACTATTTT CATACTAGAT TCCTGAGACA AATACTCACT GAAGGGCTTG6180TTTAAAAATA AATTGTGTTT TGGTCTGTTC TTGTAGATAA TGCCCTTCTA TTTTAGGTAG6240AAGCTCTGGA ATCCCTTTAT TGTGCTGTTG CTCTTATCTG CAAGGTGGCA AGCAGTTCTT6300TTCAGCAGAT TTTGCCCACT ATTCCTCTGA GCTGAAGTTC TTTGCATAGA TTTGGCTTAA6360GCTTGAATTA GATCCCTGCA AAGGCTTGCT CTGTGATGTC AGATGTAATT GTAAATGTCA6420GTAATCACTT CATGAATGCT AAATGAGAAT GTAAGTATTT TTAAATGTGT GTATTTCAAA6480TTTGTTTGAC TAATTCTGGA ATTACAAGAT TTCTATGCAG GATTTACCTT CATCCTGTGC6540ATGTTTCCCA AACTGTGAGG AGGGAAGGCT CAGAGATCGA GCTTCTCCTC TGAGTTCTAA6600CAAAATGGTG CTTTGAGGGT CAGCCTTTAG GAAGGTGCAG CTTTGTTGTC CTTTGAGCTT6660TCTGTTATGT GCCTATCCTA ATAAACTCTT AAACACATTACJ3 DNA sequenceGene name: FLT1/vascular endothelial growth factor receptorUnigene number: Hs.138671Probeset Accession #: AA047437Nucleic Acid Accession #: NM_002019Coding sequence: 250-4266 (predicted start/stop codons underlined)GCGGACACTC CTCTCGGCTC CTCCCCGGCA GCGGCGGCGG CTCGGAGCGG GCTCCGGGGC60TCGGGTGCAG CGGCCAGCGG GCCTGGCGGC GAGGATTACC CGGGGAAGTG GTTGTCTCCT120GGCTGGAGCC GCGAGACGGG CGCTCAGGGC GCGGGGCCGG CGGCGGCGAA CGAGAGGACG180GACTCTGGCG GCCGGGTCGT TGGCCGGGGG AGCGCGGGCA CCGGGCGAGC AGGCCGCGTC240GCGCTCACCATGGTCAGCTA CTGGGACACC GGGGTCCTGC TGTGCGCGCT GCTCAGCTGT300CTGCTTCTCA CAGGATCTAG TTCAGGTTCA AAATTAAAAG ATCCTGAACT GAGTTTAAAA360GGCACCCAGC ACATCATGCA AGCAGGCCAG ACACTGCATC TCCAATGCAG GGGGGAAGCA420GCCCATAAAT GGTCTTTGCC TGAAATGGTG AGTAAGGAAA GCGAAAGGCT GAGCATAACT480AAATCTGCCT GTGGAAGAAA TGGCAAACAA TTCTGCAGTA CTTTAACCTT GAACACAGCT540CAAGCAAACC ACACTGGCTT CTACAGCTGC AAATATCTAG CTGTACCTAC TTCAAAGAAG600AAGGAAACAG AATCTGCAAT CTATATATTT ATTAGTGATA CAGGTAGACC TTTCGTAGAG660ATGTACAGTG AAATCCCCGA AATTATACAC ATGACTGAAG GAAGGGAGCT CGTCATTCCC720TGCCGGGTTA CGTCACCTAA CATCACTGTT ACTTTAAAAA AGTTTCCACT TGACACTTTG780ATCCCTGATG GAAAACGCAT AATCTGGGAC AGTAGAAAGG GCTTCATCAT ATCAAATGCA840ACGTACAAAG AAATAGGGCT TCTGACCTGT GAAGCAACAG TCAATGGGCA TTTGTATAAG900ACAAACTATC TCACACATCG ACAAACCAAT ACAATCATAG ATGTCCAAAT AAGCACACCA960CGCCCAGTCA AATTACTTAG AGGCCATACT CTTGTCCTCA ATTGTACTGC TACCACTCCC1020TTGAACACGA GAGTTCAAAT GACCTGGAGT TACCCTGATG AAAAAAATAA GAGAGCTTCC1080GTAAGGCGAC GAATTGACCA AAGCAATTCC CATGCCAACA TATTCTACAG TGTTCTTACT1140ATTGACAAAA TGCAGAACAA AGACAAAGGA CTTTATACTT GTCGTGTAAG GAGTGGACCA1200TCATTCAAAT CTGTTAACAC CTCAGTGCAT ATATATGATA AAGCATTCAT CACTGTGAAA1260CATCGAAAAC AGCAGGTGCT TGAAACCGTA GCTGGCAAGC GGTCTTACCG GCTCTCTATG1320AAAGTGAAGG CATTTCCCTC GCCGGAAGTT GTATGGTTAA AAGATGGGTT ACCTGCGACT1380GAGAAATCTG CTCGCTATTT GACTCGTGGC TACTCGTTAA TTATCAAGGA CGTAACTGAA1440GAGGATGCAG GGAATTATAC AATCTTGCTG AGCATAAAAC AGTCAAATGT GTTTAAAAAC1500CTCACTGCCA CTCTAATTGT CAATGTGAAA CCCCAGATTT ACGAAAAGGC CGTGTCATCG1560TTTCCAGACC CGGCTCTCTA CCCACTGGGC AGCAGACAAA TCCTGACTTG TACCGCATAT1620GGTATCCCTC AACCTACAAT CAAGTGGTTC TGGCACCCCT GTAACCATAA TCATTCCGAA1680GCAAGGTGTG ACTTTTGTTC CAATAATGAA GAGTCCTTTA TCCTGGATGC TGACAGCAAC1740ATGGGAAACA GAATTGAGAG CATCACTCAG CGCATGGCAA TAATAGAAGG AAAGAATAAG1800ATGGCTAGCA CCTTGGTTGT GGCTGACTCT AGAATTTCTG GAATCTACAT TTGCATAGCT1860TCCAATAAAG TTGGGACTGT GGGAAGAAAC ATAAGCTTTT ATATCACAGA TGTGCCAAAT1920GGGTTTCATG TTAACTTGGA AAAAATGCCG ACGGAAGGAG AGGACCTGAA ACTGTCTTGC1980ACAGTTAACA AGTTCTTATA CAGAGACGTT ACTTGGATTT TACTGCGGAC AGTTAATAAC2040AGAACAATGC ACTACAGTAT TAGCAAGCAA AAAATGGCCA TCACTAAGGA GCACTCCATC2100ACTCTTAATC TTACCATCAT GAATGTTTCC CTGCAAGATT CAGGCACCTA TGCCTGCAGA2160GCCAGGAATG TATACACAGG GGAAGAAATC CTCCAGAAGA AAGAAATTAC AATCAGAGAT2220CAGGAAGCAC CATACCTCCT GCGAAACCTC AGTGATCACA CAGTGGCCAT CAGCAGTTCC2280ACCACTTTAG ACTGTCATGC TAATGGTGTC CCCGAGCCTC AGATCACTTG GTTTAAAAAC2340AACCACAAAA TACAACAAGA GCCTGGAATT ATTTTAGGAC CAGGAAGCAG CACGCTGTTT2400ATTGAAAGAG TCACAGAAGA GGATGAAGGT GTCTATCACT GCAAAGCCAC CAACCAGAAG2460GGCTCTGTGG AAAGTTCAGC ATACCTCACT GTTCAAGGAA CCTCGGACAA GTCTAATCTG2520GAGCTGATCA CTCTAACATG CACCTGTGTG GCTGCGACTC TCTTCTGGCT CCTATTAACC2580CTCCTTATCC GAAAAATGAA AAGGTCTTCT TCTGAAATAA AGACTGACTA CCTATCAATT2640ATAATGGACC CAGATGAAGT TCCTTTGGAT GAGCAGTGTG AGCGGCTCCC TTATGATGCC2700AGCAAGTGGG AGTTTGCCCG GGAGAGACTT AAACTGGGCA AATCACTTGG AAGAGGGGCT2760TTTGGAAAAG TGGTTCAAGC ATCAGCATTT GGCATTAAGA AATCACCTAC GTGCCGGACT2820GTGGCTGTGA AAATGCTGAA AGAGGGGGCC ACGGCCAGCG AGTACAAAGC TCTGATGACT2880GAGCTAAAAA TCTTGACCCA CATTGGCCAC CATCTGAACG TGGTTAACCT GCTGGGAGCC2940TGCACCAAGC AAGGAGGGCC TCTGATGGTG ATTGTTGAAT ACTGCAAATA TGGAAATCTC3000TCCAACTACC TCAAGAGCAA ACGTGACTTA TTTTTTCTCA ACAAGGATGC AGCACTACAC3060ATGGAGCCTA AGAAAGAAAA AATGGAGCCA GGCCTGGAAC AAGGCAAGAA ACCAAGACTA3120GATAGCGTCA CCAGCAGCGA AAGCTTTGCG AGCTCCGGCT TTCAGGAAGA TAAAAGTCTG3180AGTGATGTTG AGGAAGAGGA GGATTCTGAC GGTTTCTACA AGGAGCCCAT CACTATGGAA3240GATCTGATTT CTTACAGTTT TCAAGTGGCC AGAGGCATGG AGTTCCTGTC TTCCAGAAAG3300TGCATTCATC GGGACCTGGC AGCGAGAAAC ATTCTTTTAT CTGAGAACAA CGTGGTGAAG3360ATTTGTGATT TTGGCCTTGC CCGGGATATT TATAAGAACC CCGATTATGT GAGAAAAGGA3420GATACTCGAC TTCCTCTGAA ATGGATGGCT CCCGAATCTA TCTTTGACAA AATCTACAGC3480ACCAAGAGCG ACGTGTGGTC TTACGGAGTA TTGCTGTGGG AAATCTTCTC CTTAGGTGGG3540TCTCCATACC CAGGAGTACA AATGGATGAG GACTTTTGCA GTCGCCTGAG GGAAGGCATG3600AGGATGAGAG CTCCTGAGTA CTCTACTCCT GAAATCTATC AGATCATGCT GGACTGCTGG3660CACAGAGACC CAAAAGAAAG GCCAAGATTT GCAGAACTTG TGGAAAAACT AGGTGATTTG3720CTTCAAGCAA ATGTACAACA GGATGGTAAA GACTACATCC CAATCAATGC CATACTGACA3780GGAAATAGTG GGTTTACATA CTCAACTCCT GCCTTCTCTG AGGACTTCTT CAAGGAAAGT3840ATTTCAGCTC CGAAGTTTAA TTCAGGAAGC TCTGATGATG TCAGATATGT AAATGCTTTC3900AAGTTCATGA GCCTGGAAAG AATCAAAACC TTTGAAGAAC TTTTACCGAA TGCCACCTCC3960ATGTTTGATG ACTACCAGGG CGACAGCAGC ACTCTGTTGG CCTCTCCCAT GCTGAAGCGC4020TTCACCTGGA CTGACAGCAA ACCCAAGGCC TCGCTCAAGA TTGACTTGAG AGTAACCAGT4080AAAAGTAAGG AGTCGGGGCT GTCTGATGTC AGCAGGCCCA GTTTCTGCCA TTCCAGCTGT4140GGGCACGTCA GCGAAGGCAA GCGCAGGTTC ACCTACGACC ACGCTGAGCT GGAAAGGAAA4200ATCGCGTGCT GCTCCCCGCC CCCAGACTAC AACTCGGTGG TCCTGTACTC CACCCCACCC4260ATCTAGAGTT TGACACGAAG CCTTATTTCT AGAAGCACAT GTGTATTTAT ACCCCCAGGA4320AACTAGCTTT TGCCAGTATT ATGCATATAT AAGTTTACAC CTTTATCTTT CCATGGGAGC4380CAGCTGCTTT TTGTGATTTT TTTAATAGTG CTTTTTTTTT TTGACTAACA AGAATGTAAC4440TCCAGATAGA GAAATAGTGA CAAGTGAAGA ACACTACTGC TAAATCCTCA TGTTACTCAG4500TGTTAGAGAA ATCCTTCCTA AACCCAATGA CTTCCCTGCT CCAACCCCCG CCACCTCAGG4560GCACGCAGGA CCAGTTTGAT TGAGGAGCTG CACTGATCAC CCAATGCATC ACGTACCCCA4620CTGGGCCAGC CCTGCAGCCC AAAACCCAGG GCAACAAGCC CGTTAGCCCC AGGGGATCAC4680TGGCTGGCCT GAGCAACATC TCGGGAGTCC TCTAGCAGGC CTAAGACATG TGAGGAGGAA4740AAGGAAAAAA AGCAAAAAGC AAGGGAGAAA AGAGAAACCG GGAGAAGGCA TGAGAAAGAA4800TTTGAGACGC ACCATGTGGG CACGGAGGGG GACGGGGCTC AGCAATGCCA TTTCAGTGGC4860TTCCCAGCTC TGACCCTTCT ACATTTGAGG GCCCAGCCAG GAGCAGATGG ACAGCGATGA4920GGGGACATTT TCTGGATTCT GGGAGGCAAG AAAAGGACAA ATATCTTTTT TGGAACTAAA4980GCAAATTTTA GACCTTTACC TATGGAAGTG GTTCTATGTC CATTCTCATT CGTGGCATGT5040TTTGATTTGT AGCACTGAGG GTGGCACTCA ACTCTGAGCC CATACTTTTG GCTCCTCTAG5100TAAGATGCAC TGAAAACTTA GCCAGAGTTA GGTTGTCTCC AGGCCATGAT GGCCTTACAC5160TGAAAATGTC ACATTCTATT TTGGGTATTA ATATATAGTC CAGACACTTA ACTCAATTTC5220TTGGTATTAT TCTGTTTTGC ACAGTTAGTT GTGAAAGAAA GCTGAGAAGA ATGAAAATGC5280AGTCCTGAGG AGAGTTTTCT CCATATCAAA ACGAGGGCTG ATGGAGGAAA AAGGTCAATA5340AGGTCAAGGG AAGACCCCGT CTCTATACCA ACCAAACCAA TTCACCAACA CAGTTGGGAC5400CCAAAACACA GGAAGTCAGT CACGTTTCCT TTTCATTTAA TGGGGATTCC ACTATCTCAC5460ACTAATCTGA AAGGATGTGG AAGAGCATTA GCTGGCGCAT ATTAAGCACT TTAAGCTCCT5520TGAGTAAAAA GGTGGTATGT AATTTATGCA AGGTATTTCT CCAGTTGGGA CTCAGGATAT5580TAGTTAATGA GCCATCACTA GAAGAAAAGC CCATTTTCAA CTGCTTTGAA ACTTGCCTGG5640GGTCTGAGCA TGATGGGAAT AGGGAGACAG GGTAGGAAAG GGCGCCTACT CTTCAGGGTC5700TAAAGATCAA GTGGGCCTTG GATCGCTAAG CTGGCTCTGT TTGATGCTAT TTATGCAAGT5760TAGGGTCTAT GTATTTAGGA TGCGCCTACT CTTCAGGGTC TAAAGATCAA GTGGGCCTTG5820GATCGCTAAG CTGGCTCTGT TTGATGCTAT TTATGCAAGT TAGGGTCTAT GTATTTAGGA5880TGTCTGCACC TTCTGCAGCC AGTCAGAAGC TGGAGAGGCA ACAGTGGATT GCTGCTTCTT5940GGGGAGAAGA GTATGCTTCC TTTTATCCAT GTAATTTAAC TGTAGAACCT GAGCTCTAAG6000TAACCGAAGA ATGTATGCCT CTGTTCTTAT GTGCCACATC CTTGTTTAAA GGCTCTCTGT6060ATGAAGAGAT GGGACCGTCA TCAGCACATT CCCTAGTGAG CCTACTGGCT CCTGGCAGCG6120GCTTTTGTGG AAGACTCACT AGCCAGAAGA GAGGAGTGGG ACAGTCCTCT CCACCAAGAT6180CTAAATCCAA ACAAAAGCAG GCTAGAGCCA GAAGAGAGGA CAAATCTTTG TTGTTCCTCT6240TCTTTACACA TACGCAAACC ACCTGTGACA GCTGGCAATT TTATAAATCA GGTAACTGGA6300AGGAGGTTAA ACTCAGAAAA AAGAAGACCT CAGTCAATTC TCTACTTTTT TTTTTTTTTT6360TCCAAATCAG ATAATAGCCC AGCAAATAGT GATAACAAAT AAAACCTTAG CTGTTCATGT6420CTTGATTTCA ATAATTAATT CTTAATCATT AAGAGACCAT AATAAATACT CCTTTTCAAG6480AGAAAAGCAA AACCATTAGA ATTGTTACTC AGCTCCTTCA AACTCAGGTT TGTAGCATAC6540ATGAGTCCAT CCATCAGTCA AAGAATGGTT CCATCTGGAG TCTTAATGTA GAAAGAAAAA6600TGGAGACTTG TAATAATGAG CTAGTTACAA AGTGCTTGTT CATTAAAATA GCACTGAAAA6660TTGAAACATG AATTAACTGA TAATATTCCA ATCATTTGCC ATTTATGACA AAAATGGTTG6720GCACTAACAA AGAACGAGCA CTTCCTTTCA GAGTTTCTGA GATAATGTAC GTGGAACAGT6780CTGGGTGGAA TGGGGCTGAA ACCATGTGCA AGTCTGTGTC TTGTCAGTCC AAGAAGTGAC6840ACCGAGATGT TAATTTTAGG GACCCGTGCC TTGTTTCCTA GCCCACAAGA ATGCAAACAT6900CAAACAGATA CTCGCTAGCC TCATTTAAAT TGATTAAAGG AGGAGTGCAT CTTTGGCCGA6960CAGTGGTGTA ACTGTGTGTG TGTGTGTGTG TGTGTGTGTG TGTGTGTGTG TGTGGGTGTG7020GGTGTATGTG TGTTTTGTGC ATAACTATTT AAGGAAACTG GAATTTTAAA GTTACTTTTA7080TACAAACCAA GAATATATGC TACAGATATA AGACAGACAT GGTTTGGTCC TATATTTCTA7140GTCATGATGA ATGTATTTTG TATACCATCT TCATATAATA TACTTAAAAA TATTTCTTAA7200TTGGGATTTG TAATCGTACC AACTTAATTG ATAAACTTGG CAACTGCTTT TATGTTCTGT7260CTCCTTCCAT AAATTTTTCA AAATACTAAT TCAACAAAGA AAAAGCTCTT TTTTTTCCTA7320AAATAAACTC AAATTTATCC TTGTTTAGAG CAGAGAAAAA TTAAGAAAAA CTTTGAAATG7380GTCTCAAAAA ATTGCTAAAT ATTTTCAATG GAAAACTAAA TGTTAGTTTA GCTGATTGTA7440TGGGGTTTTC GAACCTTTCA CTTTTTGTTT GTTTTACCTA TTTCACAACT GTGTAAATTG7500CCAATAATTC CTGTCCATGA AAATGCAAAT TATCCAGTGT AGATATATTT GACCATCACC7560CTATGGATAT TGGCTAGTTT TGCCTTTATT AAGCAAATTC ATTTCAGCCT GAATGTCTGC7620CTATATATTC TCTGCTCTTT GTATTCTCCT TTGAACCCGT TAAAACATCC TGTGGCACTCACJ9 DNA sequenceGene name: Purine nucleoside phosphorylaseUnigene number: Hs.75514Probeset Accession #: K02574Nucleic acid Accession #: X00737 clusterCoding sequence: 110-979 (predicted start/stop codons underlined)AACTGTGCGA ACCAGACCCG GCAGCCTTGC TCAGTTCAGC ATAGCGGAGC GGATCCGATC60GGATCGGAGC ACACCGGAGC AGGCTCATCG AGAAGGCGTC TGCGAGACCATGGAGAACGG120ATACACCTAT GAAGATTATA AGAACACTGC AGAATGGCTT CTGTCTCATA CTAAGCACCG180ACCTCAAGTT GCAATAATCT GTGGTTCTGG ATTAGGAGGT CTGACTGATA AATTAACTCA240GGCCCAGATC TTTGACTACA GTGAAATCCC CAACTTTCCT CGAAGTACAG TGCCAGGTCA300TGCTGGCCGA CTGGTGTTTG GGTTCCTGAA TGGCAGGGCC TGTGTGATGA TGCAGGGCAG360GTTCCACATG TATGAAGGGT ACCCACTCTG GAAGGTGACA TTCCCAGTGA GGGTTTTCCA420CCTTCTGGGT GTGGACACCC TGGTAGTCAC CAATGCAGCA GGAGGGCTGA ACCCCAAGTT480TGAGGTTGGA GATATCATGC TGATCCGTGA CCATATCAAC CTACCTGGTT TCAGTGGTCA540GAACCCTCTC AGAGGGCCCA ATGATGAAAG GTTTGGAGAT CGTTTCCCTG CCATGTCTGA600TGCCTACGAC CGGACTATGA GGCAGAGGGC TCTCAGTACC TGGAAACAAA TGGGGGAGCA660ACGTGAGCTA CAGGAAGGCA CCTATGTGAT GGTGGCAGGC CCCAGCTTTG AGACTGTGGC720AGAATGTCGT GTGCTGCAGA AGCTGGGAGC AGACGCTGTT GGCATGAGTA CAGTACCAGA780AGTTATCGTT GCACGGCACT GTGGACTTCG AGTCTTTGGC TTCTCACTCA TCACTAACAA840GGTCATCATG GATTATGAAA GCCTGGAGAA GGCCAACCAT GAAGAAGTCT TAGCAGCTGG900CAAACAAGCT GCACAGAAAT TGGAACAGTT TGTCTCCATT CTTATGGCCA GCATTCCACT960CCCTGACAAA GCCAGTTGAC CTGCCTTGGA GTCGTCTGGC ATCTCCCACA CAAGACCCAA1020GTAGCTGCTA CCTTCTTTGG CCCCTTGCTG GAGTCATGTG CCTCTGTCCT TAGGTTGTAG1080CAGAAAGGAA AAGATTCCTG TCCTTCACCT TTCCCACTTT CTTCTACCAG ACCCTTCTGG1140TGCCAGATCC TCTTCTCAAA GCTGGGATTA CAGGTGTGAG CATAGTGAGA CCTTGGCGCT1200ACAAAATAAA GCTGTTCTCA TTCCTGTTCT TTCTTACACA AGAGCTGGAG CCCGTGCCCT1260ACCACACATC TGTGGAGATG CCCAGGATTT GACTCGGGCC TTAGAACTTT GCATAGCAGC1320TGCTACTAGC TCTTTGAGAT AATACATTCC GAGGGGCTCA GTTCTGCCTT ATCTAAATCA1380CCAGAGACCA AACAAGGACT AATCCAATAC CTCTTGGAACK4 DNA sequenceGene name: ESTUnigene number: Hs.265499Probeset Accession #: R68763CAT cluster#: Cluster 46668_2Sequence: Both the EST corresponding to the probeset accession and exonprediction; number and the CAT cluster align with the Homo sapiens BAC cloneAC009414 RP11-490M8. Using FGENESH, 2 exons predicted on this BAC clone upstreamof the probeset.predicted exon 1: bases 5808-5837 of BAC clone AC009414AAAGTCTCGC CCAAACTTTG TTCGGCACAA CCAGCGCCGA GGGGGCGGCG CAGGCCAGGT60GGGAGGGGGC CCGCAGCGGG CGGCCGTACC TTCGCAAACG CCCGCTTCGT ACTCGGTGAG120GGAGTCGCCA TTGAGCGGGG GGCGGATGAC ACAACGCAGC CCCCGGTCGC AGGTTCCGTA180AATCCCGAAG GTGCCGCCGC AGCTCTCGTT CCTCTGGCTG GCGCACGTGT AGCAGCAGCC240GCAGACGCCC TGCACGATGC TCCCCGGGCA GTTCCTGGGC TCCTCGCACT TGGACTCGTC300ACAGGGCAGG CAGACCAGCG CCCGGGTGCC GGAGCGCGCC AGCAGCAGCA GCAGCCCCAG360CAGCGAGACC AGGAGGTGCC CGCAGCCGGC CAACCCCCTG TCCCCCGCCA CCAAGTACAT420CCTCCTGCGC CGCCGCCGCC TCCTCCTCGC AGCCGGGCCG GGAGCGGGGC GGGCGCCCTC480CCCTGCGCGG GGCACACGCG CCGCCGCCGC CGCACCAGCA GCCCGCGGTC CTCACCGCCC540CTCTCGGGGC CCCCGGGGCG CGCCTCCCCT CGCGGGGCGA GGCCCCCGCC CCTTCTGCGG600GCCGCGCCGA CCCCGAGCCC ACGAGCCTTG GCGCCGGCGG CAGCTTCCCC TCCTCCTCCT660CCTCCTCCTC CCGGGAGGGA GGGGGAAAAA AGAAAAAAGT TTCCTCCCGG CAGCTCCGGT720TCAACCCAAA CTTCTGGCGC GGCGGCGGCG GTGGCTGCTG CGCTCGGCTC CAGCCCGGGC780CGGCGGCGCC TCCTCCCTCT CCTCCTCCGA GTCGGCCGGC CCCGCAGCGG CGCAGCCTCC840GGGCCGGTCC CCGCCTCCCG AGCTGCCGAG TGGGCGCGGT GGCGCAGCAC AAGATCCGCG900GCGTCCGCTC CGCGCGCCCC GCTCGCCTCA CTCCTGCGCC GCTCCTCCGG GCGCTTGTTT960ATGGCTGGAG CCTCAGCCGC TCGGGCTGCG CCCTCCCCCA TCCTACCTCC TCCCCCAGAC1020CTTCCCCCCA CCCCCACGCG CCGCGCGCCG CTCATTGGCT GCCCCCCCTC CCCGGCCCGG1080CCGGCCCCCT CCGCCTCCCC CTCCCCCTCT CGGGCGGCCG GGCCCTTCCT CCCTCCCTCA1140CACGCCTCCA CCTCTTCCCG ATCTCCTCCT CCCCGAGCCC GGCGCACCGA GCCGGCCGTG1200CCACCGAGCT GCGGCTCTGG CCCCGGCGCC GCGGGTGCGC TGCGGATGGG CTTGGGGCGC1260ACCCAGCGAG CAGCGAGAGT CGCGGTGTCC CGGGCGCTCG CTGGCACCGT GGCCGCAGCG1320GCCGGCCTGG GAGCCAGGAG GGCGAGGCGG CTGCACCTTC GGGGCCAGAT TGGAGTTCGA1380AGAGTGGCGG GTACCCCAGA AGCTCGGGGC CGGGGCGATG GCTGCAGCCT CGGGAGGGTA1440TCGCCGGATC GAACTCCGGG AAAGGGAAGC AAAGGCATGG AACCTCCGCA CACTGGATGApredicted ACK4 gene seq (predicted start/stop codons underlined)ATGCCCCCGG AACAGCATCA TCAGCCCAAC AAAGTCTCGC CCAAACTTTG TTGGGCACAA60CCAGCGCCGA GGGGGCGGCG CAGGCCAGGT GGGAGGGGGC CCGCAGCGGG CGGCCGTACC120TTCGCAAACG CCCGCTTCGT ACTCGGTGAG GGAGTCGCCA TTGAGCGGGG GGCGGATGAC180ACAACGCAGC CCCCGGTCGC AGGTTCCGTA AATCCCGAAG GTGCCGCCGC AGCTCTCGTT240CCTCTGGCTG GCGCACGTGT AGCAGCAGCC GCAGACGCCC TGCACGATGC TCCCCGGGCA300GTTCCTGGGC TCCTCGCACT TGGACTCGTC ACAGGGCAGG CAGACCAGCG CCCGGGTGCC360GGAGCGCGCC AGCAGCAGCA GCAGCCCCAG CAGCGAGACC AGGAGGTGCC CGCAGCCGGC420CAACCCCCTG TCCCCCGCCA CCAAGTACAT CCTCCTGCGC CGCCGCCGCC TCCTCCTCGC480AGCCGGGCCG GGAGCGGGGC GGGCGCCCTC CCCTGCGCGG GGCACACGCG CCGCCGCCGC540CGCACCAGCA GCCCGCGGTC CTCACCGCCC CTCTCGGGGC CCCCGGGGCG CGCCTCCCCT600CGCGGGGCGA GGCCCCCGCC CCTTCTGCGG GCCGCGCCGA CCCCGAGCCC ACGAGCCTTG660GCGCCGGCGG CAGCTTCCCC TCCTCCTCCT CCTCCTCCTC CCGGGAGGGA GGGGGAAAAA720AGAAAAAAGT TTCCTCCCGG CAGCTCCGGT TCAACCCAAA CTTCTGGCGC GGCGGCGGCG780GTGGCTGCTG CGCTCGGCTC CAGCCCGGGC CGGCGGCGCC TCCTCCCTCT CCTCCTCCGA840GTCGGCCGGC CCCGCAGCGG CGCAGCCTCC GGGCCGGTCC CCGCCTCCCG AGCTGCCGAG900TGGGCGCGGT GGCGCAGCAC AAGATCCGCG GCGTCCGCTC CGCGCGCCCC GCTCGCCTCA960CTCCTGCGCC GCTCCTCCGG GCGCTTGTTT ATGGCTGGAG CCTCAGCCGC TCGGGCTGCG1020CCCTCCCCCA TCCTACCTCC TCCCCCAGAC CTTCCCCCCA CCCCCACGCG CCGCGCGCCG1080CTCATTGGCT GCCCCCCCTC CCCGGCCCGG CCGGCCCCCT CCGCCTCCCC CTCCCCCTCT1140CGGGCGGCCG GGCCCTTCCT CCCTCCCTCA CACGCCTCCA CCTCTTCCCG ATCTCCTCCT1200CCCCGAGCCC GGCGCACCGA GCCGGCCGTG CCACCGAGCT GCGGCTCTGG CCCCGGCGCC1260GCGGGTGCGC TGCGGATGGG CTTGGGGCGC ACCCAGCGAG CAGCGAGAGT CGCGGTGTCC1320CGGGCGCTCG CTGGCACCGT GGCCGCAGCG GCCGGCCTGG GAGCCAGGAG GGCGAGGCGG1380CTGCACCTTC GGGGCCAGAT TGGAGTTCGA AGAGTGGCGG GTACCCCAGA AGCTCGGGGC1440CGGGGCGATG GCTGCAGCCT CGGGAGGGTA TCGCCGGATC GAACTCCGGG AAAGGGAAGC1500AAAGGCATGG AACCTCCGCA CACTGGATGAAAA8 DNA sequenceGene name: ETL protein, with extended open reading frameUnigene number: Hs.57958Probeset Accession #: D58024Nucleotide Accession #: AF192403Coding sequence: 151-2136. Underlined sequences correspond to extended sequencenot included in AF192403.ATGAAAACAG CCGCACTCAC TCCGCCGCGC TCTCCGCCAC CGCCACCACT GCGGCCACCG60CCAATGAAAC GCCTCCCGCT CCTAGTGGTT TTTTCCACTT TGTTGAATTG TTCCTATACT120CAAAATTGCA CCAAGACACC TTGTCTCCCA AATGCAAAAT GTGAAATACG CAATGGAATT180GAAGCCTGCT ATTGCAACAT GGGATTTTCA GGAAATGGTG TCACAATTTG TGAAGATGAT240AATGAATGTG GAAATTTAAC TCAGTCCTGT GGCGAAAATG CTAATTGCAC TAACACAGAA300GGAAGTTATT ATTGTATGTG TGTACCTGGC TTCAGATCCA GCAGTAACCA AGACAGGTTT360ATCACTAATG ATGGAACCGT CTGTATAGAA AATGTGAATG CAAACTGCCA TTTAGATAAT420GTCTGTATAG CTGCAAATAT TAATAAAACT TTAACAAAAA TCAGATCCAT AAAAGAACCT480GTGGCTTTGC TACAAGAAGT CTATAGAAAT TCTGTGACAG ATCTTTCACC AACAGATATA540ATTACATATA TAGAAATATT AGCTGAATCA TCTTCATTAC TAGGTTACAA GAACAACACT600ATCTCAGCCA AGGACACCCT TTCTAACTCA ACTCTTACTG AATTTGTAAA AACCGTGAAT660AATTTTGTTC AAAGGGATAC ATTTGTAGTT TGGGACAAGT TATCTGTGAA TCATAGGAGA720ACACATCTTA CAAAACTCAT GCACACTGTT GAACAAGCTA CTTTAAGGAT ATCCCAGAGC780TTCCAAAAGA CCACAGAGTT TGATACAAAT TCAACGGATA TAGCTCTCAA AGTTTTCTTT840TTTGATTCAT ATAACATGAA ACATATTCAT CCTCATATGA ATATGGATGG AGACTACATA900AATATATTTC CAAAGAGAAA AGCTGCATAT GATTCAAATG GCAATGTTGC AGTTGCATTT960TTATATTATA AGAGTATTGG TCCTTTGCTT TCATCATCTG ACAACTTCTT ATTGAAACCT1020CAAAATTATG ATAATTCTGA AGAGGAGGAA AGAGTCATAT CTTCAGTAAT TTCAGTCTCA1080ATGAGCTCAA ACCCACCCAC ATTATATGAA CTTGAAAAAA TAACATTTAC ATTAAGTCAT1140CGAAAGGTCA CAGATAGGTA TAGGAGTCTA TGTGCATTTT GGAATTACTC ACCTGATACC1200ATGAATGGCA GCTGGTCTTC AGAGGGCTGT GAGCTGACAT ACTCAAATGA GACCCACACC1260TCATGCCGCT GTAATCACCT GACACATTTT GCAATTTTGA TGTCCTCTGG TCCTTCCATT1320GGTATTAAAG ATTATAATAT TCTTACAAGG ATCACTCAAC TAGGAATAAT TATTTCACTG1380ATTTGTCTTG CCATATGCAT TTTTACCTTC TGGTTCTTCA GTGAAATTCA AAGCACCAGG1440ACAACAATTC ACAAAAATCT TTGCTGTAGC CTATTTCTTG CTGAACTTGT TTTTCTTGTT1500GGGATCAATA CAAATACTAA TAAGCTCNTT TCTGTTTCAA TCATTGCCGG ACTGCTACAC1560TACTTCTTTT TAGCTGCTTT TGCATGGATG TGCATTGAAG GCATACATCT CTATCTCATT1620GTTGTGGGTG TCATCTACAA CAAGGGATTT TTGCACAAGA ATTTTTATAT CTTTGGCTAT1680CTAAGCCCAG CCGTGGTAGT TGGATTTTCG GCAGCACTAG GATACAGATA TTATGGCACA1740ACAAAAGTAT GTTGGCTTAG CACCGAAACA CACTTTATTT GGAGTTTTAT AGGACCAGCA1800TGCCTAATCA TTCTTGTTAA TCTCTTGGCT TTTGGAGTCA TCATATACAA AGTTTTTCGT1860CACACTGCAG GGTTGAAACC AGAAGTTAGT TGCTTTGAGA ACATAAGGTC TTGTGCAAGA1920GGAGCCCTCG CTCTTCTGTT CCTTCTCGGC ACCACCTGGA TCTTTGGGGT TCTCCATGTT1980GTGCACGCAT CAGTGGTTAC AGCTTACCTC TTCACAGTCA GCAATGCTTT CCAGGGGATG2040TTCATTTTTT TATTCCTGTG TGTTTTATCT AGAAAGATTC AAGAAGAATA TTACAGATTG2100TTCAAAAATG TCCCCTGTTG TTTTGGATGT TTAAGGTAAA CATAGAGAAT GGTGGATAAT2160TACAACTGCA CTAAAAATAA AAATTCCAAG CTGTGGATGA CCAATGTATA AAAATGACTC2220ATCAAATTAT CCAATTATTA ACTACTAGAC AAAAAGTATT TTAAATCAGT TTTTCTGTTT2280ATGCTATAGG AACTGTAGAT AATAAGGTAA AATTATGTAT CATATAGATA TACTATGTTT2340TTCTATGTGA AATAGTTCTG TCAAAAATAG TATTGCAGAT ATTTGGAAAG TAATTGGTTT2400CTCAGGAGTG ATATCACTGC ACCCAAGGAA AGATTTTCTT TCTAACACGA GAAGTATATG2460AATGTCCTGA AGGAAACCAC TGGCTTGATA TTTCTGTGAC TCGTGTTGCC TTTGAAACTA2520GTCCCCTACC ACCTCGGTAA TGAGCTCCAT TACAGAAAGT GGAACATAAG AGAATGAAGG2580GGCAGAATAT CAAACAGTGA AAAGGGAATG ATAAGATGTA TTTTGAATGA ACTGTTTTTT2640CTGTAGACTA GCTGAGAAAT TGTTGACATA AAATAAAGAA TTGAAGAAAC ACATTTTACC2700ATTTTGTGAA TTGTTCTGAA CTTAAATGTC CACTAAAACA ACTTAGACTT CTGTTTGCTA2760AATCTGTTTC TTTTTCTAAT ATTCTAAAAA AAAAAAAAAG GTTTMCCYCC CAAATTGAAA2820AAAAAAGGGA AAAAAAAATC TGTTTCTAAG GTTAGACTGA GATATATACT ATTTCCTTAC2880TTATTTCACA GATTGTGACT TTGGATAGTT AATCAGTAAA ATATAAATGT GTCGAAAC6 DNA sequenceGene name: Homo sapiens cDNA FLJ13465 fis, clone PLACE1003493, weakly similar toendothelial cell multimerin precursorUnigene number: Hs.134797Probeset Accession #: AA025351Nucleotide Accession #: AK023527Coding sequence: predicted 75-2921Extended sequence: 729-3465 (underlined sequence)AAGACAACGT CACTAGCAGT TTCTGGAGCT ACTTGCCAAG GCTGAGTGTG AGCTGAGCCT60GCCCCACCAC CAAGATGATC CTGAGCTTGC TGTTCAGCCT TGGGGGCCCC CTGGGCTGGG120GGCTGCTGGG GGCATGGGCC CAGGCTTCCA GTACTAGCCT CTCTGATCTG CAGAGCTCCA180GGACACCTGG GGTCTGGAAG GCAGAGGCTG AGGACACCAG CAAGGACCCC GTTGGACGTA240ACTGGTGCCC CTACCCAATG TCCAAGCTGG TCACCTTACT AGCTCTTTGC AAAACAGAGA300AATTCCTCAT CCACTCGCAG CAGCCGTGTC CGCAGGGAGC TCCAGACTGC CAGAAAGTCA360AAGTCATGTA CCGCATGGCC CACAAGCCAG TGTACCAGGT CAAGCAGAAG GTGCTGACCT420CTTTGGCCTG GAGGTGCTGC CCTGGCTACA CGGGCCCCAA CTGCGAGCAC CACGATTCCA480TGGCAATCCC TGAGCCTGCA GATCCTGGTG ACAGCCACCA GGAACCTCAG GATGGACCAG540TCAGCTTCAA ACCTGGCCAC CTTGCTGCAG TGATCAATGA GGTTGAGGTG CAACAGGAAC600AGCAGGAACA TCTGCTGGGA GATCTCCAGA ATGATGTGCA CCGGGTGGCA GACAGCCTGC660CAGGCCTGTG GAAAGCCCTG CCTGGTAACC TCACAGCTGC AGTGATGGAA GCAAATCAAA720CAGGGCACGA GTTCCCTGAT AGATCCTTGG AGCAGGTGCT GCTACCCCAC GTGGACACCT780TCCTACAAGT GCATTTCAGC CCCATCTGGA GGAGCTTTAA CCAAAGCCTG CACAGCCTTA840CCCAGGCCAT AAGAAACCTG TCTCTTGACG TGGAGGCCAA CCGCCAGGCC ATCTCCAGAG900TCCAGGACAG TGCCGTGGCC AGGGCTGACT TCCAGGAGCT TGGTGCCAAA TTTGAGGCCA960AGGTCCAGGA GAACACTCAG AGAGTGGGTC AGCTGCGACA GGACGTGGAG GACCGCCTGC1020ACGCCCAGCA CTTTACCCTG CACCGCTCGA TCTCAGAGCT CCAAGCCGAT GTGGACACCA1080AATTGAAGAG GCTGCACAAG GCTCAGGAGG CCCCAGGGAC CAATGGCAGT CTGGTGTTGG1140CAACGCCTGG GGCTGGGGCA AGGCCTGAGC CGGACAGCCT GCAGGCCAGG CTGGGCCAGC1200TGCAGAGGAA CCTCTCAGAG CTGCACATGA CCACGGCCCG CAGGGAGGAG GAGTTGCAGT1260ACACCCTGGA GGACATGAGG GCCACCCTGA CCCGGCACGT GGATGAGATC AAGGAACTGT1320ACTCCGAATC GGACGAGACT TTCGATCAGA TTAGCAAGGT GGAGCGGCAG GTGGAGGAGC1380TGCAGGTGAA CCACACGGCG CTCCGTGAGC TGCGCGTGAT CCTGATGGAG AAGTCTCTGA1440TCATGGAGGA GAACAAGGAG GAGGTGGAGC GGCAGCTCCT GGAGCTCAAC CTCACGCTGC1500AGCACCTGCA GGGTGGCCAT GCCGACCTCA TCAAGTACGT GAAGGACTGC AATTGCCAGA1560AGCTCTATTT AGACCTGGAC GTCATCCGGG AGGGCCAGAG GGACGCCACG CGTGCCCTGG1620AGGAGACCCA GGTGAGCCTG GACGAGCGGC GGCAGCTGGA CGGCTCCTCC CTGCAGGCCC1680TGCAGAACGC CGTGGACGCC GTGTCGCTGG CCGTGGACGC GCACAAAGCG GAGGGCGAGC1740GGGCGCGGGC GGCCACGTCG CGGCTCCGGA GCCAAGTGCA GGCGCTGGAT GACGAGGTGG1800GCGCGCTGAA GGCGGCCGCG GCCGAGGCCC GCCACGAGGT GCGCCAGCTG CACAGCGCCT1860TCGCCGCCCT GCTGGAGGAC GCGCTGCGGC ACGAGGCGGT GCTGGCCGCG CTCTTCGGGG1920AGGAGGTGCT GGAGGAGATG TCTGAGCAGA CGCCGGGACC GCTGCCCCTG AGCTACGAGC1980AGATCCGCGT GGCCCTGCAG GACGCCGCTA GCGGGCTGCA GGAGCAGGCG CTCGGCTGGG2040ACGAGCTGGC CGCCCGAGTG ACGGCCCTGG AGCAGGCCTC GGAGCCCCCG CGGCCGGCAG2100AGCACCTGGA GCCCAGCCAC GACGCGGGCC GCGAGGAGGC CGCCACCACC GCCCTGGCCG2160GGCTGGCGCG GGAGCTCCAG AGCCTGAGCA ACGACGTCAA GAATGTCGGG CGGTGCTGCG2220AGGCYGAGGC CGGGGCCGGG GCCGCCTCCC TCAACGCCTC CCTTGACGGC CTCCACAACG2280CACTCTTCGC CACTCAGCGC AGCTTGGAGC AGCACCAGCG GCTCTTCCAC AGCCTCTTTG2340GGAACTTCCA AGGGCTCATG GAAGCCAACG TCAGCCTGGA CCTGGGGAAG CTGCAGACCA2400TGCTGAGCAG GAAAGGGAAA AAGCAGCAGA AAGACCTGGA AGCTCCCCGG AAGAGGGACA2460AGAAGGAAGC GGAGCCTTTC GTGGACATAC GGGTCACAGG GCCTGTGCCA GGTGCCTTGG2520GCGCGGCGCT CTGGGAGGCA GRWTCCCCTG TGGCCTTCTA TGCCAGCTTT TCAGAAGGGA2580CGGCTGCCCT GCAGACAGTG AAGTTCAACA CCACATACAT CAACATTGGC AGCAGCTACT2640TCCCTGAACA TGGCTACTTC CGAGCCCCTG AGCGTGGTGT CTACCTGTTT GCAGTGAGCG2700TTGAATTTGG CCCAGGGCCA GGCACCGGGC AGCTGGTGTT TGGAGGTCAC CATCGGACTC2760CAGTCTGTAC CACTGGGCAG GGGAGTGGAA GCACAGCAAC GGTCTTTGCC ATGGCTGAGC2820TGCAGAAGGG TGAGCGAGTA TGGTTTGAGT TAACCCAGGG ATCAATAACA AAGAGAAGCC2880TGTCGGGCAC TGCATTTGGG GGCTTCCTGA TGTTTAAGAC CTGAACCCCA GCCCCAATCT2940GATCAGACAT CATGGACTCG CCCAGCTCTC CTCGGCCTGG GGCTCTGGCC AAGGATGGGC3000TGGAGGTCAT TCAGTTGGTC TGTCTCTTCC CTGGAAACCT TCTGCAAAGA TGGTGTGGTG3060TACGTGGCTT CCCTGTAACC ACATGGGGCT TGGCCATTTC TCCATGATGA GAAGGACTGG3120AATGCTTCTC CGGGCAGGAC ATGGTCCTAG GAAGCCTGAA CCTTGGCTTG GCATGCCTTC3180TCAGACAGCA CGGCCTGGGC TCCAACTCTT CACCACACCC TGTATTCTAC AACTTCTTTG3240GTGTTTTGCT CCTCCTGTGG TTGGAAACTT CTGTACAACA CTTTAAACTT TTCTCTTGCT3300TCCTCTTCTC TTCTCCCTTA TCGTATGATA GAAAGACATT CTTCCCCAGG AGGAATGTTT3360AAAATGGAGG CAACATTTTG GCCAACATTG GAAAGCACTA GAGGGCAATG GGATTAAACC3420AACCTGCTTG GTCTCTATTA GTCAGTAATG AAGACGACAG CCTGGCCAAC CAAGGGAAAC3480TCTGATGATT TTATAAGTTT GATAGTTCCT CCTGTGTTCA TTCTCCTTCC TGCCACCTTG3720TGAAGATGCC TTGGTTCCTC TTCACTGTCT GCCATGATTG TAAGTTTCCT GAGGCCTCCC3780CAGCCATGTG GAACAGTGAG TCAATTAAAC CTCTTTCCTT TATAAATTACH7 DNA sequenceGene name: ESTsUnigene number: Hs.3807Probeset Accession #: AA292694BAC Accession #:ALI161751FGENESH predicted exons: FGENESH predicts 2 exons on the minus strand of AL161751upstream of the ACH7 probeset.FGENESH predicted exon 1:ATGGGCAAAG ACTTCATGAC TAAAACACCA AAAGCATTTG CAACAAAAGC CAAAATTGAC60AAATGGGATC TAATTAAACT AAAGAGCTTC TGCACAGCAA AAGAAACTAT CATCAGAGTG120AACAGTCAAC CTACAGACTG GCAGAAAACT TTTGCAATCT ATCCATCTGA CAAAGGGGTA180ATAGCCAGAA TCTACAAGGA GCTTGAACAA ATTTATAAGA AAAAAAAACC AACAAAAAFGENESH predicted exon 2:CGCTCCGCAC ACATTTCCTG TCGCGGCCTA AGGGAAACTG TTGGCCGCTG GGCCCGCGGG60GGGATTCTTG GCAGTTGGGG GGTCCGTCGG GAGCGAGGGC GGAGGGGAAG GGAGGGGGAA120CCGGGTTGGG GAAGCCAGCT GTAGAGGGCG GTGACCGCGC TCCAGACACA GCTCTGCGTC180CTCGAGCGGG ACAGATCCAA GTTGGGAGCA GCTCTGCGTG CGGGGCCTCA GAGAATGAGG240CCGGCGTTCG CCCTGTGCCT CCTCTGGCAG GCGCTCTGGC CCGGGCCGGG CGGCGGCGAA300CACCCCACTG CCGACCGTGC TGGCTGCTCG GCCTCGGGGG CCTGCTACAG CCTGCACCAC360GCTACCATGA AGCGGCAGGC GGCCGAGGAG GCCTGCATCC TGCGAGGTGG GGCGCTCAGC420ACCGTGCGTG CGGGCGCCGA GCTGCGCGCT GTGCTCGCGC TCCTGCGGGC AGGCCCAGGG480CCCGGAGGGG GCTCCAAAGA CCTGCTGTTC TGGGTCGCAC TGGAGCGCAG GCGTTCCCAC540TGCACCCTGG AGAACGAGCC TTTGCGGGGT TTCTCCTGGC TGTCCTCCGA CCCCGGCGGT600CTCGAAAGCG ACACGCTGCA GTGGGTGGAG GAGCCCCAAC GCTCCTGCAC CGCGCGGAGA660TGCGCGGTAC TCCAGGCCAC CGGTGGGGTC GAGCCCGCAG CTGGAAGGAG ATGCGATGCC720ACCTGCGCGC CAACGGCTAC CTGTGCAAGT ACCAGTTTGA GGTCTTGTGT CCTGCGCCGC780GCCCCGGGGC CGCCTCTAAC TTGAGCTATC GCGCGCCCTT CCAGCTGCAC AGCGCCGCTC840TGGACTTGAG TCCACCTGGG ACCGAGGTGA GTGCGCTCTG CCGGGGACAG CTCCCGATCT900CAGTTACTTG CATCGCGGAC GAAATCGGCG CTCGCTGGGA CAAACTCTCG GGCGATGTGT960TGTGTCCCTG CCCCGGGAGG TACCTCCGTG CTGGCAAATG CGCAGAGCTC CCTAACTGCC1020TAGACGACTT GGGAGGCTTT GCCTGCGAAT GTGCTACGGG CTTCGAGCTG GGGAAGGACG1080GCCGCTCTTG TGTGACCAGT GGGGAAGGAC AGCCGACCCT TGGGGGGACC GGGGTGCCCA1140CCAGGCGCCC GCCGGCCACT GCAACCAGCC CCGTGCCGCA GAGAACATGG CCAATCAGGG1200TCGACGAGAA GCTGGGAGAG ACACCACTTG TCCCTGAACA AGACAATTCA GTAACATCTA1260TTCCTGAGAT TCCTCGATGG GGATCACAGA GCACGATGTC TACCCTTCAA ATGTCCCTTC1320AAGCCGAGTC AAAGGCCACT ATCACCCCAT CAGGGAGCGT GATTTCCAAG TTTAATTCTA1380CGACTTCCTC TGCCACTCCT CAGGCTTTCG ACTCCTCCTC TGCCGTGGTC TTCATATTTG1440TGAGCACAGC AGTAGTAGTG TTGGTGATCT TGACCATGAC AGTACTGGGG CTTGTCAAGC1500TCTGCTTTCA CGAAAGCCCC TCTTCCCAGC CAAGGAAGGA GTCTATGGGC CCGCCGGGCC1560TGGAGAGTGA TCCTGAGCCC GCTGCTTTGG GCTCCAGTTC TGCACATTGC ACAAACAATG1620GGGTGAAAGT CGGGGACTGT GATCTGCGGG ACAGAGCAGA AGGTGCCTTG CTGGCGGAGT1680CCCCTCTTGG CTCTAGTGAT GCATAGACH7 predicted coding seq (predicted start/stop codons underlined)ATGGGCAAAG ACTTCATGAC TAAAACACCA AAAGCATTTG CAACAAAAGC CAAAATTGAC60AAATGGGATC TAATTAAACT AAAGAGCTTC TGCACAGCAA AAGAAACTAT CATCAGAGTG120AACAGTCAAC CTACAGACTG GCAGAAAACT TTTGCAATCT ATCCATCTGA CAAAGGGGTA180ATAGCCAGAA TCTACAAGGA GCTTGAACAA ATTTATAAGA AAAAAAAACC AACAAAAACG240CTCCGCACAC ATTTCCTGTC GCGGCCTAAG GGAAACTGTT GGCCGCTGGG CCCGCGGGGG300GATTCTTGGC AGTTGGGGGG TCCGTCGGGA GCGAGGGCGG AGGGGAAGGG AGGGGGAACC360GGGTTGGGGA AGCCAGCTGT AGAGGGCGGT GACCGCGCTC CAGACACAGC TCTGCGTCCT420CGAGCGGGAC AGATCCAAGT TGGGAGCAGC TCTGCGTGCG GGGCCTCAGA GAATGAGGCC480GGCGTTCGCC CTGTGCCTCC TCTGGCAGGC GCTCTGGCCC GGGCCGGGCG GCGGCGAACA540CCCCACTGCC GACCGTGCTG GCTGCTCGGC CTCGGGGGCC TGCTACAGCC TGCACCACGC600TACCATGAAG CGGCAGGCGG CCGAGGAGGC CTGCATCCTG CGAGGTGGGG CGCTCAGCAC660CGTGCGTGCG GGCGCCGAGC TGCGCGCTGT GCTCGCGCTC CTGCGGGCAG GCCCAGGGCC720CGGAGGGGGC TCCAAAGACC TGCTGTTCTG GGTCGCACTG GAGCGCAGGC GTTCCCACTG780CACCCTGGAG AACGAGCCTT TGCGGGGTTT CTCCTGGCTG TCCTCCGACC CCGGCGGTCT840CGAAAGCGAC ACGCTGCAGT GGGTGGAGGA GCCCCAACGC TCCTGCACCG CGCGGAGATG900CGCGGTACTC CAGGCCACCG GTGGGGTCGA GCCCGCAGCT GGAAGGAGAT GCGATGCCAC960CTGCGCGCCA ACGGCTACCT GTGCAAGTAC CAGTTTGAGG TCTTGTGTCC TGCGCCGCGC1020CCCGGGGCCG CCTCTAACTT GAGCTATCGC GCGCCCTTCC AGCTGCACAG CGCCGCTCTG1080GACTTCAGTC CACCTGGGAC CGAGGTGAGT GCGCTCTGCC GGGGACAGCT CCCGATCTCA1140GTTACTTGCA TCGCGGACGA AATCGGCGCT CGCTGGGACA AACTCTCGGG CGATGTGTTG1200TGTCCCTGCC CCGGGAGGTA CCTCCGTGCT GGCAAATGCG CAGAGCTCCC TAACTGCCTA1260GACGACTTGG GAGGCTTTGC CTGCGAATGT GCTACGGGCT TCGAGCTGGG GAAGGACGGC1320CGCTCTTGTG TGACCAGTGG GGAAGGACAG CCGACCCTTG GGGGGACCGG GGTGCCCACC1380AGGCGCCCGC CGGCCACTGC AACCAGCCCC GTGCCGCAGA GAACATGGCC AATCAGGGTC1440GACGAGAAGC TGGGAGAGAC ACCACTTGTC CCTGAACAAG ACAATTCAGT AACATCTATT1500CCTGAGATTC CTCGATGGGG ATCACAGAGC ACGATGTCTA CCCTTCAAAT GTCCCTTCAA1560GCCGAGTCAA AGGCCACTAT CACCCCATCA GGGAGCGTGA TTTCCAAGTT TAATTCTACG1620ACTTCCTCTG CCACTCCTCA GGCTTTCGAC TCCTCCTCTG CCGTGGTCTT CATATTTGTG1680AGCACAGCAG TAGTAGTGTT GGTGATCTTG ACCATGACAG TACTGGGGCT TGTCAAGCTC1740TGCTTTCACG AAAGCCCCTC TTCCCAGCCA AGGAAGGAGT CTATGGGCCC GCCGGGCCTG1800GAGAGTGATC CTGAGCCCGC TGCTTTGGGC TCCAGTTCTG CACATTGCAC AAACAATGGG1860GTGAAAGTCG GGGACTGTGA TCTGCGGGAC AGAGCAGAGG GTGCCTTGCT GGCGGAGTCC1920CCTCTTGGCT CTAGTGATGC ATAGAAD3 DNA sequenceGene name: ESTsUnigene number: Hs.17404Probeset Accession #: N39584Nucleic Acid Accession #: N39584Coding sequence: no identified ORF; possible frameshiftsAAATGGGATT GAGTTAAAAC TATTTTATTT TAAATATACA TTTTAAAGCA GTTCTTTTTT60TTTTTTTTTT TTTTATTATA CACACACTTC AAGAGAATAT GCACAGTCTA GGCCGGGCAC120GGTGGCTCAC GCCTGTAATC CCAGCACTTT GGGAGGCCGA GGCATGTGGA TCACCTGAGG180TCAGGAGTTT GAGACCAGCC TAGACAACAT GGTGAAACCT TGTCTCTATG AAAAATACAA240AATTTGCTGG GAGTGGTGGT GCATGCCTGT AATCCCAGCT ACTTGGAAGG CTGAGGCAGG300AGAATGTCTT GAACCTAGGA GGTGGAGGTT GCAGTGAGCT GAGATTGCAC CATTGCACTC360CAGCCTGTGC AACAAAAGTG AAACTCCATT TCAAGAAAAA AAAAAAAAAA AGAATATGCA420CAGTCTGAAT GTATACCAGG AGTGTGAGAG ACACATGCCC ACTTCATGCA ACTCCTAAAC480TCAAAGTCTA AATCAGATAT TTTTATTAAC AATGACAACT TGTTGCCAAC TCCCTGTTTC540TAATCACCAA AGACCCAGGG TACCTAAAAG GACTTTGCAA CCAAGCAAAG TCACTGTCTT600CAAATCTGGA TACACACTTT CCCCTCTGTA GATTCAAAAG GTGCTTCCTT CCCGGCTGTC660TCCAGCTTCC TTACTCTCTT TTCTGGGATT TCTTTTTCTT CTTTCTTTCT GGCTCTTCCT720CCACTGGCTG AACTGGGTCC CCTAACTGAA ACAGCCCCTG ACTTAGCCCA AGCATGCTTC780CTTTAGCTGC TGTGAGAATT TTGTCTTCCT CACCAGCCAG GTCCTCAAGG CAAAGTCCTC840AGCCAGTGCT TTAAGAGCAA CTTCCCGCAA ATCAGAAACT CACTGTGATT CCAAAAATGT900TTCTGAGCCC TGGACCCCTG CCCCCAAAAT ATTTTCATCT TTCCCCCAAA CCTCCTTTAA960AGGAGCATGC ATAACAGTGT GCTGAAAGAC AGTTGTTGGT TTTTTGATTT TAGCATATTA1020TTTCCTGTAT GAAATATGTT TTATATAATC TCCTATTATT TTTATCTTAT GTTTTGTATT1080GTTGATAAAT CCCTTTTTGT CCTTCTAAGA TGTTCTATTG TAAAATCACT TATAAGGTAT1140GATTACTCTT TATGCTATTA CTTTATATGC CATTTGGGTA ATAAATAGTA AATGGTTGAT1200GATATGATTG ACTGATGCGC AGTCCAGAGC ATGTATGAAT AATCTCATAA AACAGTATCA1260CAGACATTAA GCTAAACTGT TTCGTTTTTT TGAAAGAACA ACTCATACTT TGGAACAGTT1320GTCAATATTA ATTTGTTGCA AATATTTAAT TTAAATAAAC ATTTTTGTAC CATGAAAAAA1380AAD4 DNA sequenceGene name: ERGUnigene number: Hs.279477 / Hs.45514Probeset Accession #: R32894Nucleic Acid Accession #: M17254Coding sequence: 257-1645 (predicted start/stop codons underlined)GTCCGCGCGT GTCCGCGCCC GCGTGTGCCA GCGCGCGTGC CTTGGCCGTG CGCGCCGAGC60CGGGTCGCAC TAACTCCCTC GGCGCCGACG GCGGCGCTAA CCTCTCGGTT ATTCCAGGAT120CTTTGGAGAC CCGAGGAAAG CCGTGTTGAC CAAAAGCAAG ACAAATGACT CACAGAGAAA180AAAGATGGCA GAACCAAGGG CAACTAAAGC CGTCAGGTTC TGAACAGCTG GTAGATGGGC240TGGCTTACTG AAGGACATGA TTCAGACTGT CCCGGACCCA GCAGCTCATA TCAAGGAAGC300CTTATCAGTT GTGAGTGAGG ACCAGTCGTT GTTTGAGTGT GCCTACGGAA CGCCACACCT360GGCTAAGACA GAGATGACCG CGTCCTCCTC CAGCGACTAT GGACAGACTT CCAAGATGAG420CCCACGCGTC CCTCAGCAGG ATTGGCTGTC TCAACCCCCA GCCAGGGTCA CCATCAAAAT480GGAATGTAAC CCTAGCCAGG TGAATGGCTC AAGGAACTCT CCTGATGAAT GCAGTGTGGC540CAAAGGCGGG AAGATGGTGG GCAGCCCAGA CACCGTTGGG ATGAACTACG GCAGCTACAT600GGAGGAGAAG CACATGCCAC CCCCAAACAT GACCACGAAC GAGCGCAGAG TTATCGTGCC660AGCAGATCCT ACGCTATGGA GTACAGACCA TGTGCGGCAG TGGCTGGAGT GGGCGGTGAA720AGAATATGGC CTTCCAGACG TCAACATCTT GTTATTCCAG AACATCGATG GGAAGGAACT780GTGCAAGATG ACCAAGGACG ACTTCCAGAG GCTCACCCCC AGCTACAACG CCGACATCCT840TCTCTCACAT CTCCACTACC TCAGAGAGAC TCCTCTTCCA CATTTGACTT CAGATGATGT900TGATAAAGCC TTACAAAACT CTCCACGGTT AATGCATGCT AGAAACACAG ATTTACCATA960TGAGCCCCCC AGGAGATCAG CCTGGACCGG TCACGGCCAC CCCACGCCCC AGTCGAAAGC1020TGCTCAACCA TCTCCTTCCA CAGTGCCCAA AACTGAAGAC CAGCGTCCTC AGTTAGATCC1080TTATCAGATT CTTGGACCAA CAAGTAGCCG CCTTGCAAAT CCAGGCAGTG GCCAGATCCA1140GCTTTGGCAG TTCCTCCTGG AGCTCCTGTC GGACAGCTCC AACTCCAGCT GCATCACCTG1200GGAAGGCACC AACGGGGAGT TCAAGATGAC GGATCCCGAC GAGGTGGCCC GGCGCTGGGG1260AGAGCGGAAG AGCAAACCCA ACATGAACTA CGATAAGCTC AGCCGCGCCC TCCGTTACTA1320CTATGACAAG AACATCATGA CCAAGGTCCA TGGGAAGCGC TACGCCTACA AGTTCGACTT1380CCACGGGATC GCCCAGGCCC TCCAGCCCCA CCCCCCGGAG TCATCTCTGT ACAAGTACCC1440CTCAGACCTC CCGTACATGG GCTCCTATCA CGCCCACCCA CAGAAGATGA ACTTTGTGGC1500GCCCCACCCT CCAGCCCTCC CCGTGACATC TTCCAGTTTT TTTGCTGCCC CAAACCCATA1560CTGGAATTCA CCAACTGGGG GTATATACCC CAACACTAGG CTCCCCACCA GCCATATGCC1620TTCTCATCTG GGCACTTACT ACTAAAGACC TGGCGGAGGC TTTTCCCATC AGCGTGCATT1680CACCAGCCCA TCGCCACAAA CTCTATCGGA GAACATGAAT CAAAAGTGCC TCAAGAGGAA1740TGAAAAAAGC TTTACTGGGG CTGGGGAAGG AAGCCGGGGA AGAGATCCAA AGACTCTTGG1800GAGGGAGTTA CTGAAGTCTT ACTACAGAAA TGAGGAGGAT GCTAAAAATG TCACGAATAT1860GGACATATCA TCTGTGGACT GACCTTGTAA AAGACAGTGT ATGTAGAAGC ATGAAGTCTT1920AAGGACAAAG TGCCAAAGAA AGTGGTCTTA AGAAATGTAT AAACTTTAGA GTAGAGTTTG1980AATCCCACTA ATGCAAACTG GGATGAAACT AAAGCAATAG AAACAACACA GTTTTGACCT2040AACATACCGT TTATAATGCC ATTTTAAGGA AAACTACCTG TATTTAAAAA TAGTTTCATA2100TCAAAAACAA GAGAAAAGAC ACGAGAGAGA CTGTGGCCCA TCAACAGACG TTGATATGCA2160ACTGCATGGC ATGTGCTGTT TTGGTTGAAA TCAAATACAT TCCGTTTGAT GGACAGCTGT2220CAGCTTTCTC AAACTGTGAA GATGACCCAA AGTTTCCAAC TCCTTTACAG TATTACCGGG2280ACTATGAACT AAAAGGTGGG ACTGAGGATG TGTATAGAGT GAGCGTGTGA TTGTAGACAG2340AGGGGTGAAG AAGGAGGAGG AAGAGGCAGA GAAGGAGGAG ACCAGGCTGG GAAAGAAACT2400TCTCAAGCAA TGAAGACTGG ACTGAGGACA TTTGGGGACT GTGTACAATG AGTTATGGAG2460ACTCGAGGGT TCATGCAGTC AGTGTTATAC CAAACCCAGT GTTAGGAGAA AGGACACAGC2520GTAATGGAGA AAGGGAAGTA GTAGAATTCA GAAACAAAAA TGCGCATCTC TTTCTTTGTT2580TGTCAAATGA AAATTTTAAC TGGAATTGTC TGATATTTAA GAGAAACATT CAGGACCTCA2640TCATTATGTG GGGGCTTTGT TCTCCACAGG GTCAGGTAAG AGATGGCCTT CTTGGCTGCC2700ACAATCAGAA ATCACGCAGG CATTTTGGGT AGGCGGCCTC CAGTTTTCCT TTGAGTCGCG2760AACGCTGTGC GTTTGTCAGA ATGAAGTATA CAAGTCAATG TTTTTCCCCC TTTTTATATA2820ATAATTATAT AACTTATGCA TTTATACACT ACGAGTTGAT CTCGGCCAGC CAAAGACACA2880CGACAAAAGA GACAATCGAT ATAATGTGGC CTTGAATTTT AACTCTGTAT GCTTAATGTT2940TACAATATGA AGTTATTAGT TCTTAGAATG GAGAATGTAT GTAATAAAAT AAGCTTGGCC3000TAGCATGGCA AATCAGATTT ATACAGGAGT CTGCATTTGC ACTTTTTTTA GTGACTAAAG3060TTGCTTAATG AAAACATGTG CTGAATGTTG TGGATTTTGT GTTATAATTT ACTTTGTCCA3120GGAACTTGTG CAAGGGAGAG CCAAGGAAAT AGGATGTTTG GCACCCAAD5 DNA sequenceGene name: activin A receptor type II-like 1 (ALK-1)Unigene number: Hs.8881 / Hs.172670Prabeset Accession #: T57112Nucleic Acid Accession #: NM_000020Coding sequence: 283-1794 (predicted start/stop codons underlined)AGGAAACGGT TTATTAGGAG GGAGTGGTGG AGCTGGGCCA GGCAGGAAGA CGCTGGAATA60AGAAACATTT TTGCTCCAGC CCCCATCCCA GTCCCGGGAG GCTGCCGCGC CAGCTGCGCC120GAGCGAGCCC CTCCCCGGCT CCAGCCCGGT CCGGGGCCGC GCCGGACCCC AGCCCGCCGT180CCAGCGCTGG CGGTGCAACT GCGGCCGCGC GGTGGAGGGG AGGTGGCCCC GGTCCGCCGA240AGGCTAGCGC CCCGCCACCC GCAGAGCGGG CCCAGAGGGA CCATGACCTT GGGCTCCCCC300AGGAAAGGCC TTCTGATGCT GCTGATGGCC TTGGTGACCC AGGGAGACCC TGTGAAGCCG360TCTCGGGGCC CGCTGGTGAC CTGCACGTGT GAGAGCCCAC ATTGCAAGGG GCCTACCTGC420CGGGGGGCCT GGTGCACAGT AGTGCTGGTG CGGGAGGAGG GGAGGCACCC CCAGGAACAT480CGGGGCTGCG GGAACTTGCA CAGGGAGCTC TGCAGGGGGC GCCCCACCGA GTTCGTCAAC540CACTACTGCT GCGACAGCCA CCTCTGCAAC CACAACGTGT CCCTGGTGCT GGAGGCCACC600CAACCTCCTT CGGAGCAGCC GGGAACAGAT GGCCAGCTGG CCCTGATCCT GGGCCCCGTG660CTGGCCTTGC TGGCCCTGGT GGCCCTGGGT GTCCTGGGCC TGTGGCATGT CCGACGGAGG720CAGGAGAAGC AGCGTGGCCT GCACAGCGAG CTGGGAGAGT CCAGTCTCAT CCTGAAAGCA780TCTGAGCAGG GCGACACGAT GTTGGGGGAC CTCCTGGACA GTGACTGCAC CACAGGGAGT840GGCTCAGGGC TCCCCTTCCT GGTGCAGAGG ACAGTGGCAC GGCAGGTTGC CTTGGTGGAG900TGTGTGGGAA AAGGCCGCTA TGGCGAAGTG TGGCGGGGCT TGTGGCACGG TGAGAGTGTG960GCCGTCAAGA TCTTCTCCTC GAGGGATGAA CAGTCCTGGT TCCGGGAGAC TGAGATCTAT1020AACACAGTAT TGCTCAGACA CGACAACATC CTAGGCTTCA TCGCCTCAGA CATGACCTCC1080CGCAACTCGA GCACGCAGCT GTGGCTCATC ACGCACTACC ACGAGCACGG CTCCCTCTAC1140GACTTTCTGC AGAGACAGAC GCTGGAGCCC CATCTGGCTC TGAGGCTAGC TGTGTCCGCG1200GCATGCGGCC TGGCGCACCT GCACGTGGAG ATCTTCGGTA CACAGGGCAA ACCAGCCATT1260GCCCACCGCG ACTTCAAGAG CCGCAATGTG CTGGTCAAGA GCAACCTGCA GTGTTGCATC1320GCCGACCTGG GCCTGGCTGT GATGCACTCA CAGGGCAGCG ATTACCTGGA CATCGGCAAC1380AACCCGAGAG TGGGCACCAA GCGGTACATG GCACCCGAGG TGCTGGACGA GCAGATCCGC1440ACGGACTGCT TTGAGTCCTA CAAGTGGACT GACATCTGGG CCTTTGGCCT GGTGCTGTGG1500GAGATTGCCC GCCGGACCAT CGTGAATGGC ATCGTGGAGG ACTATAGACC ACCCTTCTAT1560GATGTGGTGC CCAATGACCC CAGCTTTGAG GACATGAAGA AGGTGGTGTG TGTGGATCAG1620CAGACCCCCA CCATCCCTAA CCGGCTGGCT GCAGACCCGG TCCTCTCAGG CCTAGCTCAG1680ATGATGCGGG AGTGCTGGTA CCCAAACCCC TCTGCCCGAC TCACCGCGCT GCGGATCAAG1740AAGACACTAC AAAAAATTAG CAACAGTCCA GAGAAGCCTA AAGTGATTCA ATAGCCCAGG1800AGCACCTGAT TCCTTTCTGC CTGCAGGGGG CTGGGGGGGT GGGGGGCAGT GGATGGTGCC1860CTATCTGGGT AGAGGTAGTG TGAGTGTGGT GTGTGCTGGG GATGGGCAGC TGCGCCTGCC1920TGCTCGGCCC CCAGCCCACC CAGCCAAAAA TACAGCTGGG CTGAAACCTG ATCCCCTGCT1980GTCTGGCCTG CTCAAAGCGG CAGGCTCCCT GACGCCTGGC TCTCTCCCCA CCCCTATGGC2040CAGCATGGTG CACCCCCTAC CACTCCCGGG ACAGGATGCA AAAGAGGCTC CAGAGTCAGA2100GTGCCAAGCC AGGGAATCCC AGTCCCAGAC TCAGAGCCCG GGCCTGCACT TTGCCCCCTG2160CCCTTGATCA ACCCCACTGC CCCACCAGAG CTGCCAGGGT GGCACAGGGC CCTGTCCAGC2220CCCTGGCACA CACTTCCCTG CCAGGCCTCA GCCTCTAGCA TAAGCTCCAG AGAGCCAGGG2280CCCATCAGTT TCTCTCTGTG GATTTGTATC TCAGCTCCAT GATGCCTTGG GCTTTCTGTC2340TCCTCAACAA GAGTGCAGCT TGCTGAATGT CAGCTGCCTG AGAGAGCTGG GGCCTGACTT2400ACTAGGGCAT TAAATCCTAA GAGGTCCTAC TGAGGTGTGG CAGGATCACA GGCCAGTGGA2460AAAAGGGCAG GTCAGATGGG CAAGGCCCAG GACTTTCAGA TTAACTGAGA GGATATCGAG2520GCCAAGCATG GCAGGGGGAA GGTCAGTGGG TGTCAAGAGA CCCAGGTCTG ACCCCGGATG2580TTTGCTCCAT GTGACAAAAG CAGGCCTGTC TCAGGACCTT TTCTTTTCTT TTTTCCTTCT2640TTTTTTTTTT GACACGGAGT TTCGCTCTTG TTGTCCAGGC TAGAGTGCAA TGGCATGATC2700CCAGCTCACC GCAACGTCTA CCTCCCAGGT TCAAATCATT CTCTTGCCTC AGACTCCCGA2760GTAGCTGGGA TTACAGGCAC ATGCCACCAT GCCTGGCTAA TTTTGTATAT TTAGTAGAAA2820CAGGGTTTCA CCATGCTGGC CATGCTGGTT CTCGAACTCC TGACCTCAGG TGTTCCACCT2880ACCTCAGCCT CCCAAAGTGC TGGGGTTACA GGTGTGAGCC ATCGCGCCTG GCCAGGACCT2940TTGTTTCTTA TCTACATATT GGAAGATTTG GTCCTGATGT CCTTTGAGGC TTCTTTAGCT3000CTAGTTCTCT GACACTTCAG CCTATATCAC AGCTAACTTC YTCAGTCTCA TCTATTCCTT3060ATGCTCCAGC CCCTGGCAAT TTGCCTCAAG ATGGGGGTTT GAAAATAACT TTACCTGACT3120CAAGGAGTGT CTGGAGCACC TCCTAGTCTA AGTCTGCAAG CTCCAGTTCT TGCCTAAAAC3180CATGCCAGTG GCCACCCTTG GGCTCAGACA GCTCTGGGCC TTTTGACCAC AAGCCAGCCC3240CTCGCCCTCT CTGTGGCATA GTCTTCTCTG CCCCAGGACT GCAGGGCGGC TTCCTCCAAG3300GCTTCCAAGG CTCAAAAGAA ATTTGGCTCC ATCCAAGAAG GCTCCAGCTC CCCTACTGGC3360CCCTGGCTTC AGGCCCACAC CCCTGGGCCA GGSCCAGAGA GTGTGTCTCA GGAGAATTCA3420ATGGGCTCTA GAGAGACACA CAGAAAGTTT GGGCATTTGG GAAATTTTCA AGGRTGTATG3480TATGGYTCAC GTATGGWGCA GGTTGTCCTG GTCCYKGGGT GCAGGGAAGT GGGCTGCAGG3540GAAGTGGATT GGAGGGGAGC TTGAGGAATA TAAGGAGCGG GGGTGGAGAC TCAGGCTATG3600GACAAGGACA GCCCCAAGGT TGGGAAGACC TGGCCTTAGT CGTCCTCAGC CTAGGGCAGG3660GCAGTGAAGA AAGCTCTCCC CGCTCCTGCT GTAATGACCC AGAGTAGCCT CCCCAGGCCG3720GCATCTTATG TGTGTCTTCC ACCATCCTCA TGGTGGCACT TTTCTAGGCC TGTCTCCCAG3780CATTGTGCAA GGCTCGGAAG AGAACCACCA AGTGAAACTG GGTGAAAACA GAAAGCTCAA3840TGGATGGGCT AGGTTCCCAG ATCATTAGGG CAGAGTTTGC ACGTCCTCTG GTTCACTGGG3900AATCCACCCA GCCCACGAAT CATCTCCCTC TTTGAAGGAT TTTWATTTCT ACTGGGTTTT3960GGAACAAACT CCTGCTGAGA CCCCACAGCC AGAAACTGAA AGCAGCAGCT CCCCAAAGCC4020TGGAAAATCC CTAAGAGAAG GCCTGGGGGA MAGGAAKTGG AGTGACAGGG GACAGGTAGA4080GAGAAGGGGG CCCAATGGCC AGGGAGTGAA GGAGGTGGCG TTGCTGAGAG CAGTCTGCAC4140ATGCTTCTGT CTGAGTGCAG GAAGGTGTTC CAGGGTCGAA ATTACACTTC TCGTACCTGG4200AGACGCTGTT TGTGGGAGCA CTGGGCTCAT GCCTGGCACA CAATAGGTCT GCAATAAACC4260ATGGTTAAAT CCTGAAAAAA AAAAAAAAAAAD8 DNA sequenceGene name: ESTsUnigene number: Hs.144953Probeset Accession #: AA404418Nucleic Acid Accession 4: n/aCoding sequence: no ORF identified; possible frameshiftsTATGTCCACC AAAGACACCT CGTTGGTCAT GTTCTATCAC CTCTTCGTCA AATTGACATC60AGGTCCTAAC AGGTCACTTT CAAGATACAG AAGAGGCAAA TTTTGTTTTG AGACTTGGCC120ATTCCTAGGG TCAGCAAAGT GTATTCCTGG CAGCCAGACC TTCAGTCACT TATCAGGAAA180TGCTTGACCT AAAGACAGAC AATTCTTTCC CCAAACTTTG CTGTTTCTTT TTTGAGTCTT240TGTTGAAAGA TTTCTTTTAA AAGGCGTTCG TGTGAGAAGA TCACAGCAAC AAATCTGGCT300TGTTCTGTTT TAGACTTACT TTCTTAACTC TTGGGCAGAA GAAAATGAAT GAGATTTGAA360GACCTTTGAT ACCTTGGGTA GACAAAGCTT GCCTTGAAAC TAGAAATAAG ACGAAACTAG420ATTTTAAGGG GAAAAAATTT GCTAGTGGTA ATATAATTGG TTTTGTTTCA TTTTTTTATG480AGTCTGAGGA GTTGACATTA AACGTTGGGA TGTTGCTTTG TTAATGAAGT CATTTCAATT540TTTGCAACTC TTAACATCTG CATGCTTCCA TAAACAGTGG GTTGGAACAA AAGAAAATGT600GACTAAGGGA TATTCCTTAA ATTCTTTTTT ATGTTATGAG AGAGAATATT GGAATATAAA660GAATGTTACT TTATCTGGTA AACCATCTCA TAGGCCAGAA GCACTAACAG TTTGAATGGT720TGGCTTAAAA AAAAACGGGA GTCTTTGAAT TTAAGCTTAT GTAAAATTAC TATGCAAATA780TAGGTTATTA TTTATTTTTA CAGTGAAAAT AAAACACTAT TGAAGTATAA ATGGAAAGAA840AATAAAAGCA AAGCCTGTTT AATATAGAGA CATTAATGTT GATATCACTG TACGAACAGT900CATAGCTTGC TGCTCACTGC CGTTAAAGGG TTGACATACA AACATTGTGG AAGAGATTTC960AGTTTGAGGG CTAGTGTCTG AATTATGGAC TCCTTACCCT ACTCCACCAC TTAAAACATT1020TTAGAGACTT TTGTGAAATT AACAGGTCAT ATAATTAATA ATTGTTGTTT TATGTACATT1080TATTGAAAGG CCATATTGAG GCTCCATTGA TTTTTTTTCC TGCATATTTA TCAGTATCGA1140ATTAGAAAAT TGAACCTTCA GTGTTACTAG ATGGAAATCT ACCAAAAAGT AGCAAGGTTT1200ACGAATGGTG GGATTTATTG GTGATTAAAC ATTTTTTTCC TGTATTTTAT AAGTTTCACA1260TTACATTTAC AATGAGAAAA AAATGTAAAT GTAGAATTAA AGTCTTGTTA ATATCGTAAT1320TTGCCTATTG CTGTACTAAA AGAAGCTTCT ATAAAATGTA TCATTCTCAT CCTTAGATTC1380AGGCCAGAAA GTAACTTTCA GTGTTAGGTA TTTGAAATAA TGCAGCCTGT CATATGTACT1440CTGGTTACCA GAATGAAAAA ACAAAAAGAG ATACATACAT AGTAAGGAAA CATGAAATTG1500GAGGAATTGA TCCCCATGTG TATTGCAGCT TCATATACCA GTAGTCTCTA ATAAGTCATT1560GCTTTAATAA AAAAAAAAAT AGAAAATTTA AAACA2 DNA sequenceGene name: ESTUnigene number: Hs.16450Probeset Accession #: AA478778Nucleic Acid Accession #: AA478778Coding sequence: no ORF identified; possible frameshiftsTATTTTTGTA CGTAAAATGA TTCTATTATG ACTGCCTTTG CATGTAGTAA TATGACAAAG60TGATCCTTCA TTATCACGGT ACACTATTGT TTACTTTTCA TCTGTAAATG TTTTATTGTT120ACTTTTTTAA AATGAATTTT TTTAAAACAA TCTAGCCATC ATCAAGGTGC TATAAGAGTT180GTATAAAAGA TATTTTTGGC ATTTCTAGGC AAGTATCAGC CAATAAGTAT GTTAGTGATA240TCACAGATTG TACCAACTAT TAACTATGTT AAATAAGTAT TCAGTTTCAT GTGATCTCTG300GGAAAAAAAT ATGCTGCCTT GGTGCTAATA TTGTATGTAT TTAAATGATC ATCTGACTCA360GAAATATAAA CACTTTTAAT GAAAGGGAGG AACGGAAGGA CAATTTCCAG TGCACAGAAT420CACTTGGATG AAATAAGACC AGCTCTTTAC CCTTATTTTT GGATATGCCT TTTTTGGAAG480AGACTTAGAC TTTATCCTTA TTGTTGTTAG TGTTGTTAAT ATTCGTTGCT TCAGCCCACG540GTGCCTTGGT CTCTCCACAA TCAAATGGAG GATCCCCCAA GCAGCTTCAT TACAGAGTGA600TATTGGGAAA GTGAGATCCT CTCACCATTT TGCCAAGATA CTCTAAAATG ACATCCAAGT660TTACCAGTAG AAAGACACAG GATGCACAGA ATGGGCATGA CCTTCAGCTC ACGAGCACAC720CTGGAGAAAT TCAGAACCAG GTTCTGAATC ATCACGATTG CCTTTTGCAT GAAAACATCG780GCTGGTGATG TGACTTCTCT TCAGGCCATG AGCCTAACAY CCTGCCGGTT TTCATGCCCG840CTGCAGTAAT GGACGTTTGT GTGAAGAAAT GAACTGTGGA GTACAAAA CTTTGAGTCT900TTCCGATTGC TCATTAATTC ACTTTTTTGT TACTTCTTTC CAAAATGGAA GTGCTGAAGC960CATGGTCTTT CTGCCCCTCC AAGCTGATGA AGGGAAGCCT TTGCCAATGG CCCATGGAAG1020ACACTTGGTT TGAGAAACCC TGCCCACTTC CAAAGACCAA AGAGATTAGG AAAAGCCTGG1080CAGTATTCTC CAACTCCAAA CAAGCTCTAG AGTGCTCCAG GAAAAGTTAT ATTCAGTATA1140TGAATAAGTG TTATTCTCCA TTATTAATGT GTTCTGAAAA TATATTATGA ATAAATACAT1200CACCACACCC AAAAAAAAAA AAAAAAAAAA AAAAACA4 DNA sequenceGene name: alpha satellite junction DNA sequenceUnigene number: Hs.247946Probeset Accession #: M21305Nucleic Acid Accession #: M21305Coding sequence: 1-165 (predicted start/stop codons underlined)ATGGAATGGA ATGGAATGGC ATGGAATCGT ATAAAGTGGA ATGGAATCAA CTCGAGTGGA60ATGGAATGGA ATGGAATGGA ATGGAATGCA GTACAATGCA ATAGAATGGA ATGGAATGAA120CTCGAGTTGA CTGGAATGGA ATGGAATGGA ATGCATTTGA ATTGAACG6 DNA sequenceGene name: intercellular adhesion molecule 2 (ICAM2)Unigene number: Hs.83733Probeset Accession #: M32334Nucleic Acid Accession #: NM_000873Coding sequence: 63-890 (predicted start/stop codons underlined)CTAAAGATCT CCCTCCAGGC AGCCCTTGGC TGGTCCCTGC GAGCCCGTGG AGACTGCCAG60AGATGTCCTC TTTCGGTTAC AGGACCCTGA CTGTGGCCCT CTTCACCCTG ATCTGCTGTC120CAGGATCGGA TGAGAAGGTA TTCGAGGTAC ACGTGAGGCC AAAGAAGCTG GCGGTTGAGC180CCAAAGGGTC CCTCGAGGTC AACTGCAGCA CCACCTGTAA CCAGCCTGAA GTGGGTGGTC240TGGAGACCTC TCTAAATAAG ATTCTGCTGG ACGAACAGGC TCAGTGGAAA CATTACTTGG300TCTCAAACAT CTCCCATGAC ACGCTCCTCC AATGCCACTT CACCTGCTCC GGGAAGCAGG360AGTCAATGAA TTCCAACGTC AGCGTGTACC AGCCTCCAAG GCAGGTCATC CTGACACTGC420AACCCACTTT GGTGGCTGTG GGCAAGTCCT TCACCATTGA GTGCAGGGTG CCCACCGTGG480AGCCCCTGGA CAGCCTCACC CTCTTCCTGT TCCGTGGCAA TGAGACTCTG CACTATGAGA540CCTTCGGGAA GGCAGCCCCT GCTCCGCAGG AGGCCACAGC CACATTCAAC AGCACGGCTG600ACAGAGAGGA TGGCCACCGC AACTTCTCCT GCCTGGCTGT GCTGGACTTG ATGTCTCGCG660GTGGCAACAT CTTTCACAAA CACTCAGCCC CGAAGATGTT GGAGATCTAT GAGCCTGTGT720CGGACAGCCA GATGGTCATC ATAGTCACGG TGGTGTCGGT GTTGCTGTCC CTGTTCGTGA780CATCTGTCCT GCTCTGCTTC ATCTTCGGCC AGCACTTGCG CCAGCAGCGG ATGGGCACCT840ACGGGGTGCG AGCGGCTTGG AGGAGGCTGC CCCAGGCCTT CCGGCCATAG CAACCATGAG900TGGCATGGCC ACCACCACGG TGGTCACTGG AACTCAGTGT GACTCCTCAG GGTTGAGGTC960CAGCCCTGGC TGAAGGACTG TGACAGGCAG CAGAGACTTG GGACATTGCC TTTTCTAGCC1020CGAATACAAA CACCTGGACT TACG7 DNA sequenceGene name: Cadherin 5, VE-cadherin (CDH5)Unigene number: Hs.76206Probeset Accession #: X79981Nucleic Acid Accession #: NM_001795Coding sequence: 25-2379 (predicted start/stop codons underlined)GCACGATCTG TTCCTCCTGG GAAGATGCAG AGGCTCATGA TGCTCCTCGC CACATCGGGC60GCCTGCCTGG GCCTGCTGGC AGXGGCAGCA GTGGCAGCAG CAGGTGCTAA CCCTGCCCAA120CGGGACACCC ACAGCCTGCT GCCCACCCAC CGGCGCCAAA AGAGAGATTG GATTTGGAAC180CAGATGCACA TTGATGAAGA GAAAAACACC TCACTTCCCC ATCATGTAGG CAAGATCAAG240TCAAGCGTGA GTCGCAAGAA TGCCAAGTAC CTGCTCAAAG GAGAATATGT GGGCAAGGTC300TTCCGGGTCG ATGCAGAGAC AGGAGACGTG TTCGCCATTG AGAGGCTGGA CCGGGAGAAT360ATCTCAGAGT ACCACCTCAC TGCTGTCATT GTGGACAAGG ACACTGGTGA AAACCTGGAG420ACTCCTTCCA GCTTCACCAT CAAAGTTCAT GACGTGAACG ACAACTGGCC TGTGTTCACG480CATCGGTTGT TCAATGCGTC CGTGCCTGAG TCGTCGGCTG TGGGGACCTC AGTCATCTCT540GTGACAGCAG TGGATGCAGA CGACCCCACT GTGGGAGACC ACGCCTCTGT CATGTACCAA600ATCCTGAAGG GGAAAGAGTA TTTTGCCATC GATAATTCTG GACGTATTAT CACAATAACG660AAAAGCTTGG ACCGAGAGAA GCAGGCCAGG TATGAGATCG TGGTGGAAGC GCGAGATGCC720CAGGGCCTCC GGGGGGACTC GGGCACGGCC ACCGTGCTGG TCACTCTGCA AGACATCAAT780GACAACTTCC CCTTCTTCAC CCAGACCAAG TACACATTTG TCGTGCCTGA AGACACCCGT840GTGGGCACCT CTGTGGGCTC TCTGTTTGTT GAGGACCCAG ATGAGCCCCA GAACCGGATG900ACCAAGTACA GCATCTTGCG GGGCGACTAC CAGGACGCTT TCACCATTGA GACAAACCCC960GCCCACAACG AGGGCATCAT CAAGCCCATG AAGCCTCTGG ATTATGAATA CATCCAGCAA1020TACAGCTTCA TCGTCGAGGC CACAGACCCC ACCATCGACC TCCGATACAT GAGCCCTCCC1080GCGGGAAACA GAGCCCAGGT CATTATCAAC ATCACAGATG TGGACGAGCC CCCCATTTTC1140CAGCAGCCTT TCTACCACTT CCAGCTGAAG GAAAACCAGA AGAAGCCTCT GATTGGCACA1200GTGCTGGCCA TGGACCCTGA TGCGGCTAGG CATAGCATTG GATACTCCAT CCGCAGGACC1260AGTGACAAGG GCCAGTTCTT CCGAGTCACA AAAAAGGGGG ACATTTACAA TGAGAAAGAA1320CTGGACAGAG AAGTCTACCC CTGGTATAAC CTGACTGTGG AGGCCAAAGA ACTGGATTCC1380ACTGGAACCC CCACAGGAAA AGAATCCATT GTGCAAGTCC ACATTGAAGT TTTGGATGAG1440AATGACAATG CCCCGGAGTT TGCCAAGCCC TACCAGCCCA AAGTGTGTGA GAACGCTGTC1500CATGGCCAGC TGGTCCTGCA GATCTCCGCA ATAGACAAGG ACATAACACC ACGAAACGTG1560AAGTTCAAAT TCACCTTGAA TACTGAGAAC AACTTTACCC TCACGGATAA TCACGATAAC1620ACGGCCAACA TCACAGTCAA GTATGGGCAG TTTGACCGGG AGCATACCAA GGTCCACTTC1680CTACCCGTGG TCATCTCAGA CAATGGGATG CCAAGTCGCA CGGGCACCAG CACGCTGACC1740GTGGCCGTGT GCAAGTGCAA CGAGCAGGGC GAGTTCACCT TCTGCGAGGA TATGGCCGCC1800CAGGTGGGCG TGAGCATCCA GGCAGTGGTA GCCATCTTAC TCTGCATCCT CACCATCACA1860GTGATCACCC TGCTCATCTT CCTGCGGCGG CGGCTCCGGA AGCAGGCCCG CGCGCACGGC1920AAGAGCGTGC CGGAGATCCA CGAGCAGCTG GTCACCTACG ACGAGGAGGG CGGCGGCGAG1980ATGGACACCA CCAGCTACGA TGTGTCGGTG CTCAACTCGG TGCGCCGCGG CGGGGCCAAG2040CCCCCGCGGC CCGCGCTGGA CGCCCGGCCT TCCCTCTATG CGCAGGTGCA GAAGCCACCG2100AGGCACGCGC CTGGGGCACA CGGAGGGCCC GGGGAGATGG CAGCCATGAT CGAGGTGAAG2160AAGGACGAGG CGGACCACGA CGGCGACGGC CCCCCCTACG ACACGCTGCA CATCTACGGC2220TACGAGGGCT CCGAGTCCAT AGCCGAGTCC CTCAGCTCCC TGGGCACCGA CTCATCCGAC2280TCTGACGTGG ATTACGACTT CCTTAACGAC TGGGGACCCA GGTTTAAGAT GCTGGCTGAG2340CTGTACGGCT CGGACCCCCG GGAGGAGCTG CTGTATTAGG CGGCCGAGGT CACTCTGGGC2400CTGGGGACCC AAACCCCCTG CAGCCCAGGC CAGTCAGACT CCAGGCACCA CAGCCTCCAA2460AAATGGCAGT GACTCCCCAG CCCAGCACCC CTTCCTCGTG GGTCCCAGAG ACCTCATCAG2520CCTTGGGATA GCAAACTCCA GGTTCCTGAA ATATCCAGGA ATATATGTCA GTGATGACTA2580TTCTCAAATG CTGGCAAATC CAGGCTGGTG TTCTGTCTGG GCTCAGACAT CCACATAACC2640CTGTCACCCA CAGACCGCCG TCTAACTCAA AGACTTCCTC TGGCTCCCCA AGGCTGCAAA2700GCAAAACAGA CTGTGTTTAA CTGCTGCAGG GTCTTTTTCT AGGGTCCCTG AACGCCCTGG2760TAAGGCTGGT GAGGTCCTGG TGCCTATCTG CCTGGAGGCA AAGGCCTGGA CAGCTTGACT2820TGTGGGGCAG GATTCTCTGC AGCCCATTCC CAAGGGAGAC TGACCATCAT GCCCTCTCTC2880GGGAGCCCTA GCCCTGCTCC AACTCCATAC TCCACTCCAA GTGCCCCACC ACTCCCCAAC2940CCCTCTCCAG GCCTGTCAAG AGGGAGGAAG GGGCCCCATG GCAGCTCCTG ACCTTGGGTC3000CTGAAGTGAC CTCACTGGCC TGCCATGCCA GTAACTGTGC TGTACTGAGC ACTGAACCAC3060ATTCAGGGAA ATGCTTATTA AACCTTGAAG CAACTGTGAA TTCATTCTGG AGGGGCAGTG3120GAGATCAGGA GTGACAGATC ACAGGGTGAG GGCCACCTCC ACACCCACCC CCTCTGGAGA3180AGGCCTGGAA GAGCTGAGAC CTTGCTTTGA GACTCCTCAG CACCCCTCCA GTTTTGCCTG3240AGAAGGGGCA GATGTTCCCG GAGATCAGAA GACGTCTCCC CTTCTCTGCC TCACCTGGTC3300GCCAATCCAT GCTCTCTTTC TTTTCTCTGT CTACTCCTTA TCCCTTGGTT TAGAGGAACC3360CAAGATGTGG CCTTTAGCAA AACTGACAAT GTCCAAACCC ACTCATGACT GCATGACGGA3420GCCGAGCATG TGTCTTTACA CCTCGCTGTT GTCACATCTC AGGGAACTGA CCCTCAGGCA3480CACCTTGCAG AAGGAAGGCC CTGCCCTGCC CAACCTCTGT GGTCACCCAT GCATCATTCC3540ACTGGAACGT TTCACTGCAA ACACACCTTG GAGAAGTGGC ATCAGTCAAC AGAGAGGGGC3600AGGGAAGGAG ACACCAAGCT CACCCTTCGT CATGGACCGA GGTTCCCACT CTGGCAAAGC3660CCCTCACACT GCAAGGGATT GTAGATAACA CTGACTTGTT TGTTTTAACC AATAACTAGC3720TTCTTATAAT GATTTTTTTA CTAATGATAC TTACAAGTTT CTAGCTCTCA CAGACATATA3780GAATAAGGGT TTTTGCATAA TAAGCAGGTT GTTATTTAGG TTAACAATAT TAATTCAGGT3840TTTTTAGTTG GAAAAACAAT TCCTGTAACC TTCTATTTTC TATAATTGTA GTAATTGCTC3900TACAGATAAT GTCTATATAT TGGCCAAACT GGTGCATGAC AAGTACTGTA TTTTTTTATA3960CCTAAATAAA GAAAAATCTT TAGCCTGGGC AACAAAAAAAACG9 DNA sequenceGene name: lysyl oxidase-like 2 (LOXL2)Unigene number: Hs.83354Probeset Accession #: U89942Nucleic Acid Accession #: NM_002318 clusterCoding sequence: 248-2572 (predicted start/stop codons underlined)ACTCCAGCGC GCGGCTACCT ACGCTTGGTG CTTGCTTTCT CCAGCCATCG GAGACCAGAG60CCGCCCCCTC TGCTCGAGAA AGGGGCTCAG CGGCGGCGGA AGCGGAGGGG GACCACCGTG120GAGAGCGCGG TCCCAGCCCG GCCACTGCGG ATCCCTGAAA CCAAAAAGCT CCTGCTGCTT180CTGTACCCCG CCTGTCCCTC CCAGCTGCGC AGGGCCCCTT CGTGGGATCA TCAGCCCGAA240GACAGGGATG GAGAGGCCTC TGTGCTCCCA CCTCTGCAGC TGCCTGGCTA TGCTGGCCCT300CCTGTCCCCC CTGAGGCTGG CACAGTATGA CAGCTGGCCC CATTACCCCG AGTACTTCCA360GCAACCGGCT CCTGAGGATC ACCAGCCCCA GGCCCCCGCC AACGTGGCCA AGATTCAGCT420GCGCCTGGCT GGGCAGAAGA GGAAGCACAG CGAGGGCCGG GTGGAGGTGT ACTATGATGG480CCAGTGGGGC ACCGTGTGCG ATGACGACTT CTCCATCCAC GCTGCCCACG TCGTCTGCCG540GGAGCTGGGC TATGTGGAGG CCAAGTCCTG GACTGCCAGC TCCTCCTACG GCAAGGGAGA600AGGGCCCATC TGGTTAGACA ATCTCCACTG TACTGGCAAC GAGGCGACCC TTGCAGCATG660CACCTCCAAT GGCTGGGGCG TCACTGACTG CAAGCACACG GAGGATGTCG GTGTGGTGTG720CAGCGACAAA AGGATTCCTG GGTTCAAATT TGACAATTCG TTGATCAACC AGATAGAGAA780CCTGAATATC CAGGTGGAGG ACATTCGGAT TCGAGCCATC CTCTCAACCT ACCGCAAGCG840CACCCCAGTG ATGGAGGGCT ACGTGGAGGT GAAGGAGGGC AAGACCTGGA AGCAGATCTG900TGACAAGCAC TGGACGGCCA AGAATTCCCG CGTGGTCTGC GGCATGTTTG GCTTCCCTGG960GGAGAGGACA TACAATACCA AAGTGTACAA AATGTTTGCC TCACGGAGGA AGCAGCGCTA1020CTGGCCATTC TCCATGGACT GCACCGGCAC AGAGGCCCAC ATCTCCAGCT GCAAGCTGGG1080CCCCCAGGTG TCACTGGACC CCATGAAGAA TGTCACCTGC GAGAATGGGC TGCCGGCCGT1140GGTGAGTTGT GTGCCTGGGC AGGTCTTCAG CCCTGACGGA CCCTCGAGAT TCCGGAAAGC1200ATACAAGCCA GAGCAACCCC TGGTGCGACT GAGAGGCGGT GCCTACATCG GGGAGGGCCG1260CGTGGAGGTG CTCAAAAATG GAGAATGGGG GACCGTCTGC GACGACAAGT GGGACCTGGT1320GTCGGCCAGT GTGGTCTGCA GAGAGCTGGG CTTTGGGAGT GCCAAAGAGG CAGTCACTGG1380CTCCCGACTG GGGCAAGGGA TCGGACCCAT CCACCTCAAC GAGATCCAGT GCACAGGCAA1440TGAGAAGTCC ATTATAGACT GCAAGTTCAA TGCCGAGTCT CAGGGCTGCA ACCACGAGGA1500GGATGCTGGT GTGAGATGCA ACACCCCTGC CATGGGCTTG CAGAAGAAGC TGCGCCTGAA1560CGGCGGCCGC AATCCCTACG AGGGCCGAGT GGAGGTGCTG GTGGAGAGAA ACGGGTCCCT1620TGTGTGGGGG ATGGTGTGTG GCCAAAACTG GGGCATCGTG GAGGCCATGG TGGTCTGCCG1680CCAGCTGGGC CTGGGATTCG CCAGCAACGC CTTCCAGGAG ACCTGGTATT GGCACGGAGA1740TGTCAACAGC AACAAAGTGG TCATGAGTGG AGTGAAGTGC TCGGGAACGG AGCTGTCCCT1800GGCGCACTGC CGCCACGACG GGGAGGACGT GGCCTGCCCC CAGGGCGGAG TGCAGTACGG1860GGCCGGAGTT GCCTGCTCAG AAACCGCCCC TGACCTGGTC CTCAATGCGG AGATGGTGCA1920GCAGACCACC TACCTGGAGG ACCGGCCCAT GTTCATGCTG CAGTGTGCCA TGGAGGAGAA1980CTGCCTCTCG GCCTCAGCCG CGCAGACCGA CCCCACCACG GGCTACCGCC GGCTCCTGCG2040CTTCTCCTCC CAGATCCAGA ACAATGGCCA GTCCGACTTC CGGCCCAAGA AGGGCGGCCA2100CGCGTGGATC TGGCACGACT GTCACAGGCA CTACCACAGC ATGGAGGTGT TCACCCACTA2160TGACCTGCTG AACCTCAATG GCACCAAGGT GGCAGAGGGC CACAAGGCCA GCTTCTGCTT2220GGAGGACACA GAATGTGAAG GAGACATCCA GAAGAATTAC GAGTGTGCCA ACTTCGGCGA2280TCAGGGCATC ACCATGGGCT GCTGGGACAT GTACCGCCAT GACATCGACT GCCAGTGGGT2340TGACATCACT GACGTGCCCC CTGGAGACTA CCTGTTCCAG GTTGTTATTA ACCCCAACTT2400CGAGGTTGCA GAATCCGATT ACTCCAACAA CATCATGAAA TGCAGGAGCC GCTATGACGG2460CCACCGCATC TGGATGTACA ACTGCCACAT AGGTGGTTCC TTCAGCGAAG AGACGGAAAA2520AAAGTTTGAG CACTTCAGCG GGCTCTTAAA CAACCAGCTG TCCCCGCAGTAAAGAAGCCT2580GCGTGGTCAA CTCCTGTCTT CAGGCCACAC CACATCTTCC ATGGGACTTC CCCCCAACAA2640CTGAGTCTGA ACGAATGCCA CGTGCCCTCA CCCAGCCCGG CCCCCACCCT GTCCAGACCC2700CTACAGCTGT GTCTAAGCTC AGGAGGAAAG GGACCCTCCC ATCATTCATG GGGGGCTGCT2760ACCTGACCCT TGGGGCCTGA GAAGGCCTTG GGGGGGTGGG GTTTGTCCAC AGAGCTGCTG2820GAGCAGCACC AAGAGCCAGT CTTGACCGGG ATGAGGCCCA CAGACAGGTT GTCATCAGCT2880TGTCCCATTC AAGCCACCGA GCTCACCACA GACACAGTGG AGCCGCGCTC TTCTCCAGTG2940ACACGTGGAC AAATGCGGGC TCATCAGCCC CCCCAGAGAG GGTCAGGCCG AACCCCATTT3000CTCCTCCTCT TAGGTCATTT TCAGCAAACT TGAATATCTA GACCTCTCTT CCAATGAAAC3060CCTCCAGTCT ATTATAGTCA CATAGATAAT GGTGCCACGT GTTTTCTGAT TTGGTGAGCT3120CAGACTTGGT GCTTCCCTCT CCACAACCCC CACCCCTTGT TTTTCAAGAT ACTATTATTA3180TATTTTCACA GACTTTTGAA GCACAAATTT ATTGGCATTT AATATTGGAC ATCTGGGCCC3240TTGGAAGTAC AAATCTAAGG AAAAACCAAC CCACTGTGTA AGTGACTCAT CTTCCTGTTG3300TTCCAATTCT GTGGGTTTTT GATTCAACGG TGCTATAACC AGGGTCCTGG GTGACAGGGC3360GCTCACTGAG CACCATGTGT CATCACAGAC ACTTACACAT ACTTGAAACT TGGAATAAAA3420GAAAGATTTA TGACH2 DNA sequenceGene name:TIE tyrosine-protein kinaseUnigene number: Hs.78824Probeset Accession #: X60957Nucleic Acid Accession #: NM_005424 clusterCoding sequence: 37-3452 (predicted start/stop codons underlined)CGCTCGTCCT GGCTGGCCTG GGTCGGCCTC TGGAGTATGG TCTGGCGGGT GCCCCCTTTC60TTGCTCCCCA TCCTCTTCTT GGCTTCTCAT GTGGGCGCGG CGGTGGACCT GACGCTGCTG120GCCAACCTGC GGCTCACGGA CCCCCAGCGC TTCTTCCTGA CTTGCGTGTC TGGGGAGGCC180GGGGCGGGGA GGGGCTCGGA CGCCTGGGGC CCGCCCCTGC TGCTGGAGAA GGACGACCGT240ATCGTGCGCA CCCCGCCCGG GCCACCCCTG CGCCTGGCGC GCAACGGTTC GCACCAGGTC300ACGCTTCGCG GCTTCTCCAA GCCCTCGGAC CTCGTGGGCG TCTTCTCCTG CGTGGGCGGT360GCTGGGGCGC GGCGCACGCG CGTCATCTAC GTGCACACA GCCCTGGAGC CCACCTGCTT420CCAGACAAGG TCACACACAC TGTGAACAAA GGTGAGACCG CTGTACTTTC TGCACGTGTG480CACAAGGAGA AGCAGACAGA CGTGATCTGG AAGAGCAACG GATCCTACTT CTACACCCTG540GACTGGCATG AAGCCCAGGA TGGGCGGTTC CTGCTGCAGC TCCCAAATGT GCAGCCACCA600TCGAGCGGCA TCTACAGTGC CACTTACCTG GAAGCCAGCC CCCTGGGCAG CGCCTTCTTT660CGGCTCATCG TGCGGGGTTG TGGGGCTGGG CGCTGGGGGC CAGGCTGTAC CAAGGAGTGC720CCAGGTTGCC TACATGGAGG TGTCTGCCAC GACCATGACG GCGAATGTGT ATGCCCCCCT780GGCTTCACTG GCACCCGCTG TGAACAGGCC TGCAGAGAGG GCCGTTTTGG GCAGAGCTGC840CAGGAGCAGT GCCCAGGCAT ATCAGGCTGC CGGGGCCTCA CCTTCTGCCT CCCAGACCCC900TATGGCTGCT CTTGTGGATC TGGCTGGAGA GGAAGCCAGT GCCAAGAAGC TTGTGCCCCT960GGTCATTTTG GGGCTGATTG CCGACTCCAG TGCCAGTGTC AGAATGGTGG CACTTGTGAC1020CGGTTCAGTG GTTGTGTCTG CCCCTCTGGG TGGCATGGAG TGCACTGTGA GAAGTCAGAC1080CGGATCCCCC AGATCCTCAA CATGGCCTCA GAACTGGAGT TCAACTTAGA GACGATGCCC1140CGGATCAACT GTGCAGCTGC AGGGAACCCC TTCCCCGTGC GGGGCAGCAT AGAGCTACGC1200AAGCCAGACG GCACTGTGCT CCTGTCCACC AAGGCCATTG TGGAGCCAGA GAAGACCACA1260GCTGAGTTCG AGGTGCCCCG CTTGGTTCTT GCGGACAGTG GGTTCTGGGA GTGCCGTGTG1320TCCACATCTG GCGGCCAAGA CAGCCGGCGC TTCAAGGTCA ATGTGAAAGT GCCCCCCGTG1380CCCCTGGCTG CACCTCGGCT CCTGACCAAG CAGAGCCGCC AGCTTGTGGT CTCCCCGCTG1440GTCTCGTTCT CTGGGGATGG ACCCATCTCC ACTGTCCGCC TGCACTACCG GCCCCAGGAC1500AGTACCATGG ACTGGTCGAC CATTGTGGTG GACCCCAGTG AGAACGTGAC GTTAATGAAC1560CTGAGGCCAA AGACAGGATA CAGTGTTCGT GTGCAGCTGA GCCGGCCAGG GGAAGGAGGA1620GAGGGGGCCT GGGGGCCTCC CACCCTCATG ACCACAGACT GTCCTGAGCC TTTGTTGCAG1680CCGTGGTTGG AGGGCTGGCA TGTGGAAGGC ACTGACCGGC TGCGAGTGAG CTGGTCCTTG1740CCCTTGGTGC CCGGGCCACT GGTGGGCGAC GGTTTCCTGC TGCGCCTGTG GGACGGGACA1800CGGGGGCAGG AGCGGCGGGA GAACGTCTCA TCCCCCCAGG CCCGCACTGC CCTCCTGACG1860GGACTCACGC CTGGCACCCA CTACCAGCTG GATGTGCAGC TCTACCACTG CACCCTCCTG1920GGCCCGGCCT CGCCCCCTGC ACACGTGCTT CTGCCCCCCA GTGGGCCTCC AGCCCCCCGA1980CACCTCCACG CCCAGGCCCT CTCAGACTCC GAGATCCAGC TGACATGGAA GCACCCGGAG2040GCTCTGCCTG GGCCAATATC CAAGTACGTT GTGGAGGTGC AGGTGGCTGG GGGTGCAGGA2100GACCCACTGT GGATAGACGT GGACAGGCCT GAGGAGACAA GCACCATCAT CCGTGGCCTC2160AACGCCAGCA CGCGCTACCT CTTCCGCATG CGGGCCAGCA TTCAGGGGCT CGGGGACTGG2220AGCAACACAG TAGAAGAGTC CACCCTGGGC AACGGGCTGC AGGCTGAGGG CCCAGTCCAA2280GAGAGCCGGG CAGCTGAAGA GGGCCTGGAT CAGCAGCTGA TCCTGGCGGT GGTGGGCTCC2340GTGTCTGCCA CCTGCCTCAC CATCCTGGCC GCCCTTTTAA CCCTGGTGTG CATCCGCAGA2400AGCTGCCTGC ATCGGAGACG CACCTTCACC TACCAGTCAG GCTCGGGCGA GGAGACCATC2460CTGCAGTTCA GCTCAGGGAC CTTGACACTT ACCCGGCGGC CAAAACTGCA GCCCGAGCCC2520CTGAGCTACC CAGTGCTAGA GTGGGAGGAC ATCACCTTTG AGGACCTCAT CGGGGAGGGG2580AACTTCGGCC AGGTCATCCG GGCCATGATC AAGAAGGACG GGCTGAAGAT GAACGCAGCC2640ATCAAAATGC TGAAAGAGTA TGCCTCTGAA AATGACCATC GTGACTTTGC GGGAGAACTG2700GAAGTTCTGT GCAAATTGGG GCATCACCCC AACATCATCA ACCTCCTGGG GGCCTGTAAG2760AACCGAGGTT ACTTGTATAT CGCTATTGAA TATGCCCCCT ACGGGAACCT GCTAGATTTT2820CTGCGGAAAA GCCGGGTCCT AGAGACTGAC CCAGCTTTTG CTCGAGAGCA TGGGACAGCC2880TCTACCCTTA GCTCCCGGCA GCTGCTGCGT TTCGCCAGTG ATGCGGCCAA TGGCATGCAG2940TACCTGAGTG AGAAGCAGTT CATCCACAGG GACCTGGCTG CCCGGAATGT GCTGGTCGGA3000GAGAACCTAG CCTCCAAGAT TGCAGACTTC GGCCTTTCTC GGGGAGAGGA GGTTTATGTG3060AAGAAGACGA TGGGGCGTCT CCCTGTGCGC TGGATGGCCA TTGAGTCCCT GAACTACAGT3120GTCTATACCA CCAAGAGTGA TGTCTGGTCC TTTGGAGTCC TTCTTTGGGA GATAGTGAGC3180CTTGGAGGTA CACCCTACTG TGGCATGACC TGTGCCGAGC TCTATGAAAA GCTGCCCCAG3240GGCTACCGCA TGGAGCAGCC TCGAAACTGT GACGATGAAG TGTACGAGCT GATGCGTCAG3300TGCTGGCGGG ACCGTCCCTA TGAGCGACCC CCCTTTGCCC AGATTGCGCT ACAGCTAGGC3360CGCATGCTGG AAGCCAGGAA GGCCTATGTG AACATGTCGC TGTTTGAGAA CTTCACTTAC3420GCGGGCATTG ATGCCACAGC TGAGGAGGCC TGAGCTGCCA TCCAGCCAGA ACGTGGCTCT3480GCTGGCCGGA GCAAACTCTG CTGTCTAACC TGTGACCAGT CTGACCCTTA CAGCCTCTGA3540CTTAAGCTGC CTCAAGGAAT TTTTTTAACT TAAGGGAGAA AAAAAGGGAT CTGGGGATGG3600GGTGGGCTTA GGGGAACTGG GTTCCCATGC TTTGTAGGTG TCTCATAGCT ATCCTGGGCA3660TCCTTCTTTC TAGTTCAGCT GCCCCACAGG TGTGTTTCCC ATCCCACTGC TCCCCCAACA3720CAAACCCCCA CTCCAGCTCC TTCGCTTAAG CCAGCACTCA CACCACTAAC ATGCCCTGTT3780CAGCTACTCC CACTCCCGGC CTGTCATTCA GAAAAAAATA AATGTTCTAA TAAGCTCCAA3840AAAAAACH3 DNA sequenceGene name: placental growth factor (PGF; PlGF1; VEGF-related protein)Unigene number: Hs.2894Probeset Accession #: X54936Nucleic Acid Accession #: NM_002632 clusterCoding sequence: 322-768 (predicted start/stop codons underlined)GGGATTCGGG CCGCCCAGCT ACGGGAGGAC CTGGAGTGGC ACTGGGCGCC CGACGGCA60TCCCCGGGAC CCGCCTGCCC CTCGGCGCCC CGCCCCGCCG GGCCGCTCCC CGTCGGCTTC120CCCAGCCACA GCCTTACCTA CGGGCTCCTG ACTCCGCAAG GCTTCCAGAA GATGCTCGAA180CCACCGGCCG GGGCCTCGGG GCAGCAGTGA GGGAGGCGTC CAGCCCCCCA CTCAGCTCTT240CTCCTCCTGT GCCAGGGGCT CCCCGGGGGA TGAGCATGGT GGTTTTCCCT CGGAGCCCCC300TGGCTCGGGA CGTCTGAGAA GATGCCGGTC ATGAGGCTGT TCCCTTGCTT CCTGCAGCTC360CTGGCCGGGC TGGCGCTGCC TGCTGTGCCC CCCCAGCAGT GGGCCTTGTC TGCTGGGAAC420GGCTCGTCAG AGGTGGAAGT GGTACCCTTC CAGGAAGTGT GGGGCCGCAG CTACTGCCGG480GCGCTGGAGA GGCTGGTGGA CGTCGTGTCC GAGTACCCCA GCGAGGTGGA GCACATGTTC540AGCCCATCCT GTGTCTCCCT GCTGCGCTGC ACCGGCTGCT GCGGCGATGA GAATCTGCAC600TGTGTGCCGG TGGAGACGGC CAATGTCACC ATGCAGCTCC TAAAGATCCG TTCTGGGGAC660CGGCCCTCCT ACGTGGAGCT GACGTTCTCT CAGCACGTTC GCTGCGAATG CCGGCCTCTG720CGGGAGAAGA TGAAGCCGGA AAGGTGCGGC GATGCTGTTC CCCGGAGGTAACCCACCCCT780TGGAGGAGAG AGACCCCGCA CCCGGCTCGT GTATTTATTA CCGTCACACT CTTCAGTGAC840TCCTGCTGGT ACCTGCCCTC TATTTATTAG CCAACTGTTT CCCTGCTGAA TGCCTCGCTC900CCTTCAAGAC GAGGGGCAGG GAAGGACAGG ACCCTCAGGA ATTCAGTGCC TTCAACAACG960TGAGAGAAAG AGAGAAGCCA GCCACAGACC CCTGGGAGCT TCCGCTTTGA AAGAAGCAAG1020ACACGTGGCC TCGTGAGGGG CAAGCTAGGC CCCAGAGGCC CTGGAGGTCT CCAGGGGCCT1080GCAGAAGGAA AGAAGGGGGC CCTGCTACCT GTTCTTGGGC CTCAGGCTCT GCACAGACAA1140GCAGCCCTTG CTTTCGGAGC TCCTGTCCAA AGTAGGGATG CGGATTCTGC TGGGGCCGCC1200ACGGCCTGGT GGTGGGAAGG CCGGCAGCGG GCGGAGGGGA TTCAGCCACT TCCCCCTCTT1260CTTCTGAAGA TCAGAACATT CAGCTCTGGA GAACAGTGGT TGCCTGGGGG CTTTTGCCAC1320TCCTTGTCCC CCGTGATCTC CCCTCACACT TTGCCATTTG CTTGTACTGG GACATTGTTC1380TTTCCGGCCG AGGTGCCACC ACCCTGCCCC CACTAAGAGA CACATACAGA GTGGGCCCCG1440GGCTGGAGAA AGAGCTGCCT GGATGAGAAA CAGCTCAGCC AGTGGGGATG AGGTCACCAG1500GGGAGGAGCC TGTGCGTCCC AGCTGAAGGC AGTGGCAGGG GAGCAGGTTC CCCAAGGGCC1560CTGGCACCCC CACAAGCTGT CCCTGCAGGG CCATCTGACT GCCAAGCCAG ATTCTCTTGA1620ATAAAGTATT CTAGTGTGGA AACGCACH4 DNA sequenceGene name: nidogen 2 (NID2)Unigene number: Hs.82733Probeset Accession #: D86425Nucleic Acid Accession #: NM_007361 clusterCoding sequence: 1-4131 (predicted start/stop codons underlined)ATGGAGGGGG ACCGGGTGGC CGGGCGGCCG GTGCTGTCGT CGTTACCAGT GCTACTGCTG60CTGCAGTTGC TAATGTTGCG GGCCGCGGCG CTGCACCCAG ACGAGCTCTT CCCACACGGG120GAGTCGTGGT GGGACCAGCT CCTGCAGGAA GGCGACGACG TAAAGCTCAG CCGTGGTGAA180GCTGGCGAAT CCCCTGCACT TCTTACGAAG CCCGATTCAG CAACCTCTAC GTGGGCACCA240ACGGCATCAT CTCCACTCAG GACTTCCCCA GGGAAACGCA GTATGTGGAC TATGATTTCC300CCACCGACTT CCCGGCCATC GCCCCTTTTC TGGCGGACAT CGACACGAGC CACGGCAGAG360GCCGAGTCCT GTACCGAGAG GACACCTCCC CCGCAGTGCT GGGCCTGGCC GCCCGCTATG420TGCGCGCTGG CTTCCCGCGC TCTGCGCGCT TTTTACCCCC ACCCACGCCT TCCTGGCCAC480CTGGGAGCAG GTAGGCGCTT ACGAGGAGGT CAAACGCGGG CGCTGCCCTC GGGAGAGCTG540AACACTTTCC AGGCAGTTTT GGCATCTGAT GGGTCTGATA GCTACGCCCT CTTTCTTTAT600CCTGCCAACG GCCTGCAGTT CCTTGGAACC CGCCCCAAAG AGTCTTACAA TGTCCAGCTT660CAGCTTCCAG CTCGGGTGGG CTTCTGCCGA GGGGAGGCTG ATGATCTGAA GTCAGAAGGA720CCATATTTCA GCTTGACTAG CACTGAACAG TCTGTGAAAA ATCTCTATCA ACTAAGCAAC780CTGGGGATCC CTGGAGTGTG GGCTTTCCAT ATCGGCAGCA CTTCCCCGTT GGACAATGTC840AGGCCAGCTG CAGTTGGAGA CCTTTCCGCT GCCCACTCTT CTGTTCCCCT GGGACGTTCC900TTCAGCCATG CTACAGCCCT GGAAAGTGAC TATAATGAGG ACAATTTGGA TTACTACGAT960GTGAATGAGG AGGAAGCTGA ATACCTTCCG GGTGAACCAG AGGAGGCATT GAATGGCCAC1020AGCAGCATTG ATGTTTCCTT CCAATCCAAA GTGGATACAA AGCCTTTAGA GGAATCTTCC1080ACCTTGGATC CTCACACCAA AGAAGGAACA TCTCTGGGAG AGGTAGGGGG CCCAGATTTA1140AAAGGCCAAG TTGAGCCCTG GGATGAGAGA GAGACCAGAA GCCCAGCTCC ACCAGAGGTA1200GACAGAGATT CACTGGCTCC TTCCTGGGAA ACCCCACCAC CGTACCCCGA AAACGGAAGC1260ATCCAGCCCT ACCCAGATGG AGGGCCAGTG CCTTCGGAAA TGGATGTTCC CCCAGCTCAT1320CCTGAAGAAG AAATTGTTCT TCGAAGTTAC CCTGCTTCAG GTCACACTAC ACCCTTAAGT1380CGAGGGACGT ATGAGGTGGG ACTGGAAGAC AACATAGGTT CCAACACCGA GGTCTTCACG1440TATAATGCTG CCAACAAGGA AACCTGTGAA CACAACCACA GACAATGCTC CCGGCATGCC1500TTCTGCACGG ACTATGCCAC TGGCTTCTGC TGCCACTGCC AATCCAAGTT TTATGGAAAT1560GGGAAGCACT GTCTGCCTGA GGGGGCACCT CACCGAGTGA ATGGGAAAGT GAGTGGCCAC1620CTCCACGTGG GCCATACACC CGTGCACTTC ACTGATGTGG ACCTGCATGC GTATATCGTG1680GGCAATGATG GCAGAGCCTA CACGGCCATC AGCCACATCC CACAGCCAGC AGCCCAGGCC1740CTCCTCCCCC TCACACCAAT TGGAGGCCTG TTTGGCTGGC TCTTTGCTTT AGAAAAACCT1800GGCTCTGAGA ACGGCTTCAG CCTCGCAGGT GCTGCCTTTA CCCATGACAT GGAAGTTACA1860TTACCCGG GAGAGGAGAC GGTTCGTATC ACTCAAACTG CTGAGGGACT TGACCCAGAG1920AACTACCTGA GCATTAAGAC CAACATTCAA GGCCAGGTGC CTTACGTCCC AGCAAATTTC1980ACAGCCCACA TCTCTCCCTA CAAGGAGCTG TACCACTACT CCGACTCCAC TGTGACCTCT2040ACAAGTTCCA GAGACTACTC TCTGACTTTT GGTGCAATCA ACCAAACATG GTCCTACCGC2100ATCCACCAGA ACATCACTTA CCAGGTGTGC AGGCACGCCC CCAGACACCC GTCCTTCCCC2160ACCACCCAGC AGCTGAACGT GGACCGGGTC TTTGCCTTGT ATAATGATGA AGAAAGAGTG2220CTTAGATTTG CTGTGACCAA TCAAATTGGC CCGGTCAAAG AAGATTCAGA CCCCACTCCG2280GTGAATCCTT GCTATGATGG GAGCCACATG TGTGACACAA CAGCACGGTG CCATCCAGGG2340ACAGGTGTAG ATTACACCTG TGAGTGCGCA TCTGGGTACC AGGGAGATGG ACGGAACTGT2400GTGGATGAAA ATGAATGTGC AACTGGCTTT CATCGCTGTG GCCCCAACTC TGTATGTATC2460AACTTGCCTG GAAGCTACAG GTGTGAGTGC CGGAGTGGTT ATGAGTTTGC AGATGACCGG2520CATACTTGCA TCTTGATCAC CCCACCTGCC AACCCCTGTG AGGATGGCAG TCATACCTGT2580GCTCCTGCTG GGCAGGCCCG GTGTGTTCAC CATGGAGGCA GCACGTTCAG CTGTGCCTGC2640CTGCCTGGTT ATGCCGGCGA TGGGCACCAG TGCACTGATG TAGATGAATG CTCAGAAAAC2700AGATGTCACC CTGCAGCTAC CTGCTACAAT ACTCCTGGTT CCTTCTCCTG CCGTTGTCAA2760CCCGGATATT ATGGGGATGG ATTTCAGTGC ATACCTGACT CCACCTCAAG CCTGACACCC2820TGTGAACAAC AGCAGCGCCA TGCCCAGGCC CAGTATGCCT ACCCTGGGGC CCGGTTCCAC2880ATCCCCCAAT GCGACGAGCA GGGCAACTTC CTGCCCCTAC AGTGTCATGG CAGCACTGGT2940TTCTGCTGGT GCGTGGACCC TGATGGTCAT GAAGTTCCTG GTACCCAGAC TCCACCTGGC3000TCCACCCCGC CTCACTGTGG ACCATCACCA GAGCCCACCC AGAGGCCCCC GACCATCTGT3060GAGCGCTGGA GGGAAAACCT GCTGGAGCAC TACGGTGGCA CCCCCCGAGA TGACCAGTAC3120GTGCCCCAGT GCGATGACCT GGGCCACTTC ATCCCCCTGC AGTGCCACGG AAAGAGCGAC3180TTCTGCTGGT GTGTGGACAA AGATGGCAGA GAGGTGCAGG GCACCCGCTC CCAGCCAGGC3240ACCACCCCTG CGTGTATACC CACCGTCGCT CCACCCATGG TCCGGCCCAC GCCCCGGCCA3300GATGTGACCC CTCCATCTGT GGGCACCTTC CTGCTCTATA CTCAGGGCCA GCAGATTGGC3360TACTTACCCC TCAATGGCAC CAGGCTTCAG AAGGATGCAG CTAAGACCCT GCTGTCTCTG3420CATGGCTCCA TAATCGTGGG AATTGATTAC GACTGCCGGG AGAGGATGGT GTACTGGACA3480GATGTTGCTG GACGGACAAT CAGCCGTGCC GGTCTGGAAC TGGGAGCAGA GCCTGAGACG3540ATCGTGAATT CAGGTCTGAT AAGCCCTGAA GGACTTGCCA TAGACCACAT CCGCAGAACA3600ATGTACTGGA CGGACAGTGT CCTGGATAAG ATAGAGAGCG CCCTGCTGGA TGGCTCTGAG3660CGCAAGGTCC TCTTCTACAC AGATCTGGTG AATCCCCGTG CCATCGCTGT GGATCCAATC3720CGAGGCAACT TGTACTGGAC AGACTGGAAT AGAGAAGCTC CTAAAATTGA AACGTCATCT3780TTAGATGGAG AAAACAGAAG AATTCTGATC AATACAGACA TTGGATTGCC CAATGGCTTA3840ACCTTTGACC CTTTCTCTAA ACTGCTCTGC TGGGCAGATG CAGGAACCAA AAAACTGGAG3900TGTACACTAC CTGATGGAAC TGGACGGCGT GTCATTCAAA ACAACCTCAA GTACCCCTTC3960AGCATCGTAA GCTATGCAGA TCACTTCTAC CACACAGACT GGAGGAGGGA TGGTGTTGTA4020TCAGTAAATA AACATAGTGG CCAGTTTACT GATGAGTATC TCCCAGAACA ACGATCTCAC4080CTCTACGGGA TAACTGCAGT CTACCCCTAC TGCCCAACAG GAAGAAAGTAAGTACAGTAA4140TGTAAAGGAA GACTTGGAGT TTACAATCAG AACCTGGACC CTAAAGAACA GTGACTGCAA4200AGGCAAAGAA AGTAAAAAAG GAATTGGCCA TTAGACGTTC CTGAGCATCC AAGATGAACA4260TTTTGTAGTG CAAAAAGACT TTTGTGAAAA GCTGATACCT CAATCTTTAC TACTGTATTT4320TTAAAAATGA AGGTTGTTAT TGCAAGTTTA AAAAGGTAAC AGAATTTTAA CTGTTGCTTA4380TTAAAGCAAC TTCTTGTAAA CATTTATCAT TAATATTTAA AAGATCAAAT TCATTCAACT4440AAGAATTAGA GTTTAAGACT CTAAACCTGA TTTTTGCCAT GGATTCCTTC TGGCCAAGAA4500ATTAAAGCAC ATGTGATCAA TATAACAATA TAATCCTAAA CCTTGACAGT TGGAGAAGCC4560AATGCAGAAC TGATGGGAAA GGACCAATTA TTTATAGTTT CCAAACAAAA GTTCTAAGAT4620TTTTTACCTC TGCATCAGTG CATTTCTATT TATATCAAAA GGTGCTAAAA TGATTCAATT4680TGCATTTTCT GATCCTGTAG TGCCTCTATA GAAGTACCCA CAGAAAGTAA AGTATCACAT4740TTATAAATAC CAAAGATGTA ACAATTTTAA AATTTTCTAG ATTACTCCAA TAAAGTGTTT4800TAAGTTTAAA AAAAAAAAAA AAAAAAAAAACH5 DNA sequenceGene name: SNL (singed-like; sea urchin fascin homolog-like)Unigene number: Hs.118400Probeset Accession #: U03057 Nucleic Acid Accession #: NM_003088Coding sequence: 112-1593 (predicted start/stop codons underlined)GCGGAGGGTG CGTGCGGGCC GCGGCAGCCG AACAAAGGAG CAGGGGCGCC GCCGCAGGGA60CCCGCCACCC ACCTCCCGGG GCCGCGCAGC GGCCTCTCGT CTACTGCCAC CATGACCGCC120AACGGCACAG CCGAGGCGGT GCAGATCCAG TTCGGCCTCA TCAACTGCGG CAACAAGTAC180CTGACGGCCG AGGCGTTCGG GTTCAAGGTG AACGCGTCCG CCAGCAGCCT GAAGAAGAAG240CAGATCTGGA CGCTGGAGCA GCCCCCTGAC GAGGCGGGCA GCGCGGCCGT GTGCCTGCGC300AGCCACCTGG GCCGCTACCT GGCGGCGGAC AAGGACGGCA ACGTGACCTG CGAGCGCGAG360GTGCCCGGTC CCGACTGCCG TTTCCTCATC GTGGCGCACG ACGACGGTCG CTGGTCGCTG420CAGTCCGAGG CGCACCGGCG CTACTTCGGC GGCACCGAGG ACCGCCTGTC CTGCTTCGCG480CAGACGGTGT CCCCCGCCGA GAAGTGGAGC GTGCACATCG CCATGCACCC TCAGGTCAAC540ATCTACAGTG TCACCCGTAA GCACTACGCG CACCTGAGCG CGCGGCCGGC CGACGAGATC600GCCGTGGACC GCGACGTGCC CTGGGGCGTC GACTCGCTCA TCACCCTCGC CTTCCAGGAC660CAGCGCTACA GCGTGCAGAC CGCCGACCAC CGCTTCCTGC GCCACGACGG GCGCCTGGTG720GCGCGCCCCG AGCCGGCCAC TGGCTACACG CTGGAGTTCC GCTCCGGCAA GGTGGCCTTC780CGCGACTGCG AGGGCCGTTA CCTGGCGCCG TCGGGGCCCA GCGGCACGCT CAAGGCGGGC840AAGGCCACCA AGGTGGGCAA GGACGAGCTC TTTGCTCTGG AGCAGAGCTG CGCCCAGGTC900GTGCTGCAGG CGGCCAACGA GAGGAACGTG TCCACGCGCC AGGGTATGGA CCTGTCTGCC960AATCAGGACG AGGAGACCGA CCAGGAGACC TTCCAGCTGG AGATCGACCG CGACACCAAA1020AAGTGTGCCT TCCGTACCCA CACGGGCAAG TACTGGACGC TGACGGCCAC CGGGGGCGTG1080CAGTCCACCG CCTCCAGCAA GAATGCCAGC TGCTACTTTG ACATCGAGTG GCGTGACCGG1140CGCATCACAC TGAGGGCGTC CAATGGCAAG TTTGTGACCT CCAAGAAGAA TGGGCAGCTG1200GCCGCCTCGG TGGAGACAGC AGGGGACTCA GAGCTCTTCC TCATGAAGCT CATCAACCGC1260CCCATCATCG TGTTCCGCGG GGAGCATGGC TTCATCGGCT GCCGCAAGGT CACGGGCACC1320CTGGACGCCA ACCGCTCCAG CTATGACGTC TTCCAGCTGG AGTTCAACGA TGGCGCCTAC1380AACATCAAAG ACTCCACAGG CAAATACTGG ACGGTGGGCA GTGACTCCGC GGTCACCAGC1440AGCGGCGACA CTCCTGTGGA CTTCTTCTTC GAGTTCTGCG ACTATAACAA GGTGGCCATC1500AAGGTGGGCG GGCGCTACCT GAAGGGCGAC CACGCAGGCG TCCTGAAGGC CTCGGCGGAA1560ACCGTGGACC CCGCCTCGCT CTGGGAGTAC TAGGGCCGGC CCGTCCTTCC CCGCCCCTGC1620CCACATGGCG GCTCCTGCCA ACCCTCCCTG CTAACCCCTT CTCCGCCAGG TGGGCTCCAG1680GGCGGGAGGC AAGCCCCCTT GCCTTTCAAA CTGGAAACCC CAGAGAAAAC GGTGCCCCCA1740CCTGTCGCCC CTATGGACTC CCCACTCTCC CCTCCGCCCG GGTTCCCTAC TCCCCTCGGG1800TCAGCGGCTG CGGCCTGGCC CTGGGAGGGA TTTCAGATGC CCCTGCCCTC TTGTCTGCCA1860CGGGGCGAGT CTGGCACCTC TTTCTTCTGA CCTCAGACGG CTCTGAGCCT TATTTCTCTG1920GAAGCGGCTA AGGGACGGTT GGGGGCTGGG AGCCCTGGGC GTGTAGTGTA ACTGGAATCT1980TTTGCCTCTC CCAGCCACCT CCTCCCAGCC CCCCAGGAGA GCTGGGCACA TGTCCCAAGC2040CTGTCAGTGG CCCTCCCTGG TGCACTGTCC CCGAAACCCC TGCTTGGGAA GGGAAGCTGT2100CGGGAGGGCT AGGACTGACC CTTGTGGTGT TTTTTTGGGT GGTGGCTGGA AACAGCCCCT2160CTCCCACGTG GGAGAGGCTC AGCCTGGCTC CCTTCCCTGG AGCGGCAGGG CGTGACGGCC2220ACAGGGTCTG CCCGCTGCAC GTTCTGCCAA GGTGGTGGTG GCGGGCGGGT AGGGGTGTGG2280GGGCCGTCTT CCTCCTGTCT CTTTCCTTTC ACCCTAGCCT GACTGGAAGC AGAAAATGAC2340CAAATCAGTA TTTTTTTTAA TGAAATATTA TTGCTGGAGG CGTCCCAGGC AAGCCTGGCT2400GTAGTAGCGA GTGATCTGGC GGGGGGCGTC TCAGCACCCT CCCCAGGGGG TGCATCTCAG2460CCCCCTCTTT CCGTCCTTCC CGTCCAGCCC CAGCCCTGGG CCTGGGCTGC CGACACCTGG2520GCCAGAGCCC CTGCTGTGAT TGGTGCTCCC TGGGCCTCCC GGGTGGATGA AGCCAGGCGT2580CGCCCCCTCC GGGAGCCCTG GGGTGAGCCG CCGGGGCCCC CCTGCTGCCA GCCTCCCCCG2640TCCCCAACAT GCATCTCACT CTGGGTGTCT TGGTCTTTTA TTTTTTGTAA GTGTCATTTG2700TATAACTCTA AACGCCCATG ATAGTAGCTT CAAACTGGAA ATAGCGAAAT AAAATAACTC2760AGTCTGCACH6 DNA sequenceGene name: endothelial protein C receptor (EPCR; PROCR)Unigene number: Hs.82353Probeset Accession #: L35545Nucleic Acid Accession #: NM_006404Coding sequence: 25-741 (predicted start/stop codons underlined)CAGGTCCGGA GCCTCAACTT CAGGATGTTG ACAACATTGC TGCCGATACT GCTGCTGTCT60GGCTGGGCCT TTTGTAGCCA AGACGCCTCA GATGGCCTCC AAAGACTTCA TATGCTCCAG120ATCTCCTACT TCCGCGACCC CTATCACGTG TGGTACCAGG GCAACGCGTC GCTGGGGGGA180CACCTAACGC ACGTGCTGGA AGGCCCAGAC ACCAACACCA CGATCATTCA GCTGCAGCCC240TTGCAGGAGC CCGAGAGCTG GGCGCGCACG CAGAGTGGCC TGCAGTCCTA CCTGCTCCAG300TTCCACGGCC TCGTGCGCCT GGTGCACCAG GAGCGGACCT TGGCCTTTCC TCTGACCATC360CGCTGCTTCC TGGGCTGTGA GCTGCCTCCC GAGGGCTCTA GAGCCCATGT CTTCTTCGAA420GTGGCTGTGA ATGGGAGCTC CTTTGTGAGT TTCCGGCCGG AGAGAGCCTT GTGGCAGGCA480GACACCCAGG TCACCTCCGG AGTGGTCACC TTCACCCTGC AGCAGCTCAA TGCCTACAAC540CGCACTCGGT ATGAACTGCG GGAATTCCTG GAGGACACCT GTGTGCAGTA TGTGCAGAAA600CATATTTCCG CGGAAAACAC GAAAGGGAGC CAAACAAGCC GCTCCTACAC TTCGCTGGTC660CTGGGCGTCC TGGTGGGCGG TTTCATCATT GCTGGTGTGG CTGTAGGCAT CTTCCTGTGC720ACAGGTGGAC GGCGATGTTAATTACTCTCC AGCCCCGTCA GAAGGGGCTG GATTGATGGA780GGCTGGCAAG GGAAAGTTTC AGCTCACTGT GAAGCCAGAC TCCCCAACTG AAACACCAGA840AGGTTTGGAG TGACAGCTCC TTTCTTCTCC CACATCTGCC CACTGAAGAT TTGAGGGAGG900GGAGATGGAG AGGAGAGGTG GACAAAGTAC TTGGTTTGCT AAGAACCTAA GAACGTGTAT960GCTTTGCTGA ATTAGTCTGA TAAGTGAATG TTTATCTATC TTTGTGGAAA ACAGATAATG1020GAGTTGGGGC AGGAAGCCTA TGCGCCATCC TCCAAAGACA GACAGAATCA CCTGAGGCGT1080TCAAAAGATA TAACCAAATA AACAAGTCAT CCACAATCAA AATACAACAT TCAATACTTC1140CAGGTGTGTC AGACTTGGGA TGGGACGCTG ATATAATAGG GTAGAAAGAA GTAACACGAA1200GAAGTGGTGG AAATGTAAAA TCCAAGTCAT ATGGCAGTGA TCAATTATTA ATCAATTAAT1260AATATTAATA AATTTCTTAT ATTTACH8 DNA sequenceGene name: melanoma adhesion molecule (MCAM; MUC18)Unigene number: Hs.211579Probeset Accession #: D51069Nucleic Acid Accession #: NM_006500Coding sequence: 27-1967 (predicted start and stop codons underlined)ACTTGCGTCT CGCCCTCCGG CCAAGCATGG GGCTTCCCAG GCTGGTCTGC GCCTTCTTGC60TCGCCGCCTG CTGCTGCTGT CCTCGCGTCG CGGGTGTGCC CGGAGAGGCT GAGCAGCCTG120CGCCTGAGCT GGTGGAGGTG GAAGTGGGCA GCACAGCCCT TCTGAAGTGC GGCCTCTCCC180AGTCCCAAGG CAACCTCAGC CATGTCGACT GGTTTTCTGT CCACAAGGAG AAGCGGACGC240TCATCTTCCG TGTGCGCCAG GGCCAGGGCC AGAGCGAACC TGGGGAGTAC GAGCAGCGGC300TCAGCCTCCA GGACAGAGGG GCTACTCTGG CCCTGACTCA AGTCACCCCC CAAGACGAGC360GCATCTTCTT GTGCCAGGGC AAGCGCCCTC GGTCCCAGGA GTACCGCATC CAGCTCCGCG420TCTACAAAGC TCCGGAGGAG CCAAACATCC AGGTCAACCC CCTGGGCATC CCTGTGAACA480GTAAGGAGCC TGAGGAGGTC GCTACCTGTG TAGGGAGGAA CGGGTACCCC ATTCCTCAAG540TCATCTGGTA CAAGAATGGC CGGCCTCTGA AGGAGGAGAA GAACCGGGTC CACATTCAGT600CGTCCCAGAC TGTGGAGTCG AGTGGTTTGT ACACCTTGCA GAGTATTCTG AAGGCACAGC660TGGTTAAAGA AGACAAAGAT GCCCAGTTTT ACTGTGAGCT CAACTACCGG CTGCCCAGTG720GGAACCACAT GAAGGAGTCC AGGGAAGTCA CCGTCCCTGT TTTCTACCCG ACAGAAAAAG780TGTGGCTGGA AGTGGAGCCC GTGGGAATGC TGAAGGAAGG GGACCGCGTG GAAATCAGGT840GTTTGGCTGA TGGCAACCCT CCACCACACT TCAGCATCAG CAAGCAGAAC CCCAGCACCA900GGGAGGCAGA GGAAGAGACA ACCAACGACA ACGGGGTCCT GGTGCTGGAG CCTGCCCGGA960AGGAACACAG TGGGCGCTAT GAATGTCAGG CCTGGAACTT GGACACCATG ATATCGCTGC1020TGAGTGAACC ACAGGAACTA CTGGTGAACT ATGTGTCTGA CGTCCGAGTG AGTCCCGCAG1080CCCCTGAGAG ACAGGAAGGC AGCAGCCTCA CCCTGACCTG TGAGGCAGAG AGTAGCCAGG1140ACCTCGAGTT CCAGTGGCTG AGAGAAGAGA CAGACCAGGT GCTGGAAAGG GGGCCTGTGC1200TTCAGTTGCA TGACCTGAAA CGGGAGGCAG GAGGCGGCTA TCGCTGCGTG GCGTCTGTGC1260CCAGCATACC CGGCCTGAAC CGCACACAGC TGGTCAAGCT GGCCATTTTT GGCCCCCCTT1320GGATGGCATT CAAGGAGAGG AAGGTGTGGG TGAAAGAGAA TATGGTGTTG AATCTGTCTT1380GTGAAGCGTC AGGGCACCCC CGGCCCACCA TCTCCTGGAA CGTCAACGGC ACGGCAAGTG1440AACAAGACCA AGATCCACAG CGAGTCCTGA GCACCCTGAA TGTCCTCGTG ACCCCGGAGC1500TGTTGGAGAC AGGTGTTGAA TGCACGGCCT CCAACGACCT GGGCAAAAAC ACCAGCATCC1560TCTTCCTGGA GCTGGTCAAT TTAACCACCC TCACACCAGA CTCCAACACA ACCACTGGCC1620TCAGCACTTC CACTGCCAGT CCTCATACCA GAGCCAACAG CACCTCCACA GAGAGAAAGC1680TGCCGGAGCC GGAGAGCCGG GGCGTGGTCA TCGTGGCTGT GATTGTGTGC ATCCTGGTCC1740TGGCGGTGCT GGGCGCTGTC CTCTATTTCC TCTATAAGAA GGGCAAGCTG CCGTGCAGGC1800GCTCAGGGAA GCAGGAGATC ACGCTGCCCC CGTCTCGTAA GACCGAACTT GTAGTTGAAG1860TTAAGTCAGA TAAGCTCCCA GAAGAGATGG GCCTCCTGCA GGGCAGCAGC GGTGACAAGA1920GGGCTCCGGG AGACCAGGGA GAGAAATACA TCGATCTGAG GCATTAGCCC CGAATCACTT1980CAGCTCCCTT CCCTGCCTGG ACCATTCCCA GCTCCCTGCT CACTCTTCTC TCAGCCAAAG2040CCTCCAAAGG GACTAGAGAG AAGCCTCCTG CTCCCCTCAC CTGCACACCC CCTTTCAGAG2100GGCCACTGGG TTAGGACCTG AGGACCTCAC TTGGCCCTGC AAGCCGCTTT TCAGGGACCA2160GTCCACCACC ATCTCCTCCA CGTTGAGTGA AGCTCATCCC AAGCAAGGAG CCCCAGTCTC2220CCGAGCGGGT AGGAGAGTTT CTTGCAGAAC GTGTTTTTTC TTTACACACA TTATGGCTGT2280AAATACCTGG CTCCTGCCAG CAGCTGAGCT GGGTAGCCTC TCTGAGCTGG TTTCCTGCCC2340CAAAGGCTGG CTTCCACCAT CCAGGTGCAC CACTGAAGTG AGGACACACC GGAGCCAGGC2400GCCTGCTCAT GTTGAAGTGC GCTGTTCACA CCCGCTCCGG AGAGCACCCC AGCGGCATCC2460AGAAGCAGCT GCAGTGTTGC TGCCACCACC CTCCTGCTCG CCTCTTCAAA GTCTCCTGTG2520ACATTTTTTC TTTGGTCAGA AGCCAGGAAC TGGTGTCATT CCTTAAAAGA TACGTGCCGG2580GGCCAGGTGT GGTGGCTCAC GCCTGTAATC CCAGCACTTT GGGAGGCCGA GGCGGGCGGA2640TCACAAAGTC AGGACGAGAC CATCCTGGCT AACACGGTGA AACCCTGTCT CTACTAAAAA2700TACAAAAAAA AATTAGCTAG GCGTAGTGGT TGGCACCTAT AGTCCCAGCT ACTCGGAAGG2760CTGAAGCAGG AGAATGGTAT GAATCCAGGA GGTGGAGCTT GCAGTGAGCC GAGACCGTGC2820CACTGCACTC CAGCCTGGGC AACACAGCGA GACTCCGTCT CGAGGAAAAA AAAAGAAAAG2880ACGCGTACCT GCGGTGAGGA AGCTGGGCGC TGTTTTCGAG TTCAGGTGAA TTAGCCTCAA2940TCCCCGTGTT CACTTGCTCC CATAGCCCTC TTGATGGATC ACGTAAAACT GAAAGGCAGC3000GGGGAGCAGA CAAAGATGAG GTCTACACTG TCCTTCATGG GGATTAAAGC TATGGTTATA3060TTAGCACCAA ACTTCTACAA ACCAAGCTCA GGGCCCCAAC CCTAGAAGGG CCCAAATGAG3120AGAATGGTAC TTAGGGATGG AAAACGGGGC CTGGCTAGAG CTTCGGGTGT GTGTGTCTGT3180CTGTGTGTAT GCATACATAT GTGTGTATAT ATGGTTTTGT CAGGTGTGTA AATTTGCAAA3240TTGTTTCCTT TATATATGTA TGTATATATA TATATGAAAA TATATATATA TATGAAAAAT3300AAAGCTTAAT TGTCCCAGAA AATCATACAT TGCTTTTTTA TTCTACATGG GTACCACAGG3360AACCTGGGGG CCTGTGAAAC TACAACCAAA AGGCACACAA AACCGTTTCC AGTTGGCAGC3420AGAGATCAGG GGTTACCTCT GCTTCTGAGC AAATGGCTCA AGCTCTACCA GAGCAGACAG3480CTACCCTACT TTTCAGCAGC AAAACGTCCC GTATGACGCA GCACGAAGGG CCTGGCAGGC3540TGTTAGCAGG AGCTATGTCC CTTCCTATCG TTTCCGTCCA CTTACH9 DNA sequenceGene name: endothelin-1 (EDN1)Unigene number: Hs.2271Probeset Accession #: J05008Nucleic Acid Accession #: NM_001955Coding sequence: 337-975 (predicted start/stop codons underlined)GGAGCTGTTT ACCCCCACTC TAATAGGGGT TCAATATAAA AAGCCGGCAG AGAGCTGTCC60AAGTCAGACG CGCCTCTGCA TCTGCGCCAG GCGAACGGGT CCTGCGCCTC CTGCAGTCCC120AGCTCTCCAC CACCGCCGCG TGCGCCTGCA GACGCTCCGC TCGCTGCCTT CTCTCCTGGC180AGGCGCTGCC TTTTCTCCCC GTTAAAGGGC ACTTGGGCTG AAGGATCGCT TTGAGATCTG240AGGAACCCGC AGCGCTTTGA GGGACCTGAA GCTGTTTTTC TTCGTTTTCC TTTGGGTTCA300GTTTGAACGG GAGGTTTTTG ATCCCTTTTT TTCAGAATGG ATTATTTGCT CATGATTTTC360TCTCTGCTGT TTGTGGCTTG CCAAGGAGCT CCAGAAACAG CAGTCTTAGG CGCTGAGCTC420AGCGCGGTGG GTGAGAACGG CGGGGAGAAA CCCACTCCCA GTCCACCCTG GCGGCTCCGC480CGGTCCAAGC GCTGCTCCTG CTCGTCCCTG ATGGATAAAG AGTGTGTCTA CTTCTGCCAC540CTGGACATCA TTTGGGTCAA CACTCCCGAG CACGTTGTTC CGTATGGACT TGGAAGCCCT600AGGTCCAAGA GAGCCTTGGA GAATTTACTT CCCACAAAGG CAACAGACCG TGAGAATAGA660TGCCAATGTG CTAGCCAAAA AGACAAGAAG TGCTGGAATT TTTGCCAAGC AGGAAAAGAA720CTCAGGGCTG AAGACATTAT GGAGAAAGAC TGGAATAATC ATAAGAAAGG AAAAGACTGT780TCCAAGCTTG GGAAAAAGTG TATTTATCAG CAGTTAGTGA GAGGAAGAAA AATCAGAAGA840AGTTCAGAGG AACACCTAAG ACAAACCAGG TCGGAGACCA TGAGAAACAG CGTCAAATCA900TCTTTTCATG ATCCCAAGCT GAAAGGCAAG CCCTCCAGAG AGCGTTATGT GACCCACAAC960CGAGCACATT GGTGACAGAC TTCGGGGCCT GTCTGAAGCC ATAGCCTCCA CGGAGAGCCC1020TGTGGCCGAC TCTGCACTCT CCACCCTGGC TGGGATCAGA GCAGGAGCAT CCTCTGCTGG1080TTCCTGACTG GCAAAGGACC AGCGTCCTCG TTCAAAACAT TCCAAGAAAG GTTAAGGAGT1140TCCCCCAACC ATCTTCACTG GCTTCCATCA GTGGTAACTG CTTTGGTCTC TTCTTTCATC1200TGGGGATGAC AATGGACCTC TCAGCAGAAA CACACAGTCA CATTCGAATT CACJ1 DNA sequenceGene name: BMX non-receptor tyrosine kinaseUnigene number: Hs.27372Probeset Accession #: X83107Nucleic Acid Accession #: NM_001721Coding sequence: 34-2061 (predicted start/stop codons underlined)GCAAGCACGG AACAAGCTGA GACGGATGAT AATATGGATA CAAAATCTAT TCTAGAAGAA60CTTCTTCTCA AAAGATCACA GCAAAAGAAG AAAATGTCAC CAAATAATTA CAAAGAACGG120CTTTTTGTTT TGACCAAAAC AAACCTTTCC TACTATGAAT ATGACAAAAT GAAAAGGGGC180AGCAGAAAAG GATCCATTGA AATTAAGAAA ATCAGATGTG TGGAGAAAGT AAATCTCGAG240GAGCAGACGC CTGTAGAGAG ACAGTACCCA TTTCAGATTG TCTATAAAGA TGGGCTTCTC300TATGTCTATG CATCAAATGA AGAGAGCCGA AGTCAGTGGT TGAAAGCATT ACAAAAAGAG360ATAAGGGGTA ACCCCCACCT GCTGGTCAAG TACCATAGTG GGTTCTTCGT GGACGGGAAG420TTCCTGTGTT GCCAGCAGAG CTGTAAAGCA GCCCCAGGAT GTACCCTCTG GGAAGCATAT480GCTAATCTGC ATACTGCAGT CAATGAAGAG AAACACAGAG TTCCCACCTT CCCAGACAGA540GTGCTGAAGA TACCTCGGGC AGTTCCTGTT CTCAAAATGG ATGCACCATC TTCAAGTACC600ACTCTAGCCC AATATGACAA CGAATCAAAG AAAAACTATG GCTCCCAGCC ACCATCTTCA660AGTACCAGTC TAGCGCAATA TGACAGCAAC TCAAAGAAAA TCTATGGCTC CCAGCCAAAC720TTCAACATGC AGTATATTCC AAGGGAAGAC TTCCCTGACT GGTGGCAAGT AAGAAAACTG780AAAAGTAGCA GCAGCAGTGA AGATGTTGCA AGCAGTAACC AAAAAGAAAG AAATGTGAAT840CACACCACCT CAAAGATTTC ATGGGAATTC CCTGAGTCAA GTTCATCTGA AGAAGAGGAA900AACCTGGATG ATTATGACTG GTTTGCTGGT AACATCTCCA GATCACAATC TGAACAGTTA960CTCAGACAAA AGGGAAAAGA AGGAGCATTT ATGGTTAGAA ATTCGAGCCA AGTGGGAATG1020TACACAGTGT CCTTATTTAG TAAGGCTGTG AATGATAAAA AAGGAACTGT CAAACATTAC1080CACGTGCATA CAAATGCTGA GAACAAATTA TACCTGGCAG AAAACTACTG TTTTGATTCC1140ATTCCAAAGC TTATTCATTA TCATCAACAC AATTCAGCAG GCATGATCAC ACGGCTCCGC1200CACCCTGTGT CAACAAAGGC CAACAAGGTC CCCGACTCTG TGTCCCTGGG AAATGGAATC1260TGGGAACTGA AAAGAGAAGA GATTACCTTG TTGAAGGAGC TGGGAAGTGG CCAGTTTGGA1320GTGGTCCAGC TGGGCAAGTG GAAGGGGCAG TATGATGTTG CTGTTAAGAT GATCAAGGAG1380GGCTCCATGT CAGAAGATGA ATTCTTTCAG GAGGCCCAGA CTATGATGAA ACTCAGCCAT1440CCCAAGCTGG TTAAATTCTA TGGAGTGTGT TCAAAGGAAT ACCCCATATA CATAGTGACT1500GAATATATAA GCAATGGCTG CTTGCTGAAT TACCTGAGGA GTCACGGAAA AGGACTTGAA1560CCTTCCCAGC TCTTAGAAAT GTGCTACGAT GTCTGTGAAG GCATGGCCTT CTTGGAGAGT1620CACCAATTCA TACACCGGGA CTTGGCTGCT CGTAACTGCT TGGTGGACAG AGATCTCTGT1680GTGAAAGTAT CTGACTTTGG AATGACAAGG TATGTTCTTG ATGACCAGTA TGTCAGTTCA1740GTCGGAACAA AGTTTCCAGT CAAGTGGTCA GCTCCAGAGG TGTTTCATTA CTTCAAATAC1800AGCAGCAAGT CAGACGTATG GGCATTTGGG ATCCTGATGT GGGAGGTGTT CAGCCTGGGG1860AAGCAGCCCT ATGACTTGTA TGACAACTCC CAGGTGGTTC TGAAGGTCTC CCAGGGCCAC1920AGGCTTTACC GGCCCCACCT GGCATCGGAC ACCATCTACC AGATCATGTA CAGCTGCTGG1980CACGAGCTTC CAGAAAAGCG TCCCACATTT CAGCAACTCC TGTCTTCCAT TGAACCACTT2040CGGGAAAAAG ACAAGCATTGAAGAAGAAAT TAGGAGTGCT GATAAGAATG AATATAGATG2100CTGGCCAGCA TTTTCATTCA TTTTAAGGAA AGTAGGAAGG CATAAGTAAT TTTAGCTAGT2160TTTTAATAGT GTTCTCTGTA TTGTCTATTA TTTAGAAATG AACAAGGCAG GAAACAAAAG2220ATTCCCTTGA AATTTAGATC AAATTAGTAA TTTTGTTTTA TGCTGCTCCT GATATAACAC2280TTTCCAGCCT ATAGCAGAAG CACATTTTCA GACTGCAATA TAGAGACTGT GTTCATGTGT2340AAAGACTGAG CAGAACTGAA AAATTACTTA TTGGATATTC ATTCTTTTCT TTATATTGTC2400ATTGTCACAA CAATTAAATA TACTACCAAG TACAGAAATG TGGAAAAAAA AAACCGACJ4 DNA sequenceGene name: prostaglandin G/H synthase 2 (COX-2; PGHS-2)Unigene number: Hs.196384Probeset Accession #: D28235 Nucleic Acid Accession #: NM_000963Coding sequence: 135-1949 (predicted start/stop codons underlined)CAATTGTCAT ACGACTTGCA GTGAGCGTCA GGAGCACGTC CAGGAACTCC TCAGCAGCGC60CTCCTTCAGC TCCACAGCCA GACGCCCTCA GACAGCAAAG CCTACCCCCG CGCCGCGCCC120TGCCCGCCGC TCGGATGCTC GCCCGCGCCC TGCTGCTGTG CGCGGTCCTG GCGCTCAGCC180ATACAGCAAA TCCTTGCTGT TCCCACCCAT GTCAAAACCG AGGTGTATGT ATGAGTGTGG240GATTTGACCA GTATAAGTGC GATTGTACCC GGACAGGATT CTATGGAGAA AACTGCTCAA300CACCGGAATT TTTGACAAGA ATAAAATTAT TTCTGAAACC CACTCCAAAC ACAGTGCACT360ACATACTTAC CCACTTCAAG GGATTTTGGA ACGTTGTGAA TAACATTCCC TTCCTTCGAA420ATGCAATTAT GAGTTATGTC TTGACATCCA GATCACATTT GATTGACAGT CCACCAACTT480ACAATGCTGA CTATGGCTAC AAAAGCTGGG AAGCCTTCTC TAACCTCTCC TATTATACTA540GAGCCCTTCC TCCTGTGCCT GATGATTGCC CGACTCCCTT GGGTGTCAAA GGTAAAAAGC600AGCTTCCTGA TTCAAATGAG ATTGTGGAAA AATTGCTTCT AAGAAGAAAG TTCATCCCTG660ATCCCCAGGG CTCAAACATG ATGTTTGCAT TCTTTGCCCA GCACTTCACG CATCAGTTTT720TCAAGACAGA TCATAAGCGA GGGCCAGCTT TCACCAACGG GCTGGGCCAT GGGGTGGACT780TAAATCATAT TTACGGTGAA ACTCTGGCTA GACAGCGTAA ACTGCGCCTT TTCAAGGATG840GAAAAATGAA ATATCAGATA ATTGATGGAG AGATGTATCC TCCCACAGTC AAAGATACTC900AGGCAGAGAT GATCTACCCT CCTCAAGTCC CTGAGCATCT ACGGTTTGCT GTGGGGCAGG960AGGTCTTTGG TCTGGTGCCT GGTCTGATGA TGTATGCCAC AATCTGGCTG CGGGAACACA1020ACAGAGTATG CGATGTGCTT AAACAGGAGC ATCCTGAATG GGGTGATGAG CAGTTGTTCC1080AGACAAGCAG GCTAATACTG ATAGGAGAGA CTATTAAGAT TGTGATTGAA GATTATGTGC1140AACACTTGAG TGGCTATCAC TTCAAACTGA AATTTGACCC AGAACTACTT TTCAACAAAC1200AATTCCAGTA CCAAAATCGT ATTGCTGCTG AATTTAACAC CCTCTATCAC TGGCATCCCC1260TTCTGCCTGA CACCTTTCAA ATTCATGACC AGAAATACAA CTATCAACAG TTTATCTACA1320ACAACTCTAT ATTGCTGGAA CATGGAATTA CCCAGTTTGT TGAATCATTC ACCAGGCAAA1380TTGCTGGCAG GGTTGCTGGT GGTAGGAATG TTCCACCCGC AGTACAGAAA GTATCACAGG1440CTTCCATTGA CCAGAGCAGG CAGATGAAAT ACCAGTCTTT TAATGAGTAC CGCAAACGCT1500TTATGCTGAA GCCCTATGAA TCATTTGAAG AACTTACAGG AGAAAAGGAA ATGTCTGCAG1560AGTTGGAAGC ACTCTATGGT GACATCGATG CTGTGGAGCT GTATCCTGCC CTTCTGGTAG1620AAAAGCCTCG GCCAGATGCC ATCTTTGGTG AAACCATGGT AGAAGTTGGA GCACCATTCT1680CCTTGAAAGG ACTTATGGGT AATGTTATAT GTTCTCCTGC CTACTGGAAG CCAAGCACTT1740TTGGTGGAGA AGTGGGTTTT CAAATCATCA ACACTGCCTC AATTCAGTCT CTCATCTGCA1800ATAACGTGAA GGGCTGTCCC TTTACTTCAT TCAGTGTTCC AGATCCAGAG CTCATTAAAA1860CAGTCACCAT CAATGCAAGT TCTTCCCGCT CCGGACTAGA TGATATCAAT CCCACAGTAC1920TACTAAAAGA ACGTTCGACT GAACTGTAGA AGTCTAATGA TCATATTTAT TTATTTATAT1980GAACCATGTC TATTAATTTA ATTATTTAAT AATATTTATA TTAAACTCCT TATGTTACTT2040AACATCTTCT GTAACAGAAG TCAGTACTCC TGTTGCGGAG AAAGGAGTCA TACTTGTGAA2100GACTTTTATG TCACTACTCT AAAGATTTTG CTGTTGCTGT TAAGTTTGGA AAACAGTTTT2160TATTCTGTTT TATAAACCAG AGAGAAATGA GTTTTGACGT CTTTTTACTT GAATTTCAAC2220TTATATTATA AGAACGAAAG TAAAGATGTT TGAATACTTA AACACTATCA CAAGATGGCA2280AAATGCTGAA AGTTTTTACA CTGTCGATGT TTCCAATGCA TCTTCCATGA TGCATTAGAA2340GTAACTAATG TTTGAAATTT TAAAGTACTT TTGGTTATTT TTCTGTCATC AAACAAAAAC2400AGGTATCAGT GCATTATTAA ATGAATATTT AAATTAGACA TTACCAGTAA TTTCATGTCT2460ACTTTTTAAA ATCAGCAATG AAACAATAAT TTGAAATTTC TAAATTCATA GGGTAGAATC2520ACCTGTAAAA GCTTGTTTGA TTTCTTAAAG TTATTAAACT TGTACATATA CCAAAAAGAA2580GCTGTCTTGG ATTTAAATCT GTAAAATCAG ATGAAATTTT ACTACAATTG CTTGTTAAAA2640TATTTTAAAA GTGATGTTCC TTTTTCACCA AGAGTATAAA CCTTTTTAGT GTGACTGTTA2700AAACTTCCTT TTAAATCAAA ATGCCAAATT TATTAAGGTG GTGGAGCCAC TGCAGTGTTA2760TCTCAAAATA AGAATATTTT GTTGAGATAT TCCAGAATTT GTTTATATGG CTGGTAACAT2820GTAAAATCTA TATCAGCAAA AGGGTCTACC TTTAAAATAA GCAATAACAA AGAAGAAAAC2880CAAATTATTG TTCAAATTTA GGTTTAAACT TTTGAAGCAA ACTTTTTTTT ATCCTTGTGC2940ACTGCAGGCC TGGTACTCAG ATTTTGCTAT GAGGTTAATG AAGTACCAAG CTGTGCTTGA3000ATAACGATAT GTTTTCTCAG ATTTTCTGTT GTACAGTTTA ATTTAGCAGT CCATATCACA3060TTGCAAAAGT AGCAATGACC TCATAAAATA CCTCTTCAAA ATGCTTAAAT TCATTTCACA3120CATTAATTTT ATCTCAGTCT TGAAGCCAAT TCAGTAGGTG CATTGGAATC AAGCCTGGCT3180ACCTGCATGC TGTTCCTTTT CTTTTCTTCT TTTAGCCATT TTGCTAAGAG ACACAGTCTT3240CTCATCACTT CGTTTCTCCT ATTTTGTTTT ACTAGTTTTA AGATCAGAGT TCACTTTCTT3300TGGACTCTGC CTATATTTTC TTACCTGAAC TTTTGCAAGT TTTCAGGTAA ACCTCAGCTC3360AGGACTGCTA TTTAGCTCCT CTTAAGAAGA TTAAAAGAGA AAAAAAAAGG CCCTTTTAAA3420AATAGTATAC ACTTATTTTA AGTGAAAAGC AGAGAATTTT ATTTATAGCT AATTTTAGCT3480ATCTGTAACC AAGATGGATG CAAAGAGGCT AGTGCCTCAG AGAGAACTGT ACGGGGTTTG3540TGACTGGAAA AAGTTACGTT CCCATTCTAA TTAATGCCCT TTCTTATTTA AAAACAAAAC3600CAAATGATAT CTAAGTAGTT CTCAGCAATA ATAATAATGA CGATAATACT TCTTTTCCAC3660ATCTCATTGT CACTGACATT TAATGGTACT GTATATTACT TAATTTATTG AAGATTATTA3720TTTATGTCTT ATTAGGACAC TATGGTTATA AACTGTGTTT AAGCCTACAA TCATTGATTT3780TTTTTTGTTA TGTCACAATC AGTATATTTT CTTTGGGGTT ACCTCTCTGA ATATTATGTA3840AACAATCCAA AGAAATGATT GTATTAAGAT TTGTGAATAA ATTTTTAGAA ATCTGATTGG3900CATATTGAGA TATTTAAGGT TGAATGTTTG TCCTTAGGAT AGGCCTATGT GCTAGCCCAC3960AAAGAATATT GTCTCATTAG CCTGAATGTG CCATAAGACT GACCTTTTAA AATGTTTTGA4020GGGATCTGTG GATGCTTCGT TAATTTGTTC AGCCACAATT TATTGAGAAA ATATTCTGTG4080TCAAGCACTG TGGGTTTTAA TATTTTTAAA TCAAACGCTG ATTACAGATA ATAGTATTTA4140TATAAATAAT TGAAAAAAAT TTTCTTTTGG GAAGAGGGAG AAAATGAAAT AAATATCATT4200AAAGATAACT CAGGAGAATC TTCTTTACAA TTTTACGTTT AGAATGTTTA AGGTTAAGAA4260AGAAATAGTC AATATGCTTG TATAAAACAC TGTTCACTGT TTTTTTTAAA AAAAAAACTT4320GATTTGTTAT TAACATTGAT CTGCTGACAA AACCTGGGAA TTTGGGTTGT GTATGCGAAT4380GTTTCAGTGC CTCAGACAAA TGTGTATTTA ACTTATGTAA AAGATAAGTC TGGAAATAAA4440TGTCTGTTTA TTTTTGTACT ATTTAACJ6 DNA sequenceGene name: SEC14-like-1Unigene number: Hs.75232Probeset Accession #: D67029Nucleic Acid Accession #: NM_003003Coding sequence: 304-2451 (predicted start/stop codons underlinedCAAGTGCCGT CGCCGCGCCC CTTCCCCCTC CCGCCTCCCC GGCCCCCTCC CCGGAACCGG60CGGTCGAGCT ACGGTCGCGG ACGAGTGGAA CCGAGACTGC CCCGCGGAGC CGCCGGTATG120AGCGCCCCTC GCCACCCCGT GTCCCAGGCC CGGCCTTTCT GACAAGAGCT AGACTTCGGG180CTCCTTGAGG ATATTCAGTT TTGTATGTTT GAATATCCTC TCACCATGTT CAGCATAAAG240TACCATTCTT AATGATTATC CTCAACAAGA CAGGTGTGAG AGGGTTGCTG TTGCATTGCA300ATCATGGTGC AAAAATACCA GTCCCCAGTG AGAGTGTACA AATACCCCTT TGAATTAATT360ATGGCTGCCT ATGAAAGGAG GTTCCCTACA TGTCCTTTGA TTCCGATGTT CGTGGGCAGT420GACACTGTGA GTGAATTCAA GAGCGAAGAT GGGGCTATTC ATGTCATTGA AAGGCGCTGC480AAGCTGGATG TAGATGCACC GAGACTGCTG AAGAAGATTG CAGGAGTTGA TTATGTTTAT540TTTGTCCAGA AAAACTCACT GAATTCTCGG GAACGTACTT TGCACATTGA GGCTTATAAT600GAAACGTTTT CCAATCGGGT CATCATTAAT GAGCATTGCT GCTACACCGT TCACCCTGAA660AATGAAGATT GGACCTGTTT TGAACAGTCT GCAAGTTTAG ATATTAAATC TTTCTTTGGT720TTTGAAAGTA CAGTGGAAAA AATTGCAATG AAACAATATA CCAGCAACAT TAAAAAAGGA780AAGGAAATCA TCGAATACTA CCTTCGCCAA TTAGAAGAAG AAGGCATAAC CTTTGTGCCC840CGTTGGAGTC CGCCTTCCAT CACGCCCTCT TCAGAGACAT CTTCATCATC CTCCAAGAAA900CAAGCAGCGT CCATGGCCGT CGTCATCCCA GAAGCTGCCC TCAAGGAGGG GCTGAGTGGT960GATGCCCTCA GCAGCCCCAG TGCACCTGAG CCCGTGGTGG GCACCCCTGA CGACAAACTA1020GATGCCGACC ACATCAAGAG ATACCTGGGC GATTTGACTC CGCTGCAGGA GAGCTGCCTC1080ATTAGACTTC GCCAGTGGCT CCAGGAGACC CACAAGGGCA AAATTCCAAA AGATGAGCAT1140ATTCTTCGGT TCCTCCGTGC ACGGGATTTT AATATTGACA AAGCCAGAGA GATCATGTGT1200CAGTCTTTGA CGTGGAGAAA GCAGCATCAG GTAGACTACA TTCTTGAAAC CTGGACCCCT1260CCTCAGGTCC TTCAGGATTA CTACGCGGGA GGCTGGCATC ATCACGACAA AGATGGGCGG1320CCCCTCTACG TGCTCAGGCT GGGGCAGATG GACACCAAAG GCTTGGTGAG AGCGCTCGGG1380GAGGAAGCCC TGCTGAGATA CGTTCTCTCC GTAAATGAAG AACGGCTAAG GCGATGCGAA1440GAGAATACAA AAGTCTTTGG TCGGCCTATC AGCTCATGGA CCTGCCTGGT GGACTTGGAA1500GGGCTGAACA TGCGCCACTT GTGGAGACCT GGTGTGAAAG CGCTGCTGCG GATCATCGAG1560GTGGTGGAGG CCAACTACCC TGAGACACTG GGCCGCCTTC TCATCCTGCG GGCGCCCAGG1620GTATTTCCTG TGCTCTGGAC GCTGGTTAGT CCGTTCATTG ATGACAACAC CAGAAGGAAG1680TTCCTCATTT ATGCAGGAAA TGACTACAG GGTCCTGGAG GCCTGCTGGA TTACATCGAC1740AAAGAGATTA TTCCAGATTT CCTGAGTGG GAGTGCATGT GCGAAGTGCC AGAGGGTGGA1800CTGGTCCCCA AATCTCTGTA CCGGACTGCA GAGGAGCTGG AGAACGAAGA CCTGAAGCTC1860TGGACTGAGA CCATCTACCA GTCTGCAAGC GTCTTCAAAG GAGCCCCACA TGAGATTCTC1920ATTCAGATTG TGGATGCCTC GTCAGTCATC ACTTGGGATT TCGACGTGTG CAAAGGGGAC1980ATTGTGTTTA ACATCTATCA CTCCAAGAGG TCGCCACAAC CACCCAAAAA GGACTCCCTG2040GGAGCCCACA GCATCACCTC TCCGGGTGGG AACAATGTGC AGCTCATAGA CAAAGTCTGG2100CAGCTGGGCC GCGACTACAG CATGGTGGAG TCGCCTCTGA TCTGCAAAGA AGGAGAAAGC2160GTGCAGGGTT CCCATGTGAC CAGGTGGCCG GGCTTCTACA TCCTGCAGTG GAAATTCCAC2220AGCATGCCTG CGTGCGCCGC CAGCAGCCTT CCCCGGGTGG ACGACGTGCT TGCGTCCCTG2280CAGGTCTCTT CGCACAAGTG TAAAGTGATG TACTACACCG AGGTGATCGG CTCGGAGGAT2340TTCAGAGGTT CCATGACGAG CCTGGAGTCC AGCCACAGCG GCTTCTCCCA GCTGAGTGCC2400GCCACCACCT CCTCCAGCCA GTCCCACTCC AGCTCCATGA TCTCCAGGTAGTGCCGCGCT2460GCCTGCACCT AGTGTGCAGA GGGGACGGCC GCCCCTCCTC GGACAGCAGC TGCACCCGCC2520CACCCAGCGG CGACATTGTA CAGACTCCTC TCACCTCTAG ATAGCAAATA GCTCTCAGAT2580GGTAAACGTA GTCGTTTGAT CCCAAAACTA CCTTGGCAGG TAGTTTTAAC TCTGATCCTA2640ACTTAACTCA ATAGCCATAG ATTTTGTATA CGTTGTGCAC AAAATCCAAC CAGAGCGCAA2700GGGCTCTCTT GAAAGAAAAG TAGTTTCTGT ACCAATTAAA GGATTGACGT GGTCTCAGAT2760ATTGATGCAA AAAATTTTTC CAACGAACTC CGCATTGTCC ATTAGTGAAT GAATTCCTGT2820GACATCCTCC AGAGATGGCC CCTCCTCACC TGGGACGGAA GCTGCCAGCT CGCTTCCCCC2880AAGCTGCCTC ATGGCCCGCA CGCCGCCTCA CGGCCCCCAT GCTTCCCGCC AGTCAAGATG2940GTCTGTGGAC TTAGGGCCAG CCCTTGAGGT CCTTATCCTC TGAGGATTCA GAGGTTGCCT3000GCGGAGTACC TTGTCCCAGG GCCAGACACA CCCACACCAC CCACTGTCTG CAGTGGGGCC3060GGGGGCTCAG GAGGGGCTCT CAGGGACTCC TGGTGACTCC AGGAAAATGC TGCCATCGTT3120AAACATTACT TTCTCTTTCC TCCTTTTCAA ATCTTTTTGA TACTTTTTAG AGCAGGATTT3180TTCTGTATGT GAACTTGGGT GGGGGGGTTC TTCCCGTTTC CTTCCGTGCG TCGCCCCTCT3240CACCTGCAGT CAGCTCCCAG CCCAGTGTAG GCCATCTCCT CTGTGCCCTC TGGAGGCTCA3300TTGTCTCAGA GCCCAGACAG TTCCAGCCAC TAGGAGGCCG TCTTGGAACC AGCAAGTCGC3360ATTTGCCACT TGACACTGTC CATGGGGTTT TATTAGTAGC TAAGCAGCAG CTCTCGCATC3420CACTTCAGGG TGGCGTGTGG CATGTAGGAG TCCTGCTTCT TTGTACATGG GAATTGTGGA3480CTCATGCGTG TGTGTGTGTG CATGTGCTGT GTGTGTGCAT GTGTGCATGA CGGTGGGGGT3540GCTGGGGGGA CGGGGTGAGT GGAAACTTAG TTTGAGTAAT GAAGGAATCT TCACAGAAGC3600AAATCAGAAT ATGGGATTTG TTTGCCTTTT ACATTTTGTT TAATTCCTGA TTTTAAAGCC3660TGCTCTATCT GGTACAGGCC CTTATTTTTT CAGCTTTTTA TGGGAAAAGC AGGTTATTTG3720AGAATCTGTC CAGAAGTTGC ATAGGGGATG GCCTCCACGA TAAGGACATG CAACACGTGT3780TTCTGTGTGC AGCAGAGGCC GTGTTTTTCA TGCCAAACCC CACGCGGCTG TCAACTGTGT3840GCGTGGTAGG CATGGAGATC CTGGTTGTGC CGTCTCAGCT CCGCTCTGAA GGCACTGTGT3900GGGTGCTGCG TGACTGGAGA GCTGTGTGGA GGCCATGTGT GCCCCGTGCA GGGATCAGGA3960GGGCGGGGGA GGGACCGAGC AGCCCTCTTG CCCGGTCGGG TCAGCCCTAG TGGCTGCCTG4020CACACTGTAG ACGTCCCAGG GCCTGTGCTG TGATCACCTG CCTTTGGACC ACATTTGTGT4080TTGCTCTTAG AGATCGAGCT CCTCAGTGGT ACCTGAAGCC TTTGCTTCCG GAAAGCGCGG4140TAGGGTTCGT AGGTAGGGCT AGTAGGTAGG GTTAGTAGGT AGGGCTAGTA GGTAGGGCTA4200GTAGGTAGGG TTAGTAGGTA GGGTTCGTAG GTAGGGCTGG TAGGTAGGGT TAGTAGGTAG4260GGCTAGTAGG TAGGGTTCGT AGGTAGGGCT AGTAGGTAGG GTTAGTAGGT AGGGCTAGTA4320GGTAGGGCTA GTAGGTAGGG TTAGTAGGTA GGGTTCGTAG GTAGGGCTGG TAGGTAGGGT4380TAGTAGGTAG GGCTAGTAGG TAGGGTTCGT AGGTAGGGCT AGTAGGTAGG GTTAGTAGGT4440AGGGCTAGTA GGTAGGGCTA GTAGGTAGGG TTAGTAGGTA GGGTTCGTAG GTAGGGCTGG4500TAGGTAGGGT TAGTAGGTAG GGCTAGTAGG TAGGGCTAGT AGGTAGGGCT AGTAGGTAGG4560GTTAGTAGGT AGGGCTAGTA GGTAGGGCTA GTAGGTAGGG TTAGTAGGTA GGGTTCGTAG4620GTAGGGCTGG TAGGTAGGGT TAGTAGGTAG GGCTAGTAGG TAGGGCTAGT AGGTAGGGCT4680AGTAGGTAGG GCTAGTAGGT AGGGCTAGTA GGTAGGGCTA GTAGGTAGGG CTAGTAGGTA4740GGGTTCGTAG GTAGGGTTCG TAGGTAGGGT TCGTAGGTAG GGTTAGTAGC GCGTCTGTGC4800TGCTTCCACC TGGTGCTTCC TGTTCCCAAA TCACAAGGGC CTGAAGGTGG TCCCTGCTTT4860CTCTTTCTCT TTCTCTGTGT CTCAGATGGC GATTTTGCTG ACAGCTGCCA AGAAAATGCT4920TCACTCAACA GTCCTCATGT GCCCAGAGAT GTTTATAGAA CTGTTTGAAT TGCAGCCATC4980CCCTGCCCCC TCCCAGGCTG AAGATCTGTT CTTTTTAAGT TGATTCGGGA GTGGCATTCT5040TTTATACCCA AAGACTGTAG TGCATCTTGA AGAGCTCAAA GCACATGACC GCACAAATGC5100TTACAGGGTT TCCTCCCGAG TAATCCAATC TCACTCCCCT TGTAAGGGAA TTCTGGGGCA5160GCTATGGTTT GAGTATGCAG TTTGCATCGT GTTTCTACCT TTAGTACCTT GCCACTCTTT5220TAAAACGCTG CTGTCATTTC CCATTTCTTA GTACTAATGA TTCTTTGATT CTCCCTCTAT5280TATGTCTTAA TTCACTTTCC TTCCTAAATT TGTTATTTGC ATATCAAATT CTGTAAATGT5340TTTGTAAAGA TATTACCTCA CTTGGTAATA CAATACTGAT AGTCTTTAAA AGATTTTTTT5400ATTGTTATCA ATAATAAATG TGAACTATTT AAAGACJ8 DNA sequenceGene name: intercellular adhesion molecule 1 (ICAM1; CD54)Unigene number: Hs.168383Probeset Accession #: M24283Nucleic Acid Accession #: NM_000201Coding sequence: 58-1656 (predicted start/stop codons underlined)GCGCCCCAGT CGACGCTGAG CTCCTCTGCT ACTCAGAGTT GCAACCTCAG CCTCGCTATG60GCTCCCAGCA GCCCCCGGCC CGCGCTGCCC GCACTCCTGG TCCTGCTCGG GGCTCTGTTC120CCAGGACCTG GCAATGCCCA GACATCTGTG TCCCCCTCAA AAGTCATCCT GCCCCGGGGA180GGCTCCGTGC TGGTGACATG CAGCACCTCC TGTGACCAGC CCAAGTTGTT GGGCATAGAG240ACCCCGTTGC CTAAAAAGGA GTTGCTCCTG CCTGGGAACA ACCGGAAGGT GTATGAACTG300AGCAATGTGC AAGAAGATAG CCAACCAATG TGCTATTCAA ACTGCCCTGA TGGGCAGTCA360ACAGCTAAAA CCTTCCTCAC CGTGTACTGG ACTCCAGAAC GGGTGGAACT GGCACCCCTC420CCCTCTTGGC AGCCAGTGGG CAAGAACCTT ACCCTACGCT GCCAGGTGGA GGGTGGGGCA480CCCCGGGCCA ACCTCACCGT GGTGCTGCTC CGTGGGGAGA AGGAGCTGAA ACGGGAGCCA540GCTGTGGGGG AGCCCGCTGA GGTCACGACC ACGGTGCTGG TGAGGAGAGA TCACCATGGA600GCCAATTTCT CGTGCCGCAC TGAACTGGAC CTGCGGCCCC AAGGGCTGGA GCTGTTTGAG660AACACCTCGG CCCCCTACCA GCTCCAGACC TTTGTCCTGC CAGCGACTCC CCCACAACTT720GTCAGCCCCC GGGTCCTAGA GGTGGACACG CAGGGGACCG TGGTCTGTTC CCTGGACGGG780CTGTTCCCAG TCTCGGAGGC CCAGGTCCAC CTGGCACTGG GGGACCAGAG GTTGAACCCC840ACAGTGACCT ATGGCAACGA CTCCTTCTCG GCCAAGGCCT CAGTCAGTGT GACCGCAGAG900GACGAGGGCA CCCAGCGGCT GACGTGTGCA GTAATACTGG GGAACCAGAG CCAGGAGACA960CTGCAGACAG TGACCATCTA CAGCTTTCCG GCGCCCAACG TGATTCTGAC GAAGCCAGAG1020GTCTCAGAAG GGACCGAGGT GACAGTGAAG TGTGAGGCCC ACCCTAGAGC CAAGGTGACG1080CTGAATGGGG TTCCAGCCCA GCCACTGGGC CCGAGGGCCC AGCTCCTGCT GAAGGCCACC1140CCAGAGGACA ACGGGCGCAG CTTCTCCTGC TCTGCAACCC TGGAGGTGGC CGGCCAGCTT1200ATACACAAGA ACCAGACCCG GGAGCTTCGT GTCCTGTATG GCCCCCGACT GGACGAGAGG1260GATTGTCCGG GAAACTGGAC GTGGCCAGAA AATTCCCAGC AGACTCCAAT GTGCCAGGCT1320TGGGGGAACC CATTGCCCGA GCTCAAGTGT CTAAAGGATG GCACTTTCCC ACTGCCCATC1380GGGGAATCAG TGACTGTCAC TCGAGATCTT GAGGGCACCT ACCTCTGTCG GGCCAGGAGC1440ACTCAAGGGG AGGTCACCCG CGAGGTGACC GTGAATGTGC TCTCCCCCCG GTATGAGATT1500GTCATCATCA CTGTGGTAGC AGCCGCAGTC ATAATGGGCA CTGCAGGCCT CAGCACGTAC1560CTCTATAACC GCCAGCGGAA GATCAAGAAA TACAGACTAC AACAGGCCCA AAAAGGGACC1620CCCATGAAAC CGAACACACA AGCCACGCCT CCCTGAACCT ATCCCGGGAC AGGGCCTCTT1680CCTCGGCCTT CCCATATTGG TGGCAGTGGT GCCACACTGA ACAGAGTGGA AGACATATGC1740CATGCAGCTA CACCTACCGG CCCTGGGACG CCGGAGGACA GGGCATTGTC CTCAGTCAGA1800TACAACAGCA TTTGGGGCCA TGGTACCTGC ACACCTAAAA CACTAGGCCA CGCATCTGAT1860CTGTAGTCAC ATGACTAAGC CAAGAGGAAG GAGCAAGACT CAAGACATGA TTGATGGATG1920TTAAAGTCTA GCCTGATGAG AGGGGAAGTG GTGGGGGAGA CATAGCCCCA CCATGAGGAC1980ATACAACTGG GAAATACTGA AACTTGCTGC CTATTGGGTA TGCTGAGGCC CACAGACTTA2040CAGAAGAAGT GGCCCTCCAT AGACATGTGT AGCATCAAAA CACAAAGGCC CACACTTCCT2100GACGGATGCC AGCTTGGGCA CTGCTGTCTA CTGACCCCAA CCCTTGATGA TATGTATTTA2160TTCATTTGTT ATTTTACCAG CTATTTATTG AGTGTCTTTT ATGTAGGCTA AATGAACATA2220GGTCTCTGGC CTCACGGAGC TCCCAGTCCA TGTCACATTC AAGGTCACCA GGTACAGTTG2280TACAGGTTGT ACACTGCAGG AGAGTGCCTG GCAAAAAGAT CAAATGGGGC TGGGACTTCT2340CATTGGCCAA CCTGCCTTTC CCCAGAAGGA GTGATTTTTC TATCGGCACA AAAGCACTAT2400ATGGACTGGT AATGGTTCAC AGGTTCAGAG ATTACCCAGT GAGGCCTTAT TCCTCCCTTC2460CCCCCAAAAC TGACACCTTT GTTAGCCACC TCCCCACCCA CATACATTTC TGCCAGTGTT2520CACAATGACA CTCAGCGGTC ATGTCTGGAC ATGAGTGCCC AGGGAATATG CCCAAGCTAT2580GCCTTGTCCT CTTGTCCTGT TTGCATTTCA CTGGGAGCTT GCACTATTGC AGCTCCAGTT2640TCCTGCAGTG ATCAGGGTCC TGCAAGCAGT GGGGAAGGGG GCCAAGGTAT TGGAGGACTC2700CCTCCCAGCT TTGGAAGGGT CATCCGCGTG TGTGTGTGTG TGTATGTGTA GACAAGCTCT2760CGCTCTGTCA CCCAGGCTGG AGTGCAGTGG TGCAATCATG GTTCACTGCA GTCTTGACCT2820TTTGGGCTCA AGTGATCCTC CCACCTCAGC CTCCTGAGTA GCTGGGACCA TAGGCTCACA2880ACACCACACC TGGCAAATTT GATTTTTTTT TTTTTTTTCA GAGACGGGGT CTCGCAACAT2940TGCCCAGACT TCCTTTGTGT TAGTTAATAA AGCTTTCTCA ACTGCCACK3 DNA sequenceGene name: angiopoietin 1 receptor (TIE-2; TEK)Unigene number: Hs.89640Probeset Accession #: L06139 Nucleic Acid Accession #: NM_000459Coding sequence: 149-3523 (predicted start/stop codons underlined)CTTCTGTGCT GTTCCTTCTT GCCTCTAACT TGTAAACAAG ACGTACTAGG ACGATGCTAA60TGGAAAGTCA CAAACCGCTG GGTTTTTGAA AGGATCCTTG GGACCTCATG CACATTTGTG120GAAACTGGAT GGAGAGATTT GGGGAAGCATGGACTCTTTA GCCAGCTTAG TTCTCTGTGG180AGTCAGCTTG CTCCTTTCTG GAACTGTGGA AGGTGCCATG GACTTGATCT TGATCAATTC240CCTACCTCTT GTATCTGATG CTGAAACATC TCTCACCTGC ATTGCCTCTG GGTGGCGCCC300CCATGAGCCC ATCACCATAG GAAGGGACTT TGAAGCCTTA ATGAACCAGC ACCAGGATCC360GCTGGAAGTT ACTCAAGATG TGACCAGAGA ATGGGCTAAA AAAGTTGTTT GGAAGAGAGA420AAAGGCTAGT AAGATCAATG GTGCTTATTT CTGTGAAGGG CGAGTTCGAG GAGAGGCAAT480CAGGATACGA ACCATGAAGA TGCGTCAACA AGCTTCCTTC CTACCAGCTA CTTTAACTAT540GACTGTGGAC AAGGGAGATA ACGTGAACAT ATCTTTCAAA AAGGTATTGA TTAAAGAAGA600AGATGCAGTG ATTTACAAAA ATGGTTCCTT CATCCATTCA GTGCCCCGGC ATGAAGTACC660TGATATTCTA GAAGTACACC TGCCTCATGC TCAGCCCCAG GATGCTGGAG TGTACTCGGC720CAGGTATATA GGAGGAAACC TCTTCACCTC GGCCTTCACC AGGCTGATAG TCCGGAGATG780TGAAGCCCAG AAGTGGGGAC CTGAATGCAA CCATCTCTGT ACTGCTTGTA TGAACAATGG840TGTCTGCCAT GAAGATACTG GAGAATGCAT TTGCCCTCCT GGGTTTATGG GAAGGACGTG900TGAGAAGGCT TGTGAACTGC ACACGTTTGG CAGAACTTGT AAAGAAAGGT GCAGTGGACA960AGAGGGATGC AAGTCTTATG TGTTCTGTCT CCCTGACCCC TATGGGTGTT CCTGTGCCAC1020AGGCTGGAAG GGTCTGCAGT GCAATGAAGC ATGCCACCCT GGTTTTTACG GGCCAGATTG1080TAAGCTTAGG TGCAGCTGCA ACAATGGGGA GATGTGTGAT CGCTTCCAAG GATGTCTCTG1140CTCTCCAGGA TGGCAGGGGC TCCAGTGTGA GAGAGAAGGC ATACCGAGGA TGACCCCAAA1200GATAGTGGAT TTGCCAGATC ATATAGAAGT AAACAGTGGT AAATTTAATC CCATTTGCAA1260AGCTTCTGGC TGGCCGCTAC CTACTAATGA AGAAATGACC CTGGTGAAGC CGGATGGGAC1320AGTGCTCCAT CCAAAAGACT TTAACCATAC GGATCATTTC TCAGTAGCCA TATTCACCAT1380CCACCGGATC CTCCCCCCTG ACTCAGGAGT TTGGGTCTGC AGTGTGAACA CAGTGGCTGG1440GATGGTGGAA AAGCCCTTCA ACATTTCTGT TAAAGTTCTT CCAAAGCCCC TGAATGCCCC1500AAACGTGATT GACACTGGAC ATAACTTTGC TGTCATCAAC ATCAGCTCTG AGCCTTACTT1560TGGGGATGGA CCAATCAAAT CCAAGAAGCT TCTATACAAA CCCGTTAATC ACTATGAGGC1620TTGGCAACAT ATTCAAGTGA CAAATGAGAT TGTTACACTC AACTATTTGG AACCTCGGAC1680AGAATATGAA CTCTGTGTGC AACTGGTCCG TCGTGGAGAG GGTGGGGAAG GGCATCCTGG1740ACCTGTGAGA CGCTTCACAA CAGCTTCTAT CGGACTCCCT CCTCCAAGAG GTCTAAATCT1800CCTGCCTAAA AGTCAGACCA CTCTAAATTT GACCTGGCAA CCAATATTTC CAAGCTCGGA1860AGATGACTTT TATGTTGAAG TGGAGAGAAG GTCTGTGCAA AAAAGTGATC AGCAGAATAT1920TAAAGTTCCA GGCAACTTGA CTTCGGTGCT ACTTAACAAC TTACATCCCA GGGAGCAGTA1980CGTGGTCCGA GCTAGAGTCA ACACCAAGGC CCAGGGGGAA TGGAGTGAAG ATCTCACTGC2040TTGGACCCTT AGTGACATTC TTCCTCCTCA ACCAGAAAAC ATCAAGATTT CCAACATTAC2100ACACTCCTCG GCTGTGATTT CTTGGACAAT ATTGGATGGC TATTCTATTT CTTCTATTAC2160TATCCGTTAC AAGGTTCAAG GCAAGAATGA AGACCAGCAC GTTGATGTGA AGATAAAGAA2220TGCCACCATC ATTCAGTATC AGCTCAAGGG CCTAGAGCCT GAAACAGCAT ACCAGGTGGA2280CATTTTTGCA GAGAACAACA TAGGGTCAAG CAACCCAGCC TTTTCTCATG AACTGGTGAC2340CCTCCCAGAA TCTCAAGCAC CAGCGGACCT CGGAGGGGGG AAGATGCTGC TTATAGCCAT2400CCTTGGCTCT GCTGGAATGA CCTGCCTGAC TGTGCTGTTG GCCTTTCTGA TCATATTGCA2460ATTGAAGAGG GCAAATGTGC AAAGGAGAAT GGCCCAAGCC TTCCAAAACG TGAGGGAAGA2520ACCAGCTGTG CAGTTCAACT CAGGGACTCT GGCCCTAAAC AGGAAGGTCA AAAACAACCC2580AGATCCTACA ATTTATCCAG TGCTTGACTG GAATGACATC AAATTTCAAG ATGTGATTGG2640GGAGGGCAAT TTTGGCCAAG TTCTTAAGGC GCGCATCAAG AAGGATGGGT TACGGATGGA2700TGCTGCCATC AAAAGAATGA AAGAATATGC CTCCAAAGAT GATCACAGGG ACTTTGCAGG2760AGAACTGGAA GTTCTTTGTA AACTTGGACA CCATCCAAAC ATCATCAATC TCTTAGGAGC2820ATGTGAACAT CGAGGCTACT TGTACCTGGC CATTGAGTAC GCGCCCCATG GAAACCTTCT2880GGACTTCCTT CGCAAGAGCC GTGTGCTGGA GACGGACCCA GCATTTGCCA TTGCCAATAG2940CACCGCGTCC ACACTGTCCT CCCAGCAGCT CCTTCACTTC GCTGCCGACG TGGCCCGGGG3000CATGGACTAC TTGAGCCAAA AACAGTTTAT CCACAGGGAT CTGGCTGCCA GAAACATTTT3060AGTTGGTGAA AACTATGTGG CAAAAATAGC AGATTTTGGA TTGTCCCGAG GTCAAGAGGT3120GTACGTGAAA AAGACAATGG GAAGGCTCCC AGTGCGCTGG ATGGCCATCG AGTCACTGAA3180TTACAGTGTG TACACAACCA ACAGTGATGT ATGGTCCTAT GGTGTGTTAC TATGGGAGAT3240TGTTAGCTTA GGAGGCACAC CCTACTGCGG GATGACTTGT GCAGAACTCT ACGAGAAGCT3300GCCCCAGGGC TACAGACTGG AGAAGCCCCT GAACTGTGAT GATGAGGTGT ATGATCTAAT3360GAGACAATGC TGGCGGGAGA AGCCTTATGA GAGGCCATCA TTTGCCCAGA TATTGGTGTC3420CTTAAACAGA ATGTTAGAGG AGCGAAAGAC CTACGTGAAT ACCACGCTTT ATGAGAAGTT3480TACTTATGCA GGAATTGACT GTTCTGCTGA AGAAGCGGCC TAGGACAGAA CATCTGTATA3540CCCTCTGTTT CCCTTTCACT GGCATGGGAG ACCCTTGACA ACTGCTGAGA AAACATGCCT3600CTGCCAAAGG ATGTGATATA TAAGTGTACA TATGTGCTGG AATTCTAACA AGTCATAGGT3660TAATATTTAA GACACTGAAA AATCTAAGTG ATATAAATCA GATTCTTCTC TCTCATTTTA3720TCCCTCACCT GTAGCATGCC AGTCCCGTTT CATTTAGTCA TGTGACCACT CTGTCTTGTG3780TTTCCACAGC CTGCAAGTTC AGTCCAGGAT GCTAACATCT AAAAATAGAC TTAAATCTCA3840TTGCTTACAA GCCTAAGAAT CTTTAGAGAA GTATACATAA GTTTAGGATA AAATAATGGG3900ATTTTCTTTT CTTTTCTCTG GTAATATTGA CTTGTATATT TTAAGAAATA ACAGAAAGCC3960TGGGTGACAT TTGGGAGACA TGTGACATTT ATATATTGAA TTAATATCCC TACATGTATT4020GCACATTGTA AAAAGTTTTA GTTTTGATGA GTTGTGAGTT TACCTTGTAT ACTGTAGGCA4080CACTTTGCAC TGATATATCA TGAGTGAATA AATGTCTTGC CTACTCAAAA AAAAAAAAPZA6 DNA sequenceGene name: prostate differentiation factor (PLAB; MIC-1)Unigene number: Hs.116577Probeset Accession #: AB000584Nucleic Acid Accession #: NM_004864Coding sequence: 26-952 (predicted start/stop codons underlined)CGGAACGAGG GCAACCTGCA CAGCCATGCC CGGGCAAGAA CTCAGGACGG TGAATGGCTC60TCAGATGCTC CTGGTGTTGC TGGTGCTCTC GTGGCTGCCG CATGGGGGCG CCCTGTCTCT120GGCCGAGGCG AGCCGCGCAA GTTTCCCGGG ACCCTCAGAG TTGCACTCCG AAGACTCCAG180ATTCCGAGAG TTGCGGAAAC GCTACGAGGA CCTGCTAACC AGGCTGCGGG CCAACCAGAG240CTGGGAAGAT TCGAACACCG ACCTCGTCCC GGCCCCTGCA GTCCGGATAC TCACGCCAGA300AGTGCGGCTG GGATCCGGCG GCCACCTGCA CCTGCGTATC TCTCGGGCCG CCCTTCCCGA360GGGGCTCCCC GAGGCCTCCC GCCTTCACCG GGCTCTGTTC CGGCTGTCCC CGACGGCGTC420AAGGTCGTGG GACGTGACAC GACCGCTGCG GCGTCAGCTC AGCCTTGCAA GACCCCAAGC480GCCCGCGCTG CACCTGCGAC TGTCGCCGCC GCCGTCGCAG TCGGACCAAC TGCTGGCAGA540ATCTTCGTCC GCACGGCCCC AGCTGGAGTT GCACTTGCGG CCGCAAGCCG CCAGGGGGCG600CCGCAGAGCG CGTGCGCGCA ACGGGGACGA CTGTCCGCTC GGGCCCGGGC GTTGCTGCCG660TCTGCACACG GTCCGCGCGT CGCTGGAAGA CCTGGGCTGG GCCGATTGGG TGCTGTCGCC720ACGGGAGGTG CAAGTGACCA TGTGCATCGG CGCGTGCCCG AGCCAGTTCC GGGCGGCAAA780CATGCACGCG CAGATCAAGA CGAGCCTGCA CCGCCTGAAG CCCGACACGG AGCCAGCGCC840CTGCTGCGTG CCCGCCAGCT ACAATCCCAT GGTGCTCATT CAAAAGACCG ACACCGGGGT900GTCGCTCCAG ACCTATGATG ACTTGTTAGC CAAAGACTGC CACTGCATATGAGCAGTCCT960GGTCCTTCCA CTGTGCACCT GCGCGGGGGA GGCGACCTCA GTTGTCCTGC CCTGTGGAAT1020GGGCTGAAGG TTCCTGAGAC ACCCGATTCC TGCCCAAACA GCTGTATTTA TATAAGTCTG1080TTATTTATTA TTAATTTATT GGGGTGACCT TCTTGGGGAC TCGGGGGCTG GTCTGATGGA1140ACTGTGTATT TATTTAAAAC TCTGGTGATA AAAATAAAGC TGTCTGAACT GTTAAAAAAA1200AAC8 DNA sequenceGene name: noneUnigene number: Hs.6682Probeset Accession #: AA227926Nucleic Acid Accession #: noneCoding sequence: no ORF identified, possible frameshiftsAAGCTGCAGT TAGCCAAGAT CGCATCATTG CACTCCAGCC TAGGGGACAA GAGCGCGAGA60CTTCATCTCA AAGATTTTTA AATAATAGCT AAAGGTATGC TCTCTAGGTC ATCCTTAGTT120TATTAGTACT GTACTTAAAA ATTATTTTTT TAATAGTCAA TTTTGGGAGA TAATTATTTC180TTTCCTTATA TTTTCCAATT AGTTGGTGTC TAAAAATAAA TGTTTTGTCT AATTTTAGAT240CAGGTATACA TTCACAAAAG CATAAATCAT AGTCTCACAG GAAATTCACC AATTTTCCAT300ATGTCGTGAG ATAACTGTCC TTTCTACAAC CTCATAACAA TGAATTTATA TAATTACCTA360GATTTTCTTA GTGTGAATCT ACCCATTAGT TTTATTTTCT TGGTAGTTAT TTTTTTCCCT420CCTCTCTGTT ACTATTGGGC TTAAAATACA CAGGAGGACG GTTACAGTGT CCTAATAGCT480GTTACATGTG TGTGTTTCAG CGTACTTGAA TCAAGTGTAC ATTTATAGTA CCAATAACCG540CCTTTACAGC TTTACAGTTA ACAATTCTCT CACAAAACTG TAGAGCATTA GGCATCTGAG600AGCCATAGAG GGCCAACTTT GTTCCAGAGT GAACATGCTT TTTTTCCTCA ACATATACAC660TACTGATTTT TTTTAAAAGT ATGACTTTCA AGTGAATTAA TGTATTGGTT AGGAGAACTG720CTTGCTAAGT CCTTATTACC TCTTGTTAAA GCCTCAGAAG GCCGTGCTGA AAGCCAGAGG780GGAAAAAAAG AGTAATGCAC AGGTATCTCT TTTGCAGTGG TGACTGTATT TTGAGTACCT840TGTGTGACAG GGTATTATTA CAGCATCTTG TGGGAAAACC TATTAGGCCT TTGCATGTTA900AAGCTGTATA ATTTGTTGGG TTGTGAGTGG TCTGACTTAA ATGTGTATTA TAAAATTTAG960ACATCAAATT TTCCTACTAA CTAACTTTAT TAGATGCATA CTTGGAAGCA CAGTCATATC1020ACACTGGGAG GCAATGCAAT GTGGTTACCT GGTCCTAGGT TTGAACTGTC TTATTTCAAA1080AGATTTCTGA ATTAATTTTT CCCTAGAATT TCTCCTTCAT TCCAAAGTAC AAACATACTT1140TGAAGAATGA AACAGATTGT TCCCATGAAT GTATGCTCAT ACTCGACTAG AAACGATCTA1200TGTTAAATGA CTGTGTATAT GAATTATTTC AAGTACTACC CCAAATAACT TTCTTATTGC1260TCTGAAAGAA GAAAAGCAAT GTAAATCACT ATGATTATTG CACAAACAAC CAGAATTCTC1320CAACAATTTT AAGTAATCTG ATCCTCTTCT TGGAGAAAAT TGTTACCTAA TAGTTTTTCC1380TTATGAATGT TATTACTACT GGTATAAATC AAATTTCTAT AAATTTCCTA CTTAAAGTCT1440TAARAACTGG GTTCTTCCTT TGATGTTATT CATGTTCAGA AAGGGAAACA ACACTTTACT1500TTTTTAGGGA CAATTTCTAG AATCTATAGT AGTATCAGGA TATATTTTGC TTTAAAATAT1560ATTTTGGTTA TTTTGAATAC AGACATTGGC TCCAAATTTT CATCTTTGCA CAATAGTATG1620ACTTTTCACT AGAACTTCTC AACATTTGGG AACTTTGCAA ATATGAGCAT CATATGTGTT1680AAGGCTGTAT CATTTAATGC TATGAGATAC ATTGTTTTCT CCCTATGCCA AACAGGTGAA1740CAAACGTAGT TGTTTTTTAC TGATACTAAA TGTTGGCTAC CTGTGATTTT ATAGTATGCA1800CATGTCAGAA AAAGGCAAGA CAAATGGCCT CTTGTACTGA ATACTTCGGC AAACTTATTG1860GGGTCTTCAT TTTCTGACAG ACAGGATTTG ACTCAATATT TGTAGAGCTT GCGTAGGAAT1920GGGATTACAT GGGTAGTGAT GCACTGGTAG GAAATGGTTT TTAGTTATTG ACTCAGGAAT1980TCATCTGAGG ATGAATCTTT TATGTCTTTT TATTGTAAGG CATATCTGGA ATTTACTTTA2040TAAAGGAGGG GTTTAGGAAA GCTTTGTCCT AAAAATTGGG CCCCGGGGAT GGGAACTTCA2100TTTTCAGTTG CCAAGGGGTA GAAAAATAAT ATGTGTGTTG TTATGTTTAT GTTAACATAT2160TATTAGGTAC TATCTATGAA TGTATTTAAA TATTTTTCAT ATTCTGTGAC AAGCATTTAT2220AATTTGCAAC AAGTGGAGTC CATTTAGCCC AGTGGGAAAG TCTTGGAACT CAGGTTACCC2280TTGAAGGATA TGCTGGCAGC CATCTCTTTG ATCTGTGCTT AAACTGTAAT TTATAGACCA2340GCTAAATCCC TAACTTGGAT CTGGAATGCA TTAGTTATGA CCTTGTACCA TTCCCAGAAT2400TTCAGGGGCA TCGTGGGTTT GGTCTAGTGA TTGAAAACAC AAGAACAGAG AGATCCAGCT2460GAAAAAGAGT GATCCTCAAT ATCCTAACTA ACTGGTCCTC AACTCAAGCA GAGTTTCTTC2520ACTCTGGCAC TGTGATCATG AAACTTAGTA GAGGGGATTG TGTGTATTTT ATACAAATTT2580AATACAATGT CTTACATTGA TAAAATTCTT AAAGAGCAAA ACTGCATTTT ATTTCTGCAT2640CCACATTCCA ATCATATTAG AACTAAGATA TTTATCTATG AAGATATAAA TGGTGCAGAG2700AGACTTTCAT CTGTGGATTG CGTTGTTTCT CTAGGGTTCC TCAGCCACTG ATGCCTCGCC2760ACAAGCCATG TGATATGTGA AATAAAAAGG GATTCTTCCT ATAGCCTAAA TGAAGTTCCC2820TCTGGGGAGA GTTCTGGTAC TGCAATCACA ATGCCAGATG GTGTTTATGG GCTATTTGTG2880TAAGTAAGTG GTAAGATGCT ATGAAGTAAG TGTGTTTGTT TTCATCTTAT GGAAACTCTT2940GATGCATGTG CTTTTGTATG GAATAAATTT TGGTGCAATA TGATGTCATT CAACTTTGCA3000TTGAATTGAA TTTTGGTTGT ATTTATATGT ATTATACCTG TCACGCTTCT AGTTGCTTCA3060ACCATTTTAT AACCATTTTT GTACATATTT TACTTGAAAA TATTTTAAAT GGAAATTTAA3120ATAAACATTT GATAGTTTAC ATAAAAAAAA AAAAAAAAAA AAAD2 DNA sequenceGene name: Thrombospondin-1Unigene number: Hs.87409Probeset Accession #: AA232645Nucleic Acid Accession #: NM_003246Coding sequence: 112-3624 (predicted start/stop codons underlined)GGACGCACAG GCATTCCCCG CGCCCCTCCA GCCCTCGCCG CCCTCGCCAC CGCTCCCGGC60CGCCGCGCTC CGGTACACAC AGGATCCCTG CTGGGCACCA ACAGCTCCAC CATGGGGCTG120GCCTGGGGAC TAGGCGTCCT GTTCCTGATG CATGTGTGTG GCACCAACCG CATTCCAGAG180TCTGGCGGAG ACAACAGCGT GTTTGACATC TTTGAACTCA CCGGGGCCGC CCGCAAGGGG240TCTGGGCGCC GACTGGTGAA GGGCCCCGAC CCTTCCAGCC CAGCTTTCCG CATCGAGGAT300GCCAACCTGA TCCCCCCTGT GCCTGATGAC AAGTTCCAAG ACCTGGTGGA TGCTGTGCGG360GCAGAAAAGG GTTTCCTCCT TCTGGCATCC CTGAGGCAGA TGAAGAAGAC CCGGGGCACG420CTGCTGGCCC TGGAGCGGAA AGACCACTCT GGCCAGGTCT TCAGCGTGGT GTCCAATGGC480AAGGCGGGCA CCCTGGACCT CAGCCTGACC GTCCAAGGAA AGCAGCACGT GGTGTCTGTG540GAAGAAGCTC TCCTGGCAAC CGGCCAGTGG AAGAGCATCA CCCTGTTTGT GCAGGAAGAC600AGGGCCCAGC TGTACATCGA CTGTGAAAAG ATGGAGAATG CTGAGTTGGA CGTCCCCATC660CAAAGCGTCT TCACCAGAGA CCTGGCCAGC ATCGCCAGAC TCCGCATCGC AAAGGGGGGC720GTCAATGACA ATTTCCAGGG GGTGCTGCAG AATGTGAGGT TTGTCTTTGG AACCACACCA780GAAGACATCC TCAGGAACAA AGGCTGCTCC AGCTCTACCA GTGTCCTCCT CACCCTTGAC840AACAACGTGG TGAATGGTTC CAGCCCTGCC ATCCGCACTA ACTACATTGG CCACAAGACA900AAGGACTTGC AAGCCATCTG CGGCATCTCC TGTGATGAGC TGTCCAGCAT GGTCCTGGAA960CTCAGGGGCC TGCGCACCAT TGTGACCACG CTGCAGGACA GCATCCGCAA AGTGACTGAA1020GAGAACAAAG AGTTGGCCAA TGAGCTGAGG CGGCCTCCCC TATGCTATCA CAACGGAGTT1080CAGTACAGAA ATAACGAGGA ATGGACTGTT GATAGCTGCA CTGAGTGTCA CTGTCAGAAC1140TCAGTTACCA TCTGCAAAAA GGTGTCCTGC CCCATCATGC CCTGCTCCAA TGCCACAGTT1200CCTGATGGAG AATGCTGTCC TCGCTGTTGG CCCAGCGACT CTGCGGACGA TGGCTGGTCT1260CCATGGTCCG AGTGGACCTC CTGTTCTACG AGCTGTGGCA ATGGAATTCA GCAGCGCGGC1320CGCTCCTGCG ATAGCCTCAA CAACCGATGT GAGGGCTCCT CGGTCCAGAC ACGGACCTGC1380CACATTCAGG AGTGTGACAA AAGATTTAAA CAGGATGGTG GCTGGAGCCA CTGGTCCCCG1440TGGTCATCTT GTTCTGTGAC ATGTGGTGAT GGTGTGATCA CAAGGATCCG GCTCTGCAAC1500TCTCCCAGCC CCCAGATGAA TGGGAAACCC TGTGAAGGCG AAGCGCGGGA GACCAAAGCC1560TGCAAGAAAG ACGCCTGCCC CATCAATGGA GGCTGGGGTC CTTGGTCACC ATGGGACATC1620TGTTCTGTCA CCTGTGGAGG AGGGGTACAG AAACGTAGTC GTCTCTGCAA CAACCCCGCA1680CCCCAGTTTG GAGGCAAGGA CTGCGTTGGT GATGTAACAG AAAACCAGAT CTGCAACAAG1740CAGGACTGTC CAATTGATGG ATGCCTGTCC AATCCCTGCT TTGCCGGCGT GAAGTGTACT1800AGCTACCCTG ATGGCAGCTG GAAATGTGGT GCTTGTCCCC CTGGTTACAG TGGAAATGGC1860ATCCAGTGCA CAGATGTTGA TGAGTGCAAA GAAGTGCCTG ATGCCTGCTT CAACCACAAT1920GGAGAGCACC GGTGTGAGAA CACGGACCCC GGCTACAACT GCCTGCCCTG CCCCCCACGC1980TTCACCGGCT CACAGCCCTT CGGCCAGGGT GTCGAACATG CCACGGCCAA CAAACAGGTG2040TGCAAGCCCC GTAACCCCTG CACGGATGGG ACCCACGACT GCAACAAGAA CGCCAAGTGC2100AACTACCTGG GCCACTATAG CGACCCCATG TACCGCTGCG AGTGCAAGCC TGGCTACGCT2160GGCAATGGCA TCATCTGCGG GGAGGACACA GACCTGGATG GCTGGCCCAA TGAGAACCTG2220GTGTGCGTGG CCAATGCGAC TTACCACTGC AAAAAGGATA ATTGCCCCAA CCTTCCCAAC2280TCAGGGCAGG AAGACTATGA CAAGGATGGA ATTGGTGATG CCTGTGATGA TGACGATGAC2340AATGATAAAA TTCCAGATGA CAGGGACAAC TGTCCATTCC ATTACAACCC AGCTCAGTAT2400GACTATGACA GAGATGATGT GGGAGACCGC TGTGACAACT GTCCCTACAA CCACAACCCA2460GATCAGGCAG ACACAGACAA CAATGGGCAA GGAGACGCCT GTGCTGCAGA CATTGATGGA2520GACGGTATCC TCAATGAACG GGACAACTGC CAGTACGTCT ACAATGTGGA CCAGAGAGAC2580ACTGATATGG ATGGGGTTGG AGATCAGTGT GACAATTGCC CCTTGGAACA CAATCCGGAT2640CAGCTGGACT CTGACTCAGA CCGCATTGGA GATACCTGTG ACAACAATCA GGATATTGAT2700GAAGATGGCC ACCAGAACAA TCTGGACAAC TGTCCCTATG TGCCCAATGC CAACCAGGCT2760GACCATGACA AAGATGGCAA GGGAGATGCC TGTGACCACG ATGATGACAA CGATGGCATT2820CCTGATGACA AGGACAACTG CAGACTCGTG CCCAATCCCG ACCAGAAGGA CTCTGACGGC2880GATGGTCGAG GTGATGCCTG CAAAGATGAT TTTGACCATG ACAGTGTGCC AGACATCGAT2940GACATCTGTC CTGAGAATGT TGACATCAGT GAGACCGATT TCCGCCGATT CCAGATGATT3000CCTCTGGACC CCAAAGGGAC ATCCCAAAAT GACCCTAACT GGGTTGTACG CCATCAGGGT3060AAAGAACTCG TCCAGACTGT CAACTGTGAT CCTGGACTCG CTGTAGGTTA TGATGAGTTT3120AATGCTGTGG ACTTCAGTGG CACCTTCTTC ATCAACACCG AAAGGGACGA TGACTATGCT3180GGATTTGTCT TTGGCTACCA GTCCAGCAGC CGCTTTTATG TTGTGATGTG GAAGCAAGTC3240ACCCAGTCCT ACTGGGACAC CAACCCCACG AGGGCTCAGG GATACTCGGG CCTTTCTGTG3300AAAGTTGTAA ACTCCACCAC AGGGCCTGGC GAGCACCTGC GGAACGCCCT GTGGCACACA3360GGAAACACCC CTGGCCAGGT GCGCACCCTG TGGCATGACC CTCGTCACAT AGGCTGGAAA3420GATTTCACCG CCTACAGATG GCGTCTCAGC CACAGGCCAA AGACGGGTTT CATTAGAGTG3480GTGATGTATG AAGGGAAGAA AATCATGGCT GACTCAGGAC CCATCTATGA TAAAACCTAT3540GCTGGTGGTA GACTAGGGTT GTTTGTCTTC TCTCAAGAAA TGGTGTTCTT CTCTGACCTG3600AAATACGAAT GTAGAGATCC CTAATCATCA AATTGTTGAT TGAAAGACTG ATCATAAACC3660AATGCTGGTA TTGCACCTTC TGGAACTATG GGCTTGAGAA AACCCCCAGG ATCACTTCTC3720CTTGGCTTCC TTCTTTTCTG TGCTTGCATC AGTGTGGACT CCTAGAACGT GCGACCTGCC3780TCAAGAAAAT GCAGTTTTCA AAAACAGACT CATCAGCATT CAGCCTCCAA TGAATAAGAC3840ATCTTCCAAG CATATAAACA ATTGCTTTGG TTTCCTTTTG AAAAAGCATC TACTTGCTTC3900AGTTGGGAAG GTGCCCATTC CACTCTGCCT TTGTCACAGA GCAGGGTGCT ATTGTGAGGC3960CATCTCTGAG CAGTGGACTC AAAAGCATTT TCAGGCATGT CAGAGAAGGG AGGACTCACT4020AGAATTAGCA AACAAAACCA CCCTGACATC CTCCTTCAGG AACACGGGGA GCAGAGGCCA4080AAGCACTAAG GGGAGGGCGC ATACCCGAGA CGATTGTATG AAGAAAATAT GGAGGAACTG4140TTACATGTTC GGTACTAAGT CATTTTCAGG GGATTGAAAG ACTATTGCTG GATTTCATGA4200TGCTGACTGG CGTTAGCTGA TTAACCCATG TAAATAGGCA CTTAAATAGA AGCAGGAAAG4260GGAGACAAAG ACTGGCTTCT GGACTTCCTC CCTGATCCCC ACCCTTACTC ATCACCTTGC4320AGTGGCCAGA ATTAGGGAAT CAGAATCAAA CCAGTGTAAG GCAGTGCTGG CTGCCATTGC4380CTGGTCACAT TGAAATTGGT GGCTTCATTC TAGATGTAGC TTGTGCAGAT GTAGCAGGAA4440AATAGGAAAA CCTACCATCT CAGTGAGCAC CAGCTGCCTC CCAAAGGAGG GGCAGCCGTG4500CTTATATTTT TATGGTTACA ATGGCACAAA ATTATTATCA ACCTAACTAA AACATTCCTT4560TTCTCTTTTT TCCGTAATTA CTAGGTAGTT TTCTAATTCT CTCTTTTGGA AGTATGATTT4620TTTTAAAGTC TTTACGATGT AAAATATTTA TTTTTTACTT ATTCTGGAAG ATCTGGCTGA4680AGGATTATTC ATGGAACAGG AAGAAGCGTA AAGACTATCC ATGTCATCTT TGTTGAGAGT4740CTTCGTGACT GTAAGATTGT AAATACAGAT TATTTATTAA CTCTGTTCTG CCTGGAAATT4800TAGGCTTCAT ACGGAAAGTG TTTGAGAGCA AGTAGTTGAC ATTTATCAGC AAATCTCTTG4860CAAGAACAGC ACAAGGAAAA TCAGTCTAAT AAGCTGCTCT GCCCCTTGTG CTCAGAGTGG4920ATGTTATGGG ATTCCTTTTT TCTCTGTTTT ATCTTTTCAA GTGGAATTAG TTGGTTATCC4980ATTTGCAAAT GTTTTAAATT GCAAAGAAAG CCATGAGGTC TTCAATACTG TTTTACCCCA5040TCCCTTGTGC ATATTTCCAG GGAGAAGGAA AGGATATACA CTTTTTTCTT TCATTTTTCC5100AAAAGAGAAA AAAATGACAA AAGGTGAAAC TTACATACAA ATATTACCTC ATTTGTTGTG5160TGACTGAGTA AAGAATTTTT GGATCAAGCG GAAAGAGTTT AAGTGTCTAA CAAACTTAAA5220GCTACTGTAG TACCTAAAAA GTCAGTGTTG TACATAGCAT AAAAACTCTG CAGAGAAGTA5280TTCCCAATAA GGAAATAGCA TTGAAATGTT AAATACAATT TCTGAAAGTT ATGTTTTTTT5340TCTATCATCT GGTATACCAT TGCTTTATTT TTATAAATTA TTTTCTCATT GCCATTGGAA5400TAGAATATTC AGATTGTGTA GATATGCTAT TTAAATAATT TATCAGGAAA TACTGCCTGT5460AGAGTTAGTA TTTCTATTTT TATATAATGT TTGCACACTG AATTGAAGAA TTGTTGGTTT5520TTTCTTTTTT TTGTTTTTTT TTTTTTTTTT TTTTTTTTTG CTTTTGACCT CCCATTTTTA5580CTATTTGCCA ATACCTTTTT CTAGGAATGT GCTTTTTTTT GTACACATTT TTATCCATTT5640TACATTCTAA AGCAGTGTAA GTTGTATATT ACTGTTTCTT ATGTACAAGG AACAACAATA5700AATCATATGG AAATTTATAT TTAAD9 DNA sequenceGene name: LIM homeobox protein cofactor (CLIM-1)Unigene number: Hs.4980Probeset Accession #: F13782Nucleic Acid Accession #: AF047337Coding sequence: 110-1231 (predicted start/stop codons underlined)GTGAGCGTGT GTGCGTGCGT CTACTTTGTA CTGGGAAGAA CACAGCCCATGTGCTCTGCA60TGGACGTTAC TGATACTCTG TTTAGCTTGA TTTTCGAAAA GCAGGCAAGA TGTCCAGCAC120ACCACATGAC CCCTTCTATT CTTCTCCTTT CGGCCCATTT TATAGGAGGC ATACACCATA180CATGGTACAG CCAGAGTACC GAATCTATGA GATGAACAAG AGACTGCAGT CTCGCACAGA240GGATAGTGAC AACCTCTGGT GGGACGCCTT TGCCACTGAA TTTTTTGAAG ATGACGCCAC300ATTAACCCTT TCATTTTGTT TGGAAGATGG ACCAAAGCGA TACACTATCG GCAGGACCCT360CATCCCCCGT TACTTTAGCA CTGTGTTTGA AGGAGGGGTG ACCGACCTGT ATTACATTCT420CAAACACTCG AAAGAGTCAT ACCACAACTC ATCCATCACG GTGGACTGCG ACCAGTGTAC480CATGGTCACC CAGCACGGGA AGCCCATGTT TACCAAGGTA TGTACAGAAG GCAGACTGAT540CTTGGAGTTC ACCTTTGATG ATCTCATGAG AATCAAAACA TGGCACTTTA CCATTAGACA600ATACCGAGAG TTAGTCCCGA GAAGCATCCT AGCCATGCAT GCACAAGATC CTCAGGTCCT660GGATCAGCTG TCCAAAAACA TCACCAGGAT GGGGCTAACA AACTTCACCC TCAACTACCT720CAGGTTGTGT GTAATATTGG AGCCAATGCA GGAACTGATG TCGAGACATA AAACTTACAA780CCTCAGTCCC CGAGACTGCC TGAAGACCTG CTTGTTTCAG AAGTGGCAGA GGATGGTGGC840TCCGCCAGCA GAACCCACAA GGCAACCAAC AACCAAACGG AGAAAAAGGA AAAATTCCAC900CAGCAGCACT TCCAACAGCA GCGCTGGGAA CAATGCAAAC AGCACTGGCA GCAAGAAGAA960GACCACAGCT GCAAACCTGA GTCTGTCCAG TCAGGTACCT GATGTGATGG TGGTAGGAGA1020GCCAACTCTG ATGGGAGGTG AGTTTGGGGA CGAGGACGAA AGGCTAATCA CTAGATTAGA1080AAACACGCAA TATGATGCGG CCAACGGCAT GGACGACGAG GAGGACTTCA ACAATTCACC1140CGCGCTGGGG AACAACAGCC CGTGGAACAG TAAACCTCCC GCCACTCAAG AGACCAAATC1200AGAAAACCCC CCACCCCAGG CTTCCCAATAAGATGATCGG CACCAGAATC CACTGTCAAT1260AGGCCCGTGG GTGATCATTA CAATTGCAAA TCTTTACTTA CAGGAGAGGA AACAGAAGAG1320ATAAAAACTT TTCCATGCAA ATATCTATTT CTAAACCACA ATGATCTGAT TTTCTTTCTT1380CTTTCTTTTT TTCTAATTGA GAGGATTATT CCCAGTAAGC TTCCATGACC CTTTCTTGGA1440GGCCTTCACA GGTAATACAG ATACTGGCAC TGATTGTAAT TAAAATGAGA GAAAACTCTA1500GCGCATCTTC TGGCACGGTT TTAACAACGT GTTTGTGTTG AATTTCCTTT TTATGCATCA1560AACGAAGGCC ATATTGTCCA TAAATGCTCA GTGCTCAGGA TCTCATTAAT ATGCCGAACC1620TAACTACAGA TGACTTTTTA ATATTGTAAA ATATTTTCTG CTTTTTGACT TGCATCTGAG1680AGTTTCTTGT TTCAGTAAAA AAAGAAAAGA CAAAAAAATC AGCTTTGGAA AGTAATTTAA1740ATGTACCTTA TTTTTTTTTT CTTTATGTTT TCTTTCATTG GGCAACAGCT AAGAGGGCCC1800AGCAAGGTAA TTTATGGTTG AGCTGATGTC AATTGGTTCT TGTCTTGAGT CGACTCAATT1860TAGCCCAAGT GCTGAAACAA GAAATGTCAT TTTTTTCATC AAAGACACCA GGGCAGATTT1920TTAAGTAAAG AAAGACAATT GGACCCTTAA GAATTTATGC ATTTGTAAAG TTGCTGTTGA1980TCCAAATATT TTCAAGCCAT GTAATCCATT GGTTTTGTGG GCAGTTTAAT AAACCTGAAC2040CTTTGTGTGT TTTCTAATTG TACCTGAGTT GACCATCCTT TCTTTTTATA GTATATTTCT2100TGTATGATAT TTTGTAAAGC TCTCACCTGG TTCTTTTATG GGGACTTTTC GTTTTTGGGC2160AACTCCAGTG TATTTATGTG AAACTTTATA AGAGAATTAA TTTTTCCATT TGCATATTAA2220TATGTTCCTC CACACATGTA AAGGCACAGT GGCTCCGTGT GTTAAAAAAC AGCTGTATTT2280TATGTATGCT TTACTGATAA GTGTGCCAAT AATAAACTGT GTTAATGACCAAE1 DNA sequenceGene name: guanine nucleotide binding protein 11Unigene number: Hs.83381Probeset Accession #: U31384Nucleic Acid Accession #: NM_004126.1Coding sequence: 108-329 (predicted start/stop codons underlined)GGCACGAGCT CGTGCCGGCC TTCAGTTGTT TCGGGACGCG CCGAGCTTCG CCGCTCTTCC60AGCGGCTCCG CTGCCAGAGC TAGCCCGAGC CCGGTTCTGG GGCGAAAATG CCTGCCCTTC120ACATCGAAGA TTTGCCAGAG AAGGAAAAAC TGAAAATGGA AGTTGAGCAG CTTCGCAAAG180AAGTGAAGTT GCAGAGACAA CAAGTGTCTA AATGTTCTGA AGAAATAAAG AACTATATTG240AAGAACGTTC TGGAGAGGAT CCTCTAGTAA AGGGAATTCC AGAAGACAAG AACCCCTTTA300AAGAAAAAGG CAGCTGTGTT ATTTCATAAA TAACTTGGGA GAAACTGCAT CCTAAGTGGA360AGAACTAGTT TGTTTTAGTT TTCCCAGATA AAACCAACAT GCTTTTTAAG GAAGGAAGAA420TGAAATTAAA AGGAGACTTT CTTAAGCACC ATATAGATAG GGTTATGTAT AAAAGCATAT480GTGCTACTCA TCTTTGCTCA CTATGCAGTC TTTTTTAAGA GAGCAGAGAG TATCAGATGT540ACAATTATGG AAATAAGAAC ATTACTTGAG CATGACACTT CTTTCAGTAT ATTGCTTGAT600GCTTCAAATA AAGTTTTGTC TTAAE2 DNA sequenceGene name: Transcription factor 4 (immunoglobulin transcription factor 2) (ITF-2)(SL3-3 Enhancer factor 2) (SEF-2)Unigene number: Hs.289068Probeset Accession #: M74719Nucleic Acid Accession #: NM_003199.1coding sequence: 200-2203 (predicted start/stop codons underlined)CGGGGGGATC TTGGCTGTGT GTCTGCGGAT CTGTAGTGGC GGCGGCGGCG GCGGCGGCGG60GGAGGCAGCA GGCGCGGGAG CGGGCGCAGG AGCAGGCGGC GGCGGTGGCG GCGGCGGTTA120GACATGAACG CCGCCTCGGC GCCGGCGGTG CACGGAGAGC CCCTTCTCGC GCGCGGGCGG180TTTGTGTGAT TTTGCTAAAATGCATCACCA ACAGCGAATG GCTGCCTTAG GGACGGACAA240AGAGCTGAGT GATTTACTGG ATTTCAGTGC GATGTTTTCA CCTCCTGTGA GCAGTGGGAA300AAATGGACCA ACTTCTTTGG CAAGTGGACA TTTTACTGGC TCAAATGTAG AAGACAGAAG360TAGCTCAGGG TCCTGGGGGA ATGGAGGACA TCCAAGCCCG TCCAGGAACT ATGGAGATGG420GACTCCCTAT GACCACATGA CCAGCAGGGA CCTTGGGTCA CATGACAATC TCTCTCCACC480TTTTGTCAAT TCCAGAATAC AAAGTAAAAC AGAAAGGGGC TCATACTCAT CTTATGGGAG540AGAATCAAAC TTACAGGGTT GCCACCAGCA GAGTCTCCTT GGAGGTGACA TGGATATGGG600CAACCCAGGA ACCCTTTCGC CCACCAAACC TGGTTCCCAG TACTATCAGT ATTCTAGCAA660TAATCCCCGA AGGAGGCCTC TTCACAGTAG TGCCATGGAG GTACAGACAA AGAAAGTTCG720AAAAGTTCCT CCAGGTTTGC CATCTTCAGT CTATGCTCCA TCAGCAAGCA CTGCCGACTA780CAATAGGGAC TCGCCAGGCT ATCCTTCCTC CAAACCAGCA ACCAGCACTT TCCCTAGCTC840CTTCTTCATG CAAGATGGCC ATCACAGCAG TGACCCTTGG AGCTCCTCCA GTGGGATGAA900TCAGCCTGGC TATGCAGGAA TGTTGGGCAA CTCTTCTCAT ATTCCACAGT CCAGCAGCTA960CTGTAGCCTG CATCCACATG AACGTTTGAG CTATCCATCA CACTCCTCAG CAGACATCAA1020TTCCAGTCTT CCTCCGATGT CCACTTTCCA TCGTAGTGGT ACAAACCATT ACAGCACCTC1080TTCCTGTACG CCTCCTGCCA ACGGGACAGA CAGTATAATG GCAAATAGAG GAAGCGGGGC1140AGCCGGCAGC TCCCAGACTG GAGATGCTCT GGGGAAAGCA CTTGCTTCGA TCTATTCTCC1200AGATCACACT AACAACAGCT TTTCATCAAA CCCTTCAACT CCTGTTGGCT CTCCTCCATC1260TCTCTCAGCA GGCACAGCTG TTTGGTCTAG AAATGGAGGA CAGGCCTCAT CGTCTCCTAA1320TTATGAAGGA CCCTTACACT CTTTGCAAAG CCGAATTGAA GATCGTTTAG AAAGACTGGA1380TGATGCTATT CATGTTCTCC GGAACCATGC AGTGGGCCCA TCCACAGCTA TGCCTGGTGG1440TCATGGGGAC ATGCATGGAA TCATTGGACC TTCTCATAAT GGAGCCATGG GTGGTCTGGG1500CTCAGGGTAT GGAACCGGCC TTCTTTCAGC CAACAGACAT TCACTCATGG TGGGGACCCA1560TCGTGAAGAT GGCGTGGCCC TGAGAGGCAG CCATTCTCTT CTGCCAAACC AGGTTCCGGT1620TCCACAGCTT CCTGTCCAGT CTGCGACTTC CCCTGACCTG AACCCACCCC AGGACCCTTA1680CAGAGGCATG CCACCAGGAC TACAGGGGCA GAGTGTCTCC TCTGGCAGCT CTGAGATCAA1740ATCCGATGAC GAGGGTGATG AGAACCTGCA AGACACGAAA TCTTCGGAGG ACAAGAAATT1800AGATGACGAC AAGAAGGATA TCAAATCAAT TACTAGCAAT AATGACGATG AGGACCTGAC1860ACCAGAGCAG AAGGCAGAGC GTGAGAAGGA GCGGAGGATG GCCAACAATG CCCGAGAGCG1920TCTGCGGGTC CGTGACATCA ACGAGGCTTT CAAAGAGCTC GGCCGCATGG TGCAGCTCCA1980CCTCAAGAGT GACAAGCCCC AGACCAAGCT CCTGATCCTC CACCAGGCGG TGGCCGTCAT2040CCTCAGTCTG GAGCAGCAAG TCCGAGAAAG GAATCTGAAT CCGAAAGCTG CGTGTCTGAA2100AAGAAGGGAG GAAGAGAAGG TGTCCTCGGA GCCTCCCCCT CTCTCCTTGG CCGGCCCACA2160CCCTGGAATG GGAGACGCAT CGAATCACAT GGGACAGATG TAAAAGGGTC CAAGTTGCCA2220CATTGCTTCA TTAAAACAAG AGACCACTTC CTTAACAGCT GTATTATCTT AAACCCACAT2280AAACACTTCT CCTTAACCCC CATTTTTGTA ATATAAGACA AGTCTGAGTA GTTATGAATC2340GCAGACGCAA GAGGTTTCAG CATTCCCAAT TATCAAAAAA CAGAAAAACA AAAAAAAGAA2400AGAAAAAAGT GCAACTTGAG GGACGACTTT CTTTAACATA TCATTCAGAA TGTGCAAAGC2460AGTATGTACA GGCTGAGACA CAGCCCAGAG ACTGAACGGCAAE4 DNA sequenceGene name: phosphatidylcholine 2-acylhydrolaseUnigene number: Hs.211587Probeset Accession #: M68874Nucleic Acid Accession #: M68874Coding sequence: 139-2388 (predicted start/stop codons underlined)GAATTCTCCG GAGCTGAAAA AGGATCCTGA CTGAAAGCTA GAGGCATTGA GGAGCCTGAA60GATTCTCAGG TTTTAAAGAC GCTAGAGTGC CAAAGAAGAC TTTGAAGTGT GAAAACATTT120CCTGTAATTG AAACCAAAATGTCATTTATA GATCCTTACC AGCACATTAT AGTGGAGCAC180CAGTATTCCC ACAAGTTTAC GGTAGTGGTG TTACGTGCCA CCAAAGTGAC AAAGGGGGCC240TTTGGTGACA TGCTTGATAC TCCAGATCCC TATGTGGAAC TTTTTATCTC TACAACCCCT300GACAGCAGGA AGAGAACAAG ACATTTCAAT AATGACATAA ACCCTGTGTG GAATGAGACC360TTTGAATTTA TTTTGGATCC TAATCAGGAA AATGTTTTGG AGATTACGTT AATGGATGCC420AATTATGTCA TGGATGAAAC TCTAGGGACA GCAACATTTA CTGTATCTTC TATGAAGGTG480GGAGAAAAGA AAGAAGTTCC TTTTATTTTC AACCAAGTCA CTGAAATGGT TCTAGAAATG540TCTCTTGAAG TTTGCTCATG CCCAGACCTA CGATTTAGTA TGGCTCTGTG TGATCAGGAG600AAGACTTTCA GACAACAGAG AAAAGAACAC ATAAGGGAGA GCATGAAGAA ACTCTTGGGT660CCAAAGAATA GTGAAGGATT GCATTCTGCA CGTGATGTGC CTGTGGTAGC CATATTGGGT720TCAGGTGGGG GTTTCCGAGC CATGGTGGGA TTCTCTGGTG TGATGAAGGC ATTATACGAA780TCAGGAATTC TGGATTGTGC TACCTACGTT GCTGGTCTTT CTGGCTCCAC CTGGTATATG840TCAACCTTGT ATTCTCACCC TGATTTTCCA GAGAAAGGGC CAGAGGAGAT TAATGAAGAA900CTAATGAAAA ATGTTAGCCA CAATCCCCTT TTACTTCTCA CACCACAGAA AGTTAAAAGA960TATGTTGAGT CTTTATGGAA GAAGAAAAGC TCTGGACAAC CTGTCACCTT TACTGACATC1020TTTGGGATGT TAATAGGAGA AACACTAATT CATAATAGAA TGAATACTAC TCTGAGCAGT1080TTGAAGGAAA AAGTTAATAC TGCACAATGC CCTTTACCTC TTTTCACCTG TCTTCATGTC1140AAACCTGACG TTTCAGAGCT GATGTTTGCA GATTGGGTTG AATTTAGTCC ATACGAAATT1200GGCATGGCTA AATACGGTAC TTTTATGGCT CCCGACTTAT TTGGAAGCAA ATTTTTTATG1260GGAACAGTCG TTAAGAAGTA TGAAGAAAAC CCCTTGCATT TCTTAATGGG TGTCTGGGGC1320AGTGCCTTTT CCATATTGTT CAACAGAGTT TTGGGCGTTT CTGGTTCACA AAGCAGAGGC1380TCCACAATGG AGGAAGAATT AGAAAATATT ACCACAAAGC ATATTGTGAG TAATGATAGC1440TCGGACAGTG ATGATGAATC ACACGAACCC AAAGGCACTG AAAATGAAGA TGCTGGAAGT1500GACTATCAAA GTGATAATCA AGCAAGTTGG ATTCATCGTA TGATAATGGC CTTGGTGAGT1560GATTCAGCTT TATTCAATAC CAGAGAAGGA CGTGCTGGGA AGGTACACAA CTTCATGCTG1620GGCTTGAATC TCAATACATC TTATCCACTG TCTCCTTTGA GTGACTTTGC CACACAGGAC1680TCCTTTGATG ATGATGAACT GGATGCAGCT GTAGCAGATC CTGATGAATT TGAGCGAATA1740TATGAGCCTC TGGATGTCAA AAGTAAAAAG ATTCATGTAG TGGACAGTGG GCTCACATTT1800AACCTGCCGT ATCCCTTGAT ACTGAGACCT CAGAGAGGGG TTGATCTCAT AATCTCCTTT1860GACTTTTCTG CAAGGCCAAG TGACTCTAGT CCTCCGTTCA AGGAACTTCT ACTTGCAGAA1920AAGTGGGCTA AAATGAACAA GCTCCCCTTT CCAAAGATTG ATCCTTATGT GTTTGATCGG1980GAAGGGCTGA AGGAGTGCTA TGTCTTTAAA CCCAAGAATC CTGATATGGA GAAAGATTGC2040CCAACCATCA TCCACTTTGT TCTGGCCAAC ATCAACTTCA GAAAGTACAA GGCTCCAGGT2100GTTCCAAGGG AAACTGAGGA AGAGAAAGAA ATCGCTGACT TTGATATTTT TGATGACCCA2160GAATCACCAT TTTCAACCTT CAATTTTCAA TATCCAAATC AAGCATTCAA AAGACTACAT2220GATCTTATGC ACTTCAATAC TCTGAACAAC ATTGATGTGA TAAAAGAAGC CATGGTTGAA2280AGCATTGAAT ATAGAAGACA GAATCCATCT CGTTGCTCTG TTTCCCTTAG TAATGTTGAG2340GCAAGAAGAT TTTTCAACAA GGAGTTTCTA AGTAAACCCA AAGCATAGTT CATGTACTGG2400AAATGGCAGC AGTTTCTGAT GCTGAGGCAG TTTGCAATCC CATGACAACT GGATTTAAAA2460GTACAGTACA GATAGTCGTA CTGATCATGA GAGACTGGCT GATACTCAAA GTTGCAGTTA2520CTTAGCTGCA TGAGAATAAT ACTATTATAA GTTAGGTGAC AAATGATGTT GATTATGTAA2580GGATATACTT AGCTACATTT TCAGTCAGTA TGAACTTCCT GATACAAATG TAGGGATATA2640TACTGTATTT TTAAACATTT CTCACCAACT TTCTTATGTG TGTTCTTTTT AAAAATTTTT2700TTTCTTTTAA AATATTTAAC AGTTCAATCT CAATAAGACC TCGCATTATG TATGAATGTT2760ATTCACTGAC TAGATTTATT CATACCATGA GACAACACTA TTTTTATTTA TATATGCATA2820TATATACATA CATGAAATAA ATACATCAAT ATAAAAATAA AAAAAAACGG AATTCACA1 DNA sequenceGene name: tissue factor pathway inhibitor 2 TFPI2, placental protein 5 (PP5)Unigene number: Hs.78045Probeset Accession #: D29992Nucleic Acid Accession #: D29992.1Coding sequence: 57-764 (predicted start/stop codons underlined)GCCGCCAGCG GCTTTCTCGG ACGCCTTGCC CAGCGGGCCG CCCGACCCCC TGCACCATGG60ACCCCGCTCG CCCCCTGGGG CTGTCGATTC TGCTGCTTTT CCTGACGGAG GCTGCACTGG120GCGATGCTGC TCAGGAGCCA ACAGGAAATA ACGCGGAGAT CTGTCTCCTG CCCCTAGACT180ACGGACCCTG CCGGGCCCTA CTTCTCCGTT ACTACTACGA CAGGTACACG CAGAGCTGCC240GCCAGTTCCT GTACGGGGGC TGCGAGGGCA ACGCCAACAA TTTCTACACC TGGGAGGCTT300GCGACGATGC TTGCTGGAGG ATAGAAAAAG TTCCCAAAGT TTGCCGGCTG CAAGTGAGTG360TGGACGACCA GTGTGAGGGG TCCACAGAAA AGTATTTCTT TAATCTAAGT TCCATGACAT420GTGAAAAATT CTTTTCCGGT GGGTGTCACC GGAACCGGAT TGAGAACAGG TTTCCAGATG480AAGCTACTTG TATGGGCTTC TGCGCACCAA AGAAAATTCC ATCATTTTGC TACAGTCCAA540AAGATGAGGG ACTGTGCTCT GCCAATGTGA CTCGCTATTA TTTTAATCCA AGATACAGAA600CCTGTGATGC TTTCACCTAT ACTGGCTGTG GAGGGAATGA CAATAACTTT GTTAGCAGGG660AGGATTGCAA ACGTGCATGT GCAAAAGCTT TGAAAAAGAA AAAGAAGATG CCAAAGCTTC720GCTTTGCCAG TAGAATCCGG AAAATTCGGA AGAAGCAATT TTAAACATTC TTAATATGTC780ATCTTGTTTG TCTTTATGGC TTATTTGCCT TTATGGTTGT ATCTGAAGAA TAATATGACA840GCATGAGGAA ACAAATCATT GGTGATTTAT TCACCAGTTT TTATTAATAC AAGTCACTTT900TTCAAAAATT TGGATTTTTT TATATATAAC TAGCTGCTAT TCAAATGTGA GTCTACCATT960TTTAATTTAT GGTTCAACTG TTTGTGAGAC GAATTCTTGC AATGCATAAG ATATAAAAGC1020AAATATGACT CACTCATTTC TTGGGGTCGT ATTCCTGATT TCAGAAGAGG ATCATAACTG1080AAACAACATA AGACAATATA ATCATGTGCT TTTAACATAT TTGAGAATAA AAAGGACTAG1140CCACB8 DNA sequenceGene name: myosin XUnigene number: Hs.61638Probeset Accession #: N77151Nucleic Acid Accession #: NM_012334Coding sequence: 223-6399 (predicted start/stop codons underlined)GAGACAAAGG CTGCCGTCGG GACGGGCGAG TTAGGGACTT GGGTTTGGGC GAACAAAAGG60TGAGAAGGAC AAGAAGGGAC CGGGCGATGG CAGCAGGGGA GCCCCGCGGG CGCGCGTCCT120CGGGAGTGGC GCCGTGACAC GCATGGTTTC CCCCGACCCG CGGCGGCGCT GACTTCCGCG180AGTCGGAGCG GCACTCGGCG AGTCCGGGAC TGCGCTGGAA CAATGGATAA CTTCTTCACC240GAGGGAACAC GGGTCTGGCT GAGAGAAAAT GGCCAGCATT TTCCAAGTAC TGTAAATTCC300TGTGCAGAAG GCATCGTCGT CTTCCGGACA GACTATGGTC AGGTATTCAC TTACAAGCAG360AGCACAATTA CCCACCAGAA GGTGACTGCT ATGCACCCCA CGAACGAGGA GGGCGTGGAT420GACATGGCGT CCTTGACAGA GCTCCATGGC GGCTCCATCA TGTATAACTT ATTCCAGCGG480TATAAGAGAA ATCAAATATA TACCTACATC GGCTCCATCC TGGCCTCCGT GAACCCCTAC540CAGCCCATCG CCGGGCTGTA CGAGCCTGCC ACCATGGAGC AGTACAGCCG GCGCCACCTG600GGCGAGCTGC CCCCGCACAT CTTCGCCATC GCCAACGAGT GCTACCGCTG CCTGTGGAAG660CGCTACGACA ACCAGTGCAT CCTCATCAGT GGTGAAAGTG GGGCAGGTAA AACCGAAAGC720ACTAAATTGA TCCTCAAGTT TCTGTCAGTC ATCAGTCAAC AGTCTTTGGA ATTGTCCTTA780AAGGAGAAGA CATCCTGTGT TGAACGAGCT ATTCTTGAAA GCAGCCCCAT CATGGAAGCT840TTCGGCAATG CGAAGACCGT GTACAACAAC AACTCTAGTC GCTTTGGGAA GTTTGTTCAG900CTGAACATCT GTCAGAAAGG AAATATTCAG GGCGGGAGAA TTGTAGATTA TTTATTAGAA960AAAAACCGAG TAGTAAGGCA AAATCCCGGG GAAAGGAATT ATCACATATT TTATGCACTG1020CTGGCAGGGC TGGAACATGA AGAAAGAGAA GAATTTTATT TATCTACGCC AGAAAACTAC1080CACTACTTGA ATCAGTCTGG ATGTGTAGAA GACAAGACAA TCAGTGACCA GGAATCCTTT1140AGGGAAGTTA TTACGGCAAT GGACGTGATG CAGTTCAGCA AGGAGGAAGT TCGGGAAGTG1200TCGAGGCTGC TTGCTGGTAT ACTGCATCTT GGGAACATAG AATTTATCAC TGCTGGTGGG1260GCACAGGTTT CCTTCAAAAC AGCTTTGGGC AGATCTGCGG AGTTACTTGG GCTGGACCCA1320ACACAGCTCA CAGATGCTTT GACCCAGAGA TCAATGTTCC TCAGGGGAGA AGAGATCCTC1380ACGCCTCTCA ATGTTCAACA GGCAGTAGAC AGCAGGGACT CCCTGGCCAT GGCTCTGTAT1440GCGTGCTGCT TTGAGTGGGT AATCAAGAAG ATCAACAGCA GGATCAAAGG CAATGAGGAC1500TTCAAGTCTA TTGGCATCCT CGACATCTTT GGATTTGAAA ACTTTGAGGT TAATCACTTT1560GAACAGTTCA ATATAAACTA TGCAAACGAG AAACTTCAGG AGTACTTCAA CAAGCATATT1620TTTTCTTTAG AACAACTAGA ATATAGCCGG GAAGGATTAG TGTGGGAAGA TATTGACTGG1680ATAGACAATG GAGAATGCCT GGACTTGATT GAGAAGAAAC TTGGCCTCCT AGCCCTTATC1740AATGAAGAAA GCCATTTTCC TCAAGCCACA GACAGCACCT TATTGGAGAA GCTACACAGT1800CAGCATGCGA ATAACCACTT TTATGTGAAG CCCAGAGTTG CAGTTAACAA TTTTGGAGTG1860AAGCACTATG CTGGAGAGGT GCAATATGAT GTCCGAGGTA TCTTGGAGAA GAACAGAGAT1920ACATTTCGAG ATGACCTTCT CAATTTGCTA AGAGAAAGCC GATTTGACTT TATCTACGAT1980CTTTTTGAAC ATGTTTCAAG CCGCAACAAC CAGGATACCT TGAAATGTGG AAGCAAACAT2040CGGCGGCCTA CAGTCAGCTC ACAGTTCAAG GACTCACTGC ATTCCTTAAT GGCAACGCTA2100AGCTCCTCTA ATCCTTTCTT TGTTCGCTGT ATCAAGCCAA ACATGCAGAA GATGCCAGAC2160CAGTTTGACC AGGCGGTTGT GCTGAACCAG CTGCGGTACT CAGGGATGCT GGAGACTGTG2220AGAATCCGCA AAGCTGGGTA TGCGGTCCGA AGACCCTTTC AGGACTTTTA CAAAAGGTAT2280AAAGTGCTGA TGAGGAATCT GGCTCTGCCT GAGGACGTCC GAGGGAAGTG CACGAGCCTG2340CTGCAGCTCT ATGATGCCTC CAACAGCGAG TGGCAGCTGG GGAAGACCAA GGTCTTTCTT2400CGAGAATCCT TGGAACAGAA ACTGGAGAAG CGGAGGGAAG AGGAAGTGAG CCACGCGGCC2460ATGGTGATTC GGGCCCATGT CTTGGGCTTC TTAGCACGAA AACAATACAG AAAGGTCCTT2520TATTGTGTGG TGATAATACA GAAGAATTAC AGAGCATTCC TTCTGAGGAG GAGATTTTTG2580CACCTGAAAA AGGCAGCCAT AGTTTTCCAG AAGCAACTCA GAGGTCAGAT TGCTCGGAGA2640GTTTACAGAC AATTGCTGGC AGAGAAAAGG GAGCAAGAAG AAAAGAAGAA ACAGGAAGAG2700GAAGAAAAGA AGAAACGGGA GGAAGAAGAA AGAGAAAGAG AGAGAGAGCG AAGAGAAGCC2760GAGCTCCGCG CCCAGCAGGA AGAAGAAACG AGGAAGCAGC AAGAACTCGA AGCCTTGCAG2820AAGAGCCAGA AGGAAGCTGA ACTGACCCGT GAACTGGAGA AACAGAAGGA AAATAAGCAG2880GTGGAAGAGA TCCTCCGTCT GGAGAAAGAA ATCGAGGACC TGCAGCGCAT GAAGGAGCAG2940CAGGAGCTGT CGCTGACCGA GGCTTCCCTG CAGAAGCTGC AGGAGCGGCG GGACCAGGAG3000CTCCGCAGGC TGGAGGAGGA AGCGTGCAGG GCGGCCCAGG AGTTCCTCGA GTCCCTCAAT3060TTCGACGAGA TCGACGAGTG TGTCCGGAAT ATCGAGCGGT CCCTGTCGGT GGGAAGCGAA3120TTTTCCAGCG AGCTGGCTGA GAGCGCATGC GAGGAGAAGC CCAACTTCAA CTTCAGCCAG3180CCCTACCCAG AGGAGGAGGT CGATGAGGGC TTCGAAGCCG ACGACGACGC CTTCAAGGAC3240TCCCCCAACC CCAGCGAGCA CGGCCACTCA GACCAGCGAA CAAGTGGCAT CCGGACCAGC3300GATGACTCTT CAGAGGAGGA CCCATACATG AACGACACGG TGGTGCCCAC CAGCCCCAGT3360GCGGACAGCA CGGTGCTGCT CGCCCCATCA GTGCAGGACT CCGGGAGCCT ACACAACTCC3420TCCAGCGGCG AGTCCACCTA CTGCATGCCC CAGAACGCTG GGGACTTGCC CTCCCCAGAC3480GGCGACTACG ACTACGACCA GGATGACTAT GAGGACGGTG CCATCACTTC CGGCAGCAGC3540GTGACCTTCT CCAACTCCTA CGGCAGCCAG TGGTCCCCCG ACTACCGCTG CTCTGTGGGG3600ACCTACAACA GCTCGGGTGC CTACCGGTTC AGCTCTGAGG GGGCGCAGTC CTCGTTTGAA3660GATAGTGAAG AGGACTTTGA TTCCAGGTTT GATACAGATG ATGAGCTTTC ATACCGGCGT3720GACTCTGTGT ACAGCTGTGT CACTCTGCCG TATTTCCACA GCTTTCTGTA CATGAAAGGT3780GGCCTGATGA ACTCTTGGAA ACGCCGCTGG TGCGTCCTCA AGGATGAAAC CTTCTTGTGG3840TTCCGCTCCA AGCAGGAGGC CCTCAAGCAA GGCTGGCTCC ACAAAAAAGG GGGGGGCTCC3900TCCACGCTGT CCAGGAGAAA TTGGAAGAAG CGCTGGTTTG TCCTCCGCCA GTCCAAGCTG3960ATGTACTTTG AAAACGACAG CGAGGAGAAG CTCAAGGGCA CCGTAGAAGT GCGAACGGCA4020AAAGAGATCA TAGATAACAC CACCAAGGAG AATGGGATCG ACATCATTAT GGCCGATAGG4080ACTTTCCACC TGATTGCAGA GTCCCCAGAA GATGCCAGCC AGTGGTTCAG CGTGCTGAGT4140CAGGTCCACG CGTCCACGGA CCAGGAGATC CAGGAGATGC ATGATGAGCA GGGAAACCCA4200CAGAATGCTG TGGGCACCTT GGATGTGGGG CTGATTGATT CTGTGTGTGC CTCGACAGC4260CCTGATAGAC CCAACTCGTT TGTGATCATC ACGGCCAACC GGGTGCTGCA CTGCAACGCC4320GACACGCCGG AGGAGATGCA CCACTGGATA ACCCTGCTGC AGAGGTCCAA AGGGGACACC4380AGAGTGGAGG GCCAGGAATT CATCGTGAGA GGATGGTTGC ACAAAGAGGT GAAGAACAGT4440CCGAAGATGT CTTCACTGAA ACTGAAGAAA CGGTGGTTTG TACTCACCCA CAATTCCCTG4500GATTACTACA AGAGTTCAGA GAAGAACGCG CTCAAACTGG GGACCCTGGT CCTCAACAGC4560CTCTGCTCTG TCGTCCCCCC AGATGAGAAG ATATTCAAAG AGACAGGCTA CTGGAACGTC4620ACCGTGTACG GGCGCAAGCA CTGTTACCGG CTCTACACCA AGCTGCTCAA CGAGGCCACC4680CGGTGGTCCA GTGCCATTCA AAACGTGACT GACACCAAGG CCCCGATCGA CACCCCCACC4740CAGCAGCTGA TTCAAGATAT CAAGGAGAAC TGCCTGAACT CGGATGTGGT GGAACAGATT4800TACAAGCGGA ACCCGATCCT TCGATACACC CATGACCCCT TGCACTCCCC GCTCCTGCCC4860CTTCCGTATG GGGACATAAA TCTCAACTTG CTCAAAGACA AAGGCTATAC CACCCTTCAG4920GATGAGGCCA TCAAGATATT CAATTCCCTG CAGCAACTGG AGTCCATGTC TGACCCAATT4980CCAATAATCC AGGGCATCCT ACAGACAGGG CATGACCTGC GACCTCTGCG GGACGAGCTG5040TACTGCCAGC TTATCAAACA GACCAACAAA GTGCCCCACC CCGGCAGTGT GGGCAACCTG5100TACAGCTGGC AGATCCTGAC ATGCCTGAGC TGCACCTTCC TGCCGAGTCG AGGGATTCTC5160AAGTATCTCA AGTTCCATCT GAAAAGGATA CGGGAACAGT TTCCAGGAAC CGAGATGGAA5220AAATACGCTC TCTTCACTTA CGAATCTCTT AAGAAAACCA AATGCCGAGA GTTTGTGCCT5280TCCCGAGATG AAATAGAAGC TCTGATCCAC AGGCAGGAAA TGACATCCAC GGTCTATTGC5340CATGGCGGCG GCTCCTGCAA GATCACCATC AACTCCCACA CCACTGCTGG GGAGGTGGTG5400GAGAAGCTGA TCCGAGGCCT GGCCATGGAG GACAGCAGGA ACATGTTTGC TTTGTTTGAA5460TACAACGGCC ACGTCGACAA AGCCATTGAA AGTCGAACCG TCGTAGCTGA TGTCTTAGCC5520AAGTTTGAAA AGCTGGCTGC CACATCCGAG GTTGGGGACC TGCCATGGAA ATTCTACTTC5580AAACTTTACT GCTTCCTGGA CACAGACAAC GTGCCAAAAG ACAGTGTGGA GTTTGCATTT5640ATGTTTGAAC AGGCCCACGA AGCGGTTATC CATGGCCACC ATCCAGCCCC GGAAGAAAAC5700CTCCAGGTTC TTGCTGCCCT GCGACTCCAG TATCTGCAGG GGGATTATAC TCTGCACGCT5760GCCATCCCAC CTCTCGAAGA GGTTTATTCC CTGCAGAGAC TCAAGGCCCG CATCAGCCAG5820TCAACCAAAA CCTTCACCCC TTGTGAACGG CTGGAGAAGA GGCGGACGAG CTTCCTAGAG5880GGGACCCTGA GGCGGAGCTT CCGGACAGGA TCCGTGGTCC GGCAGAAGGT CGAGGAGGAG5940CAGATGCTGG ACATGTGGAT TAAGGAAGAA GTCTCCTCTG CTCGAGCCAG TATCATTGAC6000AAGTGGAGGA AATTTCAGGG AATGAACCAG GAACAGGCCA TGGCCAAGTA CATGGCCTTG6060ATCAAGGAGT GGCCTGGCTA TGGCTCGACG CTGTTTGATG TGGAGTGCAA GGAAGGTGGC6120TTCCCTCAGG AACTCTGGTT GGGTGTCAGC GCGGACGCCG TCTCCGTCTA CAAGCGTGGA6180GAGGGAAGAC AACTGGAAGT CTTCCAGTAT GAACACATCC TCTCTTTTGG GGCACCCCTG6240GCGAATACGT ATAAGATCGT GGTCGATGAG AGGGAGCTGC TCTTTGAAAC CAGTGAGGTG6300GTGGATGTGG CCAAGCTCAT GAAAGCCTAC ATCAGCATGA TCGTGAAGAA GCGCTACAGC6360ACGACACGCT CCGCCAGCAG CCAGGGCAGC TCCAGGTGAA GGCGGGACAG AGCCCACCTG6420TCTTTGCTAC CTGAACGCAC CACCCTCTGG CCTAGGCTGG CTCCAGTGTG CCATGCCCAG6480CCAAAACAAA CACAGAGCTG CCCAGGCTTT CTGGAAGCTT CTGGTCTGAG GGAGGTGTCT6540CCGAGGATCC TTTTGCCTGC CGCCTTCATT GATCCTGTAT TAAGCTGTCA ACTTTAACAG6600TCTGCACAGT TTCCAAAGCT TTACTACTCT TAGAGGACAC ATGCCTTAAA AAAGGAGGGG6660AGGAACCACG CTGCCACCAA AGCAGCCGGA AGTGCCTTAA CTTGTGGAAC CAACACTAAT6720CGACCGTAAC TGTGCTACTG AAGGGAACTG CCTTTCCCCC TTCTGGGGGA GACTTAACAG6780AGCGTGGAAG GGGGGCATTC TCTGTCAATG ATGCACTAAC CTCCCAACCT GATTTCCCCG6840AATCTGAGGG AAGGTGAGGG AGTGGGAAGG GGGATGGAGA GCTCGAGGGG ACAGTGTGTT6900TGAGCTGGAG TGCTGCGGGC AGCCTTTCTC ATGGAATGAC ATGAATCAAC TTTTTTCTTT6960GTTTCATCTT TTAAGTGTAC GTGCTTGCCT GTTCGTGCAT GTGTTCATAA ACTCAACACT7020TTAATCATGG TTTCATGAGC ATTAAAAAGC AAAGGGAAAA AGGATGTGTA ATGGTGTACA7080CAGTCTGTAT ATTTTAATAA TGCAGAGCTA TAGTCTCAAT TGTTACTTTA TAAGGTGGTT7140TTATTAACAA ACCCAAATCC TGGATTTTCC TGTCTTTGCT GTATTTTGAA AAACACGTGT7200TGACTCCATT GTTTTACATG TAGCAAAGTC TGCCATCTGT GTCTGCTGTA TTATAAACAG7260ATAAGCAGCC TACAAGATAA CTGTATTTAT AAACCACTCT TCAACAGCTG GCTCCAGTGC7320TGGTTTTAGA ACAAGAATGA AGTCATTTTG GAGTCTTTCA TGTCTAAAAG ATTTAAGTTA7380AAAACAAAGT GTTACTTGGA AGGTTAGCTT CTATCATTCT GGATAGATTA CAGATATAAT7440AACCATGTTG ACTATGGGGG AGAGACGCTG CATTCCAGAA ACGTCTTAAC ACTTGAGTGA7500ATCTTCAAAG GACCCTGACA TTAAATGCTG AGGCTTTAAT ACACACATAT TTTATCCCAA7560GTTTATAATG GTGGTCTGAA CAAGGCACCT GTAAATAAAT CAGCATTTAT GACCAGAAGA7620AAAATAATCT GGTCTTGGAC TTTTTATTTT TATATGGAAA AGTTTTAAGG ACTTGGGCCA7680ACTAAGTCTA CCCACACGAA AAAAGAAATT TGCCTTGTCC CTTTGTGTAC AACCATGCAA7740AACTGTTTGT TGGCTCACAG AAGTTCTGAC AATAAAAGAT ACTAGCTACC3 DNA sequenceGene name: calcitonin receptor-like (CALCRL)Unigene number: Hs.152175Probeset Accession #: L76380Nucleic Acid Accession #: NM_005795Coding sequence: 555-1940 (predicted start/stop codons underlined)GCACGAGGGA ACAACCTCTC TCTCTSCAGC AGAGAGTGTC ACCTCCTGCT TTAGGACCAT60CAAGCTCTGC TAACTGAATC TCATCCTAAT TGCAGGATCA CATTGCAAAG CTTTCACTCT120TTCCCACCTT GCTTGTGGGT AAATCTCTTC TGCGGAATCT CAGAAAGTAA AGTTCCATCC180TGAGAATATT TCACAAAGAA TTTCCTTAAG AGCTGGACTG GGTCTTGACC CCTGGAATTT240AAGAAATTCT TAAAGACAAT GTCAAATATG ATCCAAGAGA AAATGTGATT TGAGTCTGGA300GACAATTGTG CATATCGTCT AATAATAAAA ACCCATACTA GCCTATAGAA AACAATATTT360GAATAATAAA AACCCATACT AGCCTATAGA AAACAATATT TGAAAGATTG CTACCACTAA420AAAGAAAACT ACTACAACTT GACAAGACTG CTGCAAACTT CAATTGGTCA CCACAACTTG480ACAAGGTTGC TATAAAACAA GATTGCTACA ACTTCTAGTT TATGTTATAC AGCATATTTC540ATTTGGGCTT AATGATGGAG AAAAAGTGTA CCCTGTATTT TCTGGTTCTC TTGCCTTTTT600TTATGATTCT TGTTACAGCA GAATTAGAAG AGAGTCCTGA GGACTCAATT CAGTTGGGAG660TTACTAGAAA TAAAATCATG ACAGCTCAAT ATGAATGTTA CCAAAAGATT ATGCAAGACC720CCATTCAACA AGCAGAAGGC GTTTACTGCA ACAGAACCTG GGATGGATGG CTCTGCTGGA780ACGATGTTGC AGCAGGAACT GAATCAATGC AGCTCTGCCC TGATTACTTT CAGGACTTTG840ATCCATCAGA AAAAGTTACA AAGATCTGTG ACCAAGATGG AAACTGGTTT AGACATCCAG900CAAGCAACAG AACATGGACA AATTATACCC AGTGTAATGT TAACACCCAC GAGAAAGTGA960AGACTGCACT AAATTTGTTT TACCTGACCA TAATTGGACA CGGATTGTCT ATTGCATCAC1020TGCTTATCTC GCTTGGCATA TTCTTTTATT TCAAGAGCCT AAGTTGCCAA AGGATTACCT1080TACACAAAAA TCTGTTCTTC TCATTTGTTT GTAACTCTGT TGTAACAATC ATTCACCTCA1140CTGCAGTGGC CAACAACCAG GCCTTAGTAG CCACAAATCC TGTTAGTTGC AAAGTGTCCC1200AGTTCATTCA TCTTTACCTG ATGGGCTGTA ATTACTTTTG GATGCTCTGT GAAGGCATTT1260ACCTACACAC ACTCATTGTG GTGGCCGTGT TTGCAGAGAA GCAACATTTA ATGTGGTATT1320ATTTTCTTGG CTGGGGATTT CCACTGATTC CTGCTTGTAT ACATGCCATT GCTAGAAGCT1380TATATTACAA TGACAATTGC TGGATCAGTT CTGATACCCA TCTCCTCTAC ATTATCCATG1440GCCCAATTTG TGCTGCTTTA CTGGTGAATC TTTTTTTCTT GTTAAATATT GTACGCGTTC1500TCATCACCAA GTTAAAAGTT ACACACCAAG CGGAATCCAA TCTGTACATG AAAGCTGTGA1560GAGCTACTCT TATCTTGGTG CCATTGCTTG GCATTGAATT TGTGCTGATT CCATGGCGAC1620CTGAAGGAAA GATTGCAGAG GAGGTATATG ACTACATCAT GCACATCCTT ATGCACTTCC1680AGGGTCTTTT GGTCTCTACC ATTTTCTGCT TCTTTAATGG AGAGGTTCAA GCAATTCTGA1740GAAGAAACTG GAATCAATAC AAAATCCAAT TTGGAAACAG CTTTTCCAAC TCAGAAGCTC1800TTCGTAGTGC GTCTTACACA GTGTCAACAA TCAGTGATGG TCCAGGTTAT AGTCATGACT1860GTCCTAGTGA ACACTTAAAT GGAAAAAGCA TCCATGATAT TGAAAATGTT CTCTTAAAAC1920CAGAAAATTT ATATAATTGA AAATAGAAGG ATGGTTGTCT CACTGTTTGG TGCTTCTCCT1980AACTCAAGGA CTTGGACCCA TGACTCTGTA GCCAGAAGAC TTCAATATTA AATGACTTTG2040GGGAATGTCA TAAAGAAGAG CCTTCACATG AAATTAGTAG TGTGTTGATA AGAGTGTAAC2100ATCCAGCTCT ATGTGGGAAA AAAGAAATCC TGGTTTGTAA TGTTTGTCAG TAAATACTCC2160CACTATGCCT GATGTGACGC TACTAACCTG ACATCACCAA GTGTGGAATT GGAGAAAAGC2220ACAATCAACT TTTCTGAGCT GGTGTAAGCC AGTTCCAGCA CACCATTGAT GAATTCAAAC2280AAATGGCTGT AAAACTAAAC ATACATGTTG GGCATGATTC TACCCTTATT CSCCCCAAGA2340GACCTAGCTA AGGTCTATAA ACATGAAGGG AAAATTAGCT TTTAGTTTTA AAACTCTTTA2400TCCCATCTTG ATTGGGGCAG TTGACTTTTT TTTTTTCCCA GAGTGCCGTA GTCCTTTTTG2460TAACTACCCT CTCAAATGGA CAATACCAGA AGTGAATTAT CCCTGCTGGC TTTCTTTTCT2520CTATGAAAAG CAACTGAGTA CAATTGTTAT GATCTACTCA TTTGCTGACA CATCAGTTAT2580ATCTTGTGGC ATATCCATTG TGGAAACTGG ATGAACAGGA TGTATAATAT GCAATCTTAC2640TTCTATATCA TTAGGAAAAC ATCTTAGTTG ATGCTACAAA ACACCTTGTC AACCTCTTCC2700TGTCTTACCA AACAGTGGGA GGGAATTCCT AGCTGTAAAT ATAAATTTTG CCCTTCCATT2760TCTACTGTAT AAACAAATTA GCAATCATTT TATATAAAGA AAATCAATGA AGGATTTCTT2820ATTTTCTTGG AATTTTGTAA AAAGAAATTG TGAAAAATGA GCTTGTAAAT ACTCCATTAT2880TTTATTTTAT AGTCTCAAAT CAAATACATA CAACCTATGT AATTTTTAAA GCAAATATAT2940AATGCAACAA TGTGTGTATG TTAATATCTG ATACTGTATC TGGGCTGATT TTTTAAATAA3000AATAGAGTCT GGAATGCTACC4 DNA sequenceGene name: Homo sapiens mRNA; cDNA DKFZp586E1624Unigene number: Hs.94030Probeset Accession #: AA452000Nucleic Acid Accession #: AL110152.1Coding sequence: no ORF identified, possible frameshiftsACGCGTCCGA AGACATTAAG TAAAAAATTG GAACTATGAT TTTTCTTTGT CATTTTTTAA60AAAAGAATTA TTTTATTAAC CTGCTGGCAT ATAATCTGGA GTTCTTTTCA CAACCTTACT120TTTTCTGATT TGCTTTATTG AATGATTGAA TACTCATTTC TTTCTAAAAA TATGTTGTAA180ATTCTCCCTT GGCAAGATTT CTCCCTATGA GGGTAGTTAT TATTTGAGTC TGCCAAGTGG240TTACCATGGG GCAAGGTGCC ATGATGTATT CTTGGGTGCA TTGGTTTTTT GCGCATTGTA300AATTTAAGAC ACTTATAGTA AGTGGACTCA TTCATAGATG AGTTTCAGAA CCTTTTACGT360TCTCGGTAGA GGCTTCTGTC GGACAGGCAG AAGAGTGTAT TCCTCACTTT TTTTTTTGTC420TTCAAATTCC AGTAAGGCAT GCCACTTTTA AGAAATTAGA ATTTTTCTAT CATCTATGCA480AATGATATTT ATGTTAATAT TAAATATCTT ATGTTACACT GGGAGTAATT TGAGGTGCAA540TTATTTTTAT TACTACTTTG AATAGAGGAC CATTATCCTT CTTTCTTCAG AAAACTAAGA600AGTAAGTGTA ACTTTTAAAG TAAGTATATA TCAGTGAGAG TAGGCTTGTT TTACAACTAT660TTCTAGCCAG TGAGTTGTGT TTTCATGTCT CATCAAAAGA CAATACCACA TTGCATCATT720TTACAAAATA TGTTGTCATT TTCATTTCAG TTGTAACATA GGAAAATAGA TATTTCCTAG780ATGATTTCTG AGTTTCTTAC TGCAAAGAAC AGTTATAAAT TGGTATACAT GTGTCTCTGT840AATAGGGATA ATATTGATAT ATCTGTTGCT ACATATTTAA GAATCATTCT ATCTTATGTT900GTCTTGAGGC CAAGATTTAC CACGTTTGCC CAGTGTATTG AATTGGTGGT AGAAGGTAGT960TCCATGTTCC ATTTGTAGAT CTTTAAGATT TTATCTTTGA TAACTTTAAT AGAATGTGGC1020TCAGTTCTGG TCCTTCAAGC CTGTATGGTT TGGATTTTCA GTAGGGGACA GTTGATGTGG1080AGTCAATCTC TTTGGTACAC AGGAAGCTTT ATAAAATTTC ATTCACGAAT CTCTTATTTT1140GGGAAGCTGT TTTGCATATG AGAAGAACAC TGTTGAAATA AGGAACTAAA GCTTTATATA1200TTGATCAAGG TGATTCTGAA AGTTTTAATT TTTAATGTTG TAATGTTATG TTATTGTTAA1260TTGTACTTTA TTATGTATTC AATAGAAAAT CATGATTTAT TAATAAAAGC TTAAATTCTC1320ATCTAAAAAA AAAAAAAAAA AACC5 DNA sequenceGene name: Selectin E (endothelial adhesion molecule 1)Unigene number: Hs.89546Probeset Accession #: M24736Nucleic Acid Accession #: NM_000450Coding sequence: 117-1949 (predicted start/stop codons underlined)CCTGAGACAG AGGCAGCAGT GATACCCACC TGAGAGATCC TGTGTTTGAA CAACTGCTTC60CCAAAACGGA AAGTATTTCA AGCCTAAACC TTTGGGTGAA AAGAACTCTT GAAGTCATGA120TTGCTTCACA GTTTCTCTCA GCTCTCACTT TGGTGCTTCT CATTAAAGAG AGTGGAGCCT180GGTCTTACAA CACCTCCACG GAAGCTATGA CTTATGATGA GGCCAGTGCT TATTGTCAGC240AAAGGTACAC ACACCTGGTT GCAATTCAAA ACAAAGAAGA GATTGAGTAC CTAAACTCCA300TATTGAGCTA TTCACCAAGT TATTACTGGA TTGGAATCAG AAAAGTCAAC AATGTGTGGG360TCTGGGTAGG AACCCAGAAA CCTCTGACAG AAGAAGCCAA GAACTGGGCT CCAGGTGAAC420CCAACAATAG GCAAAAAGAT GAGGACTGCG TGGAGATCTA CATCAAGAGA GAAAAAGATG480TGGGCATGTG GAATGATGAG AGGTGCAGCA AGAAGAAGCT TGCCCTATGC TACACAGCTG540CCTGTACCAA TACATCCTGC AGTGGCCACG GTGAATGTGT AGAGACCATC AATAATTACA600CTTGCAAGTG TGACCCTGGC TTCAGTGGAC TCAAGTGTGA GCAAATTGTG AACTGTACAG660CCCTGGAATC CCCTGAGCAT GGAAGCCTGG TTTGCAGTCA CCCACTGGGA AACTTCAGCT720ACAATTCTTC CTGCTCTATC AGCTGTGATA GGGGTTACCT GCCAAGCAGC ATGGAGACCA780TGCAGTGTAT GTCCTCTGGA GAATGGAGTG CTCCTATTCC AGCCTGCAAT GTGGTTGAGT840GTGATGCTGT GACAAATCCA GCCAATGGGT TCGTGGAATG TTTCCAAAAC CCTGGAAGCT900TCCCATGGAA CACAACCTGT ACATTTGACT GTGAAGAAGG ATTTGAACTA ATGGGAGCCC960AGAGCCTTCA GTGTACCTCA TCTGGGAATT GGGACAACGA GAAGCCAACG TGTAAAGCTG1020TGACATGCAG GGCCGTCCGC CAGCCTCAGA ATGGCTCTGT GAGGTGCAGC CATTCCCCTG1080CTGGAGAGTT CACCTTCAAA TCATCCTGCA ACTTCACCTG TGAGGAAGGC TTCATGTTGC1140AGGGACCAGC CCAGGTTGAA TGCACCACTC AAGGGCAGTG GACACAGCAA ATCCCAGTTT1200GTGAAGCTTT CCAGTGCACA GCCTTGTCCA ACCCCGAGCG AGGCTACATG AATTGTCTTC1260CTAGTGCTTC TGGCAGTTTC CGTTATGGGT CCAGCTGTGA GTTCTCCTGT GAGCAGGGTT1320TTGTGTTGAA GGGATCCAAA AGGCTCCAAT GTGGCCCCAC AGGGGAGTGG GACAACGAGA1380AGCCCACATG TGAAGCTGTG AGATGCGATG CTGTCCACCA GCCCCCGAAG GGTTTGGTGA1440GGTGTGCTCA TTCCCCTATT GGAGAATTCA CCTACAAGTC CTCTTGTGCC TTCAGCTGTG1500AGGAGGGATT TGAATTATAT GGATCAACTC AACTTGAGTG CACATCTCAG GGACAATGGA1560CAGAAGAGGT TCCTTCCTGC CAAGTGGTAA AATGTTCAAG CCTGGCAGTT CCGGGAAAGA1620TCAACATGAG CTGCAGTGGG GAGCCCGTGT TTGGCACTGT GTGCAAGTTC GCCTGTCCTG1680AAGGATGGAC GCTCAATGGC TCTGCAGCTC GGACATGTGG AGCCACAGGA CACTGGTCTG1740GCCTGCTACC TACCTGTGAA GCTCCCACTG AGTCCAACAT TCCCTTGGTA GCTGGACTTT1800CTGCTGCTGG ACTCTCCCTC CTGACATTAG CACCATTTCT CCTCTGGCTT CGGAAATGCT1860TACGGAAAGC AAAGAAATTT GTTCCTGCCA GCAGCTGCCA AAGCCTTGAA TCAGACGGAA1920GCTACCAAAA GCCTTCTTAC ATCCTTTAAG TTCAAAAGAA TCAGAAACAG GTGCATCTGG1980GGAACTAGAG GGATACACTG AAGTTAACAG AGACAGATAA CTCTCCTCGG GTCTCTGGCC2040CTTCTTGCCT ACTATGCCAG ATGCCTTTAT GGCTGAAACC GCAACACCCA TCACCACTTC2100AATAGATCAA AGTCCAGCAG GCAAGGACGG CCTTCAACTG AAAAGACTCA GTGTTCCCTT2160TCCTACTCTC AGGATCAAGA AAGTGTTGGC TAATGAAGGG AAAGGATATT TTCTTCCAAG2220CAAAGGTGAA GAGACCAAGA CTCTGAAATC TCAGAATTCC TTTTCTAACT CTCCCTTGCT2280CGCTGTAAAA TCTTGGCACA GAAACACAAT ATTTTGTGGC TTTCTTTCTT TTGCCCTTCA2340CAGTGTTTCG ACAGCTGATT ACACAGTTGC TGTCATAAGA ATGAATAATA ATTATCCAGA2400GTTTAGAGGA AAAAAATGAC TAAAAATATT ATAACTTAAA AAAATGACAG ATGTTGAATG2460CCCACAGGCA AATGCATGGA GGGTTGTTAA TGGTGCAAAT CCTACTGAAT GCTCTGTGCG2520AGGGTTACTA TGCACAATTT AATCACTTTC ATCCCTATGG GATTCAGTGC TTCTTAAAGA2580GTTCTTAAGG ATTGTGATAT TTTTACTTGC ATTGAATATA TTATAATCTT CCATACTTCT2640TCATTCAATA CAAGTGTGGT AGGGACTTAA AAAACTTGTA AATGCTGTCA ACTATGATAT2700GGTAAAAGTT ACTTATTCTA GATTACCCCC TCATTGTTTA TTAACAAATT ATGTTACATC2760TGTTTTAAAT TTATTTCAAA AAGGGAAACT ATTGTCCCCT AGCAAGGCAT GATGTTAACC2820AGAATAAAGT TCTGAGTGTT TTTACTACAG TTGTTTTTTG AAAACATGGT AGAATTGGAG2880AGTAAAAACT GAATGGAAGG TTTGTATATT GTCAGATATT TTTTCAGAAA TATGTGGTTT2940CCACGATGAA AAACTTCCAT GAGGCCAAAC GTTTTGAACT AATAAAAGCA TAAATGCAAA3000CACACAAAGG TATAATTTTA TGAATGTCTT TGTTGGAAAA GAATACAGAA AGATGGATGT3060GCTTTGCATT CCTACAAAGA TGTTTGTCAG ATGTGATATG TAAACATAAT TCTTGTATAT3120TATGGAAGAT TTTAAATTCA CAATAGAAAC TCACCATGTA AAAGAGTCAT CTGGTAGATT3180TTTAACGAAT GAAGATGTCT AATAGTTATT CCCTATTTGT TTTCTTCTGT ATGTTAGGGT3240GCTCTGGAAG AGAGGAATGC CTGTGTGAGC AAGCATTTAT GTTTATTTAT AAGCAGATTT3300AACAATTCCA AAGGAATCTC CAGTTTTCAG TTGATCACTG GCAATGAAAA ATTCTCAGTC3360AGTAATTGCC AAAGCTGCTC TAGCCTTGAG GAGTGTGAGA ATCAAAACTC TCCTACACTT3420CCATTAACTT AGCATGTGTT GAAAAAAAAA GTTTCAGAGA AGTTCTGGCT GAACACTGGC3480AACGACAAAG CCAACAGTCA AAACAGAGAT GTGATAAGGA TCAGAACAGC AGAGGTTCTT3540TTAAAGGGGC AGAAAAACTC TGGGAAATAA GAGAGAACAA CTACTGTGAT CAGGCTATGT3600ATGGAATACA GTGTTATTTT CTTTGAAATT GTTTAAGTGT TGTAAATATT TATGTAAACT3660GCATTAGAAA TTAGCTGTGT GAAATACCAG TGTGGTTTGT GTTTGAGTTT TATTGAGAAT3720TTTAAATTAT AACTTAAAAT ATTTTATAAT TTTTAAAGTA TATATTTATT TAAGCTTATG3780TCAGACCTAT TTGACATAAC ACTATAAAGG TTGACAATAA ATGTGCTTAT GTTTACC8 DNA sequenceGene name: Chemokine (C-X-C motif), receptor 4 (fusin)Unigene number: Hs.89414Probeset Accession #: L06797Nucleic Acid Accession #: NM_003467Coding sequence: 89-1147 (predicted start/stop codons underlined)GTTTGTTGGC TGCGGCAGCA GGTAGCAAAG TGACGCCGAG GGCCTGAGTG CTCCAGTAGC60CACCGCATCT GGAGAACCAG CGGTTACCATGGAGGGGATC AGTATATACA CTTCAGATAA120CTACACCGAG GAAATGGGCT CAGGGGACTA TGACTCCATG AAGGAACCCT GTTTCCGTGA180AGAAAATGCT AATTTCAATA AAATCTTCCT GCCCACCATC TACTCCATCA TCTTCTTAAC240TGGCATTGTG GGCAATGGAT TGGTCATCCT GGTCATGGGT TACCAGAAGA AACTGAGAAG300CATGACGGAC AAGTACAGGC TGCACCTGTC AGTGGCCGAC CTCCTCTTTG TCATCACGCT360TCCCTTCTGG GCAGTTGATG CCGTGGCAAA CTGGTACTTT GGGAACTTCC TATGCAAGGC420AGTCCATGTC ATCTACACAG TCAACCTCTA CAGCAGTGTC CTCATCCTGG CCTTCATCAG480TCTGGACCGC TACCTGGCCA TCGTCCACGC CACCAACAGT CAGAGGCCAA GGAAGCTGTT540GGCTGAAAAG GTGGTCTATG TTGGCGTCTG GATCCCTGCC CTCCTGCTGA CTATTCCCGA600CTTCATCTTT GCCAACGTCA GTGAGGCAGA TGACAGATAT ATCTGTGACC GCTTCTACCC660CAATGACTTG TGGGTGGTTG TGTTCCAGTT TCAGCACATC ATGGTTGGCC TTATCCTGCC720TGGTATTGTC ATCCTGTCCT GCTATTGCAT TATCATCTCC AAGCTGTCAC ACTCCAAGGG780CCACCAGAAG CGCAAGGCCC TCAAGACCAC AGTCATCCTC ATCCTGGCTT TCTTCGCCTG840TTGGCTGCCT TACTACATTG GGATCAGCAT CGACTCCTTC ATCCTCCTGG AAATCATCAA900GCAAGGGTGT GAGTTTGAGA ACACTGTGCA CAAGTGGATT TCCATCACCG AGGCCCTAGC960TTTCTTCCAC TGTTGTCTGA ACCCCATCCT CTATGCTTTC CTTGGAGCCA AATTTAAAAC1020CTCTGCCCAG CACGCACTCA CCTCTGTGAG CAGAGGGTCC AGCCTCAAGA TCCTCTCCAA1080AGGAAAGCGA GGTGGACATT CATCTGTTTC CACTGAGTCT GAGTCTTCAA GTTTTCACTC1140CAGCTAACAC AGATGTAAAA GACTTTTTTT TATACGATAA ATAACTTTTT TTTAAGTTAC1200ACATTTTTCA GATATAAAAG ACTGACCAAT ATTGTACAGT TTTTATTGCT TGTTGGATTT1260TTGTCTTGTG TTTCTTTAGT TTTTGTGAAG TTTAATTGAC TTATTTATAT AAATTTTTTT1320TGTTTCATAT TGATGTGTGT CTAGGCAGGA CCTGTGGCCA AGTTCTTAGT TGCTGTATGT1380CTCGTGGTAG GACTGTAGAA AAGGGAACTG AACATTCCAG AGCGTGTAGT GAATCACGTA1440AAGCTAGAAA TGATCCCCAG CTGTTTATGC ATAGATAATC TCTCCATTCC CGTGGAACGT1500TTTTCCTGTT CTTAAGACGT GATTTTGCTG TAGAAGATGG CACTTATAAC CAAAGCCCAA1560AGTGGTATAG AAATGCTGGT TTTTCAGTTT TCAGGAGTGG GTTGATTTCA GCACCTACAG1620TGTACAGTCT TGTATTAAGT TGTTAATAAA AGTACATGTT AAACTTACTT AGTGTTATGACF2 DNA sequenceGene name: Endothelial cell-specific molecule 1Unigene number: Hs.41716Probeset Accession #: X89426Nucleic Acid Accession #: NM_007036Coding sequence: 56-610 (predicted start/stop codons underlined)CTTCCCACCA GCAAAGACCA CGACTGGAGA GCCGAGCCGG AGGCAGCTGG GAAACATGAA60GAGCGTCTTG CTGCTGACCA CGCTCCTCGT GCCTGCACAC CTGGTGGCCG CCTGGAGCAA120TAATTATGCG GTGGACTGCC CTCAACACTG TGACAGCAGT GAGTGCAAAA GCAGCCCGCG180CTGCAAGAGG ACAGTGCTCG ACGACTGTGG CTGCTGCCGA GTGTGCGCTG CAGGGCGGGG240AGAAACTTGC TACCGCACAG TCTCAGGCAT GGATGGCATG AAGTGTGGCC CGGGGCTGAG300GTGTCAGCCT TCTAATGGGG AGGATCCTTT TGGTGAAGAG TTTGGTATCT GCAAAGACTG360TCCCTACGGC ACCTTCGGGA TGGATTGCAG AGAGACCTGC AACTGCCAGT CAGGCATCTG420TGACAGGGGG ACGGGAAAAT GCCTGAAATT CCCCTTCTTC CAATATTCAG TAACCAAGTC480TTCCAACAGA TTTGTTTCTC TCACGGAGCA TGACATGGCA TCTGGAGATG GCAATATTGT540GAGAGAAGAA GTTGTGAAAG AGAATGCTGC CGGGTCTCCC GTAATGAGGA AATGGTTAAA600TCCACGCTGA TCCCGGCTGT GATTTCTGAG AGAAGGCTCT ATTTTCGTGA TTGTTCAACA660CACAGCCAAC ATTTTAGGAA CTTTCTAGAT ATAGCATAAG TACATGTAAT TTTTGAAGAT720CCAAATTGTG ATGCATGGTG GATCCAGAAA ACAAAAAGTA GGATACTTAC AATCCATAAC780ATCCATATGA CTGAACACTT GTATGTGTTT GTTAAATATT CGAATGCATG TAGATTTGTT840AAATGTGTGT GTATAGTAAC ACTGAAGAAC TAAAAATGCA ATTTAGGTAA TCTTACATGG900AGACAGGTCA ACCAAAGAGG GAGCTAGGCA AAGCTGAAGA CCGCAGTGAG TCAAATTAGT960TCTTTGACTT TGATGTACAT TAATGTTGGG ATATGGAATG AAGACTTAAG AGCAGGAGAA1020GATGGGGAGG GGGTGGGAGT GGGAAATAAA ATATTTAGCC CTTCCTTGGT AGGTAGCTTC1080TCTAGAATTT AATTGTGCTT TTTTTTTTTT TTTGGCTTTG GGAAAAGTCA AAATAAAACA1140ACCAGAAAAC CCCTGAAGGA AGTAAGATGT TTGAAGCTTA TGGAAATTTG AGTAACAAAC1200AGCTTTGAAC TGAGAGCAAT TTCAAAAGGC TGCTGATGTA GTTCCCGGGT TACCTGTATC1260TGAAGGACGG TTCTGGGGCA TAGGAAACAC ATACACTTCC ATAAATAGCT TTAACGTATG1320CCACCTCAGA GATAAATCTA AGAAGTATTT TACCCACTGG TGGTTTGTGT GTGTATGAAG1380GTAAATATTT ATATATTTTT ATAAATAAAT GTGTTAGTGC AAGTCATCTT CCCTACCCAT1440ATTTATCATC CTCTTGAGGA AAGAAATCTA GTATTATTTG TTGAAAATGG TTAGAATAAA1500AACCTATGAC TCTATAAGGT TTTCAAACAT CTGAGGCATG ATAAATTTAT TATCCATAAT1560TATAGGAGTC ACTCTGGATT TCAAAAAATG TCAAAAAATG AGCAACAGAG GGACCTTATT1620TAAACATAAG TGCTGTGACT TCGGTGAATT TTCAATTTAA GGTATGAAAA TAAGTTTTTA1680GGAGGTTTGT AAAAGAAGAA TCAATTTTCA GCAGAAAACA TGTCAACTTT AAAATATAGG1740TGGAATTAGG AGTATATTTG AAAGAATCTT AGCACAAACA GGACTGTTGT ACTAGATGTT1800CTTAGGAAAT ATCTCAGAAG TATTTTATTT GAAGTGAAGA ACTTATTTAA GAATTATTTC1860AGTATTTACC TGTATTTTAT TCTTGAAGTT GGCCAACAGA GTTGTGAATG TGTGTGGAAG1920GCCTTTGAAT GTAAAGCTGC ATAAGCTGTT AGGTTTTGTT TTAAAAGGAC ATGTTTATTA1980TTGTTCAATA AAAAAGAACA AGATACACF4 DNA sequenceGene name: P53-responsive gene 2 similar to D.melanogaster peroxidasin (U11052)Unigene number: Hs.118893Probeset Accession #: D86983Nucleic Acid Accession #: D86983Coding sequence: 1-4491 (predicted stop codon underlined, sequence is open at 5′end)AGCCGGCCGT GGTGGCTCCG TGCGTCCGAG CGTCCGTCCG CGCCGTCGGC CATGGCCAAG60CGCTCCAGGG GCCCCGGGCG CCGCTGCCTG TTGGCGCTCG TGCTGTTCTG CGCCTGGGGG120ACGCTGGCCG TGGTGGCCCA GAAGCCGGGC GCAGGGTGTC CGAGCCGCTG CCTGTGCTTC180CGCACCACCG TGCGCTGCAT GCATCTGCTG CTGGAGGCCG TGCCCGCCGT GGCGCCGCAG240ACCTCCATCC TAGATCTTCG CTTTAACAGA ATCAGAGAGA TCCAACCTGG GGCATTCAGG300CGGCTGAGGA ACTTGAACAC ATTGCTTCTC AATAATAATC AGATCAAGAG GATACCTAGT360GGAGCATTTG AAGACTTGGA AAATTTAAAA TATCTCTATC TGTACAAGAA TGAGATCCAG420TCAATTGACA GGCAAGCATT TAAGGGACTT GCCTCTCTAG AGCAACTATA CCTGCACTTT480AATCAGATAG AAACTTTGGA CCCAGATTCG TTCCAGCATC TCCCGAAGCT CGAGAGGCTA540TTTTTGCATA ACAACCGGAT TACACATTTA GTTCCAGGGA CATTTAATCA CTTGGAATCT600ATGAAGAGAT TGCGACTGGA CTCAAACACA CTTCACTGCG ACTGTGAAAT CCTGTGGTTG660GCGGATTTGC TGAAAACCTA CGCGGAGTCG GGGAACGCGC AGGCAGCGGC CATCTGTGAA720TATCCCAGAC GCATCCAGGG ACGCTCAGTG GCAACCATCA CCCCGGAAGA GCTGAACTGT780GAAAGGCCCC GGATCACCTC CGAGCCCCAG GACGCAGATG TGACCTCGGG GAACACCGTG840TACTTCACCT GCAGAGCCGA AGGCAACCCC AAGCCTGAGA TCATCTGGCT GCGAAACAAT900AATGAGCTGA GCATGAAGAC AGATTCCCGC CTAAACTTGC TGGACGATGG GACCCTGATG960ATCCAGAACA CACAGGAGAC AGACCAGGGT ATCTACCAGT GCATGGCAAA GAACGTGGCC1020GGAGAGGTGA AGACGCAAGA GGTGACCCTC AGGTACTTCG GGTCTCCAGC TCGACCCACT1080TTTGTAATCC AGCCACAGAA TACAGAGGTG CTGGTTGGGG AGAGCGTCAC GCTGGAGTGC1140AGCGCCACAG GCCACCCCCC GCCGCGGATC TCCTGGACGA GAGGTGACCG CACACCCTTG1200CCAGTTGACC CGCGGGTGAA CATCACGCCT TCTGGCGGGC TTTACATACA GAACGTCGTA1260CAGGGGGACA GCGGAGAGTA TGCGTGCTCT GCGACCAACA ACATTGACAG CGTCCATGCC1320ACCGCTTTCA TCATCGTCCA GGCTCTTCCT CAGTTCACTG TGACGCCTCA GGACAGAGTC1380GTTATTGAGG GCCAGACCGT GGATTTCCAG TGTGAAGCCA AGGGCAACCC GCCGCCCGTC1440ATCGCCTCCA CCAAGGGAGG GAGCCAGCTC TCCGTGGACC GGCGGCACCT GGTCCTGTCA1500TCGGGAACC TTAGAATCTC TGGTGTTGCC CTCCACGACC AGGGCCAGTA CGAATGCCAG1560GCTGTCAACA TCATCGGCTC CCAGAAGGTC GTGGCCCACC TGACTGTGCA GCCCAGAGTC1620ACCCCAGTGT TTGCCAGCAT TCCCAGCGAC ACAACAGTGG AGGTGGGCGC CAATGTGCAG1680CTCCCGTGCA GCTCCCAGGG CGAGCCCGAG CCAGCCATCA CCTGGAACAA GGATGGGGTT1740CAGGTGACAG AAAGTGGAAA ATTTCACATC AGCCCTGAAG GATTCTTGAC CATCAATGAC1800GTTGGCCCTG CAGACGCAGG TCGCTATGAG TGTGTGGCCC GGAACACCAT TGGGTCGGCC1860TCGGTGAGCA TGGTGCTCAG TGTGAACGTT CCTGACGTCA GTCGAAATGG AGATCCGTTT1920GTAGCTACCT CCATCGTGGA AGCGATTGCG ACTGTTGACA GAGCTATAAA CTCAACCCGA1980ACACATTTGT TTGACAGCCG TCCTCGTTCT CCAAATGATT TGCTGGCCTT GTTCCGGTAT2040CCGAGGGATC CTTACACAGT TGAACAGGCA CGGGCGGGAG AAATCTTTGA ACGGACATTG2100CAGCTCATTC AGGAGCATGT ACAGCATGGC TTGATGGTCG ACCTCAACGG AACAAGTTAC2160CACTACAACG ACCTGGTGTC TCCACAGTAC CTGAACCTCA TCGCAAACCT GTCGGGCTGT2220ACCGCCCACC GGCGCGTGAA CAACTGCTCG GACATGTGCT TCCACCAGAA GTACCGGACG2280CACGACGGCA CCTGTAACAA CCTGCAGCAC CCCATGTGGG GCGCCTCGCT GACCGCCTTC2340GAGCGCCTGC TGAAATCCGT GTACGAGAAT GGCTTCAACA CCCCTCGGGG CATCAACCCC2400CACCGACTGT ACAACGGGCA CGCCCTTCCC ATGCCGCGCC TGGTGTCCAC CACCCTGATC2460GGGACGGAGA CCGTCACACC CGACGAGCAG TTCACCCACA TGCTGATGCA GTGGGGCCAG2520TTCCTGGACC ACGACCTCGA CTCCACGGTG GTGGCCCTGA GCCAGGCACG CTTCTCCGAC2580GGACAGCACT GCAGCAACGT GTGCAGCAAC GACCCCCCCT GCTTCTCTGT CATGATCCCC2640CCCAATGACT CCCGGGCCAG GAGCGGGGCC CGCTGCATGT TCTTCGTGCG CTCCAGCCCT2700GTGTGCGGCA GCGGCATGAC TTCGCTGCTC ATGAACTCCG TGTACCCGCG GGAGCAGATC2760AACCAGCTCA CCTCCTACAT CGACGCATCC AACGTGTACG GGAGCACGGA GCATGAGGCC2820CGCAGCATCC GCGACCTGGC CAGCCACCGC GGCCTGCTGC GGCAGGGCAT CGTGCAGCGG2880TCCGGGAAGC CGCTGCTCCC CTTCGCCACC GGGCCGCCCA CGGAGTGCAT GCGGGACGAG2940AACGAGAGCC CCATCCCCTG CTTCCTGGCC GGGGACCACC GCGCCAACGA GCAGCTGGGC3000CTGACCAGCA TGCACACGCT GTGGTTCCGC GAGCACAACC GCATTGCCAC GGAGCTGCTC3060AAGCTGAACC CGCACTGGGA CGGCGACACC ATCTACTATG AGACCAGGAA GATCGTGGGT3120GCGGAGATCC AGCACATCAC CTACCAGCAC TGGCTCCCGA AGATCCTGGG GGAGGTGGGC3180ATGAGGACGC TGGGAGAGTA CCACGGCTAC GACCCCGGCA TCAATGCTGG CATCTTCAAC3240GCCTTCGCCA CCGCGGCCTT CAGGTTTGGC CACACGCTTG TCAACCCACT GCTTTACCGG3300CTGGACGAGA ACTTCCAGCC CATTGCACAA GATCACCTCC CCCTTCACAA AGCTTTCTTC3360TCTCCCTTCC GGATTGTGAA TGAGGGCGGC ATCGATCCGC TTCTCAGGGG GCTGTTCGGG3420GTGGCGGGGA AAATGCGTGT GCCCTCGCAG CTGCTGAACA CGGAGCTCAC GGAGCGGCTG3480TTCTCCATGG CACACACGGT GGCTCTGGAC CTGGCGGCCA TCAACATCCA GCGGGGCCGG3540GACCAGGGGA TCCCACCCTA CCACGACTAC AGGGTCTACT GCAATCTATC GGCGGCACAC3600ACGTTCGAGG ACCTGAAAAA TGAGATTAAA AACCCTGAGA TCCGGGAGAA ACTGAAAAGG3660TTGTATGGCT CGACACTCAA CATCGACCTG TTTCCGGCGC TCGTGGTGGA GGACCTGGTG3720CCTGGCAGCC GGCTGGGCCC CACCCTGATG TGTCTTCTCA GCACACAGTT CAAGCGCCTG3780CGAGATGGGG ACAGGTTGTG GTATGAGAAC CCTGGGGTGT TCTCCCCGGC CCAGCTGACT3840CAGATCAAGC AGACGTCGCT GGCCAGGATC CTATGCGACA ACGCGGACAA CATCACCCGG3900GTGCAGAGCG ACGTGTTCAG GGTGGCGGAG TTCCCTCACG GCTACGGCAG CTGTGACGAG3960ATCCCCAGGG TGGACCTCCG GGTGTGGCAG GACTGCTGTG AAGACTGTAG GACCAGGGGG4020CAGTTCAATG CCTTTTCCTA TCATTTCCGA GGCAGACGGT CTCTTGAGTT CAGCTACCAG4080GAGGACAAGC CGACCAAGAA AACAAGACCA CGGAAAATAC CCAGTGTTGG GAGACAGGGG4140GAACATCTCA GCAACAGCAC CTCAGCCTTC AGCACACGCT CAGATGCATC TGGGACAAAT4200GACTTCAGAG AGTTTGTTCT GGAAATGCAG AAGACCATCA CAGACCTCAG AACACAGATA4260AAGAAACTTG AATCACGGCT CAGTACCACA GAGTGCGTGG ATGCCGGGGG CGAATCTCAC4320GCCAACAACA CCAAGTGGAA AAAAGATGCA TGCACCATTT GTGAATGCAA AGACGGGCAG4380GTCACCTGCT TCGTGGAAGC TTGCCCCCCT GCCACCTGTG CTGTCCCCGT GAACATCCCA4440GGGGCCTGCT GTCCAGTCTG CTTACAGAAG AGGGCGGAGG AAAAGCCCTAGGCTCCTGGG4500AGGCTCCTCA GAGTTTGTCT GCTGTGCCAT CGTGAGATCG GGTGGCCGAT GGCAGGGAGC4560TGCGGACTGC AGACCAGGAA ACACCCAGAA CTCGTGACAT TTCATGACAA CGTCCAGCTG4620GTGCTGTTAC AGAAGGCAGT GCAGGAGGCT TCCAACCAGA GCATCTGCGG AGAAGGAGGC4680ACAGCAGGTG CCTGAAGGGA AGCAGGCAGG AGTCCTAGCT TCACGTTAGA CTTCTCAGGT4740TTTTATTTAA TTCTTTTAAA ATGAAAAATT GGTGCTACTA TTAAATTGCA CAGTTGAATC4800ATTTAGGCGC CTAAATTGGT TTTGCCTCCC AACACCATTT CTTTTTAAAT AAAGCAGGAT4860ACCTCTATAT GTCAGCCTTG CCTTGTTCAG ATGCCAGGAG CCGGCAGACC TGTCACCCGC4920AGGTGGGGTG AGTCTCGGAG CTGCCAGAGG GGCTCACCGA AATCGGGGTT CCATCACAAG4980CTATGTTTAA AAAGAAAATT GGTGTTTGGC AAACGGAACA GAACCTTTGA TGAGAGCGTT5040CACAGGGACA CTGTCTGGGG GTGCAGTGCA AGCCCCCGGC CTCTTCCCTG GGAACCTCTG5100AACTCCTCCT TCCTCTGGGC TCTCTGTAAC ATTTCACCAC ACGTCAGCAT CTAATCCCAA5160GACAAACATT CCCGCTGCTC GAAGCAGCTG TATAGCCTGT GACTCTCCGT GTGTCAGCTC5220CTTCCACACC TGATTAGAAC ATTCATAAGC CACATTTAGA AACAGATTTG CTTTCAGCTG5280TCACTTGCAC ACATACTGCC TAGTTGTGAA CCAAATGTGA AAAAACCTCC TTCATCCCAT5340TGTGTATCTG ATACCTGCCG AGGGCCAAGG GTGTGTGTTG ACAACGCCGC TCCCAGCCGG5400CCCTGGTTGC GTCCACGTCC TGAACAAGAG CCGCTTCCGG ATGGCTCTTC CCAAGGGAGG5460AGGAGCTCAA GTGTCGGGAA CTGTCTAACT TCAGGTTGTG TGAGTGCGTTACF5 DNA sequenceGene name: Mitogen-activated protein kinase kinase kinase kinase 4Unigene number: Hs.3628Probeset Accession #: N54067Nucleic Acid Accession #: NM_004834Coding sequence: 80-3577 (predicted start/stop codons underlined)AATTCGAGGA TCCGGGTACC ATGGCACAGA GCGACAGAGA CATTTATTGT TATTTGTTTT60TTGGTGGCAA AAAGGGAAAATGGCGAACGA CTCCCCTGCA AAAAGTCTGG TGGACATCGA120CCTCTCCTCC CTGCGGGATC CTGCTGGGAT TTTTGAGCTG GTGGAAGTGG TTGGAAATGG180CACCTATGGA CAAGTCTATA AGGGTCGACA TGTTAAAACG GGTCAGTTGG CAGCCATCAA240AGTTATGGAT GTCACTGAGG ATGAAGAGGA AGAAATCAAA CTGGAGATAA ATATGCTAAA300GAAATACTCT CATCACAGAA ACATTGCAAC ATATTATGGT GCTTTCATCA AAAAGAGCCC360TCCAGGACAT GATGACCAAC TCTGGCTTGT TATGGAGTTC TGTGGGGCTG GGTCCATTAC420AGACCTTGTG AAGAACACCA AAGGGAACAC ACTCAAAGAA GACTGGATCG CTTACATCTC480CAGAGAAATC CTGAGGGGAC TGGCACATCT TCACATTCAT CATGTGATTC ACCGGGATAT540CAAGGGCCAG AATGTGTTGC TGACTGAGAA TGCAGAGGTG AAACTTGTTG ACTTTGGTGT600GAGTGCTCAG CTGGACAGGA CTGTGGGGCG GAGAAATACG TTCATAGGCA CTCCCTACTG660GATGGCTCCT GAGGTCATCG CCTGTGATGA GAACCCAGAT GCCACCTATG ATTACAGAAG720TGATCTTTGG TCTTGTGGCA TTACAGCCAT TGAGATGGCA GAAGGTGCTC CCCCTCTCTG780TGACATGCAT CCAATGAGAG CACTGTTTCT CATTCCCAGA AACCCTCCTC CCCGGCTGAA840GTCAAAAAAA TGGTCGAAGA AGTTTTTTAG TTTTATAGAA GGGTGCCTGG TGAAGAATTA900CATGCAGCGG CCCTCTACAG AGCAGCTTTT GAAACATCCT TTTATAAGGG ATCAGCCAAA960TGAAAGGCAA GTTAGAATCC AGCTTAAGGA TCATATAGAT CGTACCAGGA AGAAGAGAGG1020CGAGAAAGAT GAAACTGAGT ATGAGTACAG TGGGAGTGAG GAAGAAGAGG AGGAAGTGCC1080TGAACAGGAA GGAGAGCCAA GTTCCATTGT GAACGTGCCT GGTGAGTCTA CTCTTCGCCG1140AGATTTCCTG AGACTGCAGC AGGAGAACAA GGAACGTTCC GAGGCTCTTC GGAGACAACA1200GTTACTACAG GAGCAACAGC TCCGGGAGCA GGAAGAATAT AAAAGGCAAC TGCTGGCAGA1260GAGACAGAAG CGGATTGAGC AGCAGAAAGA ACAGAGGCGA CGGCTAGAAG AGCAACAAAG1320GAGAGAGCGG GAGGCTAGAA GGCAGCAGGA ACGTGAACAG CGAAGGAGAG AACAAGAAGA1380AAAGAGGCGT CTAGAGGAGT TGGAGAGAAG GCGCAAAGAA GAAGAGGAGA GGAGACGGGC1440AGAAGAAGAA AAGAGGAGAG TTGAAAGAGA ACAGGAGTAT ATCAGGCGAC AGCTAGAAGA1500GGAGCAGCGG CACTTGGAAG TCCTTCAGCA GCAGCTGCTC CAGGAGCAGG CCATGTTACT1560GCATGACCAT AGGAGGCCGC ACCCGCAGCA CTCGCAGCAG CCGCCACCAC CGCAGCAGGA1620AAGGAGCAAG CCAAGCTTCC ATGCTCCCGA GCCCAAAGCC CACTACGAGC CTGCTGACCG1680AGCGCGAGAG GTTCCTGTGA GAACAACATC TCGCTCCCCT GTTCTGTCCC GTCGAGATTC1740CCCACTGCAG GGCAGTGGGC AGCAGAATAG CCAGGCAGGA CAGAGAAACT CCACCAGTAT1800TGAGCCCAGG CTTCTGTGGG AGAGAGTGGA GAAGCTGGTG CCCAGACCTG GCAGTGGCAG1860CTCCTCAGGG TCCAGCAACT CAGGATCCCA GCCCGGGTCT CACCCTGGGT CTCAGAGTGG1920CTCCGGGGAA CGCTTCAGAG TGAGATCATC ATCCAAGTCT GAAGGCTCTC CATCTCAGCG1980CCTGGAAAAT GCAGTGAAAA AACCTGAAGA TAAAAAGGAA GTTTTCAGAC CCCTCAAGCC2040TGCTGGCGAA GTGGATCTGA CCGCACTGGC CAAAGAGCTT CGAGCAGTGG AAGATGTACG2100GCCACCTCAC AAAGTAACGG ACTACTCCTC ATCCAGTGAG GAGTCGGGGA CGACGGATGA2160GGAGGACGAC GATGTGGAGC AGGAAGGGGC TGACGAGTCC ACCTCAGGAC CAGAGGACAC2220CAGAGCAGCG TCATCTCTGA ATTTGAGCAA TGGTGAAACG GAATCTGTGA AAACCATGAT2280TGTCCATGAT GATGTAGAAA GTGAGCCGGC CATGACCCCA TCCAAGGAGG GCACTCTAAT2340CGTCCGCCAG ACTCAGTCCG CTAGTAGCAC ACTCCAGAAA CACAAATCTT CCTCCTCCTT2400TACACCTTTT ATAGACCCCA GATTACTACA GATTTCTCCA TCTAGCGGAA CAACAGTGAC2460ATCTGTGGTG GGATTTTCCT GTGATGGGAT GAGACCAGAA GCCATAAGGC AAGATCCTAC2520CCGGAAAGGC TCAGTGGTCA ATGTGAATCC TACCAACACT AGGCCACAGA GTGACACCCC2580GGAGATTCGT AAATACAAGA AGAGGTTTAA CTCTGAGATT CTGTGTGCTG CCTTATGGGG2640AGTGAATTTG CTAGTGGGTA CAGAGAGTGG CCTGATGCTG CTGGACAGAA GTGGCCAAGG2700GAAGGTCTAT CCTCTTATCA ACCGAAGACG ATTTCAACAA ATGGACGTAC TTGAGGGCTT2760GAATGTCTTG GTGACAATAT CTGGCAAAAA GGATAAGTTA CGTGTCTACT ATTTGTCCTG2820GTTAAGAAAT AAAATACTTC ACAATGATCC AGAAGTTGAG AAGAAGCAGG GATGGACAAC2880CGTAGGGGAT TTGGAAGGAT GTGTACATTA TAAAGTTGTA AAATATGAAA GAATCAAATT2940TCTGGTGATT GCTTTGAAGA GTTCTGTGGA AGTCTATGCG TGGGCACCAA AGCCATATCA3000CAAATTTATG GCCTTTAAGT CATTTGGAGA ATTGGTACAT AAGCCATTAC TGGTGGATCT3060CACTGTTGAG GAAGGCCAGA GGTTGAAAGT GATCTATGGA TCCTGTGCTG GATTCCATGC3120TGTTGATGTG GATTCAGGAT CAGTCTATGA CATTTATCTA CCAACACATG TAAGAAAGAA3180CCCACACTCT ATGATCCAGT GTAGCATCAA ACCCCATGCA ATCATCATCC TCCCCAATAC3240AGATGGAATG GAGCTTCTGG TGTGCTATGA AGATGAGGGG GTTTATGTAA ACACATATGG3300AAGGATCACC AAGGATGTAG TTCTACAGTG GGGAGAGATG CCTACATCAG TAGCATATAT3360TCGATCCAAT CAGACAATGG GCTGGGGAGA GAAGGCCATA GAGATCCGAT CTGTGGAAAC3420TGGTCACTTG GATGGTGTGT TCATGCACAA AAGGGCTCAA AGACTAAAAT TCTTGTGTGA3480ACGCAATGAC AAGGTGTTCT TTGCCTCTGT TCGGTCTGGT GGCAGCAGTC AGGTTTATTT3540CATGACCTTA GGCAGGACTT CTCTTCTGAG CTGGTAGAAG CAGTGTGATC CAGGGATTAC 3600TGGCCTCCAG AGTCTTCAAG ATCCTGAGAA CTTGGAATTC CTTGTAAC GAGCTCGGAG3660CTGCACCGAG GGCAACCAGG ACAGCTGTGT GTGCAGACCT CATGTGTTCG GTTCTCTCCC3720CTCCTTCCTG TTCCTCTTAT ATACCAGTTT ATCCCCATTC TTTTTTTTTT TCTTACTCCA3780AAATAAATCA AGGCTGCAAT GCAGCTGGTG CTGTTCAGAT TCCAAAAAAA AAAAAAAACC3840ATGGTACCCG GATCCTCGAA TTCCACF8 DNA sequenceGene name: Phospholipase A2, group IVC (cytosolic, calcium-independent)Unigene number: Hs.18858Probeset Accession #: AA054087 Nucleic Acid Accession #: NM_003706Coding sequence: 310-1935 (predicted start/stop codons underlined)CACGAGGCAG GGGCCATTTT ACCTCCAGGT TGGCCCTGCT CAGGACCAGG AGGAAACACC60TCCAGCCCGC GACCTCCTCC CACAGGGGGA AAAGGAAAGC AGGAGGACCA CAGAAGCTTT120GGCACCGAGG ATCCCCGCAG TCTTCACCCG CGGAGATTCC GGCTGAAGGA GCTGTCCAGC180GACTACACCG CTAAGCGCAG GGAGCCCAAG CCTCCGCACC GGATTCCGGA GCACAAGCTC240CACCGCGCAT GCGCACACGC CCCAGACCCA GGCTCAGGAG GACTGAGAAT TTTCTGACCG300CAGTGCACCATGGGAAGCTC TGAAGTTTCC ATAATTCCTG GGCTCCAGAA AGAAGAAAAG360GCGGCCGTGG AGAGACGAAG ACTTCATGTG CTGAAAGCTC TGAAGAAGCT AAGGATTGAG420GCTGATGAGG CCCCAGTTGT TGCTGTGCTG GGCTCAGGCG GAGGACTGCG GGCTCACATT480GCCTGCCTTG GGGTCCTGAG TGAGATGAAA GAACAGGGCC TGTTGGATGC CGTCACGTAC540CTCGCAGGGG TCTCTGGATC CACTTGGGCA ATATCTTCTC TCTACACCAA TGATGGTGAC600ATGGAAGCTC TCGAGGCTGA CCTGAAACAT CGATTTACCC GACAGGAGTG GGACTTGGCT660AAGAGCCTAC AGAAAACCAT CCAAGCAGCG AGGTCTGAGA ATTACTCTCT GACCGACTTC720TGGGGCTACA TGGTTATCTC TAAGCAAACC AGAGAACTGC CGGAGTCTCA TTTGTCCAAT780ATGAAGAAGC CCGTGGAAGA AGGGACACTA CCCTACCCAA TATTTGCAGC CATTGACAAT840GACCTGCAAC CTTCCTGGCA GGAGGCAAGA GCACCAGAGA CCTGGTTCGA GTTCACCCCT900CACCACGCTG GCTTCTCTGC ACTGGGGGCC TTTGTTTCCA TAACCCACTT CGGAAGCAAA960TTCAAGAAGG GAAGACTGGT CAGAACTCAC CCTGAGAGAG ACCTGACTTT CCTGAGAGGT1020TTATGGGGAA GTGCTCTTGG TAACACTGAA GTCATTAGGG AATACATTTT TGACCAGTTA1080AGGAATCTGA CCCTGAAAGG TTTATGGAGA AGGGCTGTTG CTAATGCTAA AAGCATTGGA1140CACCTTATTT TTGCCCGATT ACTGAGGCTG CAAGAAAGTT CACAAGGGGA ACATCCTCCC1200CCAGAAGATG AAGGCGGTGA GCCTGAACAC ACCTGGCTGA CTGAGATGCT CGAGAATTGG1260ACCAGGACCT CCCTGGAAAA GCAGGAGCAG CCCCATGAGG ACCCCGAAAG GAAAGGCTCA1320CTCAGTAACT TGATGGATTT TGTGAAGAAA ACAGGCATTT GCGCTTCAAA GTGGGAATGG1380GGGACCACTC ACAACTTCCT GTACAAACAC GGTGGCATCC GGGACAAGAT AATGAGCAGC1440CGGAAGCACC TCCACCTGGT GGATGCTGGT TTAGCCATCA ACACTCCCTT CCCACTCGTG1500CTGCCCCCGA CGCGGGAGGT TCACCTCATC CTCTCCTTCG ACTTCAGTGC CGGAGATCCT1560TTCGAGACCA TCCGGGCTAC CACTGACTAC TGCCGCCGCC ACAAGATCCC CTTTCCCCAA1620GTAGAAGAGG CTGAGCTGGA TTTGTGGTCC AAGGCCCCCG CCAGCTGCTA CATCCTGAAA1680GGAGAAACTG GACCAGTGGT GATACATTTT CCCCTGTTCA ACATAGATGC CTGTGGAGGT1740GATATTGAGG CATGGAGTGA CACATACGAC ACATTCAAGC TTGCTGACAC CTACACTCTA1800GATGTGGTGG TGCTACTCTT GGCATTAGCC AAGAAGAATG TCAGGGAAAA CAAGAAGAAG1860ATCCTTAGAG AGTTGATGAA CGTGGCCGGG CTCTACTACC CGAAGGATAG TGCCCGAAGT1920TGCTGCTTGG CATAGATGAG CCTCAGCTTC CAGGGCACTG TGGGCCTGTT GGTCTACTAG1980GGCCCTGAAG TCCACCTGGC CTTCCTGTTC TTCACTCCCT TCAGCCACAC GCTTCATGGC2040CTTGAGTTCA CCTTGGCTGT CCTAACAGGG CCAATCACCA GTGACCAGCT AGACTGTGAT2100TTTGATAGCG TCATTCAGAA GAAGGTGTCC AAGGAGCTGA AGGTGGTGAA ATTTGTCCTG2160CAGGTCCCTC GGGAGATCCT GGAGCTGGAG CATGAGTGTC TGACAATCAG AAGCATCATG2220TCCAATGTCC AGATGGCCAG AATGAATGTG ATAGTTCAGA CCAATGCCTT CCACTGCTCC2280TTTATGACTG CACTTCTAGC CAGTAGCTCT GCACAAGTTA GCTCTGTAGA AGTAAGAACT2340TGGGCTTAAA TCATGGGCTA TCTCTCCACA GCCAAGTGGA GCTCTGAGAA TACAACAAGT2400GCTCAATAAA TGCTTGCTGA TTGACTGATG AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA2460AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAACG1 DNA sequenceGene name: Carbohydrate (chondroitin 6/keratan) sulfotransferase 1Unigene number: Hs.104576Probeset Accession #: AA868063 Nucleic Acid Accession #: NM_002654Coding sequence: 367-1602 (predicted start/stop codons underlined)GGGGAGGGCG CGGGAGGCGG AGGATGCCGC CGCGGCTGCT GCCGCCGCCG CCACCCGCGG60GTCCCCGGCG ACCCTACTCC AGACCCGAGG ATGGAGCCGG CGCTGGGCGC TGCAGCTGCT120CCCGGCGCGT CCCCGACCAG GTAGCTGGTG TCACTTCGGT GTGGTTGGAA GAAGACTTTC180TCCCCAGCTG CATTCCCGGA GGCGCCCTTT CGACCTGGAG GCCGGGTCTG CTGGCCACAG240GGCTGCCGCA CTGGCTGGGA CTGCCAGCTG GGCCTGGAGA CGCTGGTGGC TGTGGACTCC300CCAGCTTGGA GCAGTCCCTC TTTGACCTCA CCCCTTGGAG AAGCAGCCCC ATGAAGGTGC360CCAGCCATGC AATGTTCCTG GAAGGCCGTC CTCCTCCTTG CCCTGGCCTC CATTGCCATC420CAGTACACGG CCATCCGCAC CTTCACCGCC AAGTCCTTTC ACACCTGCCC CGGGCTGGCA480GAGGCCGGGC TGGCCGAGCG ACTGTGCGAG GAGAGCCCCA CCTTCGCCTA CAACCTCTCC540CGCAAGACCC ACATCCTCAT CCTGGCCACC ACGCGCAGCG GCTCCTCCTT CGTGGGCCAG600CTCTTCAACC AGCACCTGGA CGTCTTCTAC CTGTTTGAGC CCCTCTACCA CGTCCAGAAC660ACGCTCATCC CCCGCTTCAC CCAGGGCAAG AGCCCGGCCG ACCGGCGGGT CATGCTAGGC720GCCAGCCGCG ACCTCCTGCG GAGCCTCTAC GACTGCGACC TCTACTTCCT GGAGAACTAC780ATCAAGCCGC CGCCGGTCAA CCACACCACC GACAGGATCT TCCGCCGCGG GGCCAGCCGG840GTCCTCTGCT CCCGGCCTGT GTGCGACCCT CCGGGGCCAG CCGACCTGGT CCTGGAGGAG900GGGGACTGTG TGCGCAAGTG CGGGCTACTC AACCTGACCG TGGCGGCCGA GGCGTGCCGC960GAGCGCAGCC ACGTGGCCAT CAAGACGGTG CGCGTGCCCG AGGTGAACGA CCTGCGCGCC1020CTGGTGGAAG ACCCGCGATT AAACCTCAAG GTCATCCAGC TGGTCCGAGA CCCCCGCGGC1080ATTCTGGCTT CGCGCAGCGA GACCTTCCGC GACACGTACC GGCTCTGGCG GCTCTGGTAC1140GGCACCGGGA GGAAACCCTA CAACCTGGAC GTGACGCAGC TGACCACGGT GTGCGAGGAC1200TTCTCCAACT CCGTGTCCAC CGGCCTCATG CGGCCCCCGT GGCTCAAGGG CAAGTACATG1260TTGGTGCGCT ACGAGGACCT GGCTCGGAAC CCTATGAAGA AGACCGAGGA GATCTACGGG1320TTCCTGGGCA TCCCGCTGGA CAGCCACGTG GCCCGCTGGA TCCAGAACAA CACGCGGGGC1380GACCCCACCC TGGGCAAGCA CAAATACGGC ACCGTGCGAA ACTCGGCGGC CACGGCCGAG1440AAGTGGCGCT TCCGCCTCTC CTACGACATC GTGGCCTTTG CCCAGAACGC CTGCCAGCAG1500GTGCTGGCCC AGCTGGGCTA CAAGATCGCC GCCTCGGAGG AGGAGCTGAA GAACCCCTCG1560GTCAGCCTGG TGGAGGAGCG GGACTTCCGC CCCTTCTCGTGACCCGGGCG GTGCGGGTGG1620GGGCGGGAGG CGCAAGGTGT CGGTTTTGAT AAAATGGACC GTTTTTAACT GTTGCCTTAT1680TAACCCCTCC CTCTCCCACC TCATCTTCGT GTCCTTCCTG CCCCCAGCTC ACCCCACTCC1740CTTCTGCCCC TTTTTTGTCT CTGAAATTTG CACTACGTCT TGGACGGGAA TCACTGGGGC1800AGAGGGCGCC TGAAGTAGGG TCCCGCCCCC CCCACCCCAT TCAGACACAT GGATGTTGGG1860TCTCTGTGCG GACGGTGACA ATGTTTACAA GCACCACATT TACACATCCA CACACGCACA1920CGGGCACTCG CGAGGCGACT TCTCAAGCTT TTGAATGGGT GAGTGGTCGG GTATCTAGTT1980TTTGCACTGT CTTACTATTC AAGGTAAGAG GATACAAACA AGAGGACCAC TTGTCTCTAA2040TTTATGAATG GTGTCCATCC TTTCCCCATC CCTGCCTCCT GCCCCTGACG CCCATTTCCC2100CCCTTAGAGC AGCGAAACTG CCCCCTCCTG CCCGCCCTTG CCTGTCGGTG AGGCAGGTTT2160TTACTGTGAG GTGAACGTGG ACCTGTTTCT GTTTCCAGTC TGTGGTGATG CTGTCTGTCT2220GTCTGAGTCT CGTGGCCGCC CCTGGACCAG TGATGACTGA TGAATCTTAT GAGCTTCTGA2280TTGATCTCGG GGTCCATCTG TGATATTTCT TTGTGCCAAA AAGAAAAAAA AAGAGTGGAT2340CAGTTTGCTA AATGAACATT GAAATTGAAA TGCTTTATCT GTGTTTTCTG TAAATAAAAG2400AGTGCAATAA TCACCACG5 DNA sequenceGene name: MultimerinUnigene number: Hs.268107Probeset Accession #: U27109 Nucleic Acid Accession #: U27109.1Coding sequence: 72-3758 (predicted start/stop codons underlined)CTGCTATCAA AAAGGCCATA AGGATTTTGT CCCCAAATTT CACATGAGCT ACCTTGCTTC60AAACTACTGA GATGAAGGGG GCAAGATTAT TTGTCCTTCT TTCTAGTTTA TGGAGTGGGG120GCATTGGGCT TAACAACAGT AAGCATTCTT GGACTATACC TGAGGATGGG AACTCTCAGA180AGACTATGCC TTCTGCTTCA GTTCCTCCAA ATAAAATACA AAGTTTGCAA ATACTGCCAA240CCACTCGGGT CATGTCGGCG GAGATAGCTA CAACTCCAGA GGCAAGAACT TCTGAAGACA300GTCTTCTTAA ATCAACACTG CCTCCCTCAG AAACAAGTGC ACCTGCTGAG GGTGTGAGAA360ATCAAACTCT CACATCCACA GAGAAAGCAG AAGGAGTGGT CAAGTTACAG AATCTTACCC420TCCCAACCAA CGCTAGCATC AAGTTCAATC CTGGAGCAGA ATCAGTGGTC CTTTCCAATT480CTACACTGAA ATTTCTTCAG AGCTTTGCCA GAAAGTCAAA TGAACAAGCA ACTTCTCTAA540ACACAGTTGG AGGCACTGGA GGCATTGGAG GCGTTGGAGG CACTGGAGGC GTGGGAAATC600GAGCCCCACG GGAAACATAC CTCAGCCGGG GTGACAGCAG TTCCAGCCAA AGAACTGACT660ACCAAAAATC AAATTTCGAA ACAACTAGAG GAAAGAATTG GTGTGCTTAT GTACATACCA720GGTTATCTCC CACAGTGACA TTGGACAACC AGGTCACTTA TGTCCCAGGT GGGAAAGGAC780CTTGTGGCTG GACCGGTGGA TCCTGTCCTC AGAGATCTCA GAAGATATCC AATCCTGTCT840ATAGGATGCA ACATAAAATT GTCACCTCAT TGGATTGGAG GTGCTGTCCT GGATACAGTG900GGCCGAAATG TCAACTAAGA GCCCAGGAAC AGCAAAGTTT GATACACACC AACCAGGCTG960AAAGTCATAC AGCTGTTGGC AGAGGAGTAG CTGAGCAGCA GCAGCAGCAA GGCTGTGGTG1020ACCCAGAAGT GATGCAAAAA ATGACTGATC AGGTGAACTA CCAGGCAATG AAACTGACTC1080TTCTGCAGAA GAAGATTGAC AATATTTCTT TGACTGTGAA TGATGTAAGG AACACTTACT1140CCTCCCTAGA AGGAAAAGTC AGCGAAGATA AAAGCAGAGA ATTTCAATCT CTTCTAAAAG1200GTCTAAAATC CAAAAGCATT AATGTACTGA TAAGAGACAT AGTAAGAGAA CAATTTAAAA1260TTTTTCAAAA TGATGCAA GAGACTGTAG CACAGCTCTT CAAGACTGTA TCAAGTCTAT1320CAGAGGACCT CGAAAGCACC AGGCAAATAA TTCAAAAAGT TAATGAATCT GTGGTTTCAA1380TAGCAGCCCA GCAAAAGTTT GTTTTGGTGC AAGAGAATCG GCCCACTTTG ACTGATATAG1440TGGAACTAAG GAATCACATT GTGAATGTAA GGCAAGAAAT GACTCTTACA TGTGAGAAGC1500CTATTAAAGA ACTAGAAGTA AAGCAGACTC ATTTAGAAGG TGCTCTAGAA CAGGAACACT1560CAAGAAGCAT TCTGTATTAT GAATCCCTCA ATAAAACTCT TTCTAAATTG AAGGAAGTAC1620ATGAGCAGCT TTTATCAACT GAACAGGTAT CAGACCAGAA GAATGCTCCA GCTGCTGAGT1680CAGTTAGCAA TAATGTCACT GAGTACATGT CTACTTTACA TGAAAATATA AAGAAGCAGA1740GTTTGATGAT GCTGCAAATG TTTGAAGATT TGCACATTCA AGAAAGCAAG ATTAACAATC1800TCACCGTCTC TTTGGAGATG GAGAAAGAGT CTCTCAGAGG TGAATGTGAA GACATGTTAT1860CCAAATGCAG AAATGATTTT AAATTTCAAC TTAAGGACAC AGAAGAGAAT TTACATGTGT1920TAAATCAAAC ATTGGCTGAA GTTCTCTTTC CAATGGACAA TAAGATGGAC AAAATGAGTG1980AGCAACTAAA TGATTTGACT TATGATATGG AGATCCTTCA ACCCTTGCTT GAGCAGGGAG2040CATCACTCAG ACAGACAATG ACATATGAAC AACCAAAGGA AGCAATAGTG ATAAGGAAAA2100AGATAGAAAA TCTGACTAGT GCTGTCAATA GTCTAAATTT TATTATCAAA GAACTTACAA2160AAAGACACAA CTTACTTAGA AATGAAGTAC AGGGTCGTGA TGATGCCTTA GAAAGACGTA2220TCAATGAATA TGCCTTAGAA ATGGAAGATG GCCTCAATAA GACAATGACT ATTATAAATA2280ATGCTATTGA TTTCATTCAA GATAACTATG CCCTAAAAGA GACTTTAAGT ACTATTAAGG2340ATAATAGTGA GATCCATCAT AAATGTACCT CCGATATGGA AACTATTTTG AGATTTATTC2400CTCAGTTCCA CCGTCTGAAT GATTCTATTC AGACTTTGGT CAATGACAAT CAGAGATATA2460ACTTTGTTTT GCAAGTCGCC AAGACCCTTG CAGGTATTCC CAGAGATGAG AAACTAAATC2520AGTCCAACTT CCAAAAGATG TATCAAATGT TCAATGAAAC CACTTCCCAA GTGAGAAAAT2580ACCAGCAAAA TATGAGTCAT TTGGAAGAAA AACTACTCTT AACTACCAAG ATTTCCAAAA2640ATTTTGAGAC TCGGTTGCAA GACATTGAGT CTAAAGTTAC CCAGACGCTC ATACCTTATT2700ATATTTCAGT TAAAAAAGGC AGTGTAGTTA CAAATGAGAG AGATCAGGCT CTTCAACTGC2760AAGTATTAAA TTCCAGATTT AAGGCGTTGG AAGCAAAATC TATCCATCTT TCAATTAACT2820TCTTTTCGCT TAACAAAACT CTCCACGAAG TTTTAACAAT GTGTCACAAT GCTTCTACAA2880GTGTGTCAGA ACTGAATGCT ACCATCCCTA AGTGGATAAA ACATTCCCTG CCAGATATTC2940AACTTCTTCA GAAAGGTCTA ACAGAATTTG TGGAACCAAT AATTCAAATA AAAACTCAAG3000CTGCCCTATC TAATTCAACT TGTTGTATAG ATCGATCGTT GCCTGGTAGT CTGGCAAATG3060TTGTCAAGTC TCAGAAGCAA GTAAAATCAT TGCCAAAGAA AATTAACGCA CTTAAGAAAC3120CAACGGTAAA TCTTACCACA GTCCTGATAG GCCGGACTCA AAGAAACACG GACAACATAA3180TATATCCTGA GGAGTATTCA AGCTGTAGTC GGCATCCGTG CCAAAATGGG GGCACGTGCA3240TAAATGGAAG AACTAGCTTT ACCTGTGCCT GCAGACATCC TTTTACTGGT GACAACTGCA3300CTATCAAGCT TGTGGAAGAA AATGCTTTAG CTCCAGATTT TTCCAAAGGA TCTTACAGAT3360ATGCACCCAT GGTGGCATTT TTTGCATCTC ATACGTATGG AATGACTATA CCTGGTCCTA3420TCCTGTTTAA TAACTTGGAT GTCAATTATG GAGCTTCATA TACCCCAAGA ACTGGAAAAT3480TTAGAATTCC GTATCTTGGA GTATATGTTT TCAAGTACAC CATCGAGTCA TTTAGTGCTC3540ATATTTCTGG ATTTTTAGTG GTTGATGGAA TAGACAAGCT TGCATTTGAG TCTGAAAATA3600TTAACAGTGA AATACACTGT GATAGGGTTT TAACTGGGGA TGCCTTATTA GAATTAAATT3660ATGGGCAGGA AGTCTGGTTA CGACTTGCAA AAGGAACAAT TCCAGCCAAG TTTCCCCCTG3720TTACTACATT TAGTGGCTAT TTATTATATC GTACATAAGT TAGTATGAAA AACAGACTAT3780CACCTTTATT GAGAAACAGC CAGTGTTTTC ATTTATCTTT GCTTGCACAT CTGCTCTGTT3840TTGGTTTTTC TACAGGAAAT GAAAATCAAC TTGTTTTTTT AATATGAGTA AACTTGTATG3900TCTATTTTAT AAAATTATTT GAATATTGTT TAATGTCTGA ATATGAAAGA GTTCTTGATC3960CTAAAGAAAT TTAGTGGCAC AGAAAACAAA GTGAATTTGT TAGCATAATT ATTCCTATTC4020TTATTTCTTC ATTTTAAGTC ATTGCAATGG AAAGTAATAT TATAAAACGG TAATTACAAC4080ATATTATCAG TCACAGTTTT CTTTCCAATT AAACACTTAA CTTTTGTTAT TCCCTGTATA4140TAAATATATA ACACACATTT TCTAGATTCA CAAATTTAAA TAAATTACTC AAAAAATGACC6 DNA sequenceGene name: Homo sapiens cDNA FLJ11502 fis, clone HEMBA1002102, weakly similar toANKRYINUnigene number: Hs.213194Probeset Accession #: AA187101Nucleic Acid Accession #: AK021564Coding sequence: 1-450 (predicted stop codon underlined, 5′end sequence is open)GTCGCCGCGC GGCCGCCGGT GAGCCGCATG GAGCCCCGGG CGGCGGACGG CTGCTTCCTG60GGCGACGTGG GTTTCTGGGT GGAGCGGACC CCTGTGCACG AGGCAGCCCA GCGGGGTGAG120AGCCTGCAGC TGCAACAGCT GATCGAGAGC GGCGCCTGCG TGAACCAGGT CACCGTGGAC180TCCATCACGC CCCTGCACGC AGCCAGTCTG CAGGGCCAGG CGCGGTGTGT GCAGCTGCTG240CTGGCGGCTG GGGCCCAGGT GGATGCTCGC AACATCGACG GCAGCACCCC GCTCTGCGAT300GCCTGCGCCT CGGGCAGCAT CGAGTGTGTG AAGCTCTTGC TGTCCTACGG GGCCAAGGTC360AACCCTCCCC TGTACACAGC GTCCCCCCTG CACGAGGCCA GCTTTCCCCG CCTCCTGAGC420ACCCTGGCTT CGACGCCCTG GATCAACTGA GCCAGGTGGA ACTCCTGGGG GACATGGATC480GCAATGAATT CGACCAGTAT TTGAACACTC CTGGCCACCC AGACTCCGCC ACAGGGGCCA540TGGCCCTCAG TGGGCATGTT CCGGTCTCCC AGGTGACACC AACGGGTCCC ACAGAGACCA600GCCTCATCTC CGTCCTGGCT GATGCCACGG CCACGTACTA CAACAGCTAC AGTGTGTCAT660AGAGCTGGAG GCGCCCCGTC CGGTCAGCCC TCGCGCCCTC TCCTTCTTGT GCCTTGAGTG720GCAGAGGAGC CGTCCAGCCA CACCAGCTTT CCTCCCACCG CTCAGGGCAG GGAGGTCTGA780ACTGCGGCCC CAGAGCCTTT GGCCTAAGCT GGACTCTCCT TATCCGAGTG CCGCCTCTAT840CCCCTTCCCC ACGTTCCAGC CCCTGCAGCC CACATTTTAA GTATATTCCT TCAAGTGAGT900TTTCCTCCAG CCCCTGAGAG TTGCTGTCTC CCAGTGGAAT GTTCACTGAC GTCTTTTCTT960GGTAGCCATC ATCGAAACTA ATGGGGGGAC AGACTTGATA GCCAAGGTCC CTTCTGGTCC1020AGTTTTCTGA TTTAGGGTTC TCTCAAGATT AATAAAGGAA GATGGGGAAA TTTGACTCAT1080TAATGAGCTC GCTAACCTAC GATCTGGTGA TAATTTTGTG TGCACAGCCC AAGGACCACG1140AGGCTTTCTG CACTTTCTGC ACCCCCTTCC AAAGTGACCA CAAAATTTCA AAGGGACTCA1200TACAATTTGA GAAAAAACAG TCAACCTGAT TTGAGAAATT AACCAGTATG GCTAACTATA1260TCACAGAAAA TGGGATTGAG TTAAAACTAT TTTATTTTAA ATATACATTT TAAAGCAGTT1320CTTTTTTTTT TGTTAATTTG TTTATTATAC ACACACTTCA AGAGAATATG CACAGTCTAG1380GCCGGGCACG GTGGCTCACG CCTGTAATCC CAGCACTTTG GGAGGCCGAG GCATGTGGAT1440CACCTGAGGT CAGGAGTTTG AGACCAGCCT AGACAACATG GTGAAACCTT GTCTCTATGA1500AAAATACAAA ATTTGCTGGG AGTGGTGGTG CATGCCTGTA ATCCCAGCTA CTTGGAAGGC1560TGAGGCAGGA GAATGTCTTG AACCTAGGAG GTGGAGGTTG CAGTGAGCTG AGATTGCACC1620ATTGCACTCC AGCCTGTGCA ACAAGAGTGA AACTCCATTT CAAGACC7 DNA sequenceGene name: Human RAL A geneUnigene number: Hs.6906Probeset Accession #: AA083572Nucleic Acid Accession #: contig of X15014.1 and AK026850Coding sequence: 1-621 (predicted start/stop codons underlined)ATGGCTGCAA ATAAGCCCAA GGGTCAGAAT TCTTTGGCTT TACACAAAGT CATCATGGTG60GGCAGTGGTG GCGTGGGCAA GTCAGCTCTG ACTCTACAGT TCATGTACGA TGAGTTTGTG120GAGGACTATG AGCCTACCAA AGCAGACAGC TATCGGAAGA AGGTAGTGCT AGATGGGGAG180GAAGTCGAGA TCGATATCTT AGATACAGCT GGGCAGGAGG ACTACGCTGC AATTAGAGAC240AACTACTTCC GAAGTGGGGA GGGGTTCCTC TGTGTTTTCT CTATTACAGA AATGGAATCC300TTTGCAGCTA CAGCTGACTT CAGGGAGCAG ATTTTAAGAG TAAAAGAAGA TGAGAATGTT360CCATTTCTAC TGGTTGGTAA CAAATCAGAT TTAGAAGATA AAAGACAGGT TTCTGTAGAA420GAGGCAAAAA ACAGAGCTGA GCAGTGGAAT GTTAACTACG TGGAAACATC TGCTAAAACA480CGAGCTAATG TTGACAAGGT ATTTTTTGAT TTAATGAGAG AAATTCGAGC GAGAAAGATG540GAAGACAGCA AAGAAAAGAA TGGAAAAAAG AAGAGGAAAA GTTTAGCCAA GAGAATCAGA600GAAAGATGCT GCATTTTATAATCAAAGCCC AAACTCCTTT CTTATCTTGA CCATACTAAT660AAATATAATT TATAAGCATT GCCATTGAAG GCTTAATTGA CTGAAATTAC TTTAACATTT720TGGAAATTGT TGTATATCAC TAAAAGCATG AATTGGAACT GCAATGAAAG TCAAATTTAC780TTTAAAAAGA AATTAATATG GCTTCACCAA GAAGCAAAGT TCAACTTATT TCATAATTGC840CTACATTTAT CATGGTCCTG AATGTAGCGT GTAAGCTTGT GTTTCTTGGG CAGTCTTTCT900TGAAATTGAA GAGGTGAAAT GGGGGTGGGG AGTGGGAGGA AAGGTGACTT CCTCTGGTGT960TTATTATAAA GCTTAAATTT TATATCATTT TAAAATGTCT TGGTCTTCTA CTGCCTTGAA1020AAATGACAAT TGTGAACATG ATAGTTAAAC TACCACTTTT TTTAACCATT ATTATGCAAA1080ATTTAGAAGA AAAGTTATTG GCATGGTTGT TGCATATAGT TAAACTGAGA GTAATTCATC1140TGTGAATCTG CTTTAATTAC CTGGTGAGTA ACTTAGAAAA GTGGTGTAAA CTTGTACATG1200GAATTTTTTG AATATGCCTT AATTTAGAAA CTGAAAAATA TCCGGTTATA TGATTCTGGG1260TGTGTTCTTA CTGACACCAG GGGTCCGCTG CCCCATGTGT CCTGGTGAGA AAATATATGC1320CTGGCACAGC TTTTGTATAG AAAATTCTTG AGAAGTAACT GTCCGCTAGA AGTCTGTCCA1380AATTTAAAAT GTGTGCCATA TTCTGGTTCT TGAAAATAAG ATTCCAGAGC TCTTTGATCG1440CTTTTAATAA ACTGCAAGTT CATTTTAATT GAAGGGCCAG CATATATACT TGCAAGATAA1500TTTTCAGCTG CAAGGATTCA GCACCAGTTA TGTTTGAATG AACCCTCCTT TTCTCTGAGA1560TTCTGGTCCC TGGAAATCCC TTTCTGCTAG TGGTGAGCAT GTAAGTGTTA AGTTTTTAAT1620CTGGGAGCAG GGCATAGGAA GAAAATGTCA GTAGTGCTAA TGCATTTTGC ACTAGAACGC1680TTCGGGAAAA TATTCATGCT TGCCATCTGT TCATTTCTAA ATTTATATTC ATAAAGTTAC1740AGTTTGATAC AGGAATTATT AGGAGTAATT CTTTTCTGTT TCTGTTTATA ATGAAGAACA1800CTGTAGCTAC ATTTTCAGAA GTTAACATCA AGCCATCAAA CCTGGGTATA GTGCAGAAGA1860CGTGGCACAC ACTGACCACA CATTAGGCTG TGTCACCATT GTGTGGTGTA CCTGCTGGAA1920GAATTCTAGC ATGCTACTTG GGGACATAAT TTCAGTGGGA AATATGCCAC TGACCGATTT1980TTTTTTTTTT CCTCTTTGCA GTGGGGCTAG GACAGTTGAT TCAACAAAGT ATTTTTTTCT2040TTTTTCTCAG TCCTAATTTG GACAGGTCAA AGATGTGTTC AGGCATTCCA GGTAACAGGT2100GTGTATGTAA AGTTAAAAAT AGGCTTTTTA GGAACTCACT CTTTAGATAT TTACATCCAG2160CTTCTCATGT TAAATATTTG TCCTTAAAGG GTTTGAGATG TACATCTTTC ATTTCGTATT2220TCTCATAGGC TATGCCATGT GCGGAATTCA AGTTACCAAT GTAACACTGG CCAGCGGGCC2280CAGCAATCTC CATGTGTACT TATTACAGTC TTATTTAACC AGGGGTCCTA ACCACTAACA2340TTGTGACTTT GCTTTGAGAC CTTTCCTCTC CTGGGTACTG AGGTGCTATG AAGCCAACTG2400ACAAAGATGC ATCACGTGTC TTAGGCTGAT GCCACTACCC GATTTGTTTA TTTGCATTT2460GAGCCATTTA AAGACCAATA AACTTCCTTT TTTAAAAAAA AAAAAAAAAA AAAAAAAAAA2520AACC9 DNA sequenceGene name: KIAA0955 proteinUnigene number: Hs.10031Probeset Accession #: AA027168Nucleic Acid Accession #: AB023172Coding sequence: 314-1609 (predicted start/stop codons underlined)CTGGTTCTCA ACTTCTTTTG AAATAATGTT CATAGAGAAG GAGGGCTGTC TGAGATTCGA60GGGAAACAAG CTCTCAGGAC TTCCGGTCGC CATGATGGCT GTGGGCGGTA AACGCGGTTA120GTGCAAGCAT CTGGGCCATC TTCAATGGTA AAAAAGATAC AGTAAAGACA TAAATACCAC180ATTTGACAAA TGGAAAAAAA GGAGTGTCCA GAAAAGAGTA GCAGCAGTGA GGAAGAGCTG240CCGAGACGGG TATACAGGGA GCTACCCTGT GTTTCTGAGA CCCTTTGTGA CATCTCACAT300TTTTTCCAAG AAGATGATGA GACAGAGGCA GAGCCATTAT TGTTCCGTGC TGTTCCTGAG360TGTCAACTAT CTGGGGGGGA CATTCCCAGG AGACATTTGC TCAGAAGAGA ATCAAATAGT420TTCCTCTTAT GCTTCTAAAG TCTGTTTTGA GATCGAAGAA GATTATAAAA ATCGTCAGTT480TCTGGGGCCT GAAGGAAATG TGGATGTTGA GTTGATTGAT AAGAGCACAA ACAGATACAG540CGTTTGGTTC CCCACTGCTG GCTGGTATCT GTGGTCAGCC ACAGGCCTCG GCTTCCTGGT600AAGGGATGAG GTCACAGTGA CGATTGCGTT TGGTTCCTGG AGTCAGCACC TGGCCCTGGA660CCTGCAGCAC CATGAACAGT GGCTGGTGGG CGGCCCCTTG TTTGATGTCA CTGCAGAGCC720AGAGGAGGCT GTCGCCGAAA TCCACCTCCC CCACTTCATC TCCCTCCAAG GTGAGGTGGA780CGTCTCCTGG TTTCTCGTTG CCCATTTTAA GAATGAAGGG ATGGTCCTGG AGCATCCAGC840CCGGGTGGAG CCTTTCTATG CTGTCCTGGA AAGCCCCAGC TTCTCTCTGA TGGGCATCCT900GCTGCGGATC GCCAGTGGGA CTCGCCTCTC CATCCCCATC ACTTCCAACA CATTGATCTA960TTATCACCCC CACCCCGAAG ATATTAAGTT CCACTTGTAC CTTGTCCCCA GCGACGCCTT1020GCTAACAAAG GCGATAGATG ATGAGGAAGA TCGCTTCCAT GGTGTGCGCC TGCAGACTTC1080GCCCCCAATG GAACCCCTGA ACTTTGGTTC CAGTTATATT GTGTCTAATT CTGCTAACCT1140GAAAGTAATG CCCAAGGAGT TGAAATTGTC CTACAGGAGC CCTGGAGAAA TTCAGCACTT1200CTCAAAATTC TATGCTGGGC AGATGAAGGA ACCCATTCAA CTTGAGATTA CTGAAAAAAG1260ACATGGGACT TTGGTGTGGG ATACTGAGGT GAAGCCAGTG GATCTCCAGC TTGTAGCTGC1320ATCAGCCCCT CCTCCTTTCT CAGGTGCAGC CTTTGTGAAG GAGAACCACC GGCAACTCCA1380AGCCAGGATG GGGGACCTGA AAGGGGTGCT CGATGATCTC CAGGACAATG AGGTTCTTAC1440TGAGAATGAG AAGGAGCTGG TGGAGCAGGA AAAGACACGG CAGAGCAAGA ATGAGGCCTT1500GCTGAGCATG GTGGAGAAGA AAGGGGACCT GGCCCTGGAC GTGCTCTTCA GAAGCATTAG1560TGAAAGGGAC CCTTACCTCG TGTCCTATCT TAGACAGCAG AATTTGTAAA ATGAGTCAGT1620TAGGTAGTCT GGAAGAGAGA ATCCAGCGTT CTCATTGGAA ATGGATAAAC AGAAATGTGA1680TCATTGATTT CAGTGTTCAA GACAGAAGAA GACTGGGTAA CATCTATCAC ACAGGCTTTC1740AGGACAGACT TGTAACCTGG CATGTACCTA TTGACTGTAT CCTCATGCAT TTTCCTCAAG1800AATGTCTGAA GAAGGTAGTA ATATTCCTTT TAAATTTTTT CCAACCATTG CTTGATATAT1860CACTATTTTA TCCATTGACA TGATTCTTGA AGACCCAGGA TAAAGGACAT CCGGATAGGT1920GTGTTTATGA AGGATGGGGC CTGGAAAGGC AACTTTTCCT GATTAATGTG AAAAATAATT1980CCTATGGACA CTCCGTTTGA AGTATCACCT TCTCATAACT AAAAGCAGAA AAGCTAACAA2040AAGCTTCTCA GCTGAGGACA CTCAAGGCAT ACATGATGAC AGTCTTTTTT TTTTTTGTAT2100GTTAGGACTT TAACACTTTA TCTATGGCTA CTGTTATTAG AACAATGTAA ATGTATTTGC2160TGAAAGAGAG CACAAAAATG GGAGAAAATG CAAACATGAG CAGAAAATAT TTTCCCACTG2220GTGTGTAGCC TGCTACAAGG AGTTGTTGGG TTAAATGTTC ATGGTCAACT CCAAGGAATA2280CTGAGATGAA ATGTGGTAAA TCAACTCCAC AGAACCACCA AAAAGAAAAT GAGGGTAATT2340CAGCTTATTC TGAGACAGAC ATTCCTGGCA ATGTACCATA CAAAAAATAA GCCAACTCTG2400ACATTTGGAT TCTACCATAG ACTCTGTCAT TTTGTAGCCA TTTCAGCTGT CTTTTGATTA2460ATGTTTTCGT GGCACACATA TTTCCATCCT TTTATGTTTA ATCTGTTTAA AACAAGTTCC2520TAGTAGACAC CATCTGGTTG AGTCAGTTTT TTTTATGGTG TATTTTGAAC CCATTCTGAT2580AGTCTCTTTT AACTGGAAGA TTTCAATTAC TTACGTTAAT GTAATTATTA ATATGTTAGG2640ATTTATCCTC AGTCAGCCAG TTTGTTATGT CTTTTCTATT CTACTGTTAT CACATTTGTA2700CCACTTAAAG TGGAATCTAG GCACTTTATC ACCATTTAGA TCCTATTACC TTTTCTCATC2760TAGGATATAG TTATCTTCTA CATAATCTTT CTGTATCTTA AAACCCATCA ATAAATTATT2820ATATATTTTC TACTTTTAAT CACTCAGAAG ATTTAAAAAA CTCATGAGAA GAGTAATCTG2880TTATGTTTTT CCAGATATTT ACCATTTCTG TTGCTCTTCC TTCATTATTT TCCAAATTTC2940GTTCTGCAAA TTTCCACTTC TTCTGATAGA CGTTTTTTAG TTCTTTTAGA GTGGTTCTGA3000TAGGTACAGA TTCTCTTATT TTTTGCTTCC TCTGAGGACA TCTTTTTCTC ACCTTCATTC3060TCAGTGATGT TTTTTGCTTG TAGTATTTTT AGTTGACATT GTTTTCTGTT CAGCAGTTTC3120CTTTTAGCTT CCGTATTTCC TGATGAGAAA TCTGCAGTCA TTCAAATTGT TGTTTCCCTG3180TATGTAGTGT GTCATTTTTC TGTCAGATTT CAAGGTATTT ATCTTTAGTT TTTAGCCATT3240TCATTATGTT GGGGATGAGT TTCCTTGTTT TATTCCCTTT GGAATTTGCT CCAATTCATA3300AATTTGCAGT TTTATGTCTT TTACCAAACT TAGAGGTTTT CAGCCTAATT TCTAAAAATA3360CTTTTATTA GCCTGATTTT CATCTTTATA GGAAATAGTT TAAGTGATGA CAAGTTCCAA3420TAGCTTATAT GCCCAGAAGG CCTTCAAAAT AAGAATTTTG AAAGAATACA GAAAACAAAC3480TTTTATATCC TTCTCATGTC TTCTACTGTA AAATTCATAT GCTTTGCTAC TCTAAACCTA3540GTTTGAAATC AACAGTCTTG AGAATAGATG AAAATTTTGA TGAATAGTGG AATTCTTTTA3600AATGGAAACC TCTTACATGT GATTTTCCTT GCCATCTAGA AATAAACCAT AGTATTTATG3660TTGAATCAAT CAATATTATA TTTTGTTTTT TTCCTCCTCT TCTGAGACTC TTATTGTGGA3720AATGTTAGAC TTTTATGTTT TCCTAAATGT CCCTGATATT CTACTTATTT AGAACATCTT3780TTCATTTTTT CCATTATTCT GATTGGGTAA TTTTAATTTG TCTATTTTCA AATTTGCTGG3840AGTGTTCACC TGTTGTTGTC TGTGTCGTCC CACTGAGTGC ATTCACCACC TTTTAAATTT3900TGGTCACTGT ATGTATCAGT TCTAAAATTT CCATTTTGTT CTCTATATTT TAAATTTCTT3960GGCTTATATT CTATTTTCCT GCAAATGTGT CAGCATTTGC TTGTTTGAGC TTTTTTTTTT4020TCAAGACAGG GTCTCAACTC TGTTACCCAG GCTGGAGTGC AGTGGTGCGA TCTCAGCTCA4080CTGCAACCTC TGCCTCCTGG TTCAAGCGAT TATTGTGCCT CAGCCTCCTG AGTAGCTGGG4140ATTACAGGCA TGCACCACCA CAGCCCAGCT AATTTTTTGT ATTTTTAGTA GAGACAGAGT4200TTTGCTATGT TGGCCAGGCT GGTTTTGAAC TCCTGGCCTC AAGTGATCCA CCGACCTCAG4260CCTCCCAAAG TGCTGGGATT ACAGGCCACT ACACCTGGCA CATTTGAGTA TTTTTTTTTT4320TTTTTTTTTT TTGAGATGGA GTCTCGCTCT GTCATCTAGG CTGGAGTGCA GTGGTGTGAT4380CTCAGCTCAC TGCAGCCTCT GTCTCCCGGG CTCAAGCGAT TCTCTTGCCT CAGCCTCCTG4440AGTAGCTAGG ACTACAGGTG CATGCCAACA CGCCCGGCTA ATTTTTTTAA AAAATATTTT4500TAGTAGAGAC AGGGTTTCAC CATTTTGGCC AGGATGGTCT CGATCTCCTG ACCTCATGAT4560CCACCCGCCT CGGCCTTCCA AAGTGCTGGG ATTACAGGCA TGAGCCACCG TGCCTGGCCT4620CATTTGAGTA TTTTTATAAT GTCTCTTTTA AAGTCTTTGT CAGATAATTC CACTGTACAT4680GTTATTCAGT GTTTGGTGTC CACTGAGTTG TCATTTGCCA GACAAGTGGA GATTTTTGCA4740GCTCATCCTT GTATTCTCAG TAGTTCCGAT ATGTACCCTC GACATGTGAA TGTTATCTTA4800TGAGACTCTG TTTTATTTGT ATCCAACAGA AGATGTTTAT TATTTATTTG GCTTTCTGTG4860AACTGAGGTC TTAATATCAG CTCATTTTAA AAGTCTTTGC AGTGGTATTC GGATCTATCC4920TGTGTGTGCC TATGAGATTG GGTGCAGTGT ATCCTGTTAG CTCCATTCTC AGGGCGTTTG4980AATGTGAATT AGGACCAGCG CAATGAATGC TCAAGTTGGG GTTGGGCGTT AGAATTCATA5040AAAGTCTTTA TATGCTCAGACF6 DNA sequenceGene name: Homo sapiens cDNA FLJ10669 fis, clone NT2RP2006275, weakly similar toMicrotubule-associated protein 1B [CONTAINS: LIGHT CHAIN LC1]Unigene number: Hs.66048Probeset Accession #: AA609717 Nucleic Acid Accession #: AK001531Coding sequence: 176-2194 (predicted start/stop codons underlined)CATCTCCCCC AACCTGGGGG TCGTGTTCTT CAACGCCTGC GAGGCCGCGT CGCGGCTGGC60GCGCGGCGAG GATGAGGCGG AGCTGGCGCT GAGCCTCCTG GCGCAGCTGG GCATCACGCC120TCTGCCACTC AGCCGCGGCC CCGTGCCAGC CAAACCCACC GTGCTCTTCG AGAAGATGGG180CGTGGGCCGG CTGGACATGT ATGTGCTGCA CCCGCCCTCC GCCGGCGCCG AGCGCACGCT240GGCCTCTGTG TGCGCCCTGC TGGTGTGGCA CCCCGCCGGC CCCGGCGAGA AGGTGGTGCG300CGTGCTGTTC CCCGGTTGCA CCCCGCCCGC CTGCCTCCTG GACGGCCTGG TCCGCCTGCA360GCACTTGAGG TTCCTGCGAG AGCCCGTGGT GACGCCCCAG GACCTGGAGG GGCCGGGGCG420AGCCGAGAGC AAAGAGAGCG TGGGCTCCCG GGACAGCTCG AAGAGAGAGG GCCTCCTGGC480CACCCACCCT AGACCTGGCC AGGAGCGCCC TGGGGTGGCC CGCAAGGAGC CAGCACGGGC540TGAGGCCCCA CGCAAGACTG AGAAAGAAGC CAAGACCCCC CGGGAGTTGA AGAAAGACCC600CAAACCGAGT GTCTCCCGGA CCCAGCCGCG GGAGGTGCGC CGGGCAGCCT CTTCTGTGCC660CAACCTCAAG AAGACGAATG CCCAGGCGGC ACCCAAGCCC CGCAAAGCGC CCAGCACGTC720CCACTCTGGC TTCCCGCCGG TGGCAAATGG ACCCCGCAGC CCGCCCAGCC TCCGATGTGG780AGAAGCCAGC CCCCCCAGTG CAGCCTGCGG CTCTCCGGCC TCCCAGCTGG TGGCCACGCC840CAGCCTGGAG CTGGGGCCGA TCCCAGCCGG GGAGGAGAAG GCACTGGAGC TGCCTTTGGC900CGCCAGCTCA ATCCCAAGGC CACGCACACC CTCCCCTGAG TCCCACCGGA GCCCCGCAGA960GGGCAGCGAG CGGCTGTCGC TGAGCCCACT GCGGGGCGGG GAGGCCGGGC CAGACGCCTC1020ACCCACAGTG ACCACACCCA CGGTGACCAC GCCCTCACTA CCCGCAGAGG TGGGCTCCCC1080GCACTCGACC GAGGTGGACG AGTCCCTGTC GGTGTCCTTT GAGCAGGTGC TGCCGCCATC1140CGCCCCCACC AGTGAGGCTG GGCTGAGCCT CCCGCTGCGT GGCCCCCGGG CGCGGCGCTC1200GGCTTCCCCA CACGATGTGG ACCTGTGCCT GGTGTCACCC TGTGAATTTG AGCATCGCAA1260GGCGGTGCCA ATGGCACCGG CACCTGCGTC CCCCGGCAGC TCGAATGACA GCAGTGCCCG1320GTCACAGGAA CGGGCAGGTG GGCTGGGGGC CGAGGAGACG CCACCCACAT CGGTCAGCGA1380GTCCCTGCCC ACCCTGTCTG ACTCGGATCC CGTGCCCCTG GCCCCCGGTG CGGCAGACTC1440AGACGAAGAC ACAGAGGGCT TTGGAGTCCC TCGCCACGAC CCTTTGCCTG ACCCCCTCAA1500GGTCCCCCCA CCACTGCCTG ACCCATCCAG CATCTGCATG GTGGACCCCG AGATGCTGCC1560CCCCAAGACA GCACGGCAAA CGGAGAACGT CAGCCGCACC CGGAAGCCCC TGGCCCGCCC1620CAACTCACGC GCTGCCGCCC CCAAAGCCAC TCCAGTGGCT GCTGCCAAAA CCAAGGGGCT1680TGCTGGTGGG GACCGTGCCA GCCCACCACT CAGTGCCCGG AGTGAGCCCA GTGAGAAGGG1740AGGCCGGGCA CCCCTGTCCA GAAAGTCCTC AACCCCCAAG ACTGCCACTC GAGGCCCGTC1800GGGGTCAGCC AGCAGCCGGC CCGGGGTGTC AGCCACCCCA CCCAAGTCCC CGGTCTACCT1860GGACCTGGCC TACCTGCCCA GCGGGAGCAG CGCCCACCTG GTGGATGAGG AGTTCTTCCA1920GCGCGTGCGC GCGCTCTGCT ACGTCATCAG TGGCCAGGAC CAGCGCAAGG AGGAAGGCAT1980GCGGGCCGTC CTGGACGCGC TACTGGCCAG CAAGCAGCAT TGGGACCGTG ACCTGCAGGT2040GACCCTGATC CCCACTTTCG ACTCGGTGGC CATGCATACG TGGTACGCAG AGACGCACGC2100CCGGCACCAG GCGCTGGGCA TCACGGTGTT GGGCAGCAAC GGCATGGTGT CCATGCAGGA2160TGACGCCTTC CCGGCCTGCA AGGTGGAGTT CTAGCCCCAT CGCCGACACG CCCCCCACTC2220AGCCCAGCCC GCCTGTCCCT AGATTCAGCC ACATCAGAAA TAAACTGTGA CTACACTTG


[0328]

2






TABLE 2










AAA4 Protein sequence:



Gene name: CGI-100 protein


tlnigene number: Hs.275253


Probeset Accession #: AA089688


Protein Accession #: NP_057124


Signal sequence: predicted 1-23 (first underlined sequence)


Transmembrane Domain: predicted 201-217 (second underlined sequence)


emp24/gp25L/p24 domain: predicted 13-227


Summary: gp25L/emp24/p24 protein family members of the cis-Golgi network


bind both COP I and II coatomer. Members of this family are implicated


in bringing cargo forward from the ER and binding to coat proteins by


their cytoplasmic domains.














MGDKIWLPFP VLLLAALPPV LLP
GAAGFTP SLDSDFTFTL PAGQKECFYQ PMPLKASLEI

60






EYQVLDGAGL DIDFHLASPE GKTLVFEQRK SDGVHTVETE VGDYMFCFDN TFSTISEKVI
120





FFELILDNMG EQAQEQEDWK KYITGTDILD MKLEDILESI NSIKSRLSKS GHIQTLLRAF
180





EARDRNIQES NFDRVNFWSM VNLVVMVVVS AIQVYMLKSL FEDKRKSRT











AAA7 Protein sequence:



Gene name: Endothelial differentiation, sphingolipid G-protein-coupled


receptor, 1


(EDG1)


Unigene number: Hs.154218


Probeset Accession #: M31210


Protein Accession #: NP_001391


7 Transmembrane Domains: predicted 50-71, 92-110, 122-140, 160-177,


201-222, 251-269, 281-301 (underlined sequences)


Summary: Endothelial differentiation, sphingolipid G-protein-coupled


receptor, 1 may regulate the differentiation of endothelial cells. It


binds the sphingolipid metabolite, sphingosine-1-phosphate, which may


function as a second messenger in cell proliferation and survival.












MGPTSVPLVK AHRSSVSDYV NYDIIVRHYN YTGKLNISAD KENSIKLTSVVFILICCFII
60








LENIFVLLTI W
KTKKFHRPM YYFIGNLALS DLLAGVAYTA NLLLSGATTY KLTPAQWFLR

120





EGSMFVALSA SVFSLLAIAI ERYITMLKMK LHNGSNNFRLFLLISACWVI SLILGGLPIM
180





GWNCISALSS CSTVLPLYHK HYILFCTTVF TLLLLSIVIL YCRIYSLVRT RSRRLTFRKN
240





ISKASRSSEN VALLKTVIIV LSVFIACWAP LFILLLLDVG CKVKTCDILF RAEYFLVLAV
300







L
NSGTNPIIY TLTNKEMRRA FIRIMSCCKC PSGDSAGKFK RPIIAGMEFS RSKSDNSSHP

360





QKDEGDNPET IMSSGNYNSS S











AAB3 Protein sequence:



Gene name: Solute carrier family 20 (phosphate transporter), member 1,


Human leukaemia virus receptor 1 (GLVR1)


Unigene number: Hs.78452


Probeset Accession #: L20859


Protein Accession #: NP_005406


Transmembrane domains: predicted 24-40, 62-78, 164-180, 198-214, 232-


248, 513-529, 562-578, 604-620, 655-671


Cellular Localization: Likely a Type IIIa membrane protein (Ncyt Cexo)












MATLITSTTA ATAASGPLVD YLWMLILGFI IAFVLAFSVG ANDVANSFGT AVGSGVVTLK
60






QACILASIFE TVGSVLLGAK VSETIRKGLI DVEMYNSTQG LLMAGSVSAM FGSAVWQLVA
120





SFLKLPISGT HCIVGATIGF SLVAKGQEGV KWSELIKIVM SWFVSPLLSG IMSGILFFLV
180





RAFILHKADP VPNGLRALPV FYACTVGINL FSIMYTGAPL LGFDKLPLWG TILISVGCAV
240







FCALIVWF
FV CPRMKRKIER EIKCSPSESP LMEKKNSLKE DHEETKLSVG DIENKHPVSE

300





VGPATVPLQA VVEERTVSFK LGDLEEAPER ERLPSVDLKE ETSIDSTVNG AVQLPNGNLV
360





QFSQAVSNQI NSSGHSQYHT VHKDSGLYKE LLHKLHLAKV GMGDSGDK PLRRNNSYTS
420





YTMAICGMPL DSFRAKEGEQ KGEEMEKLTW PNADSKKRIR MDYTSYCNA VSDLHSASEI
480





DMSVKAAMGL GDRKGSNGSL EEWYDQDKPE VSLLFQFLQI LTACFGSFAH GGNDVSNAIG
540





PLVALYLVYD TGDVSSKVAT PIWLLLYGGV GICVGLWVWG RRVIQTMGKD LTPITPSSGF
600





SIELASALTV VIASNIGLPI STTHCKVGSV VSVGWLRSKK AVDWRLFRNI FMAWFVTVPI
660







SGVISAAIMA I
FRYVILRM












AAB4 Protein sequence:



Gene name: Matrix metalloproteinase 10 (stromelysin 2)


Unigene number: Hs.2258


Probeset Accession #: X07820


Protein Accession #: NP_002416


Signal sequence: predicted 1-17 (underlined sequence)


Cellular Localization: predicted secreted














MMHLAFLVLL CLPVCSA
YPL SGAAKEEDSN KDLAQQYLEK YYNLEKDVKQ FRRKDSNLIV

60






KKIQGMQKFL GLEVTGKLDT DTLEVMRKPR CGVPDVGHFS SFPGMPKWRK THLTYRIVNY
120





TPDLPRDAVD SAIEKALKVW EEVTPLTFSR LYEGEADIMI SFAVKEHGDF YSFDGPGHSL
180





AHAYPPGPGL YGDIHFDDDE KWTEDASGTN LFLVAAHELG HSLGLFHSAN TEALMYPLYN
240





SFTELAQFRL SQDDVNGIQS LYGPPPASTE EPLVPTKSVP SGSEMPAKCD PALSFDAIST
300





LRGEYLFFKD RYFWRRSHWN PEPEFHLISA FWPSLPSYLD AAYEVNSRDT VFIFKGNEFW
360





AIRGNEVQAG YPRGIHTLGF PPTIRKIDAA VSDKEKKKTY FFAADKYWRF DENSQSMEQG
420





FPRLIADDFP GVEPKVDAVL QAFGFFYFFS GSSQFEFDPN ARMVTHILKS NSWLHC











AAB6 Protein sequence:



Gene name: Podocalyxin-like


Unigene number: Hs.16426


Probeset Accession #: U97519


Protein Accession #: NP_005388


Transmembrane domain: predicted 432-448 (underlined sequence)


Cellular Localization: predicted Type Ia membrane protein (Nexo)












MRCALALSAL LLLLSTPPLL PSSPSPSPSP SPSQNATQTT TDSSNKTAPT PASSVTIMAT
60






DTAQQSTVPT SKANEILASV KATTLGVSSD SPGTTTLAQQ VSGPVNTTVA RGGGSGNPTT
120





TIESPKSTKS ADTTTVATST ATAKPNTTSS QNGAEDTTNS GGKSSHSVTT DLTSTKAEHL
180





TTPHPTSPLS PRQPTLTHPV ATPTSSGHDH LMKISSSSST VAIPGYTFTS PGMTTTLPSS
240





VISQRTQQTS SQMPASSTAP SSQETVQPTS PATALRTPTL PETMSSSPTA ASTTHRYPKT
300





PSPTVAHESN WAKCEDLETQ TQSEKQLVLN LTGNTLCAGG ASDEKLISLI CRAVKATFNP
360





AQDKCGIRLA SVPGSQTVVV KEITIHTKLP AKDVYERLKD KWDELKEAGV SDMKLGDQGP
420





PEEAEDRFSM PLIITIVCMA SFLLLVAALY GCCHQRLSQR KDQQRLTEEL QTVENGYHDN
480





PTLEVMETSS EMQEKKVVSL NGELGDSWIV PLDNLTKDDL DEEEDTHL











AAB8 Protein sequence:



Gene name: EGF-containing fibulin-like extracellular matrix protein 1


Unigene number: Hs.76224


Probeset Accession #: U03877


Protein Accession #: NP_004096 Variant 1


Signal sequence: predicted 1-17 (underlined sequence)


Summary: This gene spans approximately 18 kb of genomic DNA and consists


of 12 exons. Two transcripts with distinct 5′ UTR have been des-


cribed; the resulting proteins have distinct N-terminal amino acid se-


quences. Translation initiation from internal methionine residues was


observed with in vitro translation. A signal peptide sequence is pre-


dicted for translation initiation sites 1, 2, and 4. The protein iso-


forms contain 5 or 6 calcium-binding EGF2 domains and 5 or 6 EGF2


domains. Mutations in this gene cause the retinal disease Malattia Leven-


tinese. Transcript Variant: This variant (1) has a distinct 5′ UTR and


N-terminal protein sequence as compared to variant 2.














MLKALFLTML TLALVKS
QDT EETITYTQCT DGYEWDPVRQ QCKDIDECDI VPDACKGGMK

60






CVNHYGGYLC LPKTAQIIVN NEQPQQETQP AEGTSGATTG VVAASSMATS GVLPGGGFVA
120





SAAAVAGPEM QTGRNNFVIR RNPADPQRIP SNPSHRIQCA AGYEQSEHNV CQDIDECTAG
180





THNCRADQVC INLRGSFACQ CPPGYQKRGE QCVDIDECTI PPYCHQRCVN TPGSFYCQCS
240





PGFQLAANNY TCVDINECDA SNQCAQQCYN ILGSFICQCN QGYELSSDRL NCEDIDECRT
300





SSYLCQYQCV NEPGKFSCMC PQGYQVVRSR TCQDINECET TNECREDEMC WNYHGGFRCY
360





PRNPCQDPYI LTPENRCVCP VSNAMCRELP QSIVYKYMSI RSDRSVPSDI FQIQATTIYA
420





NTINTFRIKS GNENGEFYLR QTSPVSAMLV LVKSLSGPRE HIVDLEMLTV SSIGTFRTSS
480





VLRLTIIVGP FSF











AAB9 Protein sequence:



Gene name: Melanoma adhesion molecule, MUC 18 glycoprotein


Unigene number: Hs.211579


Probeset Accession #: M28882


Protein Accession #: NP_006491


Signal sequence: predicted 1-17 (first underlined sequence)


Transmembrane domain: predicted 559-575 (second underlined sequence)


Cellular localization: predicted Type Ia membrane protein (Nexo)














MGLPRLVCAF LLAACCCCPR VAGVPGEAEQ PAPELVEVEV GSTALLKCGL SQSQGNLSBV


60






DWFSVHKEKR TLIFRVRQGQ GQSEPGEYEQ RLSLQDRGAT LALTQVTPQD ERIFLCQGKR
120





PRSQEYRIQL RVYKAPEEPN IQVNPLGIPV NSKEPEEVAT CVGRNGYPIP QVIWYKNGRP
180





LKEEKNRVHI QSSQTVESSG LYTLQSILKA QLVKEDKDAQ FYCELNYRLP SGNHMKESRE
240





VTVPVFYPTE KVWLEVEPVG MLKEGDRVEI RCLADGNPPP HFSISKQNPS TREAEEETTN
300





DNGVLVLEPA RKEHSGRYEC QAWNLDTMIS LLSEPQELLV NYVSDVRVSP AAPERQEGSS
360





LTLTCEAESS QDLEFQWLRE ETDQVLERGP VLQLHDLKRE AGGGYRCVAS VPSIPGLNRT
420





QLVKLAIFGP PWMAFKERKV NVKENMVLNL SCEASGHPRP TISWNVNGTA SEQDQDPQRV
480





LSTLNVLVTP ELLETGVECT ASNDLGKNTS ILFLELVNLT TLTPDSNTTT GLSTSTASPH
540





TRANSTSTER KLPEPESRGVVIVAVIVCILVLAVLGAVLY FLYKKGKLPC RRSGKQEITL
600





PPSRKTELVV EVKSDKLPEE MGLLQGSSGD KRAPGDQGEK YIDLRH











AAC1 Protein sequence:



Gene name: Matrix metalloproteinase 1 (interstitial collagenase)


Unigene number: Hs.83169


Probeset Accession #: X54925


Protein Accession #: NP_002412


Signal sequence: predicted 1-19 (underlined sequence)


Cellular localization: predicted secreted protein














MHSFPPLLLL LFWGVVSIIS
F PATLETQEQD VDLVQKYLEK YYNLKNDGRQ VEKRRNSGPV

60






VEKLKQMQEF FGLKVTGKPD AETLKVMKQP RCGVPDVAQF VLTEGNPRWE QTHLTYRIEN
120





YTPDLPRADV DHAIEKAFQL WSNVTPLTFT KVSEGQADIM ISFVRGDHRD NSPFDGPGGN
180





LAHAFQPGPG IGGDAHFDED ERWTNNFREY NLHRVAAHEL GHSLGLSHST DIGALMYPST
240





TFSGDVQLAQ DDIDGIQAIY GRSQNPVQPI GPQTPKACDS KLTFDAITTI RGEVMFFKDR
300





FYMRTNPFYP EVELNFISVF WPQLPNGLEA AYEFADRDEV RFFKGNKYWA VQGQNVLHGY
360





PKDIYSSFGF PRTVKHIDAA LSEENTGKTY FFVANKYWRY DEYKRSMDPG YPKMIAHDFP
420





GIGHKVDAVF MKDGFFYFFH GTRQYKFDPK TKRILTLQKA NSWFNCRKN











AAC3 Protein sequence:



Gene name: Branched chain aminotransferase 1, cytosolic


Unigene number: Hs.157205


Probeset Accession #: AA423987


Protein Accession #: NP_005495


Cellular Localization: cytolasmic


Summary: The lack of the cytosolic enzyme branched-chain amino acid tran-


saminase (BCT) causes cell growth inhibition. There may be at least 2 dif-


ferent clinical disorders due to a defect of branched-chain amino acid


transamination: hypervalinemia and hyperleucine-isoleucinemia. Since


there are 2 distinct BCATS, mitochondrial and cytosolic, it is possible


that one is mutant in each of these 2 conditions.












MDCSNGSAEC TGEGGSKEVV GTFKAKDLIV TPATILKEKP DPNNLVFGTV FTDHMLTVEW
60






SSEFGWEKPH IKPLQNLSLH PGSSALHYAV ELFEGLKAFR GVDNKIRLFQ PNLNMDRNYR
120





SAVRATLPVF DKEELLECIQ QLVKLDQEWV PYSTSASLYI RPAFIGTEPS LGVKXPTKAL
180





LFVLLSPVGP YFSSGTFNPV SLWANPKYVR AWKGGTGDCK MGGNYGSSLF AQCEDVDNGC
240





QQVLWLYGRD HQITEVGTMN LFLYWINEDG EEELATPPLD GIILPGVTRR CILDLAHQWG
300





EFKVSERYLT MDDLTTALEG NRVREMFSSG TACVVCPVSD ILYKGETIHI PTMENGPKLA
360





SRILSKLTDI QYGREESDWT IVLS











ACG4 Protein sequence:



Gene name: Pentaxin-related gene, rapidly induced by IL-1 beta


Unigene number: Hs.2050


Probeset Accession #: M31166


Protein Accession #: NP_002843


Signal sequence: predicted 1-17 (underlined sequence)


Cellular localization: predicted secreted


Summary: TNF-inducible member of hyaluronate binding protein family,


related to CD44














MHLLAILFCA LWSAVLA
ENS DDYDLMYVNL DNEIDNGLHP TEDPTPCDCG QEHSEWDKLF

60






IMLENSQMRE RMLLQATDDV LRGELQRLRE ELGRLAESLA RPCAPGAPAE ARLTSALDEL
120





LQATRDAGRR LARMEGAEAQ RPEEAGRALA AVLEELRQTR ADLHAVQGWA ARSWLPAGCE
180





TAILFPMRSK KIFGSVHPVR PMRLESFSAC IWVKATDVLN KTILFSYGTK RNPYEIQLYL
240





SYQSIVFVVG GEENKLVAEA MVSLGRWTHL CGTWNSEEGL TSLWVNGELA ATTVEMATGH
300





IVPEGGILQI GQEKNGCCVG GGFDETLAFS GRLTGFNIWD SVLSNEEIRE TGGAESCHIR
360





GNIVGWGVTE IQPHGGAQYV S











ACK5 Protein sequence:



Gene name: Von Willebrand factor; Coagulation factor VIII


Unigene number: Hs.110802


Probeset Accession #: M10321


Protein Accession #: NP_000543


Signal peptide: predicted 1-22 (underlined sequence)


Cellular localization: predicted secreted














MIPARFAGVL LALALILPGT LC
AEGTRGRS STARCSLFGS DFVNTFDGSM YSFAGYCSYL

60






LAGGCQKRSF SIIGDFQNGK RVSLSVYLGE FFDIHLFVNG TVTQGDQRVS MPYASKGLYL
120





ETEAGYYKLS GEAYGFVARI DGSGNFQVLL SDRYFNKTCG LCGNFNIFAE DDFMTQEGTL
180





TSDPYDFANS WALSSGEQWC ERASPPSSSC NISSGEMQKG LWEQCQLLKS TSVFARCHPL
240





VDPEPFVALC EKTLCECAGG LECACPALLE YARTCAQEGM VLYGWTDHSA CSPVCPAGME
300





YRQCVSPCAR TCQSLHINEM CQERCVDGCS CPEGQLLDEG LCVESTECPC VHSGKRYPPG
360





TSLSRDCNTC ICRNSQWICS NEECPGECLV TGQSHFKSFD NRYFTFSGIC QYLLARDCQD
420





HSFSIVIETV QCADDRDAVC TRSVTVRLPG LHNSLVKLKH GAGVAMDGQD IQLPLLKGDL
480





RIQHTVTASV RLSYGEDLQM DWDGRGRLLV KLSPVYAGKT CGLCGNYNGN QGDDFLTPSG
540





LAEPRVEDFG NAWKLHGDCQ DLQKQHSDPC ALNPPMTRFS EEACAVLTSP TFEACHRAVS
600





PLPYLRNCRY DVCSCSDGRE CLCGALASYA AACAGRGVRV AWREPGRCEL NCPKGQVYLQ
660





CGTPCNLTCR SLSYPDEECN EACLEGCFCP PGLYMDERGD CVPKAQCPCY YDGEIFQPED
720





IFSDHHTMCY CEDGFMHCTM SGVPGSLLPD AVLSSPLSHR SKRSLSCRPP MVKLVCPADN
780





LRAEGLECTK TCQNYDLECM SMGCVSGCLC PPGMVRHENR CVALERCPCF NQGKEYAPGE
840





TVKIGCNTCV CPDRKWNCTD HVCDATCSTI GMAHYLTFDG LKYLFPGECQ YVLVQDYCGS
900





NPGTFRILVG NKGCSHPSVK CKKRVTILVE GGEIELFDGE VNVKRPMKDE THFEVVESGR
960





YIILLLGKAL SVVWDRHLSI SVVLKQTYQE KVCGLCGNFD GIQNNDLTSS NLQVEEDPVD
1020





FGNSWKVSSQ CADTRKVPLD SSPATCHNNI MKQTMVDSSC RILTSDVFQD CNKLVDPEPY
1080





LDVCIYDTCS CESIGDCACF CDTIAAYAHV CAQHGKVVTW RTATLCPQSC EERNLRENGY
1140





ECEWRYNSCA PACQVTCQHP EPLACPVQCV EGCHAHCPPG KILDELLQTC VDPEDCPVCE
1200





VAGRRFASGK KVTLNPSDPE HCQICHCDVV NLTCEACQEP GGLVVPPTDA PVSPTTLYVE
1260





DISEPPLHDF YCSRLLDLVF LLDGSSRLSE AEFEVLKAFV VDNMERLRIS QKNVRVAVVE
1320





YHDGSHAYIG LKDRKRPSEL RRIASQVKYA GSQVASTSEV LKYTLFQIFS KIDRPEASRI
1380





ALLLMASQEP QRMSRNFVRY VQGLKKKKVI VIPVGIGPHA NLKQIRLIEK QAPENKAFVL
1440





SSVDELEQQR DEIVSYLCDL APEAPPPTLP PHMAQVTVGP GLLGVSTLGP KRNSMVLDVA
1500





FVLEGSDKIG EADFNRSKEF MEEVIQRMDV GQDSIHVTVL QYSYMVTVEY PFSEAQSKGD
1560





ILQRVREIRY QGGNRTNTGL ALRYLSDHSF LVSQGDREQA PNLVYMVTGN PASDEIKRLP
1620





GDIQVVPIGV GPNANVQELE RIGWPNAPIL IQDFETLPRE APDLVLQRCC SGEGLQIPTL
1680





SPAPDCSQPL DVILLLDGSS SFPASYFDEM KSFAKAFISK ANIGPRLTQV SVLQYGSITT
1740





IDVPWNVVPE KAHLLSLVDV MQREGGPSQI GDALGFAVRY LTSEMHGARP GASKAVVILV
1800





TDVSVDSVDA AADAAPSNRV TVFPIGIGDR YDAAQLRILA GPAGDSNVVK LQRIEDLPTM
1860





VTLGMSFLHK LCSGFVRICM DEDGNEKRPG DVWTLPDQCH TVTCQPDGQT LLKSHRVNCD
1920





RGLRPSCPNS QSPVKVEETC GCRWTCPCVC TGSSTRHIVT FDGQNFKLTG SCSYVLFQNK
1980





EQDLEVILHN GACSPGARQG CMKSIEVKHS ALSVELHSDM EVTVNGRLVS VPYVGGNMEV
2040





NVYGAIMHEV RFNHLGHIFT FTPQNNEFQL QLSPKTFASK TYGLCGICDE NGANDFMLRD
2100





GTVTTDWKTL VQEWTVQRPG QTCQPILEEQ CLVPDSSHCQ VLLLPLFAEC HKVLAPATFY
2160





AICQQDSCHQ EQVCEVIASY AHLCRTNGVC VDWRTPDFCA MSCPPSLVYN HCEHGCPRHC
2220





DGNVSSCGDH PSEGCFCPPD KVMLEGSCVP EEACTQCIGE DGVQHQFLEA WVPDHQPCQI
2280





CTCLSGRKVN CTTQPCPTAK APTCGLCEVA RLRQNADQCC PEYECVCDPV SCDLPPVPHC
2340





ERGLQPTLTN PGECRPNFTC ACRKEECKRV SPPSCPPHRL PTLRKTQCCD EYECACNCVN
2400





STVSCPLGYL ASTATNDCGC TTTTCLPDKV CVHRSTIYPV GQFWEEGCDV CTCTDMEDAV
2460





NGLRVAQCSQ KPCEDSCRSG FTYVLHEGEC CGRCLPSACE VVTGSPRGDS QSSWKSVGSQ
2520





WASPENPCLI NECVRVKEEV FIQQRNVSCP QLEVPVCPSG FQLSCKTSAC CPSCRCERME
2580





ACMLNGTVIG PGKTVMIDVC TTCRCMVQVG ISGFKLECR KTTCNPCPLG YKEENNTGEC
2640





CGRCLPTACT IQLRGGQIMT LKRDETLQDG CDTHFCKVNE RGEYFWEKRV TGCPPFDEHK
2700





CLAEGGKIMK IPGTCCDTCE EPECNDITAR LQYVKVGSCK SEVEVDIHYC QGKCASKAYIY
2760





SIDINDVQDQ CSCCSPTRTE PMQVALHCTN GSVVYHEVLN AMECKCSPRK CSK











AAC7 protein sequence:



Gene name: KIAA1294 protein


Probeset Accession #: AA432248


Protein Accession #: BAA92532


Cellular localization: predicted nuclear protein


PFAM prediction: 22-153 Band 41 domain (underlined seq). A number of


cytoskeletal-associated proteins that associate with various proteins


at the interface between the plasma membrane and the cytoskeleton con-


tain a conserved N-terminal domain of about 150 amino-acid residues.












MAVQLVPDSA LGLLMMTEGR RCQVHLLDDR KLELLVQPKL LAKELLDLVA SHFNLKEKEY
60








FGIAFTDETG HLNWLQLDRR VLEHDFPKKS GPVVLYFCVR FYIESISYLK DNATIELFFL


120







NAKSCIYKEL IDVDSEVVFE LASYILQEAK GDF
SSNEVVR SDLKKLPALP TQALKEHPSL

180





AYCEDRVIEH YKKLNGQTRG QAIVNYMSIV ESLPTYGVHY YAVKDKQGIP WWLGLSYKGI
240





FQYDYHDKVK PRKIFQWRQL ENLYFREKKF SVEVHDPRRA SVTRRTFGHS GIAVHTWYAC
300





PALIKSIWAM AISQHQFYLD RKQSKSKIHA ARSLSEIAID LTETGTLKTS KLCLMGSKGK
360





IISGSSGSLL SSGSQESDSS QSAKKDMLAA LKSRQEALEE TLRQRLEELK KLCLREAELT
420





GKLPVEYPLD PGEEPPIVRR RIGTAFKLDE QKILPKGEEA ELERLEREFA IQSQITEAAR
480





RLASDPNVSK KLKKQRKTSY LNALKKLQEI ENAINENRIK SGKKPTQRAS LIIDDGNIAS
540





EDSSLSDALV LEDEDSQVTS TISPLHSPHK GLPPRPPSHN RPPPPQSLEG LRQMHYHRND
600





YDKSPIKPKM WSESSLDEPY EKVKKRSSHS HSSSHKRFPS TGSCAEAGGG SNSLQNSPIR
660





GLPHWNSQSS MPSTPDLRVR SPHYVHSTRS VDISPTRLHS LALHFRHRSS SLESQGKLLG
720





SENDTGSPDF YTPRTRSSNG SDPMDDCSSC TSHSSSEHYY PAQMNANYST LAEDSPSKAR
780





QRQRQRQRAA GALGSASSGS MPNLAARGGA GGAGGAGGGV YLHSQSQPSS QYRIKEYPLY
840





IEGGATPVVV RSLESDQECH YSVKAQFKTS NSYTAGGLFK ESWRGGGGDE GDTGRLTPSR
900





SQILRTPSLG REGAHDKGAG RAAVSDELRQ WYQRSTASHK EHSRLSHTSS TSSDSGSQYS
960





TSSQSTFVAH SRVTRMPQMC KATSAALPQS QRSSTPSSEI GATPPSSPHH ILTWQTGEAT
1020





ENSPILDGSE SPPHQSTDE











ACG8 Protein sequence:



Gene name: ubiquitin E3 ligase SMURF2


Unigene number: Hs.21806 (3′UTR only)


Probeset Accession #: AA398243


Protein Accession #: AF301463_1


Cellular Localization: predicted cytoplasmic


Summary: Smurf 2 Is a Ubiquitin E3 Ligase Mediating Proteasome-dependent


Degradation of Smad2 in Transforming Growth Factor-beta Signaling












MSNPGGRRNG PVKLRLTVLC AKNLVKKDFF RLPDPFAKVV VDGSGQCHST DTVKNTLDPK
60






WNQHYDLYIG KSDSVTISVW NHKKIHKKQG AGFLGCVRLL SNAINRLKDT GYQRLDLCKL
120





GPNDNDTVRG QIVVSLQSRD RIGTGGQVVD CSRLFDNDLP DGWEERRTAS GRIQYLNHIT
180





RTTQWERPTR PASEYSSPGR PLSCFVDENT PISGTNGATC GQSSDPRLAE RRVRSQRHRN
240





YMSRTHLHTP PDLPEGYEQR TTQQGQVYFL HTQTGVSTWH DPRVPRDLSN INCEELGPLP
300





PGWEIRNTAT GRVYFVDHNN RTTQFTDPRL SANLHLVLNR QNQLKDQQQQ QVVSLCPDDT
360





ECLTVPRYKR DLVQKLKILR QELSQQQPQA GHCRIEVSRE EIFEESYRQV MKMRPKDLWK
420





RLMIKFRGEE GLDYGGVARE WLYLLSHEML NPYYGLFQYS RDDIYTLQIN PDSAVNPEHL
480





SYFHFVGRIM GMAVFHGEYI DGGFTLPFYK QLLGKSITLD DMELVDPDLH NSLVWILEND
540





ITGVLDHTFC VEHNAYGEII QHELKPNGKS IPVNEENKKE YVRLYVNWRF LRGIEAQFLA
600





LQKGFNEVIP QHLLKTFDEK ELELIICGLG KIDVNDWKVN TRLKECTPDS NIVKWFWKAV
660





EFFDEERRAR LLQFVTGSSR VPLQGFKALQ GAAGPRLFTI HQIDACTNNL PKAHTCFNRI
720





DIPPYESYEK LYEKLLTAIE ETCGFAVE











ACE1 Protein sequence:



Gene name: EST


Unigene number: Hs.30089


Probeset Accession #: AA410480


CAT cluster#: cluster 96816_1


Summary: predicted open reading frame












PLWTEPPLSC CLPATYPADR GPAEPCSCAG VILGFLLFRG HNSQPTMTQT SSSQGGLGGL
60






SLTTEPVSSN PGYIPSSEAN RPSHLSSTGT PGAGVPSSGR DGGTSRDTFQ TTPPNSTTMS
120





LSMREDATIL PSPTSETVLT VAAFGVISFI VILVVVVIIL VGVVSLRFKC RKSKESGDPQ
180





KPGEREEKVG HRREPYPWN











ACJ2 Protein sequence:



Gene name: Complement component Clq receptor


Unigene number: Hs.97199


Probeset Accession #: AA487558


Protein Accession #: NP_036204


Signal sequence: 1-17 (first underlined sequence)


Transmemrane domain: 589-605 (second underlined sequence)


Cellular localization: This gene encodes a predicted type I membrane


protein. Summary: This protein acts as a receptor for complement pro-


tein Clq, mannose-binding lectin, and pulmonary surfactant protein A.


This protein is a functional receptor involved in ligand-mediated


enhancement of phagocytosis.














MATSMGLLLL LLLLLTQP
GA GTGADTEAVV CVGTACYTAH SGKLSAAEAQ NHCNQNGGNL

60






ATVKSKEEAQ HVQRVLAQLL RREAALTARM SKFWIGLQRE KGKCLDPSLP LKGFSWVGGG
120





EDTPYSNWHK ELRNSCISKR CVSLLLDLSQ PLLPNRLPKW SEGPCGSPGS PGSNIEGFVC
180





KFSFKGMCRP LALGGPGQVT YTTPFQTTSS SLEAVPFASA ANVACGEGDK DETQSHYFLC
240





KEKAPDVFDW GSSGPLCVSP KYGCNFNNGG CHQDCFEGGD GSFLCGCRPG FRLLDDLVTC
300





ASRNPCSSSP CRGGATCVLG PHGKNYTCRC PQGYQLDSSQ LDCVDVDECQ DSPCAQECVN
360





TPGGFRCECW VGYEPGGPGE GACQDVDECA LGRSPCAQGC TNTDGSFHCS CEEGYVLAGE
420





DGTQCQDVDE CVGPGGPLCD SLCFNTQGSF HCGCLPGWVL APNGVSCTMG PVSLGPPSGP
480





PDEEDKGEKE GSTVPRAATA SPTRGPEGTP KATPTTSRPS LSSDAPITSA PLKMLAPSGS
540





SGVWREPSIH HATAASGPQE PAGGDSSVAT QNNDGTDGQK LLLFYILGTV VAILLLLALA
600







LGLLV
YRKRR AKREEKKEKK PQNAADSYSW VPERAESRAM ENQYSPTPGT DC












ACJ3 Protein sequence:



Gene name: FLT1/vascular endothelial growth factor receptor


Unigene number: Hs.138671


Probeset Accession #: AA047437


Transmettlbrane domain: predicted 764-780 (underlined sequence)


Cellular Localization: predicted cell surface tyrosine kinase












MVSYWDTGVL LCALLSCLLL TGSSSGSKLK DPELSLKGTQ HIMQAGQTLH LQCRGEAAHK
60






WSLPEMVSKE SERLSITKSA CGRNGKQFCS TLTLNTAQAN HTGFYSCKYL AVPTSKKKET
120





ESAIYIFISD TGRPFVEMYS EIPEIIHMTE GRELVIPCRV TSPNITVTLK KFPLDTLIPD
180





GKRIIWDSRK GFIISNATYK EIGLLTCEAT VNGHLYKTNY LTHRQTNTII DVQISTPRPV
240





KLLRGHTLVL NCTATTPLNT RVQMTWSYPD EKNKRASVRR RIDQSNSHAN IFYSVLTIDK
300





MQNKDKGLYT CRVRSGPSFK SVNTSVHIYD KAFITVKHRK QQVLETVAGK RSYRLSMKVK
360





AFPSPEVVWL KDGLPATEKS ARYLTRGYSL IIKDVTEEDA GNYTILLSIK QSNVFKNLTA
420





TLIVNVKPQI YEKAVSSFPD PALYPLGSRQ ILTCTAYGIP QPTIKWFWHP CNHNHSEARC
480





DFCSNNEESF ILDADSNMGN RIESITQRMA IIEGKNKMAS TLVVADSRIS GIYICIASNK
540





VGTVGRNISF YITDVPNGFH VNLEKMPTEG EDLKLSCTVN KFLYRDVTWI LLRTVNNRTM
600





HYSISKQKMA ITKEHSITLN LTIMNVSLQD SGTYACRARN VYTGEEILQK KEITIRDQEA
660





PYLLRNLSDH TVAISSSTTL DCHANGVPEP QITWFKNNHK IQQEPGIILG PGSSTLFIER
720





VTEEDEGVYH CKATNQKGSV ESSAYLTVQG TSDKSNLELI TLTCTCVAAT LFWLLLTLLI
780





RKMKRSSSEI KTDYLSIIMD PDEVPLDEQC ERLPYDASKW EFARERLKLG KSLGRGAFGK
840





VVQASAFGIK KSPTCRTVAV KMLKEGATAS EYKALMTELK ILTHIGHHLN VVNLLGACTK
900





QGGPLMVIVE YCKYGNLSNY LKSKRDLFFL NKDAALHMEP KKEKMEPGLE QGKKPRLDSV
960





TSSESFASSG FQEDKSLSDV EEEEDSDGFY KEPITMEDLI SYSFQVARGM EFLSSRKCIE
1020





RDLAARNILL SENNVVKICD FGLARDIYKN PDYVRKGDTR LPLKWMAPES IFDKIYSTKS
1080





DVWSYGVLLW EIFSLGGSPY PGVQMDEDFC SRLREGMRMR APEYSTPEIY QIMLDCWHRD
1140





PKERPRFAEL VEKLGDLLQA NVQQDGKDYI PINAILTGNS GFTYSTPAFS EDFFKESISA
1200





PKFNSGSSDD VRYVNAFKFM SLERIKTFEE LLPNATSMFD DYQGDSSTLL ASPMLKRFTW
1260





TDSKPKASLK IDLRVTSKSK ESGLSDVSRP SFCHSSCGHV SEGKRRFTYD HAELERKIAC
1320





CSPPPDYNSV VLYSTPPI











ACJ9 Protein sequence:



Gene name: Purine nucleoside phosphorylase


Unigene number: Hs.75514


Probeset Accession #: K02574


Protein Accession #: CAA25320


Cellular Localization: predicted cytoplasmic


Summary: likely to catalyze the reversible phosphorolytic cleavage of


purine ribonucleosides and 2′-deoxyribonucleosides












MENGYTYEDY KNTAEWLLSH TKHRPQVAII CGSGLGGLTD KLTQAQIFDY SEIPNFPRST
60






VPGHAGRLVF GFLNGRACVM MQGRFHMYEG YPLWKVTFPV RVFHLLGVDT LVVTNAAGGL
120





NPKFEVGDIM LIRDHINLPG FSGQNPLRGP NDERFGDRFP AMSDAYDRTM RQRALSTWKQ
180





MGEQRELQEG TYVMVAGPSF ETVAECRVLQ KLGADAVGMS TVPEVIVARH CGLRVFGFSL
240





ITNKVIMDYE SLEKANHEEV LAAGKQAAQK LEQFVSILMA SIPLPDKAS











ACK4 Protein sequence



Gene name: EST


Probeset Accession #: R68763


Predicted amino acid seg: FGENESH exon prediction on BAC clone AC009414


Predicted nuclear target motifs: from 25 (4) RRRP (underlined); 176 (5)


RRRR (underlined); 177 (5) RRRR (underlined; 239 (5) KRKK (underlined);


399 (4) PPRARRT (underlined); 400 (5) PRARRTE (underlined)


Cellular localization: predicted nuclear












MPPEQHHQPN KVSPKLCSAQ PAPRGRRRPG GRGPAAGGRT FANARFVLGE GVAIERGADD
60






TTQPPVAGSV NPEGAAAALV PLAGARVAAA ADALHDAPRA VPGLLALGLV TGQADQRPGA
120





GARQQQQQPQ QRDQEVPAAG QPPVPRHQVH PPAPPPPPPR SRAGSGAGAL PCAGHTRRRR
180







R
TSSPRSSPP LSGPPGRASP RGARPPPLLR AAPTPSPRAL APAAASPPPP PPPPGREGEK

240







RKK
FPPGSSG STQTSGAAAA VAAALGSSPG RRRLLPLLLR VGRPRSGAAS GPVPASRAAE

300





WARWRSTRSA ASAPRAPLAS LLRRSSGRLF MAGASAARAA PSPILPPPPD LPPTPTRRAP
360





LIGCPPSPAR PAPSASPSPS RAAGPFLPPS HASTSSRSPP PRARRTEPAV PPSCGSGPGA
420





AGALRMGLGR TQRAARVAVS RALAGTVAAA AGLGARRARR LHLRGQIGVR RVAGTPEARG
480





RGDGCSLGRV SPDRTPGKGS KGMEPPHTG











AAA8 Protein sequence:



Gene name: ETL protein, with extended open reading frame


Unigene number: Hs.57958


Probeset Accession #: D58024


Protein Accession #: AAG33021


Transmembrane domains: predicted 454-470, 486-502, 511-527, 528-544,


556-572, 600-616, 642-661, 672-689 (underlined sequences) Extended


sequence: Residues 1-564 were added to the sequence in AAG33021


Cellular Localization: predicted cell surface serpentine receptor












MKTAALTPPR SPPPPPLRPP PMKRLPLLVV FSTLLNCSYT QNCTKTPCLP NAKCEIRNGI
60






EACYCNMGFS GNGVTICEDD NECGNLTQSC GENANCTNTE GSYYCMCVPG FRSSSNQDRF
120





ITNDGTVCIE NVNANCHLDN VCIAANINKT LTKIRSIKEP VALLQEVYRN SVTDLSPTDI
180





ITYIEILAES SSLLGYKNNT ISAKDTLSNS TLTEFVKTVN NFVQRDTFVV WDKLSVNHRR
240





THLTKLMHTV EQATLRISQS FQKTTEFDTN STDIALKVFF FDSYNMKHIH PHMNMDGDYI
300





NIFPKRKAAY DSNGNVAVAF LYYKSIGPLL SSSDNFLLKP QNYDNSEEEE RVISSVISVS
360





MSSNPPTLYE LEKITFTLSH RKVTDRYRSL CAFWNYSPDT MNGSWSSEGC ELTYSNETHT
420





SCRCNHLTHF AILMSSGPSI GIKDYNILTR ITQLGIIISL ICLAICIFTF WFFSEIQSTR
480





TTIHKNLCCS LFLAELVFLV GINTNTNKLX SVSIIAGLLH YFFLAAFAWM CIEGIHLYLI
540







VVGV
IYNKGF LHKNFYIFGY LSPAVVVGFS AALGYRYYGT TKVCWLSTET HFIWSFIGPA

600







CLIILVNLLA FGVIIY
KVFR HTAGLKPEVS CFENIRSCAR GALALLFLLG TTWIFGVLHV

660







V
HASVVTAYL FTVSNAFQGM FIFLFLCVLS RKIQEEYYRL FKNVPCCFGC LR












AAC6 Protein sequence:



Gene name: EST


Unigene number: H5.134797


Probeset Accession #: AA025351


Protein accession #: BAB14599


Signal sequence: predicted 1-24 (first underlined sequence)


extended sequence: second underlined sequence














MILSLLFSLG GPLGWGLLGA WAQA
SSTSLS DLQSSRTPGV WKAEAEDTSK DPVGRNWCPY

60






PMSKLVTLLA LCKTEKFLIH SQQPCPQGAP DCQKVKVMYR MAHKPVYQVK QKVLTSLAWR
120





CCPGYTGPNC EHHDSMAIPE PADPGDSHQE PQDGPVSFKP GHLAAVINEV EVQQEQQEHL
180





LGDLQNDVHR VADSLPGLWK ALPGNLTAAV MEANQTGHEF PDRSLEQVLL PHVDTFLQVH
240







FSPIWRSFNQ SLHSLTQAIR NLSLDVEANR QAISRVQDSA VARADFQELG AKFEAKVQEN


300







TQRVGQLRQD VEDRLHAQHF TLHRSISELQ ADVDTKLKRL HKAQEAPGTN GSLVLATPGA


360







GARPEPDSLQ ARLGQLQRL SELHMTTARR EEELQYTLED MRATLTRHVD EIKELYSESD


420







ETFDQISKVE RQVEELQVH TALRELRVIL MEKSLIMEEN KEEVERQLLE LNLTLQHLQG


480







GHADLIKYVK DCNCQKLYLD LDVIREGQRD ATRALEETQV SLDERRQLDG SSLQALQNAV


540







DAVSLAVDAH KAEGERARAA TSRLRSQVQA LDDEVGALKA AAAEARHEVR QLHSAFAALL


600







EDALRHEAVL AALFGEEVLE EMSEQTPGPL PLSYEQIRVA LQDAASGLQE QALGWDELAA


660







RVTALEQASE PPRPAEHLEP SHDAGREEAA TTALAGLARE LQSLSNDVKN VGRCCEAEAG


720







AGAASLNASL DGLHNALFAT QRSLEQHQRL FHSLFGNFQG LMEANVSLDL GKLQTMLSRK


780







GKKQQKDLEA PRKRDKKEAE PLVDIRVTGP VPGALGAALW EASPVAFYAS FSEGTAALQT


840







VKFNTTYINI GSSYFPEHGY FRAPERGVYL FAVSVEFGPG PGTGQLVFGG HHRTPVCTTG


900







QGSGSTATVF AMAELQKGER VWFELTQGSI TKRSLSGTAF GGFLMFKT













ACH7 Protein sequence:



Gene name: EST


Unigene number: Hs.3807


Probeset Accession #: AA292694


BAC Accession #: AL161751


FGENESH predicted aa seg: 1-647; based on BAC clone AL161751












MGKDFMTKTP KAFATKAKID KWDLIKLKSF CTAKETIIRV NSQPTDWQKT FAIYPSDKGV
60






IARIYKELEQ IYKKKKPTKT LRTHFLSRPK GNCWPLGPRG DSWQLGGPSG ARAEGKGGGT
120





GLGKPAVEGG DRAPDTALRP RAGQIQVGSS SACGASENEA GVRPVPPLAG ALARAGRRRT
180





PHCRPCWLLG LGGLLQPAPR YHEAAGGRGG LHPARWGAQH RACGRRAARC ARAPAGRPRA
240





RRGLQRPAVL GRTGAQAFPL HPGERAFAGF LLAVLRPRRS RKRHAAVGGG APTLLHRAEM
300





RGTPGHRWGR ARSWKEMRCH LRANGYLCKY QFEVLCPAPR PGAASNLSYR APFQLESAAL
360





DFSPPGTEVS ALCRGQLPIS VTCIADEIGA RWDKLSGDVL CPCPGRYLRA GKCAELPNCL
420





DDLGGFACEC ATGFELGKDG RSCVTSGEGQ PTLGGTGVPT RRPPATATSP VPQRTWPIRV
480





DEKLGETPLV PEQDNSVTSI PEIPRWGSQS TMSTLQMSLQ AESKATITPS GSVISKFNST
540





TSSATPQAFD SSSAVVFIFV STAVVVLVIL TMTVLGLVKL CFHESPSSQP RKESMGPPGL
600





ESDPEPAALG SSSAHCTNNG VKVGDCDLRD RAEGALLAES PLGSSDA











AAD4 Protein sequence



Gene name: ERG


Unigene number: Hs.45514


Probeset Accession #: R32894


Protein Accession #: AAA52398


Signal sequence: none


Transmembrane domains: none


PFAM domains: predicted Ets-domain 294-373; SAM_PNT: 122-206


Summary: ERG2 is a sequence-specific DNA-binding protein.












MIQTVPDPAA HIKEALSVVS EDQSLFECAY GTPHLAKTEM TASSSSDYGQ TSKMSPRVPQ
60






QDWLSQPPAR VTIKMECNPS QVNGSRNSPD ECSVAKGGKM VGSPDTVGMN YGSYMEEKHM
120





PPPNMTTNER RVIVPADPTL WSTDHVRQWL EWAVKEYGLP DVNILLFQNI DGKELCKMTK
180





DDFQRLTPSY NADILLSHLH YLRETPLPHL TSDDVDKALQ NSPRLMHARN TDLPYEPPRR
240





SAWTGHGHPT PQSKAAQPSP STVPKTEDQR PQLDPYQILG PTSSRLANPG SGQIQLWQFL
300





LELLSDSSNS SCITWEGTNG EFKMTDPDEV ARRWGERKSK PNMNYDKLSR ALRYYYDKNI
360





MTKVHGKRYA YKFDFHGIAQ ALQPHPPESS LYKYPSDLPY MGSYHARPQK MNFVAPHPPA
420





LPVTSSSFFA APNPYWNSPT GGIYPNTRLP TSHMPSELGT YY
462











AAD5 Protein sequence



Gene name: activin A receptor type Il-like 1 (ALK-1)


Unigene number: Hs.172670


Probeset Accession #: T57112


Protein Accession #: NP_000011


Signal sequence: predicted 1-21


Transmembrane domain: predicted 119-135


PFAM domains: predicted pkinase 204-489


Summary: Type Ia membrane protein; receptor tyrosine kinase














MTLGSPRKGL LMLLMALVTQ G
DPVKPSRGP LVTCTCESPH CKGPTCRGAW CTVVLVREEG

60






RHPQEHRGCG NLHRELCRGR PTEFVNHYCC DSHLCNHNVS LVLEATQPPS EQPGTDGQLA
120







LILGPVLALL ALVAL
GVLGL WHVRRRQEKQ RGLHSELGES SLILKASEQG DTMLGDLLDS

180





DCTTGSGSGL PFLVQRTVAR QVALVECVGK GRYGEVWRGL WHGESVAVKI FSSRDEQSWF
240





RETEIYNTVL LRHDNILGFI ASDMTSRNSS TQLWLITHYH EHGSLYDFLQ RQTLEPHLAL
300





RLAVSAACGL AHLHVEIFGT QGKPAIAHRD FKSRNVLVKS NLQCCIADLG LAVMHSQGSD
360





YLDIGNNPRV GTKRYMAPEV LDEQIRTDCF ESYKWTDA FGLVLWEIAR RTIVNGIVED
420





YRPPFYDVVP NDPSFEDMKK VVCVDQQTPT IPNRLAADPV LSGLAQMMRE CWYPNPSARL
480





TALRIKKTLQ KISNSPEKPK VIQ











AAD8 Protein sequence



Gene name: ESTs


Unigene number: Hs.144953


Probeset Accession #: AA404418


Protein Accession #: n/a


Signal sequence: n/a


Transmembrane domains: n/a


PFAM domains: n/a


Summary: no ORF identified; possible frameshifts. Nearby to PCTAIRE


protein kinase 2 (PCTK2) on the genome (within 100 kb).





ACA2 Protein sequence



Gene name: EST


Unigene number: Hs.16450


Probeset Accession #: AA478778


Protein Accession #: n/a


Signal sequence: n/a


Transmembrane domains: n/a


PFAM domains: n/a


Summary: no ORF identified, possible frameshifts; although a match was


found to the HTGS genomic sequence, the sequence does not extend far


enough upstream to predict coding exons.





ACA4 Protein sequence



Gene name: alpha satellite junction DNA sequence


Unigene number: Hs.247946


Probeset Accession #: M21305


Protein Accession #: AAA88020


Signal sequence: none


Transmembrane domains: none


PFAM domains; none





MEWNGMAWNR IKWNGINSSG MEWNGMEWNA VQCNRNEWNE LELTGMEWNG MHLN





ACG6 Protein sequence



Gene name: intercellular adhesion molecule 2 (ICAM2)


Unigene number: Hs.83733


Probeset Accession #: M32334


Protein Accession #: NP_000864


Signal sequence: predicted 1-21


Transmembrane domain: predicted 224-248


PFAM domains: predicted 41-98, 127-197; immunoglobulin-like C2-type


domains Summary: a predicted Type Ia membrane protein; it plays a role


in cell adhesion and is the ligand for the LFA-1 protein. ICAM2 is also


called CD102.












MSSFGYRTLT VALFTLICCP GSDEKVFEVH VRPKKLAVEP KGSLEVNCST TCNQPEVGGL
60






ETSLNKILLD EQAQWKHYLV SNISHDTVLQ CHFTCSGKQE SNNSNVSVYQ PPRQVILTLQ
120





PTLVAVGKSF TIECRVPTVE PLDSLTLFLF RGNETLHYET FGKAAPAPQE ATATFNSTAD
180





REDGRRNFSC LAVLDLMSRG GNIFHKHSAP KMLEIYEPVS DSQMVIIVTV VSVLLSLFVT
240





SVLLCFIFGQ HLRQQRMGTY GVRAAWRRLP QAFRP











ACG7 Protein sequence



Gene name: Cadherin 5, VE-cadherin (CDH5)


Unigene number: Hs.76206


Probeset Accession #: X79981


Protein Accession #: NP_001786


Signal sequence: predicted 1-27


Transmembrane domain: predicted 604-620


PFAM domains: Cadherin domains predicted 53-141, 156-249, 263-364, 377-


470, and 487-576


Summary: Likely a Type I membrane protein. Cadherins are calc. m-


dependent adhesive proteins that mediate cell-to-cell interaction. VE-


cadherin is associated with intercellular junctions.












MQRLMMLLAT SGACLGLLAV AAVAAAGANP AQRDTHSLLP THRRQKRDWI WNQMHIDEEK
60






NTSLPHHVGK IKSSVSRKNA KYLLKGEYVG KVFRVDAETG DVFAIERLDR ENISEYHLTA
120





VIVDKDTGEN LETPSSFTIK VHDVNDNWPV FTHRLFNASV PESSAVGTSV ISVTAVDADD
180





PTVGDHASVM YQILKGKEYF AIDNSGRIIT ITKSLDREKQ ARYEIVVEAR DAQGLRGDSG
240





TATVLVTLQD INDNFPFFTQ TKYTFVVPED TRVGTSVGSL FVEDPDEPQN RMTKYSILRG
300





DYQDAFTIET NPAHNEGIIK PMKPLDYEYI QQYSFIVEAT DPTIDLRYMS PPAGNRAQVI
360





INITDVDEPP IFQQPFYHFQ LKENQKKPLI GTVLAMDPDA ARHSIGYSIR RTSDKGQFFR
420





VTKKGDIYNE KELDREVYPW YNLTVEAKEL DSTGTPTGKE SIVQVHIEVL DENDNAPEFA
480





KPYQPKVCEN AVHGQLVLQI SAIDKDITPR NVKFKFTLNT ENNFTLTDNH DNTANITVKY
540





GQFDREHTKV HFLPVVISDN GMPSRTGTST LTVAVCKCNE QGEFTFCEDM AAQVGVSIQA
600





VVAILLCILT ITVITLLIFL RRRLRKQARA HGKSVPEIHE QLVTYDEEGG GEMDTTSYDV
660





SVLNSVRRGG AKPPRPALDA RPSLYAQVQK PPRHAPGAHG GPGEMAANIE VKKDEADHDG
720





DGPPYDTLHI YGYEGSESIA ESLSSLGTDS SDSDVDYDFL NDWGPRFKML AELYGSDPRE
780





ELLY











ACG9 Protein sequence



Gene name: lysyl oxidase-like 2 (LOXL2)


Unigene number: Hs.83354


Probeset Accession #: U89942


Protein Accession #: NP_002309


Signal sequence: predicted 1-2


Transmembrane domains: none predicted


PFAM domains: scavenger receptor cysteine-rich domains predicted 68-


159, 203-238, 336-425, 439-528; Lysyl oxidase predicted 548-749.


Summary: Likely a secreted protein. Lysyl oxidase is a copper-dependent


amine oxidase that belongs to a heterogeneous family of enzymes that


oxidize primary amine substrates to reactive aldehydesm, acting on the


extracellular matrix substrates, e.g., collagen and elastin.












MERPLCSHLC SCLAMLALLS PLSLAQYDSW PHYPEYFQQP APEYHQPQAP ANVAKIQLRL
60






AGQKRKHSEG RVEVYYDGQW GTVCDDDFSI HAAHVVCREL GYVEAKSWTA SSSYGKGEGP
120





IWLDNLHCTG NEATLAACTS NGWGVTDCKH TEDVGVVCSD KRIPGFKFDN SLINQIENLN
180





IQVEDIRIRA ILSTYRKRTP VMEGYVEVKE GKTWKQICDK HWTAKNSRVV CGMFGFPGER
240





TYNTKVYKMF ASRRKQRYWP FSMDCTGTEA HISSCKLGPQ VSLDPMIGNT CENGLPAVVS
300





CVPGQVFSPD GPSRFRKAYK PEQPLVRLRG GAYIGEGRVE VLKNGEWGTV CDDKWDLVSA
360





SVVCRELGFG SAKEAVTGSR LGQGIGPIHL NEIQCTGNEK SIIDCKFNAE SQGCNHEEDA
420





GVRCNTPAMG LQKKLRLNGG RNPYEGRVEV LVERNGSLVW GMVCGQNWGI VEAMVVCRQL
480





GLGFASNAFQ ETWYWHGDVN SNKVVMSGVK CSGTELSLAH CRHDGEDVAC PQGGVQYGAG
540





VACSETAPDL VLNAEMVQQT TYLEDRPMFM LQCAMEENCL SASAAQTDPT TGYRRLLRFS
600





SQIHNNGQSD FRPKNGRHAW IWHDCHRHYH SMEVFTHYDL LNLNGTKVAE GHKASFCLED
660





TECEGDIQKN YECANFGDQG ITMGCWDMYR HDIDCQWVDI TDVPPGDYLF QVVINPNFEV
720





AESDYSNNIM KCRSRYDGHR IWMYNCHIGG SFSEETEKKF EHFSGLLNNQ LSPQ











ACH2 Protein sequence



Gene name: TIE tyrosine-protein kinase


Unigene number: Hs.78824


Probeset Accession #: X60957


Protein Accession #: NP_005415


Signal sequence: predicted 1-21


Transmembrane domain: predicted 770-786


PFAM domains: laminin-EGF predicted 234-267; FN3 predicted 460-520, 548-


632, and 644-729; tyrosine_kinase predicted 839-1107


Summary: Likely a Type Ia membrane protein; TIE is a tyrosine-kinase


receptor with an unknown ligand; its expression is likely necessary for


normal blood vessel development.












MVWRVPPFLL PILFLASHVG AAVDLTLLAN LRLTDPQRFF LTCVSGEAGA GRGSDAWGPP
60






LLLEKDDRIV RTPPGPPLRL ARNGSHQVTL RGFSKPSDLV GVFSCVGGAG ARRTRVIYVH
120





NSPGAHLLPD KVTHTVNKGD TAVLSARVHK EKQTDVIWKS NGSYFYTLDW HEAQDGRFLL
180





QLPNVQPPSS GIYSATYLEA SPLGSAFFRL IVRGCGAGRW GPGCTKECPG CLHGGVCHDH
240





DGECVCPPGF TGTRCEQACR EGRFGQSCQE QCPGISGCRG LTFCLPDPYG CSCGSGWRGS
300





QCQPCAPGH FGADCRLQCQ CQNGGTCDRF SGCVCPSGWH GVHCEKSDRI PQILNMASEL
360





EFNTMPRI NCAAAGNPFP VRGSIELRKP DGTVLLSTKA IVEPEKTTAE FEVPRLVLAD
420





SGFWECRVST SGGQDSRRFK VNVKVPPVPL AAPRLLTKQS RQLVVSPLVS FSGDGPISTV
480





RLHYRPQDST MDWSTIVVDP SENVTLMNLR PKTGYSVRVQ LSRPGEGGEG AWGPPTLMTT
540





DCPEPLLQPW LEGWHVEGTD RLRVSWSLPL VPGPLVGDGF LLRLWDGTRG QERRENVSSP
600





QARTALLTGL TPGTHYQLDV QLYHCTLLGP ASPPAHVLLP PSGPPAPRHL HAQALSDSEI
660





QLTWKHPEAL PGPISKYVVE VQVAGGAGDP LWIDVDRPEE TSTIIRGLNA STRYLFRMRA
720





SIQGLGDWSN TVEESTLGNG LQAEGPVQES RAAEEGLDQQ LILAVVGSVS ATCLTILAAL
780





LTLVCIRRSC LHRRRTFTYQ SGSGEETILQ FSSGTLTLTR RPKLQPEPLS YPVLEWEDIT
840





FEDLIGEGNF GQVIRAMIKK DGLKMNAAIK MLKEYASEND HRDFAGELEV LCKLGHHPNI
900





INLLGACKNR GYLYIAIEYA PYGNLLDFLR KSRVLETDPA FAREHGTAST LSSRQLLRFA
960





SDAANGMQYL SEKQFIHRDL AARNVLVGEN LASKIADFGL SRGEEVYVKK TMGRLPVRWM
1020





AIESLNYSVY TTKSDVWSFG VLLWEIVSLG GTPYCGMTCA ELYEKLPQGY RMEQPRNCDD
1080





EVYELMRQCW RDRPYERPPF AQIALQLGRM LEARKAYVNM SLFENFTYAG IDATAEEA











ACH3 Protein sequence



Gene name: placental growth factor (PGF; PlGF1; VEGF-related protein)


Unigene number: Hs.2894


Probeset Accession #: X54936


Protein Accession #: NP_002623


Signal sequence: predicted 1-21


Transmembrane domain: none predicted


PFAM domains: PDGF predicted 52-130


Summary: Likely a secreted protein; likely regulates angiogenesis by


interacting with FLT1 and FLK1.












MPVMRLFPCF LQLLAGLALP AVPPQQWALS AGNGSSEVEV VPFQEVWGRS YCRALERLVD
60






VVSEYPSEVE HMFSPSCVSL LRCTGCCGDE NLHCVPVETA NVTMQLLKIR SGDRPSYVEL
120





TFSQHVRCEC RPLREKMKPE RCGDAVPRR











ACH4 Protein sequence



Gene name: nidogen 2 (NID2)


Unigene number: Hs.82733


Probeset Accession #: D86425


Protein Accession #: NP_031387


Signal sequence: predicted 1-30


Transmembrane domain: none predicted


PFAM domains: EGF-like_domains predicted 489-524, 764-800,


806-843, 853-891, and 897-930; thyroglobulin_repeats pre-


dicted 941-1006, and 1020-1085; LDL_receptor_repeats


predicted 1155-1197, 1199-1240, and 1242-1285. Summary: A secreted pro-


tein; NID2 likely interacts with collagens I and IV and laminin-1 to pro-


mote cell adhesion to the basement membrane.












MEGDRVAGRP VLSSLPVLLL LQLLMLRAAA LHPDELFPHG ESWWDQLLQE GDDVKLSRGE
60






AGESPALLTK PDSATSTWAP TASSPLRTSP GKRSMWTMIS PPTSRPSPLF WRTSTRATAE
120





AESCTERTPP PQCWAWPPAM CALASRALRA FYPHPRLPGH LGAGRRLRGG QTRALPSGEL
180





NTFQAVLASD GSDSYALFLY PANGLQFLGT RPKESYNVQL QLPARVGFCR GEADDLKSEG
240





PYFSLTSTEQ SVKNLYQLSN LGIPGVWAFH IGSTSPLDNV RPAAVGDLSA AHSSVPLGRS
300





FSHATALESD YNEDNLDYYD VNEEEAEYLP GEPEEALNGH SSIDVSFQSK VDTKPLEESS
360





TLDPHTKEGT SLGEVGGPDL KGQVEPWDER ETRSPAPPEV DRDSLAPSWE TPPPYPENGS
420





IQPYPDGGPV PSEMDVPPAH PEEEIVLRSY PASGHTTPLS RGTYEVGLED NIGSNTEVFT
480





YNAANKETCE HNHRQCSRHA FCTDYATGFC CHCQSKFYGN GKHCLPEGAP HRVNGKVSGH
540





LHVGHTPVHF TDVDLHAYIV GNDGRAYTAI SHIPQPAAQA LLPLTPIGGL FGWLFALEKP
600





GSENGFSLAG AAFTHDMEVT FYPGEETVRI TQTAEGLDPE NYLSIKTNIQ GQVPYVPANF
660





TAHISPYKEL YHYSDSTVTS TSSRDYSLTF GAINQTWSYR IHQNITYQVC RHAPRHPSFP
720





TTQQLNVDRV FALYNDEERV LRFAVTNQIG PVKEDSDPTP VNPCYDGSHM CDTTARCHPG
780





TGVDYTCECA SGYQGDGRNC VDENECATGF HRCGPNSVCI NLPGSYRCEC RSGYEFADDR
840





HTCILITPPA NPCEDGSHTC APAGQARCVH HGGSTFSCAC LPGYAGDGHQ CTDVDECSEN
900





RCHPAATCYN TPGSFSCRCQ PGYYGDGFQC IPDSTSSLTP CEQQQRHAQA QYAYPGARFH
960





IPQCDEQGNF LPLQCHGSTG FCWCVDPDGH EVPGTQTPPG STPPHCGPSP EPTQRPPTIC
1020





ERWRENLLEH YGGTPRDDQY VPQCDDLGNF IPLQCHGKSD FCWCVDKDGR EVQGTRSQPG
1080





TTPACIPTVA PPMVRPTPRP DVTPPSVGTF LLYTQGQQIG YLPLNGTRLQ ITAAKTLLSL
1140





HGSIIVGIDY DCRERMVYWT DVAGRTISPA GLELGAEPET IVNSGLISPE GLAIDHIRRT
1200





MYWTDSVLDK IESALLDGSE RKVLFYTDLV NPRAIAVDPI RGNLYWTDWN REAPKIETSS
1260





LDGENRRILI NTDIGLPNGL TFDPFSKLLC WADAGTKKLE CTLPDGTGRR VIQNNLKYPF
1320





SIVSYADHFY HTDWRRDGVV SVNKHSGQFT DEYLPEQRSH LYGITAVYPY CPTGRK











ACH5 Protein sequence



Gene name: SNL (singed-like; sea urchin fascin homolog-like)


Unigene number: Hs.118400


Probeset Accession #: U03057


Protein Accession #: NP_003079


Signal sequence: none identified


Transmembrane domain: none identified


PFAM domains: none identified


Summary: a cytoplasmic, actin-bundling protein that is likely to be


involved in the assembly of actin filament bundles present in micro-


spikes, membrane ruffles, and stress fibers












MTANGTAEAV QIQFGLINCG NKYLTAEAFG FKVNASASSL KKKQIWTLEQ PPDEAGSAAV
60






CLRSHLGRYL AADKDGNVTC EREVPGPDCR FLIVAHDDGR WSLQSEARRR YFGGTEDRLS
120





CFAQTVSPAE KWSVHIAMHP QVNIYSVTRK RYAHLSARPA DEIAVDRDVP WGVDSLITLA
180





FQDQRYSVQT ADHRFLRHDG RLVARPEPAT GYTLEFRSGK VAFRDCEGRY LAPSGPSGTL
240





KAGKATKVGK DELFALEQSC AQVVLQAANE RNVSTRQGMD LSANQDEETD QETFQLEIDR
300





DTKKCAFRTH TGKYWTLTAT GGVQSTASSK NASCYFDIEW RDRRITLRAS NGKFVTSKKN
360





GQLAASVETA GDSELFLMKL INRPIIVFRG EHGFIGCRKV TGTLDANRSS YDVFQLEFND
420





GAYNIKDSTG KYWTVGSDSA VTSSGDTPVD FFFEFCDYNK VAIKVGGRYL KGDHAGVLKA
480





SAETVDPASL WEY











ACH6 Protein sequence



Gene name: endothelial protein C receptor (EPCR; PROCR)


Unigene number: Hs.82353


Probeset Accession #: L35545


Protein Accession #: NP_006395


Signal sequence: predicted 1-17


Transmembrane domain: predicted 211-227


PFAM domains: none identified


Summary: a Type Ia membrane protein, EPCR likely binds to [thrombin]-


activated Protein C, a vitamin K-dependent serine protease zymogen


necessary for blood coagulation.












MLTTLLPILL LSGWAFCSQD ASDGLQRLHM LQISYFRDPY HVWYQGNASL GGHLTHVLEG
60






PDTNTTIIQL QPLQEPESWA RTQSGLQSYL LQFEGLVRLV HQERTLAFPL TIRCFLGCEL
120





PPEGSRAHVF FEVAVNGSSF VSFRPERALW QADTQVTSGV VTFTLQQLNA YNRTRYELRE
180





FLEDTCVQYV QKHISAENTK GSQTSRSYTS LVLGVLVGGF IIAGVAVGIF LCTGGRRC











ACH8 Protein sequence



Gene name: melanoma adhesion molecule (MCAM; MUC18)


Unigene number: Hs.211579


Probeset Accession #: D51069


Protein Accession #: NP_006491


Signal sequence: predicted 1-17


Transmembrane domain: predicted 559-575


PFAM domains: immunoglobulin_domains predicted 264-324, and


356-410. Summary: a Type Ia membrane protein, associated with tumor


progression and the development of metastasis in human malignant mel-


anoma, and may play a role in neural crest cells during embryonic


development.












MGLPRLVCAF LLAACCCCPR VAGVPGEAEQ PAPELVEVEV GSTALLKCGL SQSQGNLSHV
60






DWFSVHKEKR TLIFRVRQGQ GQSEPGEYEQ RLSLQDRGAT LALTQVTPQD ERIFLCQGKR
120





PRSQEYRIQL RVYKAPEEPN IQVNPLGIPV NSKEPEEVAT CVGRNGYPIP QVIWYKNGRP
180





LKEEKNRVHI QSSQTVESSG LYTLQSILKA QLVKEDKDAQ FYCELNYRLP SGNHMKESRE
240





VTVPVFYPTE KVWLEVEPVG MLKEGDRVEI RCLADGNPPP HFSISKQNPS TREAEEETTN
300





DNGVLVLEPA RKEHSGRYEC QAWNLDTMIS LLSEPQELLV NYVSDVRVSP AAPERQEGSS
360





LTLTCEAESS QDLEFQWLRE ETDQVLERGP VLQLHDLKRE AGGGYRCVAS VPSIPGLNRT
420





QLVKLAIFGP PWMAFKERKV WVKENMVLNL SCEASGHPRP TISWNVNGTA SEQDQDPQRV
480





LSTLNVLVTP ELLETGVECT ASNDLGKNTS ILFLELVNLT TLTPDSNTTT GLSTSTASPH
540





TRANSTSTER KLPEPESRGV VIVAVIVCIL VLAVLGAVLY FLYKKGKLPC RRSGKQEITL
600





PPSRKTELVV EVKSDKLPEE MGLLQGSSGD KRAPGDQGEK YIDLRH











ACH9 Protein sequence



Gene name: endothelin-1 (EDN1)


Unigene number: Hs.2271


Probeset Accession #: J05008


Protein Accession #: NP_001946


Signal sequence: predicted 1-17


Transmembrane domain: none predicted


PFAM domains: Endothelin domains predicted 59-73, and 108-129.


Summary: a secreted zymogen; the active protein is likely a 26-amino


acid peptide with potent mammalian vasoconstrictor activity; it is


necessary for normal vessel development.












MDYLLMIFSL LFVACQGAPE TAVLGAELSA VGENGGEKPT PSPPWRLRRS KRCSCSSLMD
60






KECVYFCHLD IIWVNTPEHV VPYGLGSPRS KRALENLLPT KATDRENRCQ CASQKDKKCW
120





NFCQAGKELR AEDIMEKDWN NHKKGKDCSK LGKKCIYQQL VRGRKIRRSS EEHLRQTRSE
180





TMRNSVKSSF HDPKLKGKPS RERYVTHNPA HW











ACJ1 Protein sequence



Gene name: BMX non-receptor tyrosine kinase


Unigene number: Hs.27372


Probeset Accession #: X83107


Protein Accession #: NP_001712


Signal sequence: none identified


Transmembrane domain: none identified


PFAM domains: plektrin_homology_domain predicted 6-111;


SH2_domain predicted 294-383; protein_kinase_domain


predicted 417-663 Summary: a cytoplasmic protein, it likely plays a


role in the growth and differentiation of hematopoietic cells; it is


known to also be expressed in endothelial cells.












MDTKSILEEL LLKRSQQKKK MSPNNYKERL FVLTKTNLSY YEYDKMKRGS RKGSIEIKKI
60






RCVEKVNLEE QTPVERQYPF QIVYKDGLLY VYASNEESRS QWLKALQKEI RGNPHLLVKY
120





HSGFFVDGKF LCCQQSCKAA PGCTLWEAYA NLHTAVNEEK HRVPTFPDRV LKIPRAVPVL
180





KMDAPSSSTT LAQYDNESKK NYGSQPPSSS TSLAQYDSNS KKIYGSQPNF NMQYIPREDF
240





PDWWQVRKLK SSSSSEDVAS SNQKERNVNH TTSKISWEFP ESSSSEEEEN LDDYDWFAGN
300





ISRSQSEQLL RQKGKEGAFM VRNSSQVGMY TVSLFSKAVN DKKGTVKHYH VHTNAENKLY
360





LAENYCFDSI PKLIHYHQHN SAGMITRLRH PVSTKANKVP DSVSLGNGIW ELKREEITLL
420





KELGSGQFGV VQLGKWKGQY DVAVKMIKEG SMSEDEFFQE AQTMMKLSHP KLVKFYGVCS
480





KEYPIYIVTE YISNGCLLNY LRSHGKGLEP SQLLEMCYDV CEGMAFLESH QFIHRDLAAR
540





NCLVDRDLCV KVSDFGMTRY VLDDQYVSSV GTKFPVKWSA PEVFHYFKYS SKSDVWAFGI
600





LMWEVFSLGK QPYDLYDNSQ VVLKVSQGHR LYRPHLASDT IYQIMYSCWH ELPEKRPTFQ
660





QLLSSIEPLR EKDKH











ACJ4 Protein sequence



Gene name: prostaglandin G/H synthase 2 (COX-2; PGES-2)


Unigene number: Hs.196384


Probeset Accession #: D28235


Protein Accession #: NP_000954


Signal sequence: predicted 1-17


Transmembrane domain: none identified


PFAM domains: EGF-like_domain predicted 18-55.


Summary: a microsomal enzyme; COX-2 is the therapeutic target of the


nonsteroidal anti-inflammatory drugs (NSAIDs), such as aspirin.












MLARALLLCA VLALSHTANP CCSHPCQNRG VCMSVGFDQY KCDCTRTGFY GENCSTPEFL
60






TRIKLFLKPT PNTVHYILTH FKGFWNVVNN IPFLRNAIMS YVLTSRSHLI DSPPTYNADY
120





GYKSWEAFSN LSYYTRALPP VPDDCPTPLG VKGKKQLPDS NEIVEKLLLR RKFIPDPQGS
180





NMMFAFFAQH FTHQFFKTDH KRGPAFTNGL GHGVDLNHIY GETLARQRKL RLFKDGKMKY
240





QIIDGEMYPP TVKDTQAEMI YPPQVPEHLR FAVGQEVFGL VPGLMMYATI WLREHNRVCD
300





VLKQEHPEWG DEQLFQTSRL ILIGETIKIV IEDYVQHLSG YHFKLKFDPE LLFNKQFQYQ
360





NRIAAEFNTL YHWHPLLPDT FQIHDQKYNY QQFIYNNSIL LEHGITQFVE SFTRQIAGRV
420





AGGRNVPPAV QKVSQASIDQ SRQMKYQSFN EYRKRFMLKP YESFEELTGE KEMSAELEAL
480





YGDIDAVELY PALLVEKPRP DAIFGETMVE VGAPFSLKGL MGNVICSPAY WKPSTFGGEV
540





GFQIINTASI QSLICNNVKG CPFTSFSVPD PELIKTVTIN ASSSRSGLDD INPTVLLKER
600





STEL











ACJ6 Protein sequence



Gene name: SEC14-like-1


Unigene number: Hs.75232


Probeset Accession #: D67029


Protein Accession #: NP_002994


Signal sequence: none identified


Transmembrane domain: none identified


PFAM domains: none identified


Summary: a cytoplasmic protein












MVQKYQSPVR VYKYPFELIM AAYERRFPTC PLIPMFVGSD TVSEFKSEDG AIHVIERRCK
60






LDVDAPRLLK KIAGVDYVYF VQKNSLNSRE RTLHIEAYNE TFSNRVIINE HCCYTVHPEN
120





EDWTCFEQSA SLDIKSFFGF ESTVEKIAMK QYTSNIKKGK EIIEYYLRQL EEEGITFVPR
180





WSPPSITPSS ETSSSSSKKQ AASMAVVIPE AALKEGLSGD ALSSPSAPEP VVGTPDDKLD
240





ADHIKRYLGD LTPLQESCLI RLRQWLQETH KGKIPKDEHI LRFLRARDFN IDKAREIMCQ
300





SLTWRKQHQV DYILETWTPP QVLQDYYAGG WHHHDKDGRP LYVLRLGQMD TKGLVRALGE
360





EALLRYVLSV NEERLRRCEE NTKVFGRPIS SWTCLVDLEG LNMRHLWRPG VKALLRIIEV
420





VEANYPETLG RLLILRAPRV FPVLWTLVSP FIDDNTRRKF LIYAGNDYQG PGGLLDYIDK
480





EIIPDFLSGE CMCEVPEGGL VPKSLYRTAE ELENEDLKLW TETIYQSASV FKGAPHEILI
540





QIVDASSVIT WDFDVCKGDI VFNIYHSKRS PQPPKKDSLG AHSITSPGGN NVQLIDKVWQ
600





LGRDYSMVES PLICKEGESV QGSHVTRWPG FYILQWKFHS MPACAASSLP RVDDVLASLQ
660





VSSHKCKVMY YTEVIGSEDF RGSMTSLESS HSGFSQLSAA TTSSSQSHSS SMISR











ACJ8 Protein sequence



Gene name: intercellular adhesion molecule 1 (ICAM1; CD54)


Unigene number: Hs.168383


Probeset Accession #: M24283


Protein Accession #: NP_000192


Signal sequence: predicted 1-27


Transmembrane domain: predicted 481-497


PFAM domains: immunoglobulin_domains predicted 128-188, and


325-373. Summary: a Type 1a membrane protein; ICAM1 is typically


expressed on endothelial cells and cells of the immune system; ICAM2.


binds to integrins of type CD11a/CD18, or CD11b/CD18; ICAM1 is also ex-


ploited by Rhinovirus as a receptor.












MAPSSPRPAL PALLVLLGAL FPGPGNAQTS VSPSKVILPR GGSVLVTCST SCDQPKLLGI
60






ETPLPKKELL LPGNNRKVYE LSNVQEDSQP MCYSNCPDGQ STAKTFLTVY WTPERVELAP
120





LPSWQPVGKN LTLRCQVEGG APRANLTVVL LRGEKELKRE PAVGEPAEVT TTVLVRRDHH
180





GANFSCRTEL DLRPQGLELF ENTSAPYQLQ TFVLPATPPQ LVSPRVLEVD TQGTVVCSLD
240





GLFPVSEAQV HLALGDQRLN PTVTYGNDSF SAKASVSVTA EDEGTQRLTC AVILGNQSQE
300





TLQTVTIYSF PAPNVILTKP EVSEGTEVTV KCEAHPRAKV TLNGVPAQPL GPRAQLLLKA
360





TPEDNGRSFS CSATLEVAGQ LIHKNQTREL RVLYGPRLDE RDCPGNWTWP ENSQQTPMCQ
420





AWGNPLPELK CLKDGTFPLP IGESVTVTRD LEGTYLCRAR STQGEVTREV TVNVLSPRYE
480





IVIITVVAAA VIMGTAGLST YLYNRQRKIK KYRLQQAQKG TPMKPNTQAT PP











ACK3 Protein sequence



Gene name: angiopoietin 1 receptor (TIE-2; TEK)


Unigene number: Hs.89640


Probeset Accession #: L06139


Protein Accession #: NP_000450


Signal sequence: predicted 1-18


Transmembrane domain: predicted 746-770


PFAM domains: immunoglobulin_domains predicted 44-102, 370-424;


EGF_like_domains predicted 210-252, 254-299, and 301-


341; FN3_domains predicted 444-536, 541-634, and 638-732; pro-


tein_kinase_domain predicted 824-1096.


Summary: a Type 1a membrane protein; it is expressed almost exclusively


in endothelial cells in mice, rats, and humans; the ligand for this re-


ceptor is angiopoietin-1; defects in TEK are associated with inherited


venous malformations; the TEK signaling pathway appears to be critical


for endothelial cell-smooth muscle cell communication in venous morpho-


genesis.












MDSLASLVLC GVSLLLSGTV EGAMDLILIN SLPLVSDAET SLTCIASGWR PEEPITIGRD
60






FEALMNQHQD PLEVTQDVTR EWAKKVVWKR EKASKINGAY FCEGRVRGEA IRIRTMKMRQ
120





QASFLPATLT MTVDKGDNVN ISFKKVLIKE EDAVIYKNGS FIHSVPRHEV PDILEVELPH
180





AQPQDAGVYS ARYIGGNLFT SAFTRLIVRR CEAQKWGPEC NHLCTACMNN GVCHEDTGEC
240





ICPPGFMGRT CEKACELHTF GRTCKERCSG QEGCKSYVFC LPDPYGCSCA TGWKGLQCNE
300





ACHPGFYGPD CKLRCSCNNG EMCDRFQGCL CSPGWQGLQC EREGIPRMTP KIVDLPDHIE
360





VNSGKFNPIC KASGWPLPTN EEMTLVKPDG TVLHPKDFNH TDHFSVAIFT IHRILPPDSG
420





VWVCSVNTVA GMVEKPFNIS VKVLPKPLNA PNVIDTGHNF AVINISSEPY FGDGPIKSKK
480





LLYKPVNHYE AWQHIQVTNE IVTLNYLEPR TEYELCVQLV RRGEGGEGHP GPVRRFTTAS
540





IGLPPPRGLN LLPKSQTTLN LTWQPIFPSS EDDFYVEVER RSVQKSDQQN IKVPGNLTSV
600





LLNNLHPREQ YVVRARVNTK AQGEWSEDLT AWTLSDILPP QPENIKISNI THSSAVISWT
660





ILDGYSISSI TIRYKVQGKN EDQHVDVKIK NATIIQYQLK GLEPETAYQV DIFAENNIGS
720





SNPAFSHELV TLPESQAPAD LGGGKMLLIA ILGSAGMTCL TVLLAFLIIL QLKRANVQRR
780





MAQAFQNVRE EPAVQFNSGT LALNRKVKNN PDPTIYPVLD WNDIKFQDVI GEGNFGQVLK
840





ARIKKDGLRM DAAIKRMKEY ASKDDHRDFA GELEVLCKLG HHPNIINLLG ACEHRGYLYL
900





AIEYAPHGNL LDFLRKSRVL ETDPAFAIAN STASTLSSQQ LLHFAADVAR GMDYLSQKQF
960





IHRDLAARNI LVGENYVAKI ADFGLSRGQE VYVKKTMGRL PVRWMAIESL NYSVYTTNSD
1020





VWSYGVLLWE IVSLGGTPYC GMTCAELYEK LPQGYRLEKP LNCDDEVYDL MRQCWREKPY
1080





ERPSFAQILV SLNRMLEERK TYVNTTLYEK FTYAGIDCSA EEAA











PZA6 Protein sequence



Gene name: prostate differentiation factor (PLAB; MIC-1)


Unigene number: Hs.116577


Probeset Accession #: AB000584


Protein Accession #: NP_004855


Signal sequence: predicted 1-29


Transmembrane domain: none identified


PFAM domains: TGF-beta _domain predicted 211-308.


Summary: a secreted protein; its exact function is unclear; it inhibits


proliferation of primitive hematopoietic progenitors; it inhibits acti-


vation of macrophages; it is highly expressed in placenta and in serum


of pregnant women; it may promote fetal survival by suppressing the pro-


duction of maternally-derived proinflammatory cytokines within the


uterus.












MPGQELRTVN GSQMLLVLLV LSWLPHGGAL SLAEASRASF PGPSELHSED SRFRELRKRY
60






EDLLTRLRAN QSWEDSNTDL VPAPAVPILT PEVRLGSGGH LHLRISRAAL PEGLPEASRL
120





HRALFRLSPT ASRSWDVTRP LRRQLSLARP QAPALHLRLS PPPSQSDQLL AESSSARPQL
180





ELHLRPQAAR GRRRARARNG DDCPLGPGRC CRLHTVRASL EDLGWADWVL SPREVQVTNC
240





IGACPSQFRA ANMHAQIKTS LHRLKPDTEP APCCVPASYN PMVLIQKTDT GVSLQTYDDL
300





LAKDCHCI











AAD2 Protein sequence:



Gene name: Thrombospondin-1


Unigene number: Hs.87409


Probeset Accession #: AA232645


Protein Accession #: NP_003237.1


Signal sequence: predicted 1-18 (first underlined sequence)


Transmembrane Domain: none identified


Summary: Thrombospondin is a large modular glycoprotein component of the


extracellular matrix and contains a variety of distinct domains, includ-


ing three repeating subunits (types I, II, and III) that share homology


to an assortment of other proteins.














MGLAWGLGVL FLMRVCGT
NR IPESGGDNSV FDIFELTGAA RKGSGRRLVK GPDPSSPAFR

60






IEDANLIPPV PDDKFQDLVD AVRAEKGFLL LASLRQMKKT RGTLLALERK DHSGQVFSVV
120





SNGKAGTLDL SLTVQGKQHV VSVEEALLAT GQWKSITLFV QEDRAQLYID CEKMENAELD
180





VPIQSVFTRD LASIARLRIA KGGVNDNFQG VLQNVRFVFG TTPEDILRNK GCSSSTSVLL
240





TLDNNVVNGS SPAIRTNYIG HKTKDLQAIC GISCDELSSM VLELRGLRTI VTTLQDSIRK
300





VTEENKELAN ELRRPPLCYH NGVQYRNNEE WTVDSCTECH CQNSVTICKK VSCPIMPCSN
360





ATVPDGECCP RCWPSDSADD GWSPWSEWTS CSTSCGNGIQ QRGRSCDSLN NRCEGSSVQT
420





RTCHIQECDK RFKQDGGWSH WSPWSSCSVT CGDGVITRIR LCNSPSPQMN GKPCEGEARE
480





TKACKKDACP INGGWGPWSP WDICSVTCGG GVQKRSRLCN NPAPQFGGKD CVGDVTENQI
540





CNKQDCPIDG CLSNPCFAGV KCTSYPDGSW KCGACPPGYS GNGIQCTDVD ECKEVPDACF
600





NHNGEHRCEN TDPGYNCLPC PPRFTGSQPF GQGVEHATAN KQVCKPRNPC TDGTHDCNKN
660





AKCNYLGHYS DPMYRCECKP GYAGNGIICG EDTDLDGWPN ENLVCVANAT YHCKKDNCPN
720





LPNSGQEDYD KDGIGDACDD DDDNDKIPDD RDNCPFHYNP AQYDYDRDDV GDRCDNCPYN
780





HNPDQADTDN NGEGDACAAD IDGDGILNER DNCQYVYNVD QRDTDMDGVG DQCDNCPLEH
840





NPDQLDSDSD RIGDTCDNNQ DIDEDGHQNN LDNCPYVPNA NQADHDKDGK GDACDHDDDN
900





DGIPDDKDNC RLVPNPDQKD SDGDGRGDAC KDDFDHDSVP DIDDICPENV DISETDFRRF
960





QMIPLDPKGT SQNDPNWVVR HQGKELVQTV NCDPGLAVGY DEFNAVDFSG TFFINTERDD
1020





DYAGFVFGYQ SSSRFYVVMW KQVTQSYWDT NPTRAQGYSG LSVKVVNSTT GPGEHLRNAL
1080





WHTGNTPGQV RTLWHDPRHI GWKDFTAYRW RLSHRPKTGF IRVVMYEGKK IMADSGPIYD
1140





KTYAGGRLGL FVFSQEMVFF SDLKYECRDP











AAD9 protein sequence



Gene name: LIM homeobox protein cofactor (CLIM-1)


Unigene number: Hs.4980


Probeset Accession #: F13782


Protein Accession #: AAC83552


Pfam: LIM bind


Transmembrane Domain: none identifed


Summary: The LIM homeodomain (LIM-HD) proteins, which contain two tan-


dem LIM domains followed by a homeodomain, are critical transcriptional


regulators of embryonic development. The LIM domain is a conserved


cysteine-rich zinc-binding motif found in LIM-HD proteins, cytoskeletal


components, LIM kinases, and other proteins. LIM domains are protein-pro-


tein interaction motifs, can inhibit binding of LIM-HD proteins to DNA,


and can negatively regulate LIM-HD protein function.












MSSTPHDPFY SSPFGPFYRR HTPYMVQPEY RIYEMNKRLQ SRTEDSDNLW WDAFATEFFE
60






DDATLTLSFC LEDGPKRYTI GRTLIPRYFS TVFEGGVTDL YYILKHSKES YHNSSITVDC
120





DQCTMVTQHG KPMFTKVCTE GRLILEFTFD DLMRIKTWHF TIRQYRELVP RSILANHAQD
180





PQVLDQLSKN ITRMGLTNFT LNYLRLCVIL EPMQELMSRH KTYNLSPRDC LKTCLFQKWQ
240





RMVAPPAEPT RQPTTKRRKR KNSTSSTSNS SAGNNANSTG SKKKTTAANL SLSSQVPDVM
300





VVGEPTLMGG EFGDEDERLI TRLENTQYDA ANGMDDEEDF NNSPALGNNS PWNSKPPATQ
360





ETKSENPPPQ ASQ











AAE1 protein seanence



Gene name: guanine nucleotide binding protein 11


Unigene number: Hs.83381


Probeset Accession #: U31384


Protein Accession #: NP_004117.1


Pfam: G-gamma; CAAX motif (farnesylation site) prediction underlined


Summary: The G gamma proteins are a component of the trimeric G-proteins


that interact with cell surface receptors. The G protein beta and gamma


subunits directly regulate the activities of various enzymes and ion


channels after receptor ligation. Unlike most of the other known gamma


subunits, gamma 11 is modified by a farnesyl group and is not capable


of interacting with beta 2.












MPALHIEDLP EKEKLIG4EVE QLRKEVKLQR QQVSKCSEEI KNYIEERSGE DPLVKGIPED
60






KNPFKEKGSC VIS











AAE2 protein sequence



Gene name: Transcription factor 4 (Immunoglobulin transcription factor


2) (ITF-2) CSL3-3 Enhancer factor 2) (SEF-2) Unigene number: Hs.289068


Probeset Accession #: M74719 Protein Accession #: NP_003190.1


Pfam: HLH domain prediction underlined Summary: Transcription factor 4


is a helix-loop-helix (HLH) protein which belongs to a family of nu-


clear proteins, designated SL3-3 enhancer factors 2 (SEF2), that inter-


act with an Ephrussi box-like motif within the glucocorticoid response


element in the enhancer of the murine leukemia virus SL3-3. Various cell


types display differences both in the sets of SEF2-DNA complexes formed


and in their amounts. Molecular analysis of cDNA clones show the exist-


ence of multiple related mRNA species containing alternative coding


regions, which are most probably a result of differential splicing.












MHHQQRMAAL GTDKELSDLL DFSAMFSPPV SSGKNGPTSL ASGHFTGSNV EDRSSSGSWG
60






NGGHPSPSRN YGDGTPYDHM TSRDLGSHDN LSPPFVNSRI QSKTERGSYS SYGRESNLQG
120





CHQQSLLGGD MDMGNPGTLS PTKPGSQYYQ YSSNNPRRRP LHSSAMEVQT KKVRKVPPGL
180





PSSVYAPSAS TADYNRDSPG YPSSKPATST FPSSFFMQDG HHSSDPWSSS SGMNQPGYAG
240





MLGNSSHIPQ SSSYCSLEPH ERLSYPSHSS ADINSSLPPM STFHRSGTNH YSTSSCTPPA
300





NGTDSIMANR GSGAAGSSQT GDALGKALAS IYSPDHTNNS FSSNPSTPVG SPPSLSAGTA
360





VWSRNGGQAS SSPNYEGPLH SLQSRIEDRL ERLDDAIHVL RNHAVGPSTA MGGHGDMHG
420





IIGPSHNGAM GGLGSGYGTG LLSANRHSLM VGTHREDGVA LRGSHSLLPN QVPVPQLPVQ
480





SATSPDLNPP QDPYRGMPPG LQGQSVSSGS SEIKSDDEGD ENLQDTKSSE DKKLDDDKKD
540







IKSITSNNDD EDLTPEQKAE REKER
RMANN ARERLRVRDI NEAFKELGRM VQLHLKSDKP

600





QTKLLILHQA VAVILSLEQQ VRERNLNPKA ACLKRREEEK VSSEPPPLSL AGPHPGMGDA
660





SNHMGQM











AAE4 protein sequence



Gene name: phosphatidyicholine 2-acylhydrolase


Unigene number: Hs.211587


Probeset Accession #: M68874


Protein Accession #: AAA60105.1


Pfam: PLA2 B, C2 domain prediction underlined


Summary: Phospholipases A2 (PLA2s) play a key role in inflammatory pro-


cesses through production of precursors of eicosanoids and platelet-acti-


vating factor. PLA2 is a 100 kd protein that contains a structural


element homologous to the C2 region of protein kinase C.












MSFIDPYQHI IVEHQYSHKF TVVVLRATKV TKGAFGDMLD TPDPYVELFI STTPDSRKRT
60








RHFNNDINPV WNETFEFILD PNQENVLEIT LMDANYVMDE TLGTAT
FTVS SMKVGEKKEV

120





PFIFNQVTEM VLEMSLEVCS CPDLRFSMAL CDQEKTFRQQ RKEHIRESMK KLLGPKNSEG
180





LHSARDVPVV AILGSGGGFR AMVGFSGVMK ALYESGILDC ATYVAGLSGS TWYMSTLYSH
240





PDFPEKGPEE INEELMKNVS HNPLLLLTPQ KVKRYVESLW KKKSSGQPVT FTDIFGMLIG
300





ETLIHNRMNT TLSSLKEKVN TAQCPLPLFT CLHVKPDVSE LMFADWVEFS PYEIGMAKYG
360





TFMAPDLFGS KFFMGTVVKK YEENPLHFLM GVWGSAFSIL FNRVLGVSGS QSRGSTMEEE
420





LENITTKHIV SNDSSDSDDE SHEPKGTENE DAGSDYQSDN QASWIHRMIM ALVSDSALFN
480





TREGRAGKVH NFMLGLNLNT SYPLSPLSDF ATQDSFDDDE LDAAVADPDE FERIYEPLDV
540





KSKKIHVVDS GLTFNLPYPL ILRPQRGVDL IISFDFSARP SDSSPPFKEL LLAEKWAKMN
600





KLPFPKIDPY VFDREGLKEC YVFKPKNPDM EKDCPTIIHF VLANINFRKY KAPGVPRETE
660





EEKEIADFDI FDDPESPFST FNFQYPNQAF KRLHDLMHFN TLNNIDVIKE AMVESIEYRR
720





QNPSRCSVSL SNVEARRFFN KEFLSKPKA











ACA1 protein sequence



Gene name: tissue factor pathway inhibitor 2 TFPI2, placental protein 5


(PP5)


Unigene number: Hs.78045


Probeset Accession #: D29992


Protein Accession #: BAA06272.1


Pfam: Kunitz BPTI


Signal sequence: underlined


Summary: ACA1 is a serine proteinase inhibitor that was originally puri-


fied from conditioned medium of the human glioblastoma cell line T98G.


ACA1 is identical to placental protein 5 (PP5) and TFPI2, a placenta-


derived glycoprotein with serine proteinase inhibitor activity. PP5 be-


longs to the Kunitz-type serine proteinase inhibitor family, having


three putative Kunitz-type inhibitor domains.














MDPARPLGLS ILLLFLTEAA LG
DAAQEPTG NNAEICLLPL DYGPCRALLL RYYYDRYTQS

60






CRQFLYGGCE GNANNFYTWE ACDDACWRIE KVPKVCRLQV SVDDQCEGST EKYFFNLSSM
120





TCEKFFSGGC HRNRIENRFP DEATCMGFCA PKKIPSFCYS PKDEGLCSAN VTRYYFNPRY
180





RTCDAFTYTG CGGNDNNFVS REDCKRACAK ALKKKKKMPK LRFASRIRKI RKKQF











ACB8 protein sequence



Gene name: myosin X


Unigene number: Hs.61638


Probeset Accession #: N77151


Protein Accession #: NP_036466


Pfam: myosin head, IQ (calmodulin binding motif), PH, MyTH4


Summary: Myosins are molecular motors that move along filamentous actin.


Seven classes of myosin are expressed in vertebrates: conventional


myosin, or myosin-II, as well as the 6 unconventional myosin classes-I,


-V, -VI, -VII, -IX, and -X.












MDNFFTEGTR VWLRENGQHF PSTVNSCAEG IVVFRTDYGQ VFTYKQSTIT HQKVTAMHPT
60






NEEGVDDMAS LTELHGGSIM YNLFQRYKRN QIYTYIGSIL ASVNPYQPIA GLYEPATMEQ
120





YSRRHLGELP PHIFAIANEC YRCLWKRYDN QCILISGESG AGKTESTKLI LKFLSVISQQ
180





SLELSLKEKT SCVERAILES SPIMEAFGNA KTVYNNNSSR FGKFVQLNIC QKGNIQGGRI
240





VDYLLEKNRV VRQNPGERNY HIFYALLAGL EHEEREEFYL STPENYHYLN QSGCVEDKTI
300





SDQESFREVI TANDVMQFSK EEVREVSRLL AGILHLGNIE FITAGGAQVS FKTALGRSAE
360





LLGLDPTQLT DALTQRSMFL RGEEILTPLN VQQAVDSRDS LAMALYACCF EWVIKKINSR
420





IKGNEDFKSI GILDIFGFEN FEVNHFEQFN INYANEKLQE YFNKHIFSLE QLEYSREGLV
460





WEDIDWIDNG ECLDLIEKKL GLLALINEES HFPQATDSTL LEKLHSQHAN NHFYVKPRVA
540





VNNFGVKHYA GEVQYDVRGI LEKNRDTFRD DLLNLLRESR FDFIYDLFEH VSSRNNQDTL
600





KCGSKHRRPT VSSQFKDSLH SLMATLSSSN PFFVRCIKPN MQKMPDQFDQ AVVLNQLRYS
660





GMLETVRIRK AGYAVRRPFQ DFYKRYKVLM RNLALPEDVR GKCTSLLQLY DASNSEWQLG
720





KTKVFLRESL EQKLEKRREE EVSHAAMVIR AHVLGFLARK QYRKVLYCVV IIQKNYRAFL
780





LRRRFLHLKK AAIVFQKQLR GQIARRVYRQ LLAEKREQEE KKKQEEEEKK KREEEERERE
840





RERREAELRA QQEEETRKQQ ELEALQKSQK EAELTRELEK QKENKQVEEI LRLEKEIEDL
900





QRMKEQQELS LTEASLQKLQ ERRDQELRRL EEEACRAAQE FLESLNFDEI DECVRNIERS
960





LSVGSEFSSE LAESACEEKP NFNFSQPYPE EEVDEGFEAD DDAFKDSPNP SEHGHSDQRT
1020





SGIRTSDDSS EEDPYNNDTV VPTSPSADST VLLAPSVQDS GSLHNSSSGE STYCMPQNAG
1080





DLPSPDGDYD YDQDDYEDGA ITSGSSVTFS NSYGSQWSPD YRCSVGTYNS SGAYRFSSEG
1140





AQSSFEDSEE DFDSRFDTDD ELSYRRDSVY SCVTLPYFHS FLYMKGGLMN SWKRRWCVLK
1200





DETFLWFRSK QEALKQGWLH KKGGGSSTLS RRNWKKRWFV LRQSKLMYFE NDSEEKLKGT
1260





VEVRTAKEII DNTTKENGID IIMADRTFHL IAESPEDASQ WFSVLSQVHA STDQEIQEMH
1320





DEQANPQNAV GTLDVGLIDS VCASDSPDRP NSFVIITANR VLHCNADTPE EMHHWITLLQ
1380





RSKGDTRVEG QEFIVRGWLH KEVKNSPKMS SLKLKKRWFV LTHNSLDYYK SSEKNALKLG
1440





TLVLNSLCSV VPPDEKIFKE TGYWNVTVYG RKHCYRLYTK LLNEATRWSS AIQNVTDTKA
1500





PIDTPTQQLI QDIKENCLNS DVVEQIYKRN PILRYTHHPL HSPLLPLPYG DINLNLLKDK
1560





GYTTLQDEAI KIFNSLQQLE SMSDPIPIIQ GILQTGHDLR PLRDELYCQL IKQTNKVPHP
1620





GSVGNLYSWQ ILTCLSCTFL PSRGILKYLK FHLKRIREQF PGTEMEKYAL FTYESLKKTK
1680





CREFVPSRDE IEALIHRQEM TSTVYCHGGG SCKITINSHT TAGEVVEKLI RGLAMEDSRN
1740





MFALFEYNGH VDKAIESRTV VADVLAKFEK LAATSEVGDL PWKFYFKLYC FLDTDNVPKD
1800





SVEFAFMFEQ AHEAVIHGHH PAPEENLQVL AALRLQYLQG DYTLHAAIPP LEEVYSLQRL
1860





KARISQSTKT FTPCERLEKR RTSFLEGTLR RSFRTGSVVR QKVEEEQMLD MWIKEEVSSA
1920





RASIIDKWRK FQGNNQEQAM AKYMALIKEW PGYGSTLFDV ECKEGGFPQE LWLGVSADAV
1980





SVYKRGEGRP LEVFQYEHIL SFGAPLANTY KIVVDERELL FETSEVVDVA KLMKAYISMI
2040





VKKRYSTTRS ASSQGSSR











ACC3 protein sequence



Gene name: calcitonin receptor-like (CALCRL)


Unigene number: Hs.152175


Probeset Accession #: L76380


Protein Accession #: NP_005786.1


Pfam: 7TM 2 (7 transmembrane receptor (Secretin family))


Transmembrane domains: predictions underlined


Signal sequence: first underlined region


Summary: Calcitonin gene-related peptide (CGRP) is a neuropeptide with


diverse biological effects including potent vasodilator activity. The


human CGRP1 receptor shares significant peptide sequence homology with


the human calcitonin receptor, a member of the G-protein-coupled recept-


or superfamily. Stable expression in 293 (HEK 293) cells produces spec-


ific, high affinity binding sites for CGRP. Exposure of these cells to


CGRP results in a 60-fold increase in cAMP production.














MEKKCTLYFL VLLPFFMILV TAE
LEESPED SIQLGVTRNK IMTAQYECYQ KIMQDPIQQA

60






EGVYCNRTWD GWLCWNDVAA GTESMQLCPD YFQDFDPSEK VTKICDQDGN WFRHPASNRT
120





WTNYTQCNVN THEKVKTALN LFYLTIIGHG LSIASLLISL GIFFYFKSLS CQRITLHKNL
180





FFSFVCNSVV TIIHLTAVAN NQALVATNPV SCKVSQFIHL YLMGCNYFWM LCEGIYLHTL
240







IVVAVFAEKQ
 HLMWYYFLGW GFPLIPACIH AIARSLYYND NCWISSDTHL LYIIHGPICA

300







ALLVNLFFLL NIVRVLIT
KL KVTHQAESNL YMKAVRATLI LVPLLGIEFV LIPWRPEGKI

360





AEEVYDYIMH ILMHFQGLLV STIFCFFNGE VQAILRRNWN QYKIQFGNSF SNSEALRSAS
420





YTVSTISDGP GYSHDCPSEH L&GKSIHDIE NVLLKPENLY N











ACC5 protein sequence



Gene name: Selectin E (endothelial adhesion molecule 1)


Unigene number: Hs.89546


Probeset Accession #: M24736


Protein Accession #: NP_000441.1


Pfam: lectin c, EGF like domain, sushi (SCR domain)


Signal sequence: first underlined region


Transmembrane domain: second underlined region


Summary: Focal adhesion of leukocytes to the blood vessel lining is a


key step in inflammation and certain vascular disease processes. Endo-


thelial leukocyte adhesion molecule-1 (ELAM-1), a cell surface glyco-


protein expressed by cytokine-activated endothelial, mediates the ad-


hesion of blood neutrophils. The primary sequence of ELAM-1 predicts


an amino-terminal lectin-like domain, an EGF domain, and six tandem re-


petitive motifs (about 60 amino acids each) related to those found


in complement regulatory proteins. A similar domain structure is also


found in the MEL-14 lymphocyte cell surface homing receptor, and in gran-


ule-membrane protein 140, a membrane glycoprotein of platelet and endo-


thelial secretory granules that can be rapidly mobilized (less than 5


minutes) to the cell surface by thrombin and other stimuli. Thus, ELAM-1


may be a member of a nascent gene family of cell surface molecules in-


volved in the regulation of inflammatory and immunological events at the


interface of vessel wall and blood.














MIASQFLSAL TLVLLIKESG AW
SYNTSTEA MTYDEASAYC QQRYTHLVAI QNKEEIEYLN

60






SILSYSPSYY WIGIRKVNNV WVWVGTQKPL TEEAKNWAPG EPNNRQKDED CVEIYIKREK
120





DVGMWNDERC SKKKLALCYT AACTNTSCSG HGECVETINN YTCKCDPGFS GLKCEQIVNC
180





TALESPEHGS LVCSHPLGNF SYNSSCSISC DRGYLPSSME TMQCMSSGEW SAPIPACNVV
240





ECDAVTNPAN GFVECFQNPG SFPWNTTCTF DCEEGFELMG AQSLQCTSSG NWDNEKPTCK
300





AVTCRAVRQP QNGSVRCSHS PAGEFTFKSS CNFTCEEGFM LQGPAQVECT TQGQWTQQIP
360





VCEAFQCTAL SNPERGYMNC LPSASGSFRY GSSCEFSCEQ GFVLKGSKRL QCGPTGEWDN
420





EKPTCEAVRC DAVHQPPKGL VRCAHSPIGE FTYKSSCAFS CEEGFELYGS TQLECTSQGQ
480





WTEEVPSCQV VKCSSLAVPG KINMSCSGEP VFGTVCKFAC PEGWTLNGSA ARTCGATGHW
540





SGLLPTCEAP TESNIPLVAG LSAAGLSLLT LAPFLLWLRK CLRKAKKFVP ASSCQSLESD
600





GSYQKPSYIL











ACC8 protein sentience



Gene name: Chemokine (C-X-C motif), receptor 4 (fusin)


Unigene number: Hs.89414


Probeset Accession #: L06797


Protein Accession #: NP_003458.1


Pfam: 7TM 1 (7 transmembrane receptor (rhodopsin family))


Signal sequence: none identified


Transmembrane domains: predictions underlined


Summary: The chemokine receptor CXCR4 (also designated fusin and LESTR)


is a cofactor for fusion and entry of T cell-tropic strains of HIV-1.












MEGISIYTSD NYTEEMGSGD YDSMKEPCFR EENANFNKIF LPTIYSIIFL TGIVGNGLVI
60








LV
MGYQKKLR SMTDKYRLHL SVADLLFVIT LPFWAVDAVA NWYFGNFLCK AVHVIYTVNL

120







YSSVLILAFI SL
DRYLAIVH ATNSQRPRKL LAEKVVYVGV WIPALLLTIP DFIFANVSEA

180





DDRYICDRFY PNDLWVVVFQ FQHIMVGLIL PGIVILSCYC IIISKLSHSK GHQKRKALKT
240





TVILILAFFA CWLPYYIGIS IDSFILLEII KQGCEFENTV HKWISITEAL AFFHCCLNPI
300







LYAFL
GAKFK TSAQHALTSV SRGSSLKILS KGKRGGHSSV STESESSSFH SS












ACF2 protein sequence



Gene name: Endothelial cell-specific molecule 1


Unigene number: Hs.41716


Probeset Accession #: X89426


Protein Accession #: NP_008967.1


Signal sequence: underlined


Pfam: IGFBP (Insulin-like growth factor binding proteins)


Summary: Human endothelial cell-specific molecule (called ESM-1) was


cloned from a human umbilical vein endothelial cell (HUVEC) cDNA library.


Constitutive ESM-1 gene expression is seen in HUVECs but not in the


other human cell lines. The cDNA sequence contains an open reading frame


of 552 nucleotides and a 1398-nucleotide 3′-untranslated region including


several domains involved in mRNA instability and five putative polyadenyl-


ation consensus sequences. The deduced 184-amino acid sequence defines a


cysteine-rich protein with a functional NH2-terminal hydrophobic signal


sequence.














MKSVLLLTTL LVPAHLVAA
W SNNYAVDCPQ HCDSSECKSS PRCKRTVLDD CGCCRVCAAG

60






RGETCYRTVS GMDGMKCGPG LRCQPSNGED PFGEEFGICK DCPYGTFGMD CRETCNCQSG
120





ICDRGTGKCL KFPFFQYSVT KSSNRFVSLT EHDMASGDGN IVREEVVKEN AAGSPVMRKW
180





LNPR











ACF4 protein sequence



Gene name: P53-responsive gene 2 similar to D.melanogaster peroxidasin


(U11052)


Unigene number: Hs.118893


Probeset Accession #: D86983


Protein Accession #: BAA13219


Pfam: LRRNT (Leucine rich repeat N-terminal domain), LRR (Leucine Rich


Repeat), LRRCT (Leucine rich repeat C-terminal domain), Ig (immunoglo-


bulin domain), Peroxidase, VWC (von Willebrand factor type C domain)


Summary: ACF4 is a gene originally identified from KG-1 cell and brain


cDNA libraries.












SRPWWLRASE RPSAPSAMAK RSRGPGRRCL LALVLFCAWG TLAVVAQKPG AGCPSRCLCF
60






RTTVRCMHLL LEAVPAVAPQ TSILDLRFNR IREIQPGAFR RLRNLNTLLL NNNQIKRIPS
120





GAFEDLENLK YLYLYKNEIQ SIDRQAFKGL ASLEQLYLHF NQIETLDPDS FQHLPKLERL
180





FLHNNRITHL VPGTFNHLES MKRLRLDSNT LHCDCEILWL ADLLKTYAES GNAQAAAICE
240





YPRRIQGRSV ATITPEELNC ERPRITSEPQ DADVTSGNTV YFTCRAEGNP KPEIIWLRNN
300





NELSMKTDSR LNLLDDGTLM IQNTQETDQG IYQCMAKNVA GEVKTQEVTL RYFGSPARPT
360





FVIQPQNTEV LVGESVTLEC SATGHPPPRI SWTRGDRTPL PVDPRVNITP SGGLYIQNW
420





QGDSGEYACS ATNNIDSVHA TAFIIVQALP QFTVTPQDRV VIEGQTVDFQ CEAKGNPPPV
480





IAWTKGGSQL SVDRRHLVLS SGTLRISGVA LHDQGQYECQ AVNIIGSQKV VAHLTVQPRV
540





TPVFASIPSD TTVEVGANVQ LPCSSQGEPE PAITWNKDGV QVTESGKFHI SPEGFLTIND
600





VGPADAGRYE CVARNTIGSA SVSMVLSVNV PDVSRNGDPF VATSIVEAIA TVDRAINSTR
660





THLFDSRPRS PNDLLALFRY PRDPYTVEQA RAGEIFERTL QLIQEHVQHG LMVDLNGTSY
720





HYNDLVSPQY LNLIANLSGC TAHRRVNNCS DMCFHQKYRT HDGTCNNLQH PMWGASLTAF
780





ERLLKSVYEN GFNTPRGINP HRLYNGHALP MPRLVSTTLI GTETVTPDEQ FTHMLMQWGQ
840





FLDHDLDSTV VALSQARFSD GQHCSNVCSN DPPCFSVMIP PNDSRARSGA RCMFFVRSSP
900





VCGSGMTSLL MNSVYPREQI NQLTSYIDAS NVYGSTEHEA RSIRDLASHR GLLRQGIVQR
960





SGKPLLPFAT GPPTECMRDE NESPIPCFLA GDHRANEQLG LTSMHTLWFR EENRIATELL
1020





KLNPHWDGDT IYYETRKIVG AEIQHITYQH WLPKILGEVG MRTLGEYHGY DPGINAGIFN
1080





AFATAAFRFG HTLVNPLLYR LDENFQPIAQ DHLPLHKAFF SPFRIVNEGG IDPLLRGLFG
1140





VAGKMRVPSQ LLNTELTERL FSMAHTVALD LAAINIQRGR DHGIPPYHDY RVYCNLSAAH
1200





TFEDLKNEIK NPEIREKLKR LYGSTLNIDL FPALVVEDLV PGSRLGPTLM CLLSTQFKRL
1260





RDGDRLWYEN PGVFSPAQLT QIKQTSLARI LCDNADNITR VQSDVFRVAE FPHGYGSCDE
1320





IPRVDLRVWQ DCCEDCRTRG QFNAFSYHFR GRRSLEFSYQ EDKPTKKTRP RKIPSVGRQG
1380





EHLSNSTSAF STRSDASGTN DFREFVLEMQ KTITDLRTQI KKLESRLSTT ECVDAGGESH
1440





ANNTKWKKDA CTICECKDGQ VTCFVEACPP ATCAVPVNIP GACCPVCLQK RAEEKP











ACF5 protein sequence



Gene name: Mitogen-activated protein kinase kinase kinase kinase 4


Unigene number: Hs.3628


Probeset Accession #: N54067


Protein Accession #: NP_004825.1


Pfam: pkinase (Eukaryotic protein kinase domain), CNH domain


Summary: The yeast serine/threonine kinase STE20 activates a signaling


cascade that includes STE11 (mitogen-activated protein kinase kinase


kinase), STE7 (mitogen-activated protein kinase kinase), and FUS3/KSS1


(mitogen-activated protein kinase) in response to signals from both Cdc42


and the heterotrimeric G proteins associated with transmembrane pheromone


receptors. ACF5 is a human cDNA encoding a protein kinase homologous to


STE20. This protein kinase, also designated HPK/GCK-like kinase CHGK),


has nucleotide sequences that encode an open reading frame of 1165 amino


acids with 11 kinase subdomains. HGK is a serine/threonine protein kinase


that specifically activated the c-Jun N-terminal kinase (JNK) signaling


pathway when transfected into 293T cells, but does not stimulate either


the extracellular signal-regulated kinase or p38 kinase pathway. HGK also


increased AP-1-mediated transcriptional activity in vivo. HGK may be a


novel activator of the JNK pathway. The cascade may look like this:HGK


-> TAK1 -> MKK4, MKK7 -> JNK kinase cascade, which may


mediate the TNF-alpha signaling pathway.












MANDSPAKSL VDIDLSSLRfl PAGIFELVEV VGNGTYGQVY KGRHVKTGQL AAIKVMDVTE
60






DEEEEIKLEI NMLKKYSHWR NIATYYGAFI KKSPPGHDDQ LWLVMEFCGA GSITDLVKMT
120





KGNTLKEDWI AYISREILRG LAHLHIHHVI HRDIKGQNVL LTENAEVKLV DFGVSAQLDR
180





TVGRRNTFIG TPYWMAPEVI ACDENPDATY DYRSDLWSCG ITAIEMAEGA PPLCDMHPMR
240





ALFLIPRNPP PRLKSKKWSK KFFSFIEGCL VKNYMQRPST EQLLKHPFIR DQPNERQVRI
300





QLKDHIDRTR KKRGEKDETE YEYSGSEEEE EEVPEQEGEP SSIVNVPGES TLRRDFLRLQ
360





QENKERSEAL RRQQLLQEQQ LREQEEYKRQ LLAERQKRIE QQKEQRRRLE EQQRREREAR
420





RQQEREQRRR EQEEKRRLEE LERRRKEEEE RRRAEEEKRR VEREQEYIRR QLEEEQRHLE
480





VLQQQLLQEQ AMLLHDERRP HPQHSQQPPP PQQERSKPSF HAPEPKAHYE PADRAREVPV
540





RTTSRSPVLS RRDSPLQGSG QQNSQAGQRN STSIEPRLLW ERVEKLVPRP GSGSSSGSSN
600





SGSQPGSHPG SQSGSGERFR VRSSSKSEGS PSQRLENAVK KPEDKKEVFR PLKPAGEV
660





TALAKELRAV EDVRPPHKVT DYSSSSEESG TTDEEDDDVE QEGADESTSG PEDTRAASSE
720





NLSNGETESV KTMIVNDDVE SEPAMTPSKE GTLIVRQTQS ASSTLQKHKS SSSFTPFIDP
780





RLLQISPSSG TTVTSVVGFS CDGMRPEAIR QDPTRKGSVV NVNPTNTRPQ SDTPEIRKYK
840





KRFNSEILCA ALWGVNLLVG TESGLMLLDR SGQGKVYPLI NRRRFQQMDV LEGLNVLVTI
900





SGKKDKLRVY YLSWLRNKIL HNDPEVEKKQ GWTTVGDLEG CVHYKVVKYE RIKFLVIALK
960





SSVEVYAWAP KPYHKFMAFK SFGELVHKPL LVDLTVEEGQ RLKVIYGSCA GFHAVDVDSG
1020





SVYDIYLPTH VRKNPHSMIQ CSIKPHAIII LPNTDGMELL VCYEDEGVYV NTYGRITKDV
1080





VLQWGEMPTS VAYIRSNQTM GWGEKAIEIR SVETGHLDGV FMHKRAQRLK FLCERNDKVF
1140





FASVRSGGSS QVYFMTLGRT SLLSW











ACF8 protein sequence



Gene name: Phospholipase A2, group IVC (cytosolic, calcium-independent)


Unigene number: Hs.18858


Probeset Accession #: AA054087


Protein Accession #: NP_003697.1


Pfam: none identified


Summary: ACF8 is a membrane-bound, calcium-independent PLA2, named cPLA2-


gamma. The sequence encodes a 541-amino acid protein containing a domain


with significant homology to the catalytic domain of the 85-kDa cPLA2


(cPLA2-alpha). cPLA2-gamma does not contain the regulatory calcium-depen-


dent lipid binding (caLB) domain found in cPLA2-alpha. cPLA2-gamma does


contain two consensus motifs for lipid modification, a prenylation motif


(-CCLA) at the C terminus and a myristoylation site at the N terminus.


cPLA2-gamma demonstrates a preference for arachidonic acid at the sn-2


position of phosphatidylcholine as compared with palmitic acid. cPLA2-


gamma encodes a 3-kilobase message, which is highly expressed in heart


and skeletal muscle, suggesting a specific role in these tissues.












MGSSEVSIIP GLQKEEKAAV ERRRLHVLKA LKKLRIEADE APVVAVLGSG GGLRAHIACL
60






GVLSEMKEQG LLDAVTYLAG VSGSTWAISS LYTNDGDMEA LEADLKHRFT RQEWDLAKSL
120





QKTIQAARSE NYSLTDFWAY MVISKQTREL PESHLSNMKK PVEEGTLPYP IFAAIDNDLQ
180





PSWQEARAPE TWFEFTPHHA GFSALGAFVS ITHFGSKFKK GRLVRTHPER DLTFLRGLWG
240





SALGNTEVIR EYIFDQLRNL TLKGLWRRAV ANAKSIGHLI FARLLRLQES SQGEHPPPED
300





EGGEPEHTWL TEMLENWTRT SLEKQEQPHE DPERKGSLSN LMDFVKKTGI CASKWEWGTT
360





HNFLYKHGGI RDKIMSSRKH LHLVDAGLAI NTPFPLVLPP TREVHLILSF DFSAGDPFET
420





IFATTDYCRR HKIPFPQVEE AELDLWSKAP ASCYILKGET GPVVIHFPLF NIDACGGDIE
480





AWSDTYDTFK LADTYTLDVV VLLLALAKKN VRENKKKILR ELMNVAGLYY PKDSARSCCL
540





A











ACG1 protein sequence



Gene name: carbohydrate (chondroitin 6/keratan) sulfotransferase 1


Unigene number: Hs.104576


Probeset Accession #: AA868063


Protein Accession #: NP_003645.1


Pfam: none identified


Summary: Chondroitin 6-sulfotransferase (C6ST) is the key enzyme in the


biosynthesis of chondroitin 6-sulfate, a glycosaminoglycan implicated in


chondrogenesis, neoplasia, atherosclerosis, and other processes. C6ST


catalyzes the transfer of sulfate from 3′-phosphoadenosine 5′-phospho-


sulfate to carbon 6 of the N-acetylgalactosamine residues of chondroitin.












MQCSWKAVLL LALASIAIQY TAIRTFTAKS FHTCPGLAEA GLAERLCEES PTFAYNLSRK
60






THILILATTR SGSSFVGQLF NQHLDVFYLF EPLYHVQNTL IPRFTQGKSP ADRRVMLGAS 120





RflLLRSLYDC DLYFLENYIK PPPVNHTTDR IFRRGASRVL CSRPVCDPPG PAflLVLEEGD 180





CVRKCGLLNL TVAAEACRER SHVAIKTVRV PEVNDLRALV EDPRLNLKVI QLVRflPRGIL 240





ASRSETFPflT YRLWRLWYGT GRKPYNLDVT QLTTVCEDFS NSVSTGLMRP PWLKGKYMLV 300





RYEDLARNPM KKTEEIYGFL GIPLDSHVAR WIQNNTRGDP TLGKHKYGTV RNSAATAEKW 360





RFRLSYDIVA FAQNACQQVL AQLGYKIAAS EEEL~G~PSVS LVEERDFRPF S











ACG5 protein sequence



Gene name: Multimerin


Unigene number: Hs.268107


Probeset Accession #: U27109


Protein Accession #: AAC52065


Sign sequence: prediction underlined


Pfam. EGF-like domain, Clq domain


Summary: Multimerin is a massive, soluble protein found in platelets and


in the endothelium of blood vessels. Multimerin is composed of varying


sized, disulfide-linked multimers, the smallest of which is a homotrimer.


Multimerin is a factor V/Va-binding protein and may function as a carrier


protein for platelet factor V. Northern analyses show a 4.7-kilobase tran-


script in cultured endothelial cells, a megakaryocytic cell line, plate-


lets, and highly vascular tissues. The multimerin cDNA can encode a pro-


tein of 1228 amino acids with the probable signal peptide cleavage site


between amino acids 19 and 20. The protein is predicted to be hydro-


philic and to contain 23 N-glycosylation sites. The adhesive motif RGDS


(Arg-Gly-Asp-Ser) and an epidermal growth factor-like domain were identi-


fied. Multimerin contains a probable coiled-coil structures in the


central portion of its sequence. Additionally, the carboxyl-terminal re-


gion of multimerin resembles the globular, non-collagen-like, carboxyl-


terminal domains of several other trimeric proteins, including complement


C1q and collagens type VIII and X.














MKGARLFVLL SSLWSGGIG
L NNSKHSWTIP EDGNSQKTMP SASVPPNKIQ SLQILPTTRV

60






MSAEIATTPE ARTSEDSLLK STLPPSETSA PAEGVRNQTL TSTEKAEGW KLQNLTLPTN
120





ASIKFNPGAE SVVLSNSTLK FLQSFARKSN EQATSLNTVG GTGGIGGVGG TGGVGNRAPR
180





ETYLSRGDSS SSQRTDYQKS NFETTRGKNW CAYVHTRLSP TVTLDNQVTY VPGGKGPCGW
240





TGGSCPQRSQ KISNPVYRMQ HKIVTSLDWR CCPGYSGPKC QLRAQEQQSL IHTNQAESHT
300





AVGRGVAEQQ QQQGCGDPEV MQKMTDQVNY QAMKLTLLQK KIDNISLTVN DVRNTYSSLE
360





GKVSEDKSRE FQSLLKGLKS KSINVLIRDI VREQFKIFQN DMQETVAQLF KTVSSLSEDL
420





ESTRQIIQKV NESVVSIAAQ QKFVLVQENR PTLTDIVELR NHIVNVRQEM TLTCEKPIKE
480





LEVKQTELEG ALEQEHSRSI LYYESLNKTL SKLKEVHEQL LSTEQVSDQK NAPAAESVSN
540





NVTEYMSTLH ENIKKQSLMM LQMFEDLHIQ ESKINNLTVS LEMEKESLRG ECEDMLSKCR
600





NDFKFQLIQT EENLHVLNQT LAEVLFPMDN KMDKMSEQLN DLTYDMEILQ PLLEQGASLR
660





QTMTYEQPKE AIVIRKKIEN LTSAVNSLNF IIKELTKRHN LLRNEVQGRD DALERRINEY
720





ALEMEDGLNK TMTIINNAID FIQDNYALKE TLSTIKDNSE IHHKCTSDME TILTFIPQFH
780





RLNDSIQTLV NDNQRYNFVL QVAKTLAGIP RDEKLNQSNF QKMYQMFNET TSQVRKYQQN
840





MSHLEEKLLL TTKISKNFET RLQDIESKVT QTLIPYYISV KKGSVVTNER DQALQLQVLN
900





SRFKALEAKS IHLSINFFSL NKTLHEVLTM CHNASTSVSE LNATIPKWIK HSLPDIQLLQ
960





KGLTEFVEPI IQIKTQAALS NSTCCIDRSL PGSLANVVKS QKQVKSLPKK INALKKPTVN
1020





LTTVLIGRTQ RNTDNIIYPE EYSSCSRHPC QNGGTCINGR TSFTCACRHP FTGDNCTIKL
1080





VEENALAPDF SKGSYRYAPM VAFFASHTYG MTIPGPILFN NLDVNYGASY TPRTGKFRIP
1140





YLGVYVFKYT IESFSAHISG FLVVDGIDKL AFESENINSE IHCDRVLTGD ALLELNYGQE
1200





VWLRILAKGTI PAKFPPVTTF SGYLLYRT











ACC6 protein sequence



Gene name: Homo sapiens cDNA FLJ11502 fis, clone HEMBA10021O2, weakly


similar to ANKRYIN


Unigene number: Hs.213194


Probeset Accession #: AA187101


Protein Accession #: none


Pfam: ankyrin repeats












VAARPPVSRM EPRAADGCFL GDVGFWVERT PVHEAAQRGE SLQLQQLIES GACVNQVTVD
60






SITPLHAASL QGQARCVQLL LAAGAQVDAR NIDGSTPLCD ACASGSIECV KLLLSYGAKV
120





NPPLYTASPL HEASFPRLLS TLASTPWIN











ACC7 protein sequence



Gene name: Human PAL A gene


Unigene number: Hs.6906


Probeset Accession #: AA083572 cluster


Protein Accession #: P11233


Pfam: ras


Features: CAAX motif is underlined


Summary: The RALA gene encodes a low molecular mass ras-like GTP-binding


protein that shares about 50% similarity with the ras proteins. GTP-


binding proteins mediate the transmembrane signaling initiated by the


occupancy of certain cell surface receptors. The RALA gene maps to 7p22-


p15.












MAANKPKGQN SLALHKVIMV GSGGVGKSAL TLQFMYDEFV EDYEPTKADS YRKKVVLDGE
60






EVQIDILDTA GQEDYAAIRD NYFRSGEGFL CVFSITEMES FAATADFREQ ILRVKEDENV
120





PFLLVGNKSD LEDKRQVSVE EAKNRAEQWN VNYVETSAKT RANVDKVFFD LMREIRARKM
180





EDSKEKNGKK KRKSLAKRIR ERCC











ACC9 protein sequence



Gene name: KIAA0955 protein


Unigene number: Hs.10031


Probeset Accession #: AA027168


Protein Accession #: BAA76799.1


Pfam: CARD (Caspase recruitment domain)


Summary: Gene was originally isolated as a brain cDNA. The coding region


contains a CARD domain, suggesting involvement in apoptotic signaling


pathways.












MMRQRQSHYC SVLFLSVNYL GGTFPGDICS EENQIVSSYA SKVCFEIEED YKNRQFLGPE
60






GNVDVELIDK STNRYSVWFP TAGWYLWSAT GLGFLVRDEV TVTIAFGSWS QHLALDLQHH
120





EQWLVGGPLF DVTAEPEEAV AEIHLPHFIS LQGEVDVSWF LVARFKNEGM VLEHPARVEP
180





FYAVLESPSF SLMGILLRIA SGTRLSIPIT SNTLIYYHPH PEDIKFHLYL VPSDALLTKA
240





IDDEEDRFHG VRLQTSPPME PLNFGSSYIV SNSANLKVMP KELKLSYRSP GEIQHFSKFY
300





AGQMKEPIQL EITEKRHGTL VWDTEVKPVD LQLVAASAPP PFSGAAFVKE NHRQLQARMG
360





DLKGVLDDLQ DNEVLTENEK ELVEQEKTRQ SKNEALLSMV EKKGDLALDV LFRSISERDP
420





YLVSYLRQQN L











ACF6 Protein sequence



Gene name: Homo sapiens cDNA FLJ10669 fis, clone NT2RP2006275, weakly


similar to Microtubule-associated protein 1B [CONTAINS: LIGHT CHAIN


LC1]


Unigene number: Hs.66048


Probeset Accession #: AA609717


Protein Accession #: BAA91743.1


Pfam: none identified


Summary: The cDNA for FLJ10669 was originally isolated from NT2 neuronal


precursor cells (teratocarcinoma cell line) after 2-weeks of retinoic


acid (RA) treatment. The protein sequence has similarity to microtubule-


associated protein 1B (MAP-1B), suggesting a function for ACFE in the reg-


ulating the cytoskeleton.












MGVGRLDMYV LHPPSAGAER TLASVCALLV WHPAGPGEKV VRVLFPGCTP PACLLDGLVR
60






LQHLRFLREP VVTPQDLEGP GRAESKESVG SRDSSKREGL LATHPRPGQE RPGVARKEPA
120





RAEAPRKTEK EAKTPRELKK DPKPSVSRTQ PREVRRAASS VPNLKKTNAQ AAPKPRKAPS
180





TSHSGFPPVA NGPRSPPSLR CGEASPPSAA CGSPASQLVA TPSLELGPIP AGEEKALELP
240





LAASSIPRPR TPSPESHRSP AEGSERLSLS PLRGGEAGPD ASPTVTTPTV TTPSLPAEVG
300





SPHSTEVDES LSVSFEQVLP PSAPTSEAGL SLPLRGPRAR RSASPHDVDL CLVSPCEFEH
360





RKAVPMAPAP ASPGSSNDSS ARSQERAGGL GAEETPPTSV SESLPTLSDS DPVPLAPGAA
420





DSDEDTEGFG VPRHDPLPDP LKVPPPLPDP SSICMVDPEM LPPKTARQTE NVSRTRKPLA
480





RPNSRAAAPK ATPVAAAKTK GLAGGDRASR PLSARSEPSE KGGRAPLSRK SSTPKTATRG
540





PSGSASSRPG VSATPPKSPV YLDLAYLPSG SSAHLVDEEF FQRVRALCYV ISGQDQRKEE
600





GMRAVLDALL ASKQHWDRDL QVTLIPTFDS VAMHTWYAET HARHQALGIT VLGSNGMVSM
660





QDDAFPACKV EF










[0329]


Claims
  • 1. A method of detecting an angiogenesis-associated transcript in a cell in a patient, the method comprising contacting a biological sample from the patient with a polynucleotide that selectively hybridized to a sequence at least 80% identical to a sequence as shown in Table 1.
  • 2. The method of claim 1, wherein the biological sample is a tissue sample.
  • 3. The method of claim 1, wherein the biological sample comprises isolated nucleic acids.
  • 4. The method of claim 3, wherein the nucleic acids are mRNA.
  • 5. The method of claim 3, further comprising the step of amplifying nucleic acids before the step of contacting the biological sample with the polynucleotide.
  • 6. The method of claim 1, wherein the polynucleotide comprises a sequence as shown in Table 1.
  • 7. The method of claim 1, wherein the polynucleotide is labeled.
  • 8. The method of claim 7, wherein the label is a fluorescent label.
  • 9. The method of claim 1, wherein the polynucleotide is immobilized on a solid surface.
  • 10. The method of claim 1, wherein the patient is undergoing a therapeutic regimen to treat a disease associated with angiongenesis.
  • 11. The method of claim 1, wherein the patient is suspected of having cancer.
  • 12. An isolated nucleic acid molecule consisting of a polynucleotide sequence as shown in Table 1.
  • 13. The nucleic acid molecule of claim 12, which is labeled.
  • 14. The nucleic acid of claim 13, wherein the label is a fluorescent label
  • 15. An expression vector comprising the nucleic acid of claim 12.
  • 16. A host cell comprising the expression vector of claim 15.
  • 17. An isolated nucleic acid molecule which encodes a polypeptide having an amino acid sequence as shown in Table 2.
  • 18. An isolated polypeptide which is encoded by a nucleic acid molecule having polynucleotide sequence as shown in Table 1.
  • 19. An isolated polypeptide having an amino acid sequence as shown in Table 2.
  • 20. An antibody that specifically binds a polypeptide of claim 19.
  • 21. The antibody of claim 20, further conjugated to an effector component.
  • 22. The antibody of claim 21, wherein the effector component is a fluorescent label.
  • 23. The antibody of claim 21, wherein the effector component is a radioisotope.
  • 24. The antibody of claim 21, which is an antibody fragment.
  • 25. The antibody of claim 21, which is a humanized antibody
  • 26. A method of detecting a cell undergoing angiogenesis in a biological sample from a patient, the method comprising contacting the biological sample with an antibody of claim 20.
  • 27. The method of claim 26, wherein the antibody is further conjugated to an effector component.
  • 28. The method of claim 27, wherein the effector component is a fluorescent label.
  • 29. The method of detecting antibodies specific to angiogenesis in a patient, the method comprising contacting a biological sample from the patient with a polypeptide comprising a sequence as shown in Table 2.
CROSS-REFERENCES TO RELATED APPLICATIONS

[0001] The present application is a continuation-in-part (CIP) of co-pending U.S. patent application “Novel Methods Of Diagnosis Of Angiogenesis, Compositions And Methods Of Screening For Angiogenesis Modulators”, Attorney Docket No. A651 10-1, filed on Aug. 11, 2000, which claims the benefit of priority to U.S. Ser. No. 60/148,425 filed Aug. 11, 1999, both of which are incorporated herein by reference.

Provisional Applications (1)
Number Date Country
60148425 Aug 1999 US
Continuations (1)
Number Date Country
Parent 09784356 Feb 2001 US
Child 10021660 Dec 2001 US
Continuation in Parts (1)
Number Date Country
Parent 09637977 Aug 2000 US
Child 10021660 Dec 2001 US