Metastasis-associated genes

Information

  • Patent Application
  • 20030054387
  • Publication Number
    20030054387
  • Date Filed
    June 25, 2002
    22 years ago
  • Date Published
    March 20, 2003
    21 years ago
Abstract
Many genes are identified as being metastasis associated. Identifying and profiling of these genes expression can be used to evaluate a sample, to diagnose tumor invasive potential or metastatic development in a sample, or screen for a test compound useful in the prevention or treatment of tumor metastasis.
Description


BACKGROUND

[0002] This application relates to metastasis-associated genes. Metastasis is a multiple-step process that includes the migration of cancer cells from a primary tumor site to a remote site. This process involves interactions between cancer cells and their surrounding microenvironment. Metastasis is a major cause of mortality for cancer patients. Many studies on cancer metastasis have been conducted and several molecules participating in tumor cell invasion and metastasis have been identified and characterized. Among these molecules, some facilitate invasion and metastasis, e.g. laminin receptor, vitronectin receptor, metalloproteinases, and CD44; while others inhibit these processes, e.g. cadherin, tissue inhibitors of metalloproteinases, and nm23. See, e.g., Wewer et al. (1986) Proc. Natl. Acad. Sci. USA 83: 7137-7141; Albelda et al. (1990) Cancer Res. 50: 6757-6764; Powell et al. (1993) Cancer Res. 53: 417-422; Sreenath et al. (1992) Cancer Res. 52: 4942-4947; Birch et al. (1991) Cancer Res. 51: 6660-6667; Ham et al. (1996) J Clin. Gastroenterol. 22: 107-110; Uleminckx et al. (1991) Cell 66: 107-119; Liotta et al. (1991) Cell 64: 327-336; Goldberg et al. (1989) Proc. Natl. Acad. Sci. USA 86: 8207-8211; and Kodera et al. (1994) Cancer 73: 259-265.


[0003] Calcyclin and AXL are also associated with metastasis (Weterman et al. (1992) Cancer Res. 52: 1291-1296; and Jacob et al. (1999) Cancer Detect. Prev. 23: 325-332). Some proteases and adhesion molecules including disintegrin-metalloprotease, MMP-19 protein, interstitial collagenase, protocadherin, integrin α-3 and integrin α-6 have been reported before to be associated with cancer invasion and metastasis. See, for example, Kanamori et al. (1999) Cancer Res. 59: 4225-4227; Okada et al. (1994) Clin. Exp. Metastas. 12: 305-314; Morini et al. (2000) Int. J Cancer, 87: 336-342; and Friedrichs et al. (1995) Cancer Res. 55: 901-906. The tumor-associated antigen L6 is reportedly highly expressed on several carcinomas such as lung and breast cancer (Marken et al. (1992) Proc. Natl. Acad. Sci. USA 89: 3503-3507).



SUMMARY

[0004] This invention is based on the discovery that the genes listed in the four groups—Group I, II, III, and IV—are associated with tumor metastasis.


[0005] As used herein, “Group I” refers to the group consisting the genes listed in Table I. “Group II” refers to the group consisting of the genes listed in Table II. “Group III” refers to the group consisting of the genes listed in Table III. “Group IV” refers to the group consisting of the genes listed in Table IV.


[0006] Each of the genes in these groups are either negatively or positively correlated with invasiveness. Group I, II, III, or IV genes (all or a fraction of Group I, II, III, or IV) can be used independently, or in combination. Each of the terms “Group I,” “Group II,” and “Group III” encompasses polypeptides encoded by the listed genes. Whether the terms refer to genes, polypeptides, or both will be apparent from context.


[0007] In one aspect, this invention features a first method of evaluating a sample. The method includes determining the level of the expression of at least one nucleic acid selected from Group I, II, III, or IV in a sample, and comparing the level of expression to a reference expression value to thereby evaluate the sample. The level of expression can be a value that is compared to a reference value to thereby evaluate the sample. The reference value can be arbitrary or associated with a reference sample or a reference state.


[0008] The sample used to obtain the reference expression can be one or more of: (1) a sample from a normal subject; (2) a sample suspected of having or having a disorder, e.g., a neoplastic disorder, or metastatic disorder; (3) a sample from a subject having a metastatic disorder and undergoing treatment; and (4) a sample from a subject being evaluated, e.g., an earlier sample or a normal sample of the same subject. An expression, e.g., a sample expression or a reference expression, can be a qualitative or quantitative assessment of the abundance of (1) an mRNA transcribed from a nucleic acid; or of (2) a polypeptide encoded by the nucleic acid. The mRNA expression can be determined by, for example, quantitative PCR, Northern analysis, microarray analysis, serial analysis of nucleic acid expression (SAGE), and other routine methods. Polypeptide expression can be determined by antibody probes, e.g., using an antibody array, or by quantitative mass spectroscopy.


[0009] In some embodiments, the method can further include determining the levels of expression of 10%, 20%, 30%, 50%, 75%, 80%, 90%, 99% or more nucleic acids selected from Groups I, II or III.


[0010] In another aspect, the invention features a second method of evaluating a sample. The method includes identifying a sample expression profile, wherein the sample expression profile represents the levels of expression of at least two nucleic acids selected from Group I, II, III, and/or IV; and comparing the sample expression profile to at least one reference expression profile to thereby evaluate the sample.


[0011] An “expression profile” (e.g., a reference expression profile or a sample expression profile) used herein includes a plurality of values, wherein each value corresponds to the level of expression of a different nucleic acid, splice-variant or allelic variant of a nucleic acid or a translation product thereof. The value can be a qualitative or quantitative assessment of the abundance of (1) an mRNA transcribed from a nucleic acid; or of (2) a polypeptide encoded by the nucleic acid. An expression profile has a plurality of values, at least two of which correspond to a nucleic acid of Group I, II, III, or IV.


[0012] In some embodiments, the expression profile can include values for 50%, 60%, 80%, or 90% of the nucleic acids selected from Groups I and II. Alternatively, the expression profile can include values for all the nucleic acids selected from Group II, or a fraction of the nucleic acids of Group II, e.g., 20%, 40%, 50%, 60%, 80%, or 90% of nucleic acids of Group II.


[0013] In some other embodiments, the expression profile can include values for all the nucleic acids of Group I, II, III, and/or IV (i.e., 100% of the nucleic acids), or a fraction of the nucleic acids of Group I, II, III, and/or IV, e.g., at least 20%, 40%, 50%, 60%, 80%, or 90% of the nucleic acids of Group I, II, III, and/or IV. The expression profile can be obtained from an array by a method, which may include providing an array; contacting the array with a nucleic acid mixture (e.g., a mixture of nucleic acids obtained or amplified from a cell), and detecting binding of the nucleic acid mixture to the array to produce an expression profile. Alternatively, the expression profile can be determined using a method and/or apparatus that does not require an array, such as SAGE or quantitative PCR with multiple primers.


[0014] A reference expression profile can be a profile including one or more of: (1) a profile from a sample from a normal subject, (2) a profile from a sample suspected of having or having a disorder, e.g., a neoplastic disorder, or metastatic disorder; (3) a profile from a sample from a subject having a metastatic disorder and undergoing treatment; and (4) a profile from a sample from a subject being evaluated, e.g., an earlier sample or a normal sample of the same subject. For example, the reference expression profile can be the profile of a cancerous cell line, e.g., a metastatic cancer cell line, e.g., a lung adenocarcinoma cell line, e.g., a lung adenocarcinoma cell line described herein.


[0015] A reference expression profile can also be an expression profile obtained from any suitable standard, e.g., a nucleic acid mixture. A sample expression profile is compared to a reference expression profile to produce a difference profile. Preferably, the sample expression profile is compared indirectly to the reference profile. For example, the sample expression profile is compared in multi-dimensional space to a cluster of reference expression profiles.


[0016] In still another aspect, the invention features an array. The array includes a substrate having a plurality of addresses. Each address of the plurality includes a capture probe, e.g., a unique capture probe. An address can have a single species of capture probe, e.g., each address recognizes a single species (e.g., a nucleic acid or polypeptide species). The addresses can be disposed on the substrate in a two-dimensional or three-dimensional configuration. At least one address of the plurality includes a capture probe such as an antibody or an antibody derivative that hybridizes specifically to a nucleic acid selected from Group I, II, III, or IV. A plurality of addresses can include addresses having nucleic acid capture probes for all the nucleic acids of Group I, II, III, and/or IV (i.e., 100% of the nucleic acids), or a fraction of the nucleic acids of Group I, II, III, and/or IV, e.g., at least 20%, 40%, 50%, 60%, 80%, or 90% of the nucleic acids of Group I, II, III, and/or IV. Alternatively, at least one address of the plurality includes a capture probe that binds specifically to a polypeptide selected from the group of polypeptides encoded by the nucleic acids of Group I, II, III, and/or IV. In some embodiments, the plurality of addresses includes addresses having polypeptide capture probes for all the nucleic acids of Group I, II, III, and/or IV (i.e., 100% of the nucleic acids) or a fraction of the nucleic acids of Group I, II, III, and/or IV, e.g., at least 20%, 40%, 50%, 60%, 80%, or 90% of the nucleic acids of Group I, II, III, and/or IV. An array can have a density consisting of at least 10, 50, 100, 200, 500, 103, 104, 105, 106, 107, 108, or 109 or more addresses per cm2 and ranges between.


[0017] In further another aspect, the invention features a set of probes that hybridize to a set of nucleic acids selected from Group I, II, III, and/or IV. The set of probes can include at least 2, 5, 10, 20, 30, 50 or more nucleic acids. The probes can be perfectly matched probes to the nucleic acids selected from Group I, II, III, and/or IV. However, mismatch probes having less than 5% mismatched nucleotides can also be used. The probes may be attached to a polymer, soluble or insoluble, naturally occurring or synthetic. The probes may be attached to a solid support (e.g., a planar array), in a gel matrix, or in solution.


[0018] Also featured is a method for diagnosis that includes providing a sample from a subject; determining the sample expression profile; comparing the sample expression profile to a reference expression profile, wherein the profile includes one or more values representing the levels of expression of one or more nucleic acids selected from Group I, II, III, and/or IV; and categorizing the subject as having tumor invasive potential or metastatic development when the sample expression profile is found to be altered from the reference expression profile. In some embodiments, the expression profile includes multiple values for the levels of expression of nucleic acids from Group I, II, III, and/or IV, e.g., all the nucleic acids of Group I, II, III, and/or IV (i.e., 100% of the nucleic acids), or a fraction of the nucleic acids of Group I, II, III, and/or IV, e.g., at least 20%, 40%, 50%, 60%, 80%, or 90% of the nucleic acids of Group I, II, III, and/or IV. The method can further include comparing the value or the profile (i.e., multiple values) to a reference value or a reference profile.


[0019] In some other embodiments, the reference profile is the profile of a cell line derived from a cancer, e.g., a metastatic cell line serially passaged. Examples of the cell lines are human lung adenocarcinoma cell lines of different invasive and metastatic capacities, e.g., CL1-0 and its sublines (e.g., CL1-1 and CL1-5).


[0020] An alteration in the expression of one or more nucleic acids of the profile is an indication that the subject has or is disposed to having tumor invasive potential or metastatic development. Preferably, expression of a plurality of nucleic acids of the profile (e.g., at least about 5%, 10%, 15%, 20%, 40%, 50%, 60%, 70%, 80%, or 90%) is altered. The expression or the expression profile can be obtained from an array by the method described herein, or from a method and/or apparatus that does not require an array, such as SAGE or quantitative PCR with multiple primers. Alternatively, the expression or the expression profile can be determined by any combination of the just described methods.


[0021] In addition to diagnosing tumor invasive potential or metastatic development in a sample, this method can also be used to (1) monitor a subject during tumor treatment; or (2) monitor a treatment for tumor metastasis in a subject.


[0022] The subject expression profile can be determined in a subject during treatment. The subject expression profile can be compared to a reference profile or to a profile obtained from the subject prior to treatment or prior to onset of the disorder. In a preferred embodiment, the subject expression profile is determined at intervals (e.g., regular intervals) during treatment.


[0023] Still also featured is a method for screening for a test compound useful in the prevention or treatment of tumor metastasis. The method includes: providing one or more reference expression profiles; contacting the compound to a cell; determining a compound-associated expression profile, e.g., using a method described herein; and comparing the compound-associated expression profile to at least one reference profile. The compound-associated expression profile and the reference profile, including a subject expression profile and the reference profile, include one or more values representing the level of expression of one or more nucleic acids selected from Group I, II, III, and/or IV. In some embodiments, the profiles include multiple values for the level of expression of nucleic acids from Group I, II, III, and/or IV, e.g., all the nucleic acids of Group I, II, III, and/or IV (i.e., 100% of the nucleic acids) or a fraction of the nucleic acids of Group I, II, III, and/or IV, e.g., at least 20%, 40%, 50%, 60%, 80%, or 90% of the nucleic acids of Group I, II, III, and/or IV. In some other embodiments, the profile is the profile of at least two cell lines which are clonally related. Examples of the cell lines are human lung adenocarcinoma cell lines of different invasive and metastatic capacities, e.g., CL1-0 and its sublines (e.g., CL1-1 and CL1-5).


[0024] This method for screening may also include comparing the compound-associated expression profile to a plurality of reference profiles (e.g., all reference profiles), and identifying a most similar reference profile as an indication of the efficacy and/or utility of the compound. Multiple compound-associated expression profiles can be determined at periodic intervals after contact with the agent. The expression profile can be obtained from an array by the method described herein, or from a method and/or apparatus that does not require an array, such as SAGE or quantitative PCR with multiple primers. Alternatively, the expression profile can be determined by any combination of the just described methods.


[0025] This invention also features cell lines that are clonally related and have different invasion capabilities, such as human lung adenocarcinoma cell lines, e.g., CL1-0 and its sublines (e.g., CL1-1 and CL1-5). The cell lines can be used to determine a sample expression value or expression profile or a reference expression value or expression profile.


[0026] This invention still also features a transactional method of evaluating a subject. The method includes: (1) obtaining a sample from a caregiver; (2) determining a sample expression profile for the sample; and (3) transmitting a result to the caregiver. Optionally, the method further includes either or both of steps: (4) comparing the sample expression profile to one or more reference expression profiles; and (5) selecting the reference expression profile most similar to the subject expression profile. The reference expression profiles can be include one or more of: (1) a profile from a sample from a normal subject; (2) a profile from a sample suspected of having or having a disorder, e.g., a neoplastic disorder, or metastatic disorder; (3) a profile from a sample from a subject having a metastatic disorder and undergoing a treatment; and (4) a profile from the subject being evaluated, e.g., an earlier profile or a normal profile of the same subject. The subject expression profile and the reference profiles include one or more values representing the level of expression of one or more nucleic acids selected from Group I, II, III, and/or IV, e.g., all the nucleic acids of Group I, II, III, and/or IV (i.e., 100% of the nucleic acids) or a fraction of the nucleic acids of Group I, II, III, and/or IV, e.g., at least 20%, 40%, 50%, 60%, 80%, or 90% of the nucleic acids of Group I, II, III, and/or IV.


[0027] The result transmitted to the caregiver can be one or more of: information about the subject expression profile, e.g., raw or processed expression profile data and/or a graphical representation of the profile; a difference expression profile obtained by comparing the subject expression profile to a reference profile; a descriptor of the most similar reference profile; the most similar reference profile; and a diagnosis or treatment associated with the most similar reference profile. The result can be transmitted across a computer network, e.g., the result can be in the form of a computer transmission (e.g., across the Internet or a private network, e.g., a virtual private network). The result can be transmitted across a telecommunications network, e.g., using a telephone or mobile phone. The results can compressed and/or encrypted.


[0028] In the context of expression profiles herein, “most similar” refers to a profile, which for more than one value of the profile, compares favorably to a given profile. A variety of routine statistical measures can be used to compare two reference profiles. One possible metric is the length (i.e. Euclidean distance) of a difference vector representing the difference between the two profiles. Each of the subject and reference profile is represented as a multi-dimensional vector, wherein the coordinate of each dimension is a value in the profile. The distance of the difference vector is calculated using standard vectorial mathematics. In another embodiment, values for different nucleic acids in the profile are weighted for comparison.


[0029] Also within the scope of this invention is a computer medium having encoded thereon computer-readable instructions to effect the following steps: receive a subject expression profile; access a database of reference expression profiles; and either (1) select a matching reference profile most similar to the subject expression profile or (2) determine at least one comparison score for the similarity of the subject expression profile to at least one reference profile. The subject expression profile and the reference profiles include one or more values representing the levels of expression of one or more nucleic acids selected from Group I, II, III, and/or IV. In a preferred embodiment, the profiles include multiple values for the levels of expression of nucleic acids from Group I, II, III, and/or IV, e.g., all the nucleic acids of Group I, II, III, and/or IV (i.e., 100% of the nucleic acids) or a fraction of the nucleic acids of Group I, II, III, and/or IV, e.g., at least 20%, 40%, 50%, 60%, 80%, or 90% of the nucleic acids of Group I, II, III, and/or IV.


[0030] The instructions may further include instructions to create a graphical user interface that can display a sample expression profile and/or a reference profile. For example, a subset of or all values of the profile can be depicted as a graphic having a color dependent on the magnitude of the value. The graphical user interface can also allow the user to select a reference profile from a plurality of reference profiles, and can depict a comparison between the sample expression profile and the selected reference profile. The computer medium can further include, e.g., have encoded thereon, data records for one or more reference profiles.


[0031] This invention also features a computer medium having a plurality of digitally encoded data records. Each data record includes values representing the levels of expression of one or more nucleic acids selected from Group I, II, III, and/or IV, and a descriptor of the sample. The values may include multiple values for the level of expression of nucleic acids from Group I, II, III, and/or IV, e.g., all the nucleic acids of Group I, II, III, and/or IV (i.e., 100% of the nucleic acids) or a fraction of the nucleic acids of Group I, II, III, and/or IV, e.g., at least 20%, 40%, 50%, 60%, 80%, or 90% of the nucleic acids of Group I, II, III, and/or IV. The descriptor of the sample can be an identifier of the sample, a subject from which the sample was derived (e.g., a patient), a diagnosis (e.g., a metastasis disorder), or a treatment (e.g., a preferred treatment).


[0032] In one embodiment, the data records include records for one or more samples from a normal individual, an abnormal individual (e.g., an individual having a metastasis disorder), and an abnormal individual under a treatment. The data record may further include a value representing the level of expression for each nucleic acid detected by a capture probe on an array described herein.


[0033] The details of one or more embodiments of the invention are set forth in the description below. Other features, objects, and advantages of the invention will be apparent from the description and from the claims.
1TABLE I(Group I):UnigeneIMAGEGenbankClusterIDAccession no.Putative gene nameHs.7587823633R39603early development regulator 2(homolog of polyhomeotic 2)Hs.81071229703H66472extracellular matrix protein 1Hs.89563148578H12586nuclear cap binding protein 1,80 kDHs.20294924829R38857KIAA1102 proteinHs.635300048N78918calcium channel, voltage-dependent, beta 1 subunitHs.79274344912W72244annexin A5Hs.263435345832W72690histidine ammonia-lyaseHs.278593530715AA069915interleukin 18 binding proteinHs.19131307932N93047transcription factor Dp-2 (E2Fdimerization partner 2)Hs.8117108245T69807erbb2-interacting protein ERBINHs.1501114116T79471syndecan 2 (heparan sulfateproteoglycan 1, cell surface-associated, fibroglycan)Hs.103042174835H30016microtubule-associated protein 1BHs.77313310428N98775cyclin-dependent kinase (CDC2-like) 10Hs.2361744582H07076hypothetical protein FLJ20531Hs.153549076AA083222ribosomal protein L7Hs.8123322820W15314chromobox homolog 3 (DrosophilaHP1 gamma)Hs.75716323255W42998plasminogen activator inhibitor,type II (arginine-serpin)Hs.77326323605W44341insulin-like growth factor bindingprotein 3Hs.75621200421R97243protease inhibitor 1 (anti-elastase),alpha-1-antitrypsinHs.99910202674H53527phosphofructokinase, plateletHs.24340131368R23188centaurin beta2Hs.1770203816H56135ligase I, DNA, ATP-dependentHs.150423324698W47128cyclin-dependent kinase 9 (CDC2-related kinase)Hs.145956429284AA007349zinc finger protein 226Hs.26582944280H06518integrin, alpha 3 (antigen CD49C,alpha 3 subunit of VLA-3 receptor)Hs.306838359R49463SWI/SNF related, matrixassociated, actin dependentregulator of chromatin,subfamily a, member 3Hs.103106340731W56305Homo sapiens mRNA for G7bprotein (G7b gene, located inthe class III region ofthe major histocompatibilitycomplexHs.8334149318H15718AXL receptor tyrosine kinaseHs.51147221208H91843guanine nucleotide binding protein(G protein), alpha transducingactivity polypeptide 1Hs.7457952262H22893KIAA0263 gene productHs.250687294499N69526transient receptor potentialchannel 1Hs.202379471256AA034479meiotic recombination (S.cerevisiae) 11 homolog BHs.8638621555T65116myeloid cell leukemia sequence 1(BCL2-related)Hs.2099125755R37131SET domain, bifurcated, 1Hs.8530267051T70335adenosine deaminase, RNA-specific, B1 (homolog ofrat RED1)Hs.78473308063N95309N-deacetylase/N-sulfotransferase(heparan glucosaminyl) 2Hs.102346840W79815aminomethyltransferase (glycinecleavage system protein T)Hs.75516109211T81117tyrosine kinase 2Hs.73769308366N93766folate receptor 1 (adult)Hs.14376110760T90619actin, gamma 1Hs.15154110740T90558sushi-repeat-containing protein,X chromosomeHs.83727309032N92864cleavage and polyadenylationspecific factor 1, 160 kD subunitHs.69771544635AA075297B-factor, properdinHs.12163544672AA074799eukaryotic translation initiationfactor 2, subunit 2 (beta, 38 kD)Hs.211539166201R87490eukaryotic translation initiationfactor 2, subunit 3 (gamma, 52 kD)Hs.80315179283H50251SH3-domain GRB2-like 3Hs.17704120850T95879PERB11 family member in MHCclass I regionHs.63489365849AA025527protein tyrosine phosphatase,non-receptor type 6Hs.80776192966H41410phospholipase C, delta 1Hs.180842323468W45605ribosomal protein L13Hs.2102247510H11603adaptor-related protein complex3, beta 2 subunitHs.27084550080H17934kinesin-like 5 (mitotic kinesin-like protein 1)Hs.7890252494H23010voltage-dependent anion channel 2Hs.85296484874AA037229Integrin beta 3 {alternativelyspliced, clone beta 3C} [human,erythroleukemia cell HEL, mRNAPartial, 409 nt]Hs.75535300051N78927myosin, light polypeptide 2,regulatory, cardiac, slowHs.80680485005AA037708major vault proteinHs.82071343685W69165Cbp/p300-interacting trans-activator, with Glu/Asp-richcarboxy-terminal domain, 2Hs.7846540991R56065v-jun avian sarcoma virus 17oncogene homologHs.99914347097W79487ribosomal protein L22Hs.73818489338AA058525ubiquinol-cytochrome creductase hinge proteinHs.7988944722H06712monocyte to macrophage differ-entiation-associatedHs.92198257128N26830calcium-regulated heat-stableprotein (24 kD)Hs.179898321445W32294HSPC055 proteinHs.1826846408H09257adenylate kinase 5Hs.114346567001AA152406cytochrome c oxidase subunit VIIapolypeptide 1 (muscle)Hs.15534247306H11054protein kinase C, deltaHs.2961377441AA055242S100 calcium-binding protein A3Hs.799433398R44070hypothetical protein FLJ20551Hs.180069591311AA159002nuclear respiratory factor 1Hs.7675347648H16257endoglin (Osler-Rendu-Webersyndrome 1)Hs.279607592815AA158262calpastatinHs.182979323915W46302ribosomal protein L12Hs.118442595244AA164211cyclin CHs.7513949363H16576partner of RAC1 (arfaptin 2)Hs.76240324895W49669adenylate kinase 1Hs.74578416266W86149DEAD/H (Asp-Glu-Ala-Asp/His)box polypeptide 9 (RNA helicaseA, nuclear DNA helicase II;leukophysin)Hs.259776416764W86580Human 3-hydroxyacyl-CoAdehydrogenase, isoform 2 mRNA,complete cdsHs.14331327145W02662S100 calcium-binding protein A13Hs.181289328416W38424elastase 3, pancreatic (protease E)Hs.173554341640W58185ubiquinol-cytochrome c reductasecore protein IIHs.267819223243H85614protein phosphatase 1, regulatory(inhibitor) subunit 2Hs.8376552414H24390dihydrofolate reductaseHs.17320525760R37231nucleophosmin (nucleolarphosphoproteinB23, numatrin)Hs.2730306373N90702heterogeneous nuclearribonucleoprotein LHs.183994307127N91788protein phosphatase 1, catalyticsubunit, alpha isoformHs.50889307055N89677(clone PWHLC2-24) myosin lightchain 2Hs.211973109285T80836homolog of Yeast RRP4(ribosomal RNA processing 4),3′-5′-exoribonucleaseHs.184776243274H94695ribosomal protein L23aHs.2175246180N55510colony stimulating factor 3receptor (granulocyte)Hs.180946544885AA075385ribosomal protein L5Hs.32952323834W46372keratin, hair, basic, 1Hs.157777202915H54091Homo sapiens mRNA; cDNA.DKFZp434G0118 (from cloneDKFZp434G0118)Hs.8028835091R43793heat shock 70 kD protein-like 1Hs.132243278624N66195aminopeptidase puromycinsensitiveHs.75607470841AA031774myristoylated alanine-rich proteinkinase C substrate (MARCKS,80K-L)Hs.99858233358H79784ribosomal protein L7aHs.8216326094R37282monoamine oxidase BHs.82128155195R702625T4 oncofetal trophoblastglycoproteinHs.14655042511R60961myosin, heavy polypeptide 9,non-muscleHs.7533427524R39966exostoses (multiple) 2Hs.85087166004R87406latent transforming growth factorbeta binding protein 4Hs.19844343783H05672inositol 1,4,5-triphosphatereceptor, type 1Hs.64639310347N98760glioma pathogenesis-related proteinHs.243987124194R01769GATA-binding protein 4Hs.6980322522W15247aldo-keto reductase family 7,member A2 (aflatoxin aldehydereductase)Hs.167531195423R89611Homo sapiens mRNA full lengthinsert cDNA clone EUIROIMAGE195423Hs.585546889H10154Homo sapiens mRNA; cDNADKFZp434D0818 (from cloneDKFZp434D0818)Hs.418323181W42634fibroblast activation protein,alpha; sepraseHs.18241833899R44535endonuclease GHs.150555593061AA158728protein predicted by clone 23733Hs.180577133172R26450granulinHs.286272457N35801ribosomal protein L4Hs.75428275935R93344superoxide dismutase 1, soluble(amyotrophic lateral sclerosis 1(adult))Hs.22324150603H17646eukaryotic translation elongationfactor 1 delta (guanine nucleotideexchange protein)Hs.95011428054AA002062syntrophin, beta 1 (dystrophin-associated protein A1, 59 kD,basic component 1)Hs.7577238329R49540nuclear receptor subfamily 3,group C, member 1Hs.242463510899AA099945keratin 8Hs.2774752375H22978G protein-coupled receptor 37(endothelin receptor type B-like)Hs.65424297795N69950tetranectin (plasminogen-bindingprotein)Hs.82906129381R11676cell division cycle 20, S. cerevisiaehomologHs.23119485665AA041521ITBA1 geneHs.85570344436W73253hypothetical protein similar tobeta-transducin familyHs.44499344704W73034small EDRK-rich factor 2Hs.274416306510N91803NADH dehydrogenase(ubiquinone) 1 alpha sub-complex, 6 (14 kD, B14)Hs.78547345201W72342Human mRNA for zinc fingerprotein (clone 647)Hs.151706154045R48810KIAA0134 gene productHs.7644346040W72152H1 histone family, member 2Hs.9242346109W72744purine-rich element bindingprotein BHs.1076346130W72751small proline-rich protein 1B(cornifin)Hs.46405346780W78101polymerase (RNA) II (DNAdirected) polypeptide FHs.197728155738R72084carboxylesterase 2 (intestine,liver)Hs.84318308501N95782replication protein A1 (70 kD)Hs.99969347175W80463fusion, derived from t(12; 16)malignant liposarcomaHs.170027243603N49725mouse double minute 2, humanhomolog of; p53-binding proteinHs.78888308755N95211diazepam binding inhibitor (GABAreceptor modulator, acyl-Coenzyme A binding protein)Hs.258730246201N55530heme-regulated initiation factor 2-alpha kinaseHs.180946544915AA075496ribosomal protein L5Hs.88411113635T79220DNA segment on chromosome 65(unique) 49 expressed sequenceHs.3281172857H20073neuronal pentraxin IIHs.256697174619H27885histidine triad nucleotide-binding proteinHs.13456310457N99978Homo sapiens clone 24747 mRNAsequenceHs.15196510032AA053393putative receptor proteinHs.773631276R42876hypothetical proteinHs.100724510275AA053166peroxisome proliferativeactivated receptor, gammaHs.16059366695AA029690HSPC009 proteinHs.81469376052AA039353nucleotide binding protein 1(E. coli MinD like)Hs.80684267145N23950high-mobility group (nonhistonechromosomal) protein 2Hs.11878686004T62812metallothionein 2AHs.82251469748AA028056myosin ICHs.65114511604AA115797keratin 18Hs.8383222735H86444bromodomain adjacent to zincfinger domain, 2BHs.11638526088AA076383FACL5 for fatty acid coenzyme Aligase 5Hs.50130343578W69632necdin (mouse) homologHs.2083485849AA040427CDC-like kinase 1Hs.18053226483R39784heat shock 90 kD protein 1, alphaHs.7532342313R60946prohibitinHs.50640306873N91935JAK binding proteinHs.7703143297H05065Sp2 transcription factorHs.64593309391N94313ATP synthase, H+ transporting,mitochondrial F1F0, subunit dHs.13528143567H05938alpha-actinin-2-associated LIMproteinHs.1239309853N94637alanyl (membrane) aminopeptidase(aminopeptidase N, amino-peptidase M, microsomalaminopeptidase, CD13, p150)Hs.7906921505T65541cyclin G2Hs.16944930209R16311protein kinase C, alphaHs.7762831155R42594steroidogenic acute regulatoryprotein relatedHs.7644257094N30791H1 histone family, member 2Hs.3254259977N32606ribosomal protein L23-likeHs.82767563608AA100423sperm specific antigen 2Hs.65114563957AA101381keratin 18Hs.118893563884AA101407p53-responsive gene 2Hs.7722146367H09959choline kinaseHs.15634646597H09978topoisomerase (DNA) II alpha(170 kD)Hs.274364712871AA2819816-phosphofructo-2-kinase/fructose-2,6-biphosphatase 3Hs.42484567179AA148598hypothetical protein FLJ10618Hs.17266547384H10779methylenetetrahydrofolate de-hydrogenase (NADP+ dependent),methenyltetrahydrofolatecyclohydrolase, formyl-tetrahydrofolate sHs.35384198546R94868ring finger protein 1Hs.180655205953H58497serine/threonine kinase 12Hs.83173327182W02748cyclin D3Hs.7519937209R50927protein phosphatase 2, regulatorysubunit B (B56), beta isoformHs.24151537394R49679COX11 (yeast) homolog, cyto-chrome c oxidase assembly proteinHs.28777283919N50797H2A histone family, member LHs.75752285455N63245cytochrome c oxidase subunit VIIbHs.179735340781W56747ras homolog gene family, memberCHs.1636438571R51498hypothetical protein FLJ10955Hs.77274143356R74194plasminogen activator, urokinaseHs.86386146164R79099mycloid cell leukemia sequence 1(BCL2-related)Hs.155103222259H83844eukaryotic translation initiationfactor 1A, Y chromosomeHs.75478342260W61185KIAA0956 proteinHs.82316470583AA031928interferon-induced, hepatitis C-associated microtubular aggregateprotein (44 kD)Hs.7542839993R52548superoxide dismutase 1, soluble(amyotrophic lateral sclerosis 1(adult))Hs.94234297884N68934frizzled (Drosophila) homolog 1Hs.1189921759T651273-hydroxy-3-methylglutaryl-Coenzyme A reductaseHs.76152343315W68075decorinHs.165843344021W70096casein kinase 2, beta polypeptideHs.7735625291R17745transferrin receptor (p90, CD71)Hs.7523240704R55993SEC14 (S. cerevisiae)-like 1Hs.2393306175N90546phosphorylase kinase, alpha 1(muscle)Hs.876841144R59152hypothetical protein FLJ10849Hs.8498121640T65431X-ray repair complementingdefective repair in Chinesehamster cells 5 (double-strand-break rejoining; Ku autoantigen,80 kD)Hs.77550307169N91798CDC28 protein kinase 1Hs.227949345522W72422SEC 13 (S. cerevisiae)-like 1Hs.181028307509N95162cytochrome c oxidase subunit VaHs.85838156040R72509solute carrier family 16 (mono-carboxylic acid transporters),member 3Hs.181165307988N95281eukaryotic translation elongationfactor 1 alpha 1Hs.108854308000N92290HSPC163 proteinHs.7806042453R61298phosphorylase kinase, betaHs.2670042358R59858Homo sapiens cDNA FLJ10309fis, clone NT2RM2000287Hs.7510327100R37146tyrosine 3-monooxygenase/tryptophan 5-monooxygenaseactivation protein, zetapolypeptideHs.1600342714R60888retinoblastoma-binding protein 4Hs.119387110774T90622KIAA0792 gene productHs.77890357308W93728guanylate cyclase 1, soluble, beta 3Hs.250867160363H22334zona pellucida glycoprotein 3A(sperm receptor)Hs.180450309185N99249ribosomal protein S24Hs.118131309218N938505,10-methenyltetrahydrofolatesynthetase (5-formyltetra-hydrofolate cyclo-ligase)Hs.117729162681H28224keratin 14 (epidermolysis bullosasimplex, Dowling-Meara,Koebner)Hs.195851247237N57938actin, alpha 2, smooth muscle,aortaHs.78854166145R87471ATPase, Na+/K+ transporting,beta 2 polypeptideHs.41688250069H97140dual specificity phosphatase 8Hs.77256121554T97906enhancer of zeste (Drosophila)homolog 2Hs.18116345188H08826high-mobility group (nonhistonechromosomal) protein 17Hs.75319510231AA053076ribonucleotide reductase M2 poly-peptideHs.149436563057AA113038kinesin family member 5BHs.293446623H10294ribonucleotide reductase M1 poly-peptideHs.79914196071R89380lumicanHs.79037567278AA130632heat shock 60 kD protein 1(chaperonin)Hs.180946590007AA155824ribosomal protein L5Hs.85844323685W44537neurotrophic tyrosine kinase,receptor, type 1Hs.2632233430R43993cell cycle related kinaseHs.651747514H12216amiloride-sensitive cationchannel 1, neuronal (degenerin)Hs.4209323859W46361ribosomal protein, mitochondrial,L2Hs.83137162R493173-hydroxymethyl-3-methylglutaryl-Coenzyme A lyase(hydroxymethylglutaricaciduria)Hs.7586223275R39273MAD (mothers againstdecapentaplegic, Drosophila)homolog 4Hs.11781637540R50950sorcinHs.198287139392R65579pregnancy specific beta-1-glyco-protein 11Hs.7911738183R49222corticotropin releasing hormonereceptor 1Hs.25984237296R51092H91620p proteinHs.23206842396R60877transcription factor 8 (repressesinterleukin 2 expression)Hs.274344118942T92945hypothetical proteinHs.23479951558H21053breakpoint cluster regionHs.9849323330R38269X-ray repair complementingdefective repair in Chinese hamstercells 1Hs.22584123760R38172DKFZP434D193 proteinHs.9198552482H22986wingless-type MMTV integrationsite family, member 10BHs.173664147016R80150v-erb-b2 avian erythroblasticleukemia viral oncogene homolog2 (neuro/glioblastoma derivedoncogene homolog)Hs.17956653273R16162Human clone 23799 mRNAsequenceHs.18045526222R37437RAD23 (S. cerevisiae) homolog AHs.7722526790R37680ADP-ribosyltransferase (NAD+;poly (ADP-ribose) polymerase)-like 1Hs.6390843168R60538heme oxygenase (decycling) 2Hs.384328140R40503dual specificity phosphatase 7Hs.89529501617AA129588aldo-keto reductase family 1,member A1 (aldehyde reductase)Hs.10804336987R48940Friend leukemia virus integration1Hs.174007133122R26243von Hippel-Lindau syndromeHs.108110546454AA081374DKFZP547E2110 proteinHs.1908310779W19210proteoglycan 1, secretory granuleHs.26814930747R42072putative methyltransferaseHs.12743363489AA019803Homo sapiens clone 24511 mRNAsequenceHs.180255186767H50623major histocompatibility complex,class II, DR beta 1Hs.10832331669R43030ubiquitin-conjugating enzymeE2E 2 (homologous to yeastUBC4/5)Hs.48876123355T99625farnesyl-diphosphate farnesyl-transferase 1Hs.17490547304H10460KIAA0033 proteinHs.76716127599R09178pre-alpha (globulin) inhibitor, H3polypeptideHs.25615380437AA054135CGI-119 proteinHs.247565380875AA056094rhodopsin (retinitis pigmentosa 4,autosomal dominant)Hs.855134872R44450PRP4/STK/WD splicing factorHs.628935871R46034growth factor receptor-boundprotein 2Hs.363149932H29009immunoglobulin (CD79A) bindingprotein 1Hs.812850624H17907phosphatidylserine decarboxylaseHs.18322411T82477Duffy blood groupHs.92323511428AA126009FXYD domain-containing iontransport regulator 3Hs.33287145106R77291nuclear factor I/BHs.76293512246AA057636thymosin, beta 10Hs.17775342742W68632p75NTR-associatcd cell deathexecutor; ovarian granulosacell protein (13 kD)Hs.667352600H29041trinucleotide repeat containing 15Hs.168383145112R77293intercellular adhesion molecule 1(CD54), human rhinovirus receptorHs.1420180447R84974fibroblast growth factor receptor 3(achondroplasia, thanatophoricdwarfism)Hs.19196487778AA045185ubiquitin-conjugating enzymeHBUCE1Hs.110637487978AA054590homeo box A10Hs.180919240151H82442inhibitor of DNA binding 2,dominant negative helix-loop-helixproteinHs.275374491456AA150441aldo-keto reductase family 1,member C1 (dihydrodiol de-hydrogenase 1; 20-alpha (3-alpha)-hydroxysteroid dehydrogenase)Hs.28505502133AA126984ubiquitin-conjugating enzyme E2H(homologous to yeast UBC8)Hs.83760359187AA010108troponin I, skeletal, fastHs.274434309917N94503Homo sapiens cDNA FLJ11346fis, clone PLACE1010900Hs.55498503118AA151550geranylgeranyl diphosphatesynthase 1Hs.22028173777H23796SNARE proteinHs.86958359976AA063518interferon (alpha, beta andomega) receptor 2Hs.1368429797R42224hypothetical protein FLJ10761Hs.198951309864N94468jun B proto-oncogeneHs.278626505225AA142922Arg/Abl-interacting proteinArgBP2Hs.7776844495H07123heat shock protein, neuronalDNAJ-like 1Hs.169921548957AA115186general transcription factor II, i,pseudogene 1Hs.159608362864AA019525aldehyde dehydrogenase 10 (fattyaldehyde dehydrogenase)Hs.2320531567R42860membrane protein, palmitoylated 2(MAGUK p55 subfamilymember 2)Hs.27985045476H08115CGI-50 proteinHs.18380069255T54339Ran GTPase activating protein 1Hs.169474561918AA085589DKFZP586J0119 proteinHs.18221731798R43131succinate-CoA ligase, ADP-forming, beta subunitHs.231581561922AA085678myosin, heavy polypeptide 1,skeletal muscle, adultHs.181696364698AA024422zinc finger protein 255Hs.40193123815R01452hypothetical protein KIAA1259Hs.10804371822T52520Friend leukemia virus integration 1Hs.151461667365AA228085embryonic ectoderm developmentproteinHs.1384126309R06411O-6-methylguanine-DNA methyl-transferaseHs.19718126379R06556protein tyrosine phosphatase,receptor type, UHs.46423667303AA227555H4 histone family, member GHs.17245847251H10977iduronate 2-sulfatase (Huntersyndrome)Hs.250711375752AA033816dipeptidyl carboxypeptidase 1(angiotensin I converting enzyme)Hs.102135265879N21005signal sequence receptor, delta(translocon-associatedprotein delta)Hs.121559199577R96579CGI-30 proteinHs.5276366919T67474anaphase-promoting complexsubunit 7Hs.22891267666N23174solute carrier family 7 (cationicamino acid transporter, y+system), member 8Hs.64025132373R26526basonuclinHs.75253206506H60000isocitrate dehydrogenase 3 (NAD+)gammaHs.90291416587W86459laminin, beta 2 (laminin S)Hs.9552108323T70592binder of Arl TwoHs.27873653070R16049cell division cycle 42 (GTP-binding protein, 25 kD)Hs.6639450188H17943ring finger protein 4Hs.512033297R43960dynein, cytoplasmic, light poly-peptideHs.73986044T628846-phosphofructo-2-kinase/fructose-2,6-biphosphatase 1Hs.15113437433R50947oxidase (cytochrome c) assembly1-likeHs.8291621912T65288chaperonin containing TCP1,subunit 6A (zeta 1)Hs.17364119345T94329zinc finger protein 79 (pT7)Hs.7395747559H11455RAB5A, member RAS oncogenefamilyHs.15520651480H24014serine/threonine kinase 25 (Ste20,yeast homolog)Hs.8959150182H17883Kallmann syndrome 1 sequenceHs.267319144767R77223endogenous retroviral proteaseHs.9009339039R51827heat shock 70 kD protein 4Hs.69740017R52654cytochrome c-1Hs.17016024743R37588RAB2, member RAS oncogenefamily-likeHs.7953021445T65107M5-14 proteinHs.7592524592R38826proteasome (prosome, macropain)inhibitor subunit 1 (PI31)Hs.1502025700R36870homolog of mouse quaking QKI(KH domain RNA binding protein)Hs.18053225792R36846heat shock 90kD protein 1, alphaHs.36587156436R73564protein phosphatase 1, regulatorysubunit 7Hs.11151526964R37773CGI-43 proteinHs.283842363R61319malic enzyme 3, NADP(+)-dependent, mitochondrialHs.19122501557AA135645eukaryotic translation initiationfactor 4E-like 3Hs.68318547571AA083845hypothetical protein FLJ20344Hs.8047545923H09542polymerase (RNA) II (DNAdirected) polypeptide J (13.3 kD)Hs.3088846147H09580cytochrome c oxidase subunit VIIapolypeptide 2 likeHs.1288747422H11255Soluble VEGF receptorHs.108931377054AA057615MAGUK protein p55T; ProteinAssociated with Lins 2Hs.173936202498H53121interleukin 10 receptor, betaHs.18066950107H16738conserved gene amplified in osteo-sarcomaHs.1032328897W45438regenerating islet-derived 1 alpha(pancreatic stone protein,pancreatic thread protein)Hs.119597214028H70783stearoyl-CoA desaturase (delta-9-desaturase)Hs.117848285437N66390hemoglobin, epsilon 1Hs.9566540787R56077hypothetical proteinHs.93304238821H65030phospholipase A2, group VII(platelet-activating factoracetylhydrolase, plasma)Hs.195464489918AA114828filamin A, alpha (actin-bindingprotein-280)Hs.169832490387AA120779zinc finger protein 42 (myeloid-specific retinoic acid- responsive)Hs.74471309079N92894gap junction protein, alpha 1, 43kD (connexin 43)Hs.83147491560AA115549guanine nucleotide binding protein-like 1Hs.824828502R37489NADH dehydrogenase(ubiquinone) Fe-S protein 1(75 kD) (NADH-coenzyme Qreductase)Hs.21158428422R40649neurofilament, light polypeptide(68 kD)Hs.18405028627R40473v-Ki-ras2 Kirsten rat sarcoma 2viral oncogene homologHs.79691503974AA130144LIM domain proteinHs.108139504086AA131858zinc finger protein 212Hs.7528029795R42222glycyl-tRNA synthetaseHs.1857363198AA018928phosphodiesterase 6G, cGMP-specific, rod, gammaHs.17306368616T49793transducin-like enhancer of split 2,homolog of Drosophila E(sp1)Hs.180370123883R00785cofilin 1 (non-muscle)Hs.3193967185T52650manic fringe (Drosophila) homologHs.167835365363AA025214acyl-Coenzyme A oxidaseHs.90093128251R11513heat shock 70 kD protein 4Hs.587129068R10850arylacetamide deacetylase(esterase)Hs.8225448584H16096zuotin related factor 1Hs.167839324749W47158KIAA0395 proteinHs.152818272871N36004ubiquitin specific protease 8Hs.1806936128R62434protease, cysteine, 1 (legumain)Hs.7548521625T65577omithine aminotransferase(gyrate atrophy)Hs.8386941134R58972hypothetical proteinHs.478822490T87616KIAA0253 proteinHs.37501214636H73190MAD (mothers againstdecapentaplegic, Drosophila)homolog 5Hs.19942951483H24016Homo sapiens mRNA; cDNADKFZp434M2216 (from cloneDKFZp434M2216)Hs.16688751949H24315copine IHs.9337923234R38667eukaryotic translation initiationfactor 4BHs.24154339177R54415DKFZP586F1524 proteinHs.25333470402AA031292interleukin 1 receptor, type IIHs.11223525983AA076246isocitrate dehydrogenase 1(NADP+), solubleHs.2064529070AA064827vimentinHs.15507941356R59165protein phosphatase 2, regulatorysubunit B (B56), alpha isoformHs.77502530811AA070029methionine adenosyltransferase II,alphaHs.275924345643W71999dystrophia myotonica-containingWD repeat motifHs.227729490760AA133163FK506-binding protein 2 (13 kD)Hs.52644309290N93925SKAP55 homologueHs.11574028572R40902KIAA0210 gene productHs.8228528596R37481phosphoribosylglycinamideformyltransferase,phosphoribosylglycinamidesynthetase, phosphoribosyl-aminoimidazole synthetaseHs.7581228884R40253phosphoenolpyruvate carboxy-kinase 2 (mitochondrial)Hs.23978363287AA019362scaffold attachment factor BHs.27885745625H08426heterogeneous nuclear ribonucleo-protein H2 (H′)Hs.25035666087AA193554chloride intracellular channel 4 likeHs.234546126487R06623GMPR2 for guanosine mono-phosphate reductase isologHs.184760264287N21190CCAAT-box-binding transcriptionfactorHs.77929128773R16755excision repair cross-comple-menting rodent repairdeficiency, complementationgroup 3 (xeroderma pigmentosumgroup B complemHs.7593234298R44350N-ethylmaleimide-sensitive factorattachment protein, alphaHs.15714548300H14474tetracycline transporter-like proteinHs.2227612403AA179189CCAAT/enhancer binding protein(C/EBP), gammaHs.205842220120H82605Homo sapiens mRNA; cDNADKFZp434L231 (from cloneDKFZp434L231)Hs.99855146605R79948formyl peptide receptor-like 1Hs.69423526283AA079782kallikrein 10Hs.153678485750AA039934reproduction 8Hs.279651346688W74647melanoma inhibitory activityHs.169886489919AA114835tenascin XAHs.1757774100AA442123L1 cell adhesion molecule (hydro-cephalus, stenosis of aqueductof Sylvius 1, MASA (mentalretardation, aphasia, shuffling gaHs.250911545323AA076582interleukin 13 receptor, alpha 1Hs.25615546600AA084517DnaJ-like heat shock protein 40Hs.7603844975H08820isopentenyl-diphosphate deltaisomeraseHs.261285550127AA101201pleiotropic regulator 1 (PRL1,Arabidopsis homolog)Hs.931562597AA086247myosin, heavy polypeptide 2,skeletal muscle, adultHs.108957321973W3780140S ribosomal protein S27 isoformHs.247309123782R01201succinate-CoA ligase, GDP-forming, beta subunitHs.96247645235AA199863translin-associated factor XHs.27994632132R42928methionine-tRNA synthetaseHs.7593232531R43287N-ethylmaleimide-sensitive factorattachment protein, alphaHs.95998126314R06415Friedreich ataxiaHs.990873531T55560nitrogen fixation cluster-likeHs.8204333109R44799D123 gene productHs.67726128000R09347macrophage receptor withcollagenous structureHs.79187265680N25352coxsackie virus and adenovirusreceptorHs.74847723H11614fibroblast growth factor receptor 1(fms-related tyrosine kinase 2,Pfeiffer syndrome)Hs.193852199655R96618ATP-binding cassette, sub-family C (CFTR/MRP), member 2Hs.79748267637N25433solute carrier family 3 (activatorsof dibasic and neutral aminoacid transport), member 2Hs.15467948114H11494synaptotagmin 1Hs.81915382295AA062858leukemia-associated phospho-protein p18 (stathmin)Hs.7398649237H15069CDC-like kinase 2Hs.19463849548H15431polymerase (RNA) II (DNAdirected) polypeptide DHs.78596208005H60553proteasome (prosome, macropain)subunit, beta type, 5Hs.94498277906N63398leukocyte immunoglobulin-likereceptor, subfamily A (withTM domain), member 2Hs.167835210862H65660acyl-Coenzyme A oxidaseHs.14989450754H18070mitochondrial translationalinitiation factor 2Hs.15443722165T66157phosphodiesterase 2A,cGMP-stimulatedHs.102876328750W45402pancreatic lipaseHs.199160511356AA086055myeloid/lymphoid or mixed-lineage leukemia (trithorax(Drosophila) homolog)Hs.9177323534R38106protein phosphatase 2 (formerly2A), catalytic subunit, alphaisoformHs.1337024175R39324DKFZP564G0222 proteinHs.75576232629H72599plasminogenHs.8035025263R12792protein phosphatase 2 (formerly2A), catalytic subunit, beta isoformHs.8392025457R39849peptidylglycine alpha-amidatingmonooxygenaseHs.154510529844AA070863carbonyl reductase 3Hs.7886926531R38473transcription elongation factor A(SII), 1Hs.676226689R39132hypothetical proteinHs.147176531496AA074202Homo sapiens eps15R mRNA,partial cdsHs.3147227920R40486transforming growth factor beta-activated kinase-binding protein 1Hs.7555143960H04825Ras suppressor protein 1Hs.198248320285W04613UDP-Gal:betaGlcNAc beta 1,4-galactosyltransferase, polypeptide1Hs.85937561840AA086285myosin-binding protein C, fast-typeHs.29222125674R07481zinc finger protein 76 (expressedin testis)Hs.59732936R43808glutamic-oxaloacetic transaminase1, soluble (aspartate aminotrans-ferase 1)Hs.75412196501R91550Arginine-rich proteinHs.23929837210R50928microtubule-associated protein 4Hs.11872551680H23998selenophosphate synthetase 2Hs.2043526334AA079855adenine nucleotide translocator 1(skeletal muscle)Hs.142526852AA113154sulfotransferase family 1A,phenol-preferring, member 1Hs.1122328095R40321isocitrate dehydrogenase 1(NADP+), solubleHs.19469228615R40452cysteine desulfuraseHs.10862328330R40660thrombospondin 2Hs.110165358675W94107ribosomal protein L26 homologHs.16584351528H20727casein kinase 2, beta polypeptideHs.10811244319H05287histone fold protein CHRAC17;DNA polymerase epsilon p17subunitHs.7791729799R42127ubiquitin carboxyl-terminalesterase L3 (ubiquitin thiolesterase)Hs.17958120278T96983cerebroside (3′-phospho-adenylylsulfate:galactosylcer-amide 3′) sulfotransferaseHs.20912179130H50183adenomatous polyposis coli likeHs.279591256357H94013Cyclin-dependent kinase 7Hs.3656631097R41791LIM domain kinase 1Hs.78225186308H29761annexin A1Hs.8348468607T53300SRY (sex determining region Y)-box 4Hs.40735364722AA024552frizzled (Drosophila) homolog 3Hs.586232316R42908hypothetical proteinHs.251972193602H47573complement component 3Hs.104433665299AA195289Homo sapiens napsin 2 precursor,mRNA, partial sequenceHs.113052125148R05309RNA cyclase homologHs.76494565991AA121850proline arginine-rich end leucine-rich repeat proteinHs.33713203385H54815myo-inositol 1-phosphatesynthase A1Hs.7970949287H15647phosphotidylinositol transferproteinHs.1193082941T69406nuclear receptor subfamily 0,group B, member 2Hs.7534836060R46376proteasome (prosome, macropain)activator subunit 1 (PA28 alpha)Hs.154320273466N36891ubiquitin-activating enzyme E1C(homologous to yeast UBA3)Hs.1189208369H62837E2F transcription factor 3


[0034]

2





TABLE II










(Group II):












Genbank



Unigene
IMAGE
Accession



Cluster
ID
no.
Putative gene name













Hs.107600
51686
H24004
EST


Hs.109212
190915
H39221
ESTs, Weakly similar to Kelch motif





containing protein [H. sapiens]


NA
80790
T63042
NA


Hs.30868
236388
H62405
ESTs


NA
322334
NA
NA


NA
129855
NA
NA


Hs.20588
210873
H67736
ESTs, Moderately similar to





PAHO_HUMAN PANCREATIC





HORMONE PRECURSOR





[H. sapiens]


Hs.146215
153571
R48535
ESTs


Hs.222195
301238
N80787
ESTs


Hs.29403
156906
R74233
ESTs, Weakly similar to DDX8





HUMAN PROBABLE ATP-





DEPENDENT RNA HELICASE





HRH1 [H. sapiens]


Hs.168236
113949
T79758
ESTs


Hs.246023
264571
N20320
EST


NA
129757
NA
NA


NA
108340
T70599
NA


Hs.117565
142760
R71081
ESTs


Hs.103420
471619
AA035469
ESTs


NA
27300
NA
NA


NA
60298
NA
NA


Hs.104623
214600
H71226
ESTs, Highly similar to KIAA0940





protein [H. sapiens]


NA
53082
NA
NA


Hs.74052
307138
N93721
ESTs


Hs.8977
67397
T49325
ESTs


NA
567395
AA130843
NA


Hs.113980
202051
R99560
ESTs, Weakly similar to [Human en-





dogenous retrovirus type C oncovirus





sequence.], gene product [H. sapiens]


Hs.270197
66996
T69707
ESTs


NA
213894
H72939
NA


Hs.119756
38816
R49144
ESTs


Hs.183071
108309
T70568
ESTs


Hs.36102
232772
H72723
ESTs, Highly similar to MT1B





HUMAN METALLOTHIONEIN-IB





[H. sapiens]


NA
67885
T52774
NA


Hs.203367
134322
R31938
ESTs


NA
72672
T50400
NA


Hs.31171
148609
H12612
ESTs


NA
530744
AA069924
NA


Hs.21667
245813
N55301
ESTs


Hs.12152
29602
R42296
ESTs, Moderately similar to





SRPB_MOUSE SIGNAL





RECOGNITION PARTICLE





RECEPTOR BETA SUBUNIT





[M. musculus]


Hs.93967
180161
R85501
ESTs


NA
66469
NA
NA


Hs.169161
35000
R45094
ESTs, Moderately similar to





MAON_HUMAN NADP-





DEPENDENT MALIC ENZYME,





MITOCHONDRIAL PRECURSOR





[H. sapiens]


Hs.18627
37178
R49398
ESTs, Weakly similar to GP36b





glycoprotein [H. sapiens]


NA
219689
H80014
NA


Hs.31848
525319
AA069144
ESTs, Weakly similar to hypothetical





protein [H. sapiens]


NA
526038
AA076128
NA


NA
146842
R80670
NA


Hs.125042
239510
H81265
ESTs


NA
545120
AA075716
NA


Hs.107382
44151
H05814
ESTs


NA
29532
R41566
NA


Hs.125729
310600
N99898
ESTs, Weakly similar to zinc finger





protein zfp31 [H. sapiens]


NA
548932
AA115168
NA


NA
124187
R01274
NA


Hs.31110
47703
H12084
ESTs, Weakly similar to MAGE-B4





[H. sapiens]


Hs.13155
67006
T69711
EST


NA
530033
AA070488
NA


NA
530736
AA069921
NA


NA
531578
AA074126
NA


Hs.6820
28213
R40787
ESTs, Weakly similar to putative





[C. elegan


NA
546973
AA083382
NA


Hs.91785
31123
R42362
ESTs


Hs.230064
108369
T77828
EST


Hs.265200
624513
AA187228
ESTs, Highly similar to S29331





glutamate dehydrogenase - human





[H. sapiens]


Hs.22893
32643
R43 166
ESTs


NA
66347
NA
NA


Hs.269425
381260
AA057000
ESTs, Highly similar to dJ283E3.6.1





[H. sapiens]


Hs.36269
416646
W86446
ESTs, Weakly similar to ODB2





HUMAN LIPOAMIDE ACYL-





TRANSFERASE COMPONENT OF





BRANCHED-CHAIN ALPHA-





KETO ACID DEHYDROGENASE





COMPL


NA
53024
R15882
NA


Hs.278871
51017
H19309
ESTs, Weakly similar to





AC007228_2





BC37295_1 [H. sapiens]


Hs.10846
152989
R50746
ESTs, Weakly similar to JH0783





diamine N-acetyltransferase





[H. sapiens]


Hs.137361
488337
AA046643
ESTs, Weakly similar to RAS-





RELATED PROTEIN RAB-7





[H. sapiens]


NA
530870
AA070154
NA


NA
531411
AA071566
NA


NA
544846
AA075338
NA


Hs.177276
202575
H53817
ESTs


Hs.191112
67022
T69727
ESTs


NA
68185
T52983
NA


Hs.131897
67064
T70341
ESTs


Hs.24889
41388
R56121
ESTs


NA
66327
NA
NA


NA
544820
AA075280
NA


Hs.143333
193846
H51750
EST


NA
526156
AA076627
NA


NA
530281
AA112048
NA


Hs.44426
346063
W72646
ESTs, Weakly similar to GSHH





HUMAN PHOSPHOLIPID





HYDROPEROXIDE





GLUTATHIONE





PEROXIDASE [H. sapiens]


NA
545623
AA078835
NA


Hs.43897
257368
N27163
ESTs, Weakly similar to P2CA





HUMAN PROTEIN





PHOSPHATASE 2C ALPHA





ISOFORM [H. sapiens]


Hs.172080
129000
R10363
ESTs


Hs.146182
208169
H60623
ESTs, Weakly similar to lactase





phlorizinhydrolase [H. sapiens]


Hs.251946
418006
W90707
ESTs, Moderately similar to





PAB1_HUMAN POLY-





ADENYLATE-BINDING





PROTEIN 1 [H. sapiens]


Hs.177276
202575
H53817
ESTs


NA
544954
AA075353
NA


Hs.269605
545077
AA075680
ESTs, Moderately similar to





SUCCUNATE DEHYDROGENASE





[H. sapiens]


Hs.26481
43460
H05933
ESTs, Weakly similar to NS1-binding





protein [H. sapiens]


Hs.9299
31987
R43047
ESTs


Hs.106356
32991
R44770
ESTs


Hs.173121
486296
AA044079
ESTs


NA
545681
AA079371
NA


NA
546318
AA084033
NA


Hs.261330
548702
AA125823
ESTs, Highly similar to dJ109F14.2





[H. sapiens]


NA
69410
NA
NA


NA
67419
T49342
NA


Hs.134013
381968
AA063646
ESTs, Moderately similar to NK





homeobox protein [H. sapiens]










[0035]

3





TABLE III










(Group III):












Genbank



Unigene
IMAGE
Accession



Cluster
ID
no.
Putative gene name













Hs.183864
328527
W40254
elastase 3B


NA
119202
T94099
NA


Hs.30352
429312
AA007373
ribosomal protein S6 kinase, 52





kD, polypeptide 1


Hs.79474
38287
R49224
tyrosine 3-monooxygenase/





tryptophan 5-monooxygenase





activation protein,





epsilon polypeptide


Hs.177592
141028
R66697
ribosomal protein, large, P1


Hs.184093
510608
AA099408
HERV-H LTR-associating 1


Hs.87149
290759
N67642
integrin, beta 3 (platelet glyco-





protein IIIa, antigen





CD61)


NA
144825
R76566
NA


Hs.151123
290283
N64471
neuronal Shc


Hs.4217
145899
R79161
collagen, type VI, alpha 2


Hs.100000
122381
T99218
S100 calcium-binding protein A8





(calgranulin A)


Hs.249982
40205
R52103
cathepsin B


Hs.4814
40617
R55744
mannosidase, alpha, class 1B,





member 1


Hs.154654
25594
R15113
cytochrome P450, subfamily I





(dioxin-inducible), poly-





peptide 1 (glaucoma 3,





primary infantile)


NA
527085
AA114048
NA


Hs.2250
153025
R50354
leukemia inhibitory factor





(cholinergic differentiation





factor)


Hs.25590
153589
R48580
stanniocalcin


Hs.2633
108222
T69781
desmoglein 1


Hs.85701
345430
W72473
phosphoinositide-3-kinase,





catalytic, alpha polypeptide


Hs.211578
345935
W72201
MAD (mothers against





decapentaplegic, Drophila)





homolog 3


Hs.2030
205185
H59861
thrombomodulin


Hs.115396
155233
R70379
immunoglobulin heavy constant





delta


Hs.366
489620
AA099044
6-pyruvoyltetrahydropterin





synthase


Hs.1675
309295
N93935
gamma-glutamyltransferase-like





activity 1


Hs.84229
38040
R59398
splicing factor, arginine/serine-





rich 8





(suppressor-of-white-apricot,





Drosophila homolog)


Hs.43857
113822
T77062
similar to glucosamine-6-sulfatases


Hs.55967
309974
N99100
short stature homeobox 2


Hs.86347
31156
R42595
ESTs, Weakly similar to predicted





using Genefinder [C. elegans]


Hs.79059
45133
H07895
transforming growth factor, beta





receptor III (betaglycan, 300 kD)


NA
256612
NA
NA


Hs. 82269
258229
N30652
progestagen-associated endometrial





protein (placental protein 14,





pregnancy-associated endometrial





alpha-2-globulin, alpha u


Hs.77840
261258
H98114
annexin A4


Hs.9075
562928
AA085850
serine/threonine kinase 17a





(apoptosis-inducing)


Hs.78056
32041
R41770
cathepsin L


Hs.789
323238
W42723
GRO1 oncogene (melanoma





growth stimulating activity,





alpha)


Hs.77572
47493
H11583
BCL2/adenovirus E1B 19 kD-





interacting protein 1


Hs.107966
199180
R95740
paraoxonase 3


Hs.22026
129610
R16547
ESTs


Hs.154443
200536
R99175
minichromosome maintenance





deficient (S. cerevisiae) 4


Hs.85137
591617
AA158803
cyclin A2


Hs.31130
48276
H12262
transmembrane 7 superfamily





member 2


Hs.7645
201352
R98600
fibrinogen, B beta polypeptide


Hs.31137
66972
T67544
protein tyrosine phosphatase,





receptor type, epsilon poly-





peptide


Hs.86347
270794
N32932
ESTs, Weakly similar to predicted





using Genefinder [C. elegans]


Hs.83341
49318
H15718
AXL receptor tyrosine kinase


Hs.202362
270895
N32504
ESTs, Weakly similar to S71091





acetyl-CoA carboxylase





[H. sapiens]


Hs.170114
270560
N33237
KIAA0061 protein


Hs.24309
133914
R28671
hypothetical protein FLJ11106


NA
108296
T70555
NA


Hs.94360
82205
T68873
metallothionein 1L


Hs.181392
135225
R32850
major histocompatibility complex,





class I, E


Hs.81118
274197
H49887
leukotriene A4 hydrolase


Hs.104203
36992
R48944
ESTs


Hs.17411
327073
W02696
KIAA0699 protein


Hs.82269
327077
W02698
progestagen-associated endometrial





protein (placental protein 14,





pregnancy-associated endometrial





alpha-2-globulin, alpha u


NA
108418
T77845
NA


Hs.94382
279363
N48691
adenosine kinase


Hs.6456
211555
H56330
chaperonin containing TCP1,





subunit 2 (beta)


Hs.169907
21994
T66320
glutathione S-transferase A4


Hs.74561
428909
AA004817
alpha-2-macroglobulin


Hs.274260
108190
T69749
ATP-binding cassette, sub-family





C (CFTR/MRP), member 6


Hs.13225
22396
T87624
UDP-Gal:betaGlcNAc beta 1,4-





galactosyltransferase, polypeptide





4


Hs.79345
22304
T82485
coagulation factor VIIIc,





procoagulant component





(hemophilia A)


Hs.182490
286024
N64275
leucine-rich protein mRNA


Hs.75627
141979
R69023
CD14 antigen


Hs.153614
287575
N62125
retinitis pigmentosa GTPase





regulator


Hs.20423
143388
R74208
NOT4 (negative regulator of





transcription 4, yeast) homolog


NA
52365
H24082
NA


Hs.194720
52150
H24068
ATP-binding cassette, sub-family





G (WHITE), member 2


Hs.75410
342231
W61143
heat shock 70 kD protein 5





(glucose-regulated protein,





78 kD)


Hs.799
24365
R37964
diphtheria toxin receptor (heparin-





binding epidermal growth factor-





like growth factor)


Hs.31819
52295
H24360
ESTs, Weakly similar to





thioredoxin-like protein





[H. sapiens]


Hs.3280
297727
N69907
caspase 6, apoptosis-related





cysteine protease


Hs.18212
230060
H68190
DNA segment on chromosome X





(unique) 9879 expressed sequence


Hs.77274
143356
R74194
plasminogen activator, urokinase


Hs.6066
24792
R38781
Rho guanine nucleotide exchange





factor (GEF) 4


Hs.228572
150162
H04461
EST


Hs.198891
40240
R55052
serine/threonine-protein kinase





PRP4 homolog


NA
231916
H92881
NA


Hs.118845
301128
N81117
troponin C, slow


Hs.75517
526215
AA076664
laminin, beta 3 (nicein (125 kD),





kalinin (140 kD), BM600 (125





kD))


Hs.75819
40348
R54793
glycoprotein M6A


Hs.75354
40567
R55251
GCN1 (general control of amino-





acid synthesis 1, yeast)-like 1


Hs.63510
301301
N80830
KIAA0141 gene product


Hs.199533
21418
T65374
ESTs


NA
238357
H64393
NA


Hs.7158
307072
N89678
DKFZP566H073 protein


Hs.167292
345648
W72064
ESTs, Weakly similar to zinc





finger protein C2H2-25





[H. sapiens]


Hs.76366
345703
W71991
BCL2-antagonist of cell death


Hs.87149
200209
R97831
integrin, beta 3 (platelet glyco-





protein IIIa, antigen CD61)


Hs.251653
41413
R56886
tubulin, beta, 2


Hs.117852
238349
H64389
ATP-binding cassette, sub-family





D (ALD), member 2


Hs.115537
26189
R20620
ESTs, Weakly similar to





MICROSOMAL DIPEPTIDASE





PRECURSOR [H. sapiens]


Hs.7844
42075
R60845
golgi autoantigen, golgin subfamily





b, macrogolgin (with trans-





membrane signal), 1


Hs.1239
109164
T81089
alanyl (membrane) aminopeptidase





(aminopeptidase N, amino-





peptidase M, microsomal amino-





peptidase, CD13, p150)


Hs.77899
307949
N92266
tropomyosin 1 (alpha)


Hs.8123
346824
W78007
chromobox homolog 3 (Drosophila





HP1 gamma)


Hs.171731
240062
H82236
solute carrier family 14 (urea





transporter), member 1 (Kidd





blood group)


Hs.2551
241489
H90431
adrenergic, beta-2-, receptor,





surface


Hs.184161
310131
N98616
exostoses (multiple) 1


Hs.75596
376696
AA046615
interleukin 2 receptor, beta


Hs.1422
347751
W81586
Gardner-Rasheed feline sarcoma





viral (v-fgr) oncogene homolog


Hs.507
357785
W95595
corneodesmosin


Hs.75445
27542
R40157
SPARC-like 1 (mast9, hevin)


Hs.26691
43006
R59736
ESTs


NA
163482
H14182
NA


NA
544875
AA075381
NA


Hs.168236
113951
T79759
ESTs


Hs.256309
309944
N94512
Human beta-1D integrin mRNA,





cytoplasmic domain, partial cds


Hs.182741
310447
N98462
TIA1 cytotoxic granule-associated





RNA-binding protein-like 1


Hs.215595
120148
T95078
guanine nucleotide binding protein





(G protein), beta polypeptide 1


NA
547027
AA082916
NA


NA
68351
NA
NA


Hs.8015
44909
H07857
ubiquitin specific protease 21


Hs.9475
67188
T52634
ESTs, Weakly similar to GTRS





HUMAN GLUCOSE TRANS-





PORTER TYPE 5, SMALL





INTESTINE [H. sapiens]


Hs.234773
509919
AA056421
ecotropic viral integration site 1


Hs.34853
260740
H97932
inhibitor of DNA binding 4,





dominant negative helix-loop-





helix protein


Hs.155606
364554
AA022577
paired mesoderm homeo box 1


Hs.154095
261759
H99156
zinc finger protein 143 (clone





pHZ-1)


NA
322339
NA
NA


Hs.81737
262750
H99622
palmitoyl-protein thioesterase 2


Hs.181125
194467
R83196
immunoglobulin lambda locus


Hs.278607
263725
H99680
ubiquitin activating enzyme E1-





like protein


Hs.404
46936
H10052
myeloid/lymphoid or mixed-





lineage leukemia (trithorax





(Drosophila) homolog);





translocated to, 3


Hs.1197
128225
R11507
heat shock 10 kD protein 1





(chaperonin 10)


Hs.83050
128937
R10699
phosphoinositide-3-kinase,





regulatory subunit 4, p150


Hs.1012
129270
R11065
complement component 4-binding





protein, alpha


Hs.274260
200946
R97755
ATP-binding cassette, sub-family





C (CFTR/MRP), member 6


Hs.28532
68011
T49766
ESTs, Weakly similar to BAI1-





associated protein 1 [H. sapiens]


Hs.76722
594019
AA165157
CCAAT/enhancer binding protein





(C/EBP), delta


Hs.75621
78425
T61377
protease inhibitor 1 (anti-





elastase), alpha-1-antitrypsin


Hs.9994
201995
R99339
lipase, hepatic


Hs.74111
49402
H15565
RNA-binding protein (auto-





antigenic)


Hs.848
610317
AA171524
FK506-binding protein 4 (59 kD)


Hs.73853
612944
AA181547
bone morphogenetic protein 2


Hs.76688
83060
T67816
carboxylesterase 1





(monocyte/macrophage serine





esterase 1)


Hs.197728
85578
T72257
carboxylesterase 2 (intestine, liver)


Hs.75155
85640
T62051
transferrin


Hs.78036
137257
R36683
solute carrier family 6 (neuro-





transmitter transporter,





noradrenalin), member 2


Hs.107082
139304
R63714
ESTs, Moderately similar to





alternatively spliced product





using exon 13A





[H. sapiens]


Hs.6518
50866
H17125
ganglioside expression factor 2


Hs.198253
139530
R62322
major histocompatibility complex,





class II, DQ alpha 1


Hs.265262
141115
R66326
colony stimulating factor 2





receptor, beta, low-





affinity (granulocyte-macrophage)


Hs.124186
141298
R63802
ring finger protein 2


Hs.82985
340928
W57799
collagen, type V, alpha 2


Hs.155924
23235
R39184
cAMP responsive element





modulator


Hs.272630
512924
AA063307
vacuolar proton pump delta poly-





peptide


Hs.155140
21658
T65122
casein kinase 2, alpha 1 poly-





peptide


NA
145407
R78034
NA


Hs.182490
53071
R16051
leucine-rich protein mRNA


Hs.184669
153055
R50369
zinc finger protein 144 (Mel-18)


Hs.50651
171569
H18190
Janus kinase 1 (a protein tyrosine





kinase)


Hs.83393
344997
W72895
cystatin E/M


Hs.268915
108235
T69792
ESTs


Hs.263671
488635
AA044896
Homo sapiens mRNA; cDNA





DKFZp434I0812 (from clone





DKFZp434I0812); partial cds


Hs.85701
250142
N23534
phosphoinositide-3-kinase,





catalytic, alpha polypeptide


Hs.67397
530888
AA069960
homeo box A1


Hs.153612
308682
N95462
ATP-binding cassette, sub-family





F (GCN20), member 2


Hs.9873
244132
N52439
Homo sapiens mRNA; cDNA





DKFZp434E0620 (from clone





DKFZp434E0620); partial cds


NA
544792
AA075319
NA


Hs.75682
43714
H05738
autoantigen


Hs.251972
28386
R37386
complement component 3


Hs.155392
43852
H04819
collapsin response mediator protein





1


NA
545884
AA079529
NA


Hs.98493
29438
R41276
X-ray repair complementing





defective repair in Chinese





hamster cells 1


Hs.64016
250640
H98523
protein S (alpha)


Hs.119251
546466
AA084355
ubiquinol-cytochrome c reductase





core protein I


Hs.77448
44289
H06253
aldehyde dehydrogenase 4





(glutamate gamma-semialdehyde





dehydrogenase; pyrroline-5-





carboxylate dehydrogenase)


Hs.1139
360293
AA013183
cold shock domain protein A


Hs.41066
252534
H87476
ESTs, Moderately similar to





EFGM_RAT ELONGA-





TION FACTOR G, MITO-





CHONDRIAL PRECURSOR





[R. norvegicus]


NA
547154
AA084870
NA


Hs.18894
320170
W04536
adaptor-related protein complex 1,





mu 2 subunit


Hs.160318
561873
AA086437
FXYD domain-containing ion





transport regulator 1





(phospholemman)


Hs.234773
625011
AA181023
ecotropic viral integration site 1


Hs.151573
33049
R44018
cryptochrome 1 (photolyase-like)


Hs.179606
46631
H10061
nuclear RNA helicase, DECD





variant of DEAD box family


Hs.30965
46843
H10072
neuronal Shc adaptor homolog


Hs.107444
46667
H09790
Homo sapiens cDNA FLJ20562





fis, clone KAT11992


Hs.154103
127614
R09181
LIM protein (similar to rat protein





kinase C-binding enigma)


NA
75009
T51849
NA


Hs.25615
200648
R99249
CGI-119 protein


Hs.78225
267361
N24938
annexin A1


Hs.118666
49272
H16503
Human clone 23759 mRNA,





partial cds


Hs.166017
268834
N26000
microphthalmia-associated





transcription factor


Hs.77424
81221
T57079
Fc fragment of IgG, high affinity





Ia, receptor for (CD64)


NA
611899
AA178923
NA


Hs.160786
83166
T68162
argininosuccinate synthetase


Hs.19121
36905
R49169
adaptor-related protein complex 2,





alpha 2 subunit


Hs.89649
49995
H28958
epoxide hydrolase 1, microsomal





(xenobiotic)


Hs.84981
36341
R62442
X-ray repair complementing





defective repair in Chinese





hamster cells 5 (double-





strand-break rejoining; Ku





autoantigen, 80 kD)


Hs.241567
139988
R64674
RNA binding motif, single





stranded interacting protein 1


Hs.78353
21899
T65211
SFRS protein kinase 2


Hs.75613
429981
AA034145
CD36 antigen (collagen type I





receptor, thrombospondin receptor)


Hs.4096
340876
W57561
KIAA0742 protein


Hs.75428
48198
H11120
superoxide dismutase 1, soluble





(amyotrophic lateral sclerosis 1





(adult))


NA
513140
AA063384
NA


Hs.239176
148379
H13300
insulin-like growth factor 1





receptor


NA
235880
H52231
NA


Hs.203246
239973
H79874
ESTs, Moderately similar to





ZN91_HUMAN ZINC FINGER





PROTEIN 91 [H. sapiens]


Hs.75511
267256
N23360
connective tissue growth factor


Hs.180383
42464
R59865
dual specificity phosphatase 6


Hs.15114
489708
AA099583
ras homolog gene family, member





D


NA
530923
AA070369
NA


Hs.155020
27326
R37020
putative methyltransferase


Hs.131255
245269
N54563
ubiquinol-cytochrome c reductase





binding protein


Hs.1252
356915
W92730
apolipoprotein H (beta-2-glyco-





protein I)


Hs.2430
501972
AA128103
transcription factor-like 1


Hs.21618
28938
R40823
ESTs


Hs.76392
309912
N94493
aldehyde dehydrogenase 1, soluble


Hs.1521
503206
AA148925
immunoglobulin mu binding





protein 2


Hs.2780
155061
R70216
jun D proto-oncogene


NA
546664
AA084411
NA


Hs.169921
548957
AA115186
general transcription factor II, i,





pseudogene 1


Hs.151236
547732
AA084099
highly charged protein


Hs.173135
67286
T49194
dual-specificity tyrosine-(Y)-





phosphorylation regulated kinase 2


Hs.153998
363167
AA019082
creatine kinase, mitochondrial 1





(ubiquitous)


Hs.155597
257625
N30864
D component of complement





(adipsin)


Hs.75111
45354
H09725
protease, serine, 11 (IGF binding)


Hs.70266
62112
T41077
yeast Sec31p homolog


Hs.217493
624382
AA182794
annexin A2


Hs.197289
187804
H44007
rab3 GTPase-activating protein,





non-catalytic subunit (150 kD)


Hs.106674
46154
H09066
BRCA1 associated protein-1





(ubiquitin carboxy-terminal





hydrolase)


Hs.77805
32649
R43304
ATPase, H+ transporting,





lysosomal (vacuolar proton pump)





31 kD; Vacuolar proton-ATPase,





subunit B; V-ATPase, subunit E


Hs.123122
667355
AA228030
FSH primary response (LRPR1,





rat) homolog 1


Hs.66731
667188
AA236353
homeo box B13


Hs.13776
127047
R07880
ADP-ribosyltransferase 4


NA
66315
NA
NA


NA
66492
NA
NA


Hs.117077
586811
AA130717
zinc finger protein 264


Hs.267887
33005
R44793
adenylyl cyclase-associated





protein 2


Hs.184771
265874
N20996
nuclear factor I/C (CCAAT-





binding transcription factor)


Hs.12653
66810
T64945
ESTs


NA
66814
T64947
NA


Hs.12107
267725
N25578
putative breast adenocarcinoma





marker (32 kD)


Hs.250666
130900
R22228
hairy (Drosophila)-homolog


Hs.183551
66986
T69703
EST


Hs.8110
593119
AA160915
adducin 3 (gamma)


Hs.10684
48653
H14599
Homo sapiens clone 24421 mRNA





sequence


Hs.13880
35391
R45220
CGI-143 protein


Hs.82911
49507
H15572
protein tyrosine phosphatase type





IVA, member 2


Hs.7912
35271
R45583
neuronal cell adhesion molecule


Hs.5636
35530
R45336
RAB6, member RAS oncogene





family


Hs.23016
35630
R45296
Human orphan G protein-coupled





receptor (RDC1) mRNA, partial





cds


Hs.142258
613363
AA180403
signal transducer and activator of





transcription 3 (acute-phase





response factor)


Hs.41639
273845
N37094
programmed cell death 2


Hs.74583
50055
H17451
KIAA0275 gene product


Hs.271363
108322
T70583
ESTs


Hs.174185
108458
T80109
phosphodiesterase I/nucleotide





pyrophosphatase 2 (autotaxin)


Hs.280666
37049
R49483
Homo sapiens chromosome 19,





cosmid R32184


NA
108333
T70597
NA


Hs.192245
108461
T80112
ESTs










[0036]

4





TABLE 4










Group 4 Genes












Genbank



Unigene
IMAGE
Accession



Cluster
ID
no.
Putative gene name













Hs.172028
212359
H69859
a disintegrin and metalloprotease





domain 10


Hs.173310
49387
H15537
protocadherin gamma subfamily C, 3


Hs.83169
325050
W49497
matrix metalloproteinase 1





(interstitial collagenase)


Hs.34073
165834
R86708
BH-protocadherin (brain-heart)


Hs.118638
323236
W42722
non-metastatic cells 1, protein





(NM23A) expressed in


Hs.154057
154770
R55625
matrix metalloproteinase 19


Hs.275243
526111
AA076242
S100 calcium-binding protein A6





(calcyclin)


Hs.275163
325115
W47002
non-metastatic cells 2, protein





(NM23B) expressed in


Hs.58324
345484
W72552
a disintegrin-like and metalloprotease





(reprolysin type) with





thrombospondin type 1 motif, 5





(aggrecanase-2)


Hs.118512
469969
AA029934
integrin, alpha V (vitronectin





receptor, alpha polypeptide, antigen





CD51)


Hs.2442
529120
AA065135
a disintegrin and metalloproteinase





domain 9 (meltrin gamma)


Hs.2399
270505
N33214
matrix metalloproteinase 14





(membrane-inserted)


Hs.118638
81478
T63504
non-metastatic cells 1, protein





(NM23A) expressed in


Hs.275243
326169
W52162
S100 calcium-binding protein A6





(calcyclin)


Hs.155392
43852
H04819
collapsin response mediator protein 1











DETAILED DESCRIPTION

[0037] Metastasis-Associated Genes


[0038] A model system containing model cell lines has been developed from a human lung adenocarcinoma cell line. The model cell lines, such as CL1-0 and its sublines (e.g., CL1-1 and CL1-5), which are clonally related, have different invasion capabilities both in vitro and in vivo. To define genetic determinants of tumor metastasis, a cDNA microarray containing 9600 putative genes has been used to compare gene expression between the clonally related high metastatic and low metastatic tumors. Hundreds of genes have been identified that are differentially expressed in those model cell lines. Some of these genes, i.e., Group I, show strong correlation, either positively or negatively, between their expression levels and the invasiveness of cell lines. These findings illustrate that those model cell lines with varying invasive capabilities, together with a cDNA microarray technique, can be a good model system in identifying invasion or metastasis-associated genes.


[0039] Self-organizing maps (SOMs, see, Tamayo et al. (1999) Proc. Natl. Acad. Sci. USA 96: 2907), one of the widely used clustering methods, can organize expression profiles into clusters of patterns. This characteristic is useful to identify metastasis-associated genes. By using SOMs, 8,525 genes were analyzed and their expression profiles grouped into 100 clusters. Four of the clusters contained genes whose expression correlated positively with invasiveness of tumor cell lines; while another four clusters had negative correlation to invasiveness. Thus identified genes can be further confirmed by using Northern blotting and flow cytometric analysis. These genes sequences can be verified by re-sequencing.


[0040] The cDNA array has identified three groups of genes as being associated with tumor metastasis. Group I genes have known cellular functions that include, but are not limited to, proteases and adhesion molecules, cell cycle regulators, signal transduction molecules, cytoskeleton and motility proteins, urokinase-type plasminogen activators, and angiogenesis-related molecules. Microarray analysis also suggests that high expression level of tumor-associated antigen L6 is closely correlated with tumor metastasis. Group II genes are anonymous and strongly correlated either positively or negatively with invasiveness. Group III genes have known or unknown cellular functions and moderately correlated either positively or negatively with invasiveness.


[0041] Arrays


[0042] Arrays are useful molecular tools for characterizing a sample by multiple criteria. For example, an array having a capture probes for one or more nucleic acids of Group I, II, III, or IV can be used to assess a metastasis state of cell. Arrays can have many addresses on a substrate. The featured arrays can be configured in a variety of formats, non-limiting examples of which are described below.


[0043] A substrate can be opaque, translucent, or transparent. The addresses can be distributed, on the substrate in one dimension, e.g., a linear array; in two dimensions, e.g., a planar array; or in three dimensions, e.g., a three dimensional array. The solid substrate may be of any convenient shape or form, e.g., square, rectangular, ovoid, or circular. Non-limiting examples of two-dimensional array substrates include glass slides, quartz (e.g., UV-transparent quartz glass), single crystal silicon, wafers (e.g., silica or plastic), mass spectroscopy plates, metal-coated substrates (e.g., gold), membranes (e.g., nylon and nitrocellulose), plastics and polymers (e.g., polystyrene, polypropylene, polyvinylidene difluoride, poly-tetrafluoroethylene, polycarbonate, PDMS, nylon, acrylic, and the like). Three-dimensional array substrates include porous matrices, e.g., gels or matrices. Potentially useful porous substrates include: agarose gels, acrylamide gels, sintered glass, dextran, meshed polymers (e.g., macroporous crosslinked dextran, sephacryl, and sepharose), and so forth.


[0044] An array can have a density of at least than 10, 50, 100, 200, 500, 1000, 2000, 104, 105, 106, 107, 108, or 109 or more addresses per cm2 and ranges between. In some embodiments, the plurality of addresses includes at least 10, 100, 500, 1,000, 5,000, 10,000, or 50,000 addresses. In some other embodiments, the plurality of addresses includes less than 9, 99, 499, 999, 4,999, 9,999, or 49,999 addresses. Addresses in addition to the address of the plurality can be disposed on the array. The center to center distance can be 5 mm, 1 mm, 100 um, 10 um, 1 um or less. The longest diameter of each address can be 5 mm, 1 mm, 100 um, 10 um, 1 um or less. Each addresses can contain 0.1 ug, 1 ug, 100 ng, 10 ng, 1 ng, 100 pg, 10 pg, 1 pg, 0.1 pg or less of a capture agent, i.e. the capture probe. For example, each address can contain 100, 103, 104, 105, 106, 107, 108, or 109 or more molecules of the nucleic acid.


[0045] A nucleic array can be fabricated by a variety of methods, e.g., photolithographic methods (see, e.g., U.S. Pat. Nos. 5,143,854; 5,510,270; and. 5,527,681), mechanical methods (e.g., directed-flow methods as described in U.S. Pat. No. 5,384,261), pin based methods (e.g., as described in U.S. Pat. No. 5,288,514), and bead based techniques (e.g., as described in PCT US/93/04145). A capture probe can be a single-stranded nucleic acid, a double-stranded nucleic acid (e.g., which is denatured prior to or during hybridization), or a nucleic acid having a single-stranded region and a double-stranded region. The capture probe can be selected by a variety of criteria, and can be designed by a computer program with optimization parameters. The capture probe can be selected to hybridize to a sequence rich (e.g., non-homopolymeric) region of a nucleic acid. The Tm of the capture probe can be optimized by prudent selection of the complementarity region and length. Ideally, the Tm of all capture probes on the array is similar, e.g., within 20, 10, 5, 3, or 2° C. of one another. A database scan of available sequence information for a species can be used to determine potential cross-hybridization and specificity problems.


[0046] A nucleic acid array can be used to hybridize a nucleic acid that is obtained as follows: A RNA can be isolated by routine methods, e.g., including DNase treatment to remove genomic DNA and hybridization to an oligo-dT coupled a solid substrate (e.g., as described in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y). The solid substrate is washed and the RNA is eluted. The RNA can be reversed transcribed and, optionally, amplified. The amplified nucleic acid can be labeled during amplification, e.g., by the incorporation of a labeled nucleotide. Examples of labels include fluorescent labels, e.g., red-fluorescent dye Cy5 (Amersham) or green-fluorescent dye Cy3 (Amersham), chemiluminescent labels, e.g., as described in U.S. Pat. No. 4,277,437, and colorimetric detection, as described in Examples. Alternatively, the amplified nucleic acid can be labeled with biotin and detected after hybridization with labeled streptavidin, e.g., streptavidin-phycoerythrin (Molecular Probes). The labeled nucleic acid can be contacted to the array. In addition, a control nucleic acid or a reference nucleic acid can be contacted to the same array. The control nucleic acid or reference nucleic acid can be labeled with a label other than the sample nucleic acid, e.g., one with a different emission maximum. Labeled nucleic acids can be contacted to an array under hybridization conditions. The array can be washed, and then imaged to detect, e.g., color development or fluorescence, at each address of the array.


[0047] A polypeptide array can be used to determine the expression level of a polypeptide encoded by a nucleic acid selected from Group I, II, III, or IV. The polypeptide array can have antibody capture probes for each of the polypeptides.


[0048] A low-density (96 well format) polypeptide array has been developed in which polypeptides are spotted onto a nitrocellulose membrane (e.g., Ge, H. (2000) Nucleic Acids Res. 28: e3, I-VII). A high-density polypeptide array (100,000 samples within 222×222 mm) used for antibody screening was formed by spotting proteins onto polyvinylidene difluoride (PVDF) (e.g., Lueking et al. (1999) Anal. Biochem. 270: 103-11 1). Polypeptides can be printed on a flat glass plate that contained wells formed by an enclosing hydrophobic Teflon mask (e.g., Mendoza, et al. (1999). Biotechniques 27: 778-788.). Also, polypeptides can be covalently linked to chemically derivatized flat glass slides in a high-density array (1600 spots per square centimeter) (MacBeath, G., and Schreiber, S. L. (2000) Science 289: 1760-1763). De Wildt et al., describe a high-density array of 18,342 bacterial clones, each expressing a different single-chain antibody, in order to screen antibody-antigen interactions (De Wildt et al. (2000). Nature Biotech. 18: 989-994). These art-known methods and others can be used to generate an array of antibodies for detecting the abundance of polypeptides in a sample. The sample can be labeled, e.g., biotinylated, for subsequent detection with streptavidin coupled to a fluorescent label. The array can then be scanned to measure binding at each address.


[0049] The nucleic acid and polypeptide arrays of the invention can be used in wide variety of applications. For example, the arrays can be used to analyze a patient sample. The sample is compared to data obtained previously, e.g., known clinical specimens or other patient samples. Further, the arrays can be used to characterize a cell culture sample, e.g., to determine a cellular state after varying a parameter, e.g., exposing the cell culture to an antigen, a transgene, or a test compound.


[0050] Evaluating Expression


[0051] The level of expression of at least one nucleic acid selected from Group I, II, III, or IV can be measured in a number of ways, including, but not limited to: measuring the abundance of an MRNA encoded by a nucleic acid selected from Group I, II, III, or IV; measuring the amount of a polypeptide encoded by such a nucleic acid; or measuring an activity of a polypeptide encoded by such a nucleic acid.


[0052] The level of mRNA corresponding to a nucleic acid selected from Group I, II, III, or IV in a cell can be determined by the following formats. The isolated MRNA can be used in hybridization or amplification assays that include, but are not limited to, Northern analyses, polymerase chain reaction analyses, and probe arrays. One method for the detection of mRNA levels involves contacting the isolated mRNA with a nucleic acid probe that can hybridize to the mRNA encoded by the nucleic acid being detected. The nucleic acid probe can be, for example, a full-length of a nucleic acid complementary to a nucleic acid selected from Group I, II, III, or IV, or a portion thereof, such as an oligonucleotide of at least 7, 15, 30, 50, 100 nucleotides in length and sufficient to specifically hybridize under stringent conditions to the mRNA. In one format, mRNA is immobilized on a surface and contacted with the probes, for example by running the isolated mRNA on an agarose gel and transferring the mRNA from the gel to a membrane, such as nitrocellulose. In an alternative format, the probes are immobilized on a surface and the mRNA is contacted with the probes, for example, in a two-dimensional array.


[0053] The level of mRNA in a sample that is encoded by a nucleic acid selected from Group I, II, III, or IV can be evaluated using nucleic acid amplification, e.g., by rtPCR (Mullis (1987) U.S. Pat. No. 4,683,202), ligase chain reaction (Barany (1991) Proc. Natl. Acad. Sci. USA 88: 189-193), self sustained sequence replication (Guatelli et al. (1990) Proc. Natl. Acad. Sci. USA 87: 1874-1878), transcriptional amplification system (Kwoh et al. (1989), Proc. NatL Acad. Sci. USA 86: 1173-1177), Q-Beta Replicase (Lizardi et al. (1988) Bio/Technology 6: 1197), rolling circle replication (Lizardi et al., U.S. Pat. No. 5,854,033) or any other nucleic acid amplification method, followed by the detection of the amplified molecules using techniques known in the art. As used herein, amplification primers are defined as being a pair of nucleic acid molecules that can anneal to 5′ or 3′ regions of a gene (plus and minus strands, respectively, or vice-versa) and contain a short region in between. In general, amplification primers are from about 10 to 30 nucleotides in length and flank a region from about 50 to 200 nucleotides in length. Under appropriate conditions and with appropriate reagents, such primers permit the amplification of a nucleic acid molecule comprising the nucleotide sequence flanked by the primers. For an in situ format, a cell or tissue sample can be prepared/processed and immobilized on a support, typically a glass slide, and then contacted with a probe that can hybridize to MRNA that encodes the nucleic acid being analyzed.


[0054] A variety of methods can be used to determine the abundance of a polypeptide encoded by a nucleic acid selected from Group I, II, III, or IV. In general, these methods include contacting an agent that selectively binds to the polypeptide, such as an antibody, with a sample, to evaluate the level of the polypeptide in the sample. In some embodiments, the antibody bears a detectable label or is recognizable by a labeling agent. Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., Fab or F(ab′)2) can be used. The term “labeled,” with regard to the probe or antibody, is intended to encompass direct labeling of the probe or the antibody by coupling (i.e., physically linking) a detectable substance to the probe or the antibody, as well as indirect labeling of the probe or antibody by reactivity with a detectable substance. Examples of detectable substances are provided herein.


[0055] The detection methods can be used to detect a polypeptide in a sample in vitro as well as in vivo. In vitro techniques for detection of the protein include enzyme linked immunosorbent assays (ELISAs), immunoprecipitations, immunofluorescence, enzyme immunoassay (EIA), radioimmunoassay (RIA), and Western blot analysis. In vivo techniques for detection the protein include introducing into a subject a labeled antibody. For example, the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques. In another example, the sample can be labeled, e.g., biotinylated and then contacted to the antibody, e.g., an antibody positioned on an antibody array (as described below). The sample can be detected, e.g., with avidin coupled to a fluorescent label.


[0056] Expression Profiles


[0057] A general scheme for producing and evaluating profiles is as follows. Nucleic acid is prepared from a sample, e.g., a sample of interest and hybridized to an array, e.g., with multiple addresses. Hybridization of the nucleic acid to the array is detected. The extent of hybridization at an address is represented by a numerical value and stored, e.g., in a vector, a one-dimensional matrix, or one-dimensional array. The vector x{xa, xb . . . } has a value for each address of the array. For example, a numerical value for the extent of hybridization at a first address is stored in the variable xa. The numerical value can be adjusted, e.g., for local background levels, sample amount, and other variations. Nucleic acid is also prepared from a reference sample and hybridized to an array (e.g., the same or a different array), e.g., with multiple addresses. The vector y is construct identically to vector x. The sample expression profile and the reference profile can be compared, e.g., using a mathematical equation that is a function of the two vectors. The comparison can be evaluated as a scalar value, e.g., a score representing similarity of the two profiles. Either or both vectors can be transformed by a matrix in order to add weighting values to different nucleic acids detected by the array.


[0058] The expression data can be stored in a database, e.g., a relational database such as a SQL database (e.g., Oracle or Sybase database environments). The database can have multiple tables. For example, raw expression data can be stored in one table, wherein each column corresponds to a nucleic acid being assayed, e.g., an address or an array, and each row corresponds to a sample. A separate table can store identifiers and sample information, e.g., the batch number of the array used, date, and other quality control information.


[0059] Nucleic acids that are similarly regulated can be identified by clustering expression data to identify coregulated nucleic acids. Nucleic acids can be clustered using hierarchical clustering (see, e.g., Sokal and Michener (1958) Univ. Kans. Sci. Bull. 38: 1409), Bayesian clustering, k-means clustering, and self-organizing maps (see, Tamayo et al. (1999) Proc. NatL. Acad. Sci. USA 96: 2907).


[0060] Expression profiles obtained from nucleic acid expression analysis on an array can be used to compare samples and/or cells in a variety of states as described in Golub et al. ((1999) Science 286: 531). For example, multiple expression profiles from different conditions and including replicates or like samples from similar conditions are compared to identify nucleic acids whose expression level is predictive of the sample and/or condition. Each candidate nucleic acid can be given a weighted “voting” factor dependent on the degree of correlation of the nucleic acid's expression and the sample identity. A correlation can be measured using a Euclidean distance or a correlation coefficient, e.g., the Pearson correlation coefficient.


[0061] The similarity of a sample expression profile to a predictor expression profile (e.g., a reference expression profile that has associated weighting factors for each nucleic acid) can then be determined, e.g., by comparing the log of the expression level of the sample to the log of the predictor or reference expression value and adjusting the comparison by the weighting factor for all nucleic acids of predictive value in the profile.


[0062] Transactional Methods for Evaluating a Sample


[0063] A transactional method for evaluating a sample can be performed as follows. A patient is treated by a physician. The physician obtains a sample (i.e., “patient sample”), e.g., a blood sample, from the patient. The patient sample can be delivered to a diagnostics department which can collate information about the patient, the patient sample, and results of the evaluation. A courier service can deliver the sample to a diagnostic service. Location of the sample is monitored by a courier computer system, and can be tracked by accessing the courier computer system, e.g., using a web page across the Internet. At the diagnostic service, the sample is processed to produce a sample expression profile. For example, nucleic acid is extracted from the sample, optionally amplified, and contacted to a nucleic acid microarray. Binding of the nucleic acid to the microarray is quantitated by a detector that streams data to the array diagnostic server. The array diagnostic server processes the microarray data, e.g., to correct for background, sample loading, and microarray quality. It can also compare the raw or processed data to a reference expression profile, e.g., to produce a difference profile. The raw profiles, processed profiles and/or difference profiles are stored in a database server. A network server manages the results and information flow. In one embodiment, the network server encrypts and compresses the results for electronic delivery to the healthcare provider's internal network. The results can be sent across a computer network, e.g., the Internet, or a proprietary connection. For data security, the diagnostic systems and the healthcare provider systems can be located behind firewalls. In another embodiment, an indication that the results are available can also be sent to the healthcare provider and/or the patient, for example, by to an email client. The healthcare provider, e.g., the physician, can access the results, e.g., using the secure Hypertext Transfer Protocol (HTTP) (e.g., with secure sockets layer (SSL) encryption). The results can be provided by the network server as a web page (e.g., in HTML, XML, and the like) for viewing on the physician's browser.


[0064] Further communication between the physician and the diagnostic service can result in additional tests, e.g., a second expression profile can be obtained for the sample, e.g., using the same or a different microarray.


[0065] Screening a Test Compound


[0066] The invention provides a method for screening a test compound useful in the prevention or treatment of tumor metastasis. A “test compound” can be any chemical compound, for example, a macromolecule (e.g., a polypeptide, a protein complex, or a nucleic acid) or a small molecule (e.g., an amino acid, a nucleotide, an organic or inorganic compound). The test compound can have a formula weight of less than about 10,000 grams per mole, less than 5,000 grams per mole, less than 1,000 grams per mole, or less than about 500 grams per mole. The test compound can be naturally occurring (e.g., a herb or a nature product), synthetic, or both. Examples of macromolecules are proteins, protein complexes, and glycoproteins, nucleic acids, e.g., DNA, RNA and PNA (peptide nucleic acid). Examples of small molecules are peptides, peptidomimetics (e.g., peptoids), amino acids, amino acid analogs, polynucleotides, polynucleotide analogs, nucleotides, nucleotide analogs, organic or inorganic compounds e.g., heteroorganic or organometallic compounds. A test compound can be the only substance assayed by the method described herein. Alternatively, a collection of test compounds can be assayed either consecutively or concurrently by the methods described herein. Exemplary test compounds can be obtained from a combinatorial chemical library including peptide libraries (see, e.g., U.S. Pat. No. 5,010,175, Furka, Int. J Pept. Prot. Res. 37:487-493 (1991) and Houghton et al., Nature 354:84-88 (1991)), peptoids (e.g., PCT Publication No. WO 91/19735), encoded peptides (e.g., PCT Publication No. WO 93/20242), random bio-oligomers (e.g., PCT Publication No. WO 92/00091), benzodiazepines (e.g., U.S. Pat. No. 5,288,514), diversomers such as hydantoins, benzodiazepines and dipeptides (Hobbs et al, Proc. Nat. Acad. Sci. USA 90:6909-6913 (1993)), vinylogous polypeptides (Hagihara et al., J Amer Chem. Soc. 114:6568 (1992)), nonpeptidal peptidomimetics with glucose scaffolding (Hirschmann et al., J Amer. Chem. Soc. 114:9217-9218 (1992)), analogous organic syntheses of small compound libraries (Chen et al., J Amer Chem. Soc. 116:2661 (1994)), oligocarbamates (Cho et al, Science 261:1303 (1993)), and/or peptidyl phosphonates (Campbell et al., J Org. Chem. 59:658 (1994)), nucleic acid libraries (see Ausubel, Berger and Sambrook, all supra), peptide nucleic acid libraries (see, e.g., U.S. Pat. 5,539,083), antibody libraries (see, e.g., Vaughn et al., Nature Biotechnology, 14(3):309-314 (1996) and PCT/US96/10287), carbohydrate libraries (see, e.g., Liang et al., Science, 274:1520-1522 (1996) and U.S. Pat. No. 5,593,853), small organic molecule libraries (see, e.g., benzodiazepines, Baum C&EN, Jan 18, page 33 (1993); isoprenoids, U.S. Pat. No. 5,569,588; thiazolidinones and metathiazanones, U.S. Pat. No. 5,549,974; pyrrolidines, U.S. Pat. Nos. 5,525,735 and 5,519,134; morpholino compounds, U.S. Pat. No. 5,506,337; benzodiazepines, 5,288,514, and the like).


[0067] The test compound or compounds can be screened individually or in parallel. A compound can be screened by being monitored the level of expression of one or more nucleic acids selected from Group I, II, III, or IV. Comparing a compound-associated expression profile to a reference profile can identify the ability of the compound to modulate metastastic gene expression. The expression profile can be a profile of at least two cell lines which are clonally related. Examples of the cell lines are human lung adenocarcinoma cell lines of different invasive and metastatic capacities, e.g., CL1-0 and its sublines (e.g., CL1-1 and CL1-5). An example of the parallel screening is a high throughput drug screen. A high-throughput method can be used to screen large libraries of chemicals. Such libraries of test compounds can be generated or purchased e.g., from Chembridge Corp., San Diego, Calif. Libraries can be designed to cover a diverse range of compounds. For example, a library can include 10,000, 50,000, or 100,000 or more unique compounds. Alternatively, prior experimentation and anecdotal evidence, can suggest a class or category of compounds of enhanced potential. A library can be designed and synthesized to cover such a class of chemicals. A library can be tested on cell lines, such as CL1-0 and its sublines, and gene expression levels can be monitored. Regardless of a method used for screening, compounds that alter the expression level are considered “candidate” compounds or drugs. Candidate compounds are retested on metastastic cells, or tested on animals. Candidate compounds that are positive in a retest are considered “lead” compounds.


[0068] Once a lead compound has been identified, standard principles of medicinal chemistry can be used to produce derivatives of the compound. Derivatives can be screened for improved pharmacological properties, for example, efficacy, pharmacokinetics, stability, solubility, and clearance. The moieties responsible for a compound's activity in the assays described above can be delineated by examination of structure-activity relationships (SAR) as is commonly practiced in the art. A person of ordinary skill in pharmaceutical chemistry could modify moieties on a lead compound and measure the effects of the modification on the efficacy of the compound to thereby produce derivatives with increased potency. For an example, see Nagarajan et al. (1988) J. Antibiot. 41: 1430-8. Furthermore, if the biochemical target of the lead compound is known or determined, the structure of the target and the lead compound can inform the design and optimization of derivatives. Molecular modeling software is commercially available (e.g., Molecular Simulations, Inc.).


[0069] The specific examples below are to be construed as merely illustrative, and not limitative of the remainder of the disclosure in any way whatsoever. Without further elaboration, it is believed that one skilled in the art can, based on the description herein, utilize the present invention to its fullest extent. All publications, including patents, cited herein are hereby incorporated by reference in their entirety.







EXAMPLE

[0070] Materials


[0071] Cell Lines. Human lung adenocarcinoma cell lines of different invasive and metastatic capacities (CL1-0 and its sublines, CL1-1 and CL1-5) were grown in RPMI medium with 10% fetal bovine serum (FBS) at 37° C., 20% O2, and 5% CO2. See Chu et al. (1997) Am. J Respir. Cell Mol. Biol. 17: 353-360.


[0072] In Vitro Invasion Assay. CL1-5 cells were injected into the tail veins of SCID mice to obtain a more invasive cell line than the just described CL1 series. A highly metastatic cell line was isolated and cloned from the cancer lesion formed in the lung of mice. After four-repeated in vivo selection, the cell line was designated as CL1-5-F4 and incorporated into the panel of cell lines for microarray analysis.


[0073] Invasiveness of the CL1 series of cell lines was examined by using membrane invasion culture system (MICS). In the MICS system, a polycarbonate membrane containing 10 μm pores (Nucleopore Corp., Pleasanton, Calif.) was coated with a mixture of laminin (50 μg/ml; Sigma Chemical Co., St. Louis, Mo.), type IV collagen (50 μg/ml; Sigma), and gelatin (2 mg/ml; Bio-Rad, Hercules, Calif.) in 10 mM glacial acetic acid solution. The membrane was placed between upper- and lower- well plates of a MICS chamber. CL1 cell line series were then re-suspended in RPMI containing 10% NuSerum and seeded into the upper wells of the chamber (5×104 cells/well). After incubating for 24 hours at 37° C., cells that invaded through the coated membrane were removed from the lower wells with 1 mM ethylene diamine tetraacetic acid (EDTA) in phosphate-buffered saline (PBS) and dot-blotted onto a polycarbonate membrane with 3 Elm pores. After fixation in methanol, blotted cells were stained with Liu stain (Handsel Technologies, Inc., Taipei, Taiwan) and the cell number in each blot was counted under a microscope. Each experiment was repeated for three times.


[0074] Tracheal Graft Invasion Assay. A tracheal graft invasion assay was carried out to confirm the in vitro selected lung cancer cell lines with different invasive/metastatic potentials also possess invasive ability in vivo. Rat tracheas were isolated from Sprague-Dawley (SD) rats weighing around 200 gm. The airway epithelial cells of the tracheas were denuded by repetitive freeze-and-thaw procedures for three times at −70° C. Thr CL1-0, CL1-1 and CL1-5 cells were cultured to sub-confluence before they were harvested. 106 cells from each cell line were then injected into the isolated rat tracheas. The upper and lower ends of the tracheas were tightened with threads and implanted subcutaneously in SCID mice. Each cell line was sealed in three different trachea grafts and each SCID mouse was implanted with one graft. The SCID mice were sacrificed four weeks later and the tracheal grafts were taken out for histological examination. The tumor part of the tracheal graft was sliced at 1 mm intervals. At least three sections were examined for the presence of basement membrane invasion. All animal experiments were performed in accordance with the animal guidelines at The Department of Animal Care, Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan.


[0075] Biotinylated Probe Preparation and Microarray Hybridization. Five micrograms of the mRNAs derived from each lung cancer cell line were labeled with biotin during reverse transcription. See Chen et al. (1998) Genomics 51: 313-324; and Hong et al. (2000) Am. J Respir. Cell Mol. Biol. 23: 355-363. The microarrays (18 mm by 27 mm) carrying 9,600 PCR-amplified cDNA fragments were prepared on Nylon membranes by an arraying machine built in-house. The 9600 non-redundant expressed sequence tag (EST) clones were IMAGE human cDNA clones each representing a putative gene cluster with an assigned gene name in the Unigene clustering (Schuler (1997) J Mol. Med. 75: 694-698). All experiments of hybridization were performed in triplicate individually. The details of probe preparation, hybridization, and color development were also described previously.


[0076] The microarray images were digitized by using a drum scanner (ScanView, Foster City, Calif.). Image analysis and spot quantification were done by the MuCDA program written in-house. The program is available via anonymous ftp from Academia Sinica. The microarray images can also be processed by commercial image processing programs and other available microarray image analysis programs.


[0077] Northern hybridization. To confirm the results of gene detection by the cDNA microarray, sixteen of differentially expressed cDNA clones including ten clones of ascending trend and six clones of descending trend in metastasis were selected from cluster analysis of array data, and the entire inserts of the clones were individually PCR-amplified to serve as probes for Northern hybridization. The hybridization and washing procedures were carried out by standard protocol and described in our previous report (Hong et al. (2000) Am. J Respir. Cell Mol. Biol. 23: 355-363).


[0078] Flow cytometric assay. The adenocarcinoma cell sublines, CL1-0, CL1-1, CL1-5 and CL1-5-F4, were subjected to indirect immunofluorescence staining for the expression of surface tumor-associated antigen L6, integrin α-3 and integrin α-6 using murine mAb against human tumor-associated antigen L6 (ATCC, Manassas, Va.), integrin α3 (Chemicon, Temecula, Calif.) and integrin α-6 (BQ16; Acell Co., Bayport, Minn.) respectively. The fluorescence intensity was analyzed by FACStar (Becton-Dickinson, Mountain View, Calif.).


[0079] Statistical Analysis. A cluster analysis method to identify invasion-associated genes was performed on the microarrays. Gene expression data obtained from the microarray experiments were processed and normalized using the protocol and program described by Iyer, V. R. et al. (1999) Science 283: 83-87. Genes were clustered into groups on the basis of expression profiles by self-organizing maps (SOMs) algorithm as described by Tamayo P. et al. (1999) Proc. Natl. Acad. Sci. USA 96: 2907-2912. After cluster analysis by the SOM method, genes whose expression profiles correlate either positively or negatively with the invasiveness of cell lines were identified.


[0080] A repeated measurement analysis of variance (ANOVA test) was performed to determine any significant difference between the numbers of invasion foci formed in tracheal grafts. Data from three experiments in duplicates was analyzed by ANOVA test (Excel, Microsoft, Taiwan) to determine any significant difference.


[0081] Results


[0082] Invasiveness abilities were measured in the four human lung adenocarcinoma cell lines, CL1-0, CL1-1, CL1-5 and CL1-5-F4. Cells invading through the coated membrane were harvested and counted. The cell counts were: CL1-0: 202±16; CL1-1: 1491±202; CL1-5: 3865±530; and CL1-5-F4: 4115±507. The invasiveness of the four cell lines were as expected and followed a trend of: CL1-5-F4≧CL1-5≧CL1-1≧CL1-0.


[0083] The invasiveness of the four adenocarcinoma cell lines was confirmed to have equivalent in vivo invasiveness by the tracheal graft invasion assay. After four weeks, the human airway epithelial cells were repopulated on the rat tracheal basement membrane. The repopulated airway epithelial cells revealed pseudostratified columnar epithelium with mucus and ciliary differentiation. After tracheal graft injected with CL1-0 cells, tumor formation was evident. However, histochemical staining of the control rat trachea without tumor cell injection and without epithelial cells on the basement membrane revealed no invasion of the basement membrane. When tracheal graft was injected with CL1-5 cells, invasion of the basement membrane was clearly evident, in addition to tumor formation in rat trachea. The invasion foci in three sections of the three cell lines were also calculated. The total invasion foci per graft for CL1-0, CL1-1 and CL1-5 cells were 0.0, 0.7±0.5 and 4.0±2.0 respectively (ANOVA test: α=0.05, ρ=0.0133).


[0084] Biotin-labeled probes deriving from mRNAs of cell lines of varying invasiveness were hybridized to microarrays with 9,600 putative genes to profile the gene expression patterns. Microarray images showed the gene expression patterns for a series of lung adenocarcinoma cell lines. The trend of gene expression level changes could clearly be seen. It has been observed that the expression levels of the calcyclin gene correlated positively with cell line invasiveness, and the expression levels of the AXL gene also correlated positively with invasiveness.


[0085] In order to identify all possible metastasis-associated genes from the 9,600-feature microarray, a cluster analysis on the expression profiles of the four lung adenocarcinoma cell lines was performed. Of the 9,600 putative genes, 8,525 had statistically significant expression values and their expression profiles were grouped into 100 clusters. To avoid confusion of negative values in expression patterns, the scale value of normalization, from −1 to +1, was shift to a positive value, from 0 to +2. Four exemplary clusters (No.1-No.4, shown in FIG. 1A of U.S. Application Serial No.60/300,991, filed Jun. 26, 2002) correlated positively with the invasiveness of the cell lines. The four clusters contained expression profiles of 61, 50, 67, and 99 genes, respectively. Another four exemplary clusters (No.5-No. 8, shown in FIG. 1A of U.S. application Ser. No. 60/300,991, filed Jun. 26, 2002) correlated negatively with invasiveness and each cluster contained 110, 68, 71, and 63 genes, respectively. The gene expression profiles (277 positively correlated genes and 312 negatively correlated genes) were rearranged by hierarchical cluster analysis using the average linkage method.


[0086] To substantiate the results of the microarray studies, a Northern-blotting analysis was performed. Ten genes having ascending expression containing five sequence-verified known genes (i.e., calcyclin, AXL, tumor-associated antigen L6, Metallothionein I-B, and RTP) and five anonymous genes (i.e., EST-T40480, EST-T70568, EST-R16261, EST-N20320, and EST-T52774) whose expression had a positive correlation were selected. These ten genes had higher expression levels in the more invasive cell line (CL1-5-F4). Another six genes having descending expression, five of which are sequence-verified known genes (i.e., proteoglycan I secretory granule, TFIID I, DnaJ-like heat shock protein 40, phosphoenolpyruvate carboxykinase 2, and soluble VEGF receptor) and one is anonymous gene (EST-H04819) whose expression had a negative correlation with the invasiveness of adenocarcinoma cell lines, were also selected to perform Northern blotting. These six genes were highly expressed in the less invasive cell line (CL1-0). The results of Northern blotting analysis were consistent with those from the microarray studies. Radio-labeled GAPDH and Gβ-like protein were used as internal controls.


[0087] To demonstrate the protein expression of identified genes was also consistent with microarray analysis, three antibodies, tumor-associated antigen L6, integrin α3 and integrin α-6 were used to carry out flow cytometric analysis across all four CL1 sublines respectively. Each experiment was carried out in triplicate. The average background of fluorescence was 3.3±0.64 (arbitrary fluorescence intensity). The antibody against tumor-associated antigen L6 was used to quantify protein expression level, it was obvious that the peak was shifted from CL1-0 (18±10.0)to CL1-5-F4 (233±36.9) and the differentially expressed ratio of CL1-5-F4 to CL1-0 was 12.94. The antibody against integrin α3 made the peak shift from CL1-0 (3±0.6) to CL1-5-F4 (49±17.3) and the differentially expressed ratio was 16.33. The antibody against integrin α6 made the peak shift from CL1-0 (14α2.8) to CL1-5-F4 (53±21.7) and the differentially expressed ratio was 3.79. These results demonstrated that flow cytometric analysis of protein were consistent with microarray analysis or Northern blotting analysis



OTHER EMBODIMENTS

[0088] All of the features disclosed in this specification may be used in any combination. Each feature disclosed in this specification may be replace by an alternative feature serving the same, equivalent, or similar purpose. Thus, unless expressly stated otherwise, each feature disclosed is only an example of a generic series of equivalent or similar features.


[0089] From the above description, one skilled in the art can easily ascertain the essential characteristics of the present invention, and without departing from the spirit and scope thereof, can make various changes and modifications of the invention to adapt it to various usages and conditions. Accordingly, other embodiments are also within the scope of the following claims.


Claims
  • 1. A method for evaluating a sample, the method comprising: determining the abundance of at least one nucleic acid selected from Group I and/or II in a first sample; and determining the abundance of the at least one nucleic acid in a second sample that comprises normal cells or tumor cells; comparing the abundance of the at least one nucleic acid in the first sample to the abundance of the at least one nucleic acid in the second sample, and categorizing the sample as having tumor invasive or metastatic potential based on results of the comparing.
  • 2. A method for evaluating a sample, the method comprising: determining the abundance of at least one nucleic acid selected from Group I and/or II in a first sample; and determining the abundance of the at least one nucleic acid in a second sample that comprises normal cells; determining the abundance of the at least one nucleic acid in a third sample that comprises tumor cells; comparing the abundance of the at least one nucleic acid in the first sample to the abundance of the at least one nucleic acid in the second sample and abundance of the at least one nucleic acid in the second sample the third sample, and categorizing the first sample as having tumor invasive or metastatic potential based on results of the comparing.
  • 3. The method of claim 2 wherein increased similarity between the first sample and third sample relative to similarity between the first sample and the second sample categorizes the first sample as having tumor invasive or metastatic potential.
  • 4. The method of claim 1 or 2, wherein the at least one nucleic acid comprises at least ten nucleic acids selected from Groups I and/or II.
  • 5. The method of claim 4, wherein the at least one nucleic acid comprises at least ten nucleic acids selected from Group I.
  • 6. The method of claim 1 or 2, wherein the at least one nucleic acid comprises one or more nucleic acids selected from the group consisting of: EST-T70568, EST-N20320, EST-T52774, proteoglycan I secretory granule (W19210), DnaJ-like heat shock protein 40 (A084517), and phosphoenolpyruvate carboxykinase 2 (R40253).
  • 7. A method of evaluating a sample, the method comprising: identifying a expression profile that represents the levels of protein or mRNA expression from at least two genes selected from Group I and/or II in a sample; and comparing the sample expression profile to at least one reference expression profile; wherein each of the sample expression profile and the reference expression profile includes a plurality of values, each of the values is an assessment of the abundance of (1) an MRNA transcribed from a gene selected from Group I and/or II; or (2) a polypeptide encoded by the gene.
  • 8. The method of claim 7, wherein each of the sample expression profile and the reference expression profile includes a plurality of values for 50% of the members of Group I.
  • 9. The method of claim 8, wherein each of the sample expression profile and the reference expression profile includes a plurality of values for 80%,of the members of Group I.
  • 10. The method of claim 7, wherein each of the sample expression profile and the reference expression profile includes a plurality of values for 20% of the members of Group II.
  • 11. The method of claim 9, wherein the comparing comprises evaluating a Euclidean distance.
  • 12. The method of claim 9, wherein the comparing comprises evaluating a correlation coefficient.
  • 13. The method of claim 9, wherein the reference profile is a profile of a non-tumerous cell.
  • 14. The method of claim 9, wherein the reference profile is a profile of a tumor cell.
  • 15. The method of claim 14, wherein the tumor cell is a cultured lung adenocarcinoma cell.
  • 16. A method for diagnosing tumor invasive potential or metastatic development in a subject, the method comprising: providing a sample from the subject; determining a protein or mRNA expression profile of the sample; comparing the expression profile to a reference profile for a non-metastatic cell; and categorizing the subject as having tumor invasive potential or metastatic development when the sample expression profile is found to be altered relative to the reference expression profile, wherein each of the sample expression profile and the reference expression profile includes one or more values representing the levels of expression of one or more nucleic acids selected from Group I and/or II.
  • 17. The method of claim 16, wherein each of the sample expression profile and the reference expression profile includes one or more values representing the levels of expression of 50% of nucleic acids selected from Group I.
  • 18. The method of claim 16, wherein the sample comprises a biopsy.
  • 19. The method of claim 16, wherein the sample comprises lung tissue or lung cells.
  • 20. The method of claim 19, wherein the expression profile is an MRNA expression profile and the expression profile is determined using a nucleic acid array.
  • 21. A method for screening for a test compound useful in the prevention or treatment of tumor metastasis, comprising: providing a reference expression profile; contacting the test compound to a cell; determining a compound-associated expression profile for the contacted cell; and comparing the compound-associated expression profile to the reference profile; wherein each of the compound-associate expression profile and the reference expression profile includes one or more values representing the level of expression of one of more nucleic acids selected from Group I and/or Group II.
  • 22. The method of claim 21, wherein each of the compound-associated profile and the reference expression profile includes one or more values representing the levels of expression of at least 50% of nucleic acids selected from Group I.
  • 23. The method of claim 22, wherein each of the compound-associated profile and the reference expression profile includes one or more values representing the levels of expression of at least 80% of nucleic acids selected from Group I.
  • 24. The method of claim 21, wherein each of the compound-associated profile and the reference expression profile includes one or more values representing the levels of expression of 20% of nucleic acids selected from Group II.
CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims priority to U.S. Application Serial No. 60/300,991, filed Jun. 26, 2001, the contents of which is hereby incorporated by reference in its entirety for all purposes.

Provisional Applications (1)
Number Date Country
60300991 Jun 2001 US