Breast cancer signatures

Abstract
The invention relates to the identification and use of gene expression profiles, or patterns, suitable for identification of breast cancer patient populations with different survival outcomes. The gene expression profiles may be embodied in nucleic acid expression, protein expression, or other expression formats, and may be used in the study and/or determination of the prognosis of a patient, including breast cancer survival.
Description
FIELD OF THE INVENTION

The invention relates to the identification and use of gene expression profiles, or patterns; with clinical relevance to breast cancer. In particular, the invention provides the identities of genes that are correlated with breast cancer recurrence, cancer metastasis, and patient survival. The gene expression profiles, whether embodied in nucleic acid expression, protein expression, or other expression formats, may be used to predict breast cancer recurrence and survival of subjects afflicted with breast cancer. The profiles may also be used in the study and/or diagnosis of breast cancer cells and tissue as well as for the study and/or determination of prognosis of a patient. When used for diagnosis or prognosis, the profiles are used to determine the treatment of breast cancer based upon the likelihood of recurrence, metastases, and life expectancy.


BACKGROUND OF THE INVENTION

Breast cancer is by far the most common cancer among women. Each year, more than 180,000 and 1 million women in the U.S. and worldwide, respectively, are diagnosed with breast cancer. Breast cancer is the leading cause of death for women between ages 50-55, and is the most common non-preventable malignancy in women in the Western Hemisphere. An estimated 2,167,000 women in the United States are currently living with the disease (National Cancer Institute, Surveillance Epidemiology and End Results (NCI SEER) program, Cancer Statistics Review (CSR), www-seer.ims.nci.nih.gov/Publications/CSR1973 (1998)). Based on cancer rates from 1995 through 1997, a report from the National Cancer Institute (NCI) estimates that about 1 in 8 women in the United States (approximately 12.8 percent) will develop breast cancer during her lifetime (NCI's Surveillance, Epidemiology, and End Results Program (SEER) publication SEER Cancer Statistics Review 1973-1997). Breast cancer is the second most common form of cancer, after skin cancer, among women in the United States. An estimated 250,100 new cases of breast cancer are expected to be diagnosed in the United States in 2001. Of these, 192,200 new cases of more advanced (invasive) breast cancer are expected to occur among women (an increase of 5% over last year), 46,400 new cases of early stage (in situ) breast cancer are expected to occur among women (up 9% from last year), and about 1,500 new cases of breast cancer are expected to be diagnosed in men (Cancer Facts & Figures 2001 American Cancer Society). An estimated 40,600 deaths (40,300 women, 400 men) from breast cancer are expected in 2001. Breast cancer ranks second only to lung cancer among causes of cancer deaths in women. Nearly 86% of women who are diagnosed with breast cancer are likely to still be alive five years later, though 24% of them will die of breast cancer after 10 years, and nearly half (47%) will die of breast cancer after 20 years.


Every woman is at risk for breast cancer. Over 70 percent of breast cancers occur in women who have no identifiable risk factors other than age (U.S. General Accounting Office. Breast Cancer, 1971-1991: Prevention, Treatment and Research. GAO/PEMD-92-12; 1991). Only 5 to 10% of breast cancers are linked to a family history of breast cancer (Henderson IC, Breast Cancer. In: Murphy G P, Lawrence W L, Lenhard R E (eds). Clinical Oncology. Atlanta, Ga.: American Cancer Society; 1995:198-219).


Each breast has 15 to 20 sections called lobes. Within each lobe are many smaller lobules. Lobules end in dozens of tiny bulbs that can produce milk. The lobes, lobules, and bulbs are all linked by thin tubes called ducts. These ducts lead to the nipple in the center of a dark area of skin called the areola. Fat surrounds the lobules and ducts. There are no muscles in the breast, but muscles lie under each breast and cover the ribs. Each breast also contains blood vessels and lymph vessels. The lymph vessels carry colorless fluid called lymph, and lead to the lymph nodes. Clusters of lymph nodes are found near the breast in the axilla (under the arm), above the collarbone, and in the chest.


Breast tumors can be either benign or malignant. Benign tumors are not cancerous, they do not spread to other parts of the body, and are not a threat to life. They can usually be removed, and in most cases, do not come back. Malignant tumors are cancerous, and can invade and damage nearby tissues and organs. Malignant tumor cells may metastasize, entering the bloodstream or lymphatic system. When breast cancer cells metastasize outside the breast, they are often found in the lymph nodes under the arm (axillary lymph nodes). If the cancer has reached these nodes, it means that cancer cells may have spread to other lymph nodes or other organs, such as bones, liver, or lungs.


Major and intensive research has been focussed on early detection, treatment and prevention. This has included an emphasis on determining the presence of precancerous or cancerous ductal epithelial cells. These cells are analyzed, for example, for cell morphology, for protein markers, for nucleic acid markers, for chromosomal abnormalities, for biochemical markers, and for other characteristic changes that would signal the presence of cancerous or precancerous cells. This has led to various molecular alterations that have been reported in breast cancer, few of which have been well characterized in human clinical breast specimens. Molecular alterations include presence/absence of estrogen and progesterone steroid receptors, HER-2 expression/amplification (Mark H F, et al. HER-2/neu gene amplification in stages I-IV breast cancer detected by fluorescent in situ hybridization. Genet Med; 1(3):98-103 1999), Ki-67 (an antigen that is present in all stages of the cell cycle except G0 and used as a marker for tumor cell proliferation, and prognostic markers (including oncogenes, tumor suppressor genes, and angiogenesis markers) like p53, p27, Cathepsin D, pS2, multi-drug resistance (MDR) gene, and CD31.


van't Veer et al. (Nature 415:530-536, 2002) describe gene expression profiling of clinical outcome in breast cancer. They identified genes expressed in breast cancer tumors, the expression levels of which correlated either with patients afflicted with distant metastases within 5 years or with patients that remained metastasis-free after at least 5 years.


Ramaswamy et al. (Nature Genetics 33:49-54, 2003) describe the identification of a molecular signature of metastasis in primary solid tumors. The genes of the signature were identified based on gene expression profiles of 12 metastatic adenocarcinoma nodules of diverse origin (lung, breast, prostate, colorectal, uterus) compared to expression profiles of 64 primary adenocarcinomas representing the same spectrum of tumor types from different individuals. A 128 gene set was identified.


Both of the above described approaches, however, utilize heterogeneous populations of cells found in a tumor sample to obtain information on gene expression patterns. The use of such populations may result in the inclusion or exclusion of multiple genes that are differentially expressed in cancer cells. The gene expression patterns observed by the above described approaches may thus provide little confidence that the differences in gene expression are meaningfully associated with breast cancer recurrence or survival.


Citation of documents herein is not intended as an admission that any is pertinent prior art. All statements as to the date or representation as to the contents of documents is based on the information available to the applicant and does not constitute any admission as to the correctness of the dates or contents of the documents.


SUMMARY OF THE INVENTION

The present invention relates to the identification and use of gene expression patterns (or profiles or “signatures”) which are clinically relevant to breast cancer. In particular, the identities of genes that are correlated with breast cancer recurrence, cancer metastasis, and patient survival are provided. The gene expression profiles, whether embodied in nucleic acid expression, protein expression, or other expression formats, may be used to predict breast cancer recurrence and survival of subjects afflicted with breast cancer.


The invention thus provides for the identification and use of gene expression patterns (or profiles or “signatures”) which correlate with (and thus able to discriminate between) patients with good or poor survival outcomes. In one embodiment, the invention provides patterns that are able to distinguish patients with estrogen receptor (ER) positive breast tumors into those with poor survival outcomes, similar to that of patients with ER negative breast tumors, and those with a better survival outcome. These patterns are thus able to distinguish patients with ER positive breast tumors into at least two subtypes. Other patterns are capable of identifying subjects with ER negative tumors, and the survival outcomes associated therewith, as well as survival outcomes for some breast cancer subjects independent of the ER status of their tumors.


The invention also provides for the identification and use of gene expression patterns which correlate with the recurrence of breast cancer in the form of metastases. The patterns are able to distinguish patients with breast cancer into at least those with good or poor survival outcomes.


The present invention provides a non-subjective means for the identification of patients with breast cancer as likely to have a good or poor survival outcome by assaying for the expression patterns disclosed herein. Thus where subjective interpretation may have been previously used to determine the prognosis and/or treatment of breast cancer patients, the present invention provides objective gene expression patterns, which may used alone or in combination with subjective criteria to provide a more accurate assessment of breast cancer patient outcomes. The expression patterns of the invention thus provide a means to determine breast cancer prognosis. Furthermore, the expression patterns can also be used as a means to assay small, node negative tumors that are not readily assayed by other means.


The gene expression patterns comprise one or more than one gene capable of discriminating between breast cancer survival outcomes with significant accuracy. The gene(s) are identified as correlated with various breast cancer survival outcomes such that the levels of their expression are relevant to a determination of the survival, and thus preferred treatment protocols, of a breast cancer patient. Thus in one aspect, the invention provides a method to determine the survival outcome of a subject afflicted with, or suspected of having, breast cancer by assaying a cell containing sample from said subject for expression of one or more than one gene disclosed herein as correlated with breast cancer survival outcomes.


Gene expression patterns of the invention are identified as described below. Generally, a large sampling of gene expression profile of a sample is obtained through quantifying the expression levels of mRNA corresponding to many genes. This profile is then analyzed to identify genes, the expression of which are positively, or negatively, correlated, with breast cancer survival outcomes. An expression profile of a subset of human genes may then be identified by the methods of the present invention as correlated with a particular breast cancer survival outcome. The use of multiple samples increases the confidence which a gene may be believed to be correlated with a particular survival outcome. Without sufficient confidence, it remains unpredictable whether a particular gene is actually correlated with breast cancer survival outcomes and also unpredictable whether a particular gene may be successfully used to identify the survival outcome for a breast cancer patient.


A profile of genes that are highly correlated with one survival outcome relative to another may be used to assay an sample from a subject afflicted with, or suspected of having, breast cancer to predict the survival outcome of the subject from whom the sample was obtained. Such an assay may be used as part of a method to determine the therapeutic treatment for said subject based upon the breast cancer survival outcome identified.


The correlated genes may be used singly with significant accuracy or in combination to increase the ability to accurately discriminate between various stages and/or grades of breast cancer. The present invention thus provides means for correlating a molecular expression phenotype with breast cancer survival outcomes. This correlation is a way to molecularly provide for the determine survival outcomes as disclosed herein. Additional uses of the correlated gene(s) are in the classification of cells and tissues; determination of diagnosis and/or prognosis; and determination and/or alteration of therapy.


An assay of the invention may utilize a means related to the expression level of the sequences disclosed herein as long as the assay reflects, quantitatively or qualitatively, expression of the sequence. Preferably, however, a quantitative assay means is preferred. The ability to discriminate is conferred by the identification of expression of the individual genes as relevant and not by the form of the assay used to determine the actual level of expression. An assay may utilize any identifying feature of an identified individual gene as disclosed herein as long as the assay reflects, quantitatively or qualitatively, expression of the gene. Identifying features include, but are not limited to, unique nucleic acid sequences used to encode (DNA), or express (RNA), said gene or epitopes specific to, or activities of, a protein encoded by said gene. Alternative means include detection of nucleic acid amplification as indicative of increased expression levels and nucleic acid inactivation, deletion, or methylation, as indicative of decreased expression levels. Stated differently, the invention may be practiced by assaying one or more aspect of the DNA template(s) underlying the expression of the disclosed sequence(s), of the RNA used as an intermediate to express the sequence(s), or of the proteinaceous product expressed by the sequence(s), as well as proteolytic fragments of such products. As such, the detection of the presence of, amount of, stability of, or degradation (including rate) of, such DNA, RNA and proteinaceous molecules may be used in the practice of the invention. As such, all that is required is the identity of the gene(s) necessary to discriminate between breast cancer survival outcomes and an appropriate cell containing sample for use in an expression assay.


In one aspect, the invention provides for the identification of the gene expression patterns by analyzing global, or near global, gene expression from single cells or homogenous cell populations which have been dissected away from, or otherwise isolated or purified from, contaminating cells beyond that possible by a simple biopsy. Because the expression of numerous genes fluctuate between cells from different patients as well as between cells from the same patient sample, multiple data from expression of individual genes and gene expression patterns are used as reference data to generate models which in turn permit the identification of individual gene(s), the expression of which are most highly correlated with particular breast cancer survival outcomes.


In a further aspect, the gene sequence(s) capable of discriminating between breast cancer survival outcomes based on cell or tissue samples may be used to determine the likely outcome of a patient from whom the sample was obtained. Preferably, the sample is isolated via non-invasive means. The expression of said gene(s) in said sample may be determined and compared to the expression of said gene(s) in reference data of gene expression patterns as disclosed herein. Alternatively, the expression level may be compared to expression levels in normal or non-cancerous cells, such as, but not limited to, those from the same sample or subject. In embodiments of the invention utilizing quantitative PCR, the expression level may be compared to expression levels of reference genes in the same sample or a ratio of expression levels may be used. The invention provides for ratios of the expression level of a sequence that is underexpressed to the expression level of a sequence that is overexpressed as a indicator of survival outcome or cancer recurrence, including metastatic cancer. The use of a ratio can reduce comparisons with normal or non-cancerous cells.


One advantage provided by the present invention is that contaminating, non-breast cells (such as infiltrating lymphocytes or other immune system cells) are not present to possibly affect the genes identified or the subsequent analysis of gene expression to identify the survival outcomes of patients with breast cancer. Such contamination is present where a biopsy is used to generate gene expression profiles.


While the present invention has been described mainly in the context of human breast cancer, it may be practiced in the context of breast cancer of any animal known to be potentially afflicted by breast cancer. Preferred animals for the application of the present invention are mammals, particularly those important to agricultural applications (such as, but not limited to, cattle, sheep, horses, and other “farm animals”) and for human companionship (such as, but not limited to, dogs and cats).




BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 is a clinical outcome (overall survival) plot of two subtypes based on expression of 864 genes as listed in Tables 2 and 3.



FIG. 2 is a plot of a 297 gene signature (identities of the genes are presented in Table 5) which segregates the survival data of a patient population into “long” and “short” groups with significantly different overall survival curves. FIG. 2 also shows the comparison of this 297 gene set with that of a set of 17 genes correlated with matastasis described by Ramaswamy et al. (supra, see Table 1 therein).



FIG. 3 is a plot of clinical outcomes for four breast cancer subtypes provided by the instant invention.




DETAILED DESCRIPTION OF THE SPECIFIC EMBODIMENTS

Definitions of Terms as used Herein:


A gene expression “pattern” or “profile” or “signature” refers to the relative expression of a gene between two or more breast cancer survival outcomes which is correlated with being able to distinguish between said outcomes.


A “gene” is a polynucleotide that encodes a discrete product, whether RNA or proteinaceous in nature. It is appreciated that more than one polynucleotide may be capable of encoding a discrete product. The term includes alleles and polymorphisms of a gene that encodes the same product, or a functionally associated (including gain, loss, or modulation of function) analog thereof, based upon chromosomal location and ability to recombine during normal mitosis.


A “sequence” or “gene sequence” as used herein is a nucleic acid molecule or polynucleotide composed of a discrete order of nucleotide bases. The term includes the ordering of bases that encodes a discrete product (i.e. “coding region”), whether RNA or proteinaceous in nature, as well as the ordered bases that precede or follow a “coding region”. Non-limiting examples of the latter include 5′ and 3′ untranslated regions of a gene. It is appreciated that more than one polynucleotide may be capable of encoding a discrete product. It is also appreciated that alleles and polymorphisms of the disclosed sequences may exist and may be used in the practice of the invention to identify the expression level(s) of the disclosed sequences or the allele or polymorphism. Identification of an allele or polymorphism depends in part upon chromosomal location and ability to recombine during mitosis.


The terms “correlate” or “correlation” or equivalents thereof refer to an association between expression of one or more genes in a breast cancer cell or tissue sample and the survival outcome of the subject from whom the sample was obtained. Genes expressed at higher levels and correlated with the survival outcomes disclosed herein are provided. The invention provides for the correlation between increases, as well as decreases, in expression of gene sequences and survival outcomes and cancer recurrence, including cancer metastases, in patients. Increases and decreases may be readily expressed in the form of a ratio between expression in a non-normal cell and a normal cell such that a ratio of one (1) indicates no difference while ratios of two (2) and one-half indicate twice as much, and half as much, expression in the non-normal cell versus the normal cell, respectively. Expression levels can be readily determined by quantitative methods as described below.


For example, increases in gene expression can be indicated by ratios of or about 1.1, of or about 1.2, of or about 1.3, of or about 1.4, of or about 1.5, of or about 1.6, of or about 1.7, of or about 1.8, of or about 1.9, of or about 2, of or about 2.5, of or about 3, of or about 3.5, of or about 4, of or about 4.5, of or about 5, of or about 5.5, of or about 6, of or about 6.5, of or about 7, of or about 7.5, of or about 8, of or about 8.5, of or about 9, of or about 9.5, of or about 10, of or about 15, of or about 20, of or about 30, of or about 40, of or about 50, of or about 60, of or about 70, of or about 80, of or about 90, of or about 100, of or about 150, of or about 200, of or about 300, of or about 400, of or about 500, of or about 600, of or about 700, of or about 800, of or about 900, or of or about 1000. A ratio of 2 is a 100% (or a two-fold) increase in expression. Decreases in gene expression can be indicated by ratios of or about 0.9, of or about 0.8, of or about 0.7, of or about 0.6, of or about 0.5, of or about 0.4, of or about 0.3, of or about 0.2, of or about 0.1, of or about 0.05, of or about 0.01, of or about 0.005, of or about 0.001, of or about 0.0005, of or about 0.0001, of or about 0.00005, of or about 0.00001, of or about 0.000005, or of or about 0.000001.


In some embodiments of the invention, such as those related to survival, cancer recurrence, or metastasis as possible outcome phenotypes, a ratio of the expression of a gene sequence expressed at increased levels in correlation with an outcome to the expression of a gene sequence expressed at decreased levels in correlation with the outcome may also be used as an indicator of the phenotype. As a non-limiting example, one cancer survival outcome may be correlated with increased expression of a gene sequence overexpressed in a sample of cancer cells as well as decreased expression of another gene sequence underexpressed in those cells. Therefore, a ratio of the expression levels of the underexpressed sequence to the expression levels of the overexpressed sequence may be used as an indicator or predictor of the ourcome.


A “polynucleotide” is a polymeric form of nucleotides of any length, either ribonucleotides or deoxyribonucleotides. This term refers only to the primary structure of the molecule. Thus, this term includes double- and single-stranded DNA and RNA. It also includes known types of modifications including labels known in the art, methylation, “caps”, substitution of one or more of the naturally occurring nucleotides with an analog, and intemucleotide modifications such as uncharged linkages (e.g., phosphorothioates, phosphorodithioates, etc.), as well as unmodified forms of the polynucleotide.


The term “amplify” is used in the broad sense to mean creating an amplification product can be made enzymatically with DNA or RNA polymerases. “Amplification,” as used herein, generally refers to the process of producing multiple copies of a desired sequence, particularly those of a sample. “Multiple copies” mean at least 2 copies. A “copy” does not necessarily mean perfect sequence complementarity or identity to the template sequence.


By corresponding is meant that a nucleic acid molecule shares a substantial amount of sequence identity with another nucleic acid molecule. Substantial amount means at least 95%, usually at least 98% and more usually at least 99%, and sequence identity is determined using the BLAST algorithm, as described in Altschul et al. (1990), J. Mol. Biol. 215:403-410 (using the published default setting, i.e. parameters w=4, t=17). Methods for amplifying mRNA are generally known in the art, and include reverse transcription PCR (RT-PCR) and those described in U.S. patent application Ser. No. 10/062,857 (filed on Oct. 25, 2001), as well as U.S. Provisional Patent Applications 60/298,847 (filed Jun. 15, 2001) and 60/257,801 (filed Dec. 22, 2000), all of which are hereby incorporated by reference in their entireties as if fully set forth. Another method which may be used is quantitative PCR (or Q-PCR). Alternatively, RNA may be directly labeled as the corresponding cDNA by methods known in the art.


A “microarray” is a linear or two-dimensional array of preferably discrete regions, each having a defined area, formed on the surface of a solid support such as, but not limited to, glass, plastic, or synthetic membrane. The density of the discrete regions on a microarray is determined by the total numbers of immobilized polynucleotides to be detected on the surface of a single solid phase support, preferably at least about 50/cm2, more preferably at least about 100/cm2, even more preferably at least about 500/cm2, but preferably below about 1,000/cm2. Preferably, the arrays contain less than about 500, about 1000, about 1500, about 2000, about 2500, or about 3000 immobilized polynucleotides in total. As used herein, a DNA microarray is an array of oligonucleotides or polynucleotides placed on a chip or other surfaces used to hybridize to amplified or cloned polynucleotides from a sample. Since the position of each particular group of primers in the array is known, the identities of a sample polynucleotides can be determined based on their binding to a particular position in the microarray.


Because the invention relies upon the identification of genes that are over- or under-expressed, one embodiment of the invention involves determining expression by hybridization of mRNA, or an amplified or cloned version thereof, of a sample cell to a polynucleotide that is unique to a particular gene sequence. Preferred polynucleotides of this type contain at least about 20, at least about 22, at least about 24, at least about 26, at least about 28, at least about 30, or at least about 32 consecutive basepairs of a gene sequence that is not found in other gene sequences. The term “about” as used in the previous sentence refers to an increase or decrease of 1 from the stated numerical value. Even more preferred are polynucleotides of at least or about 50, at least or about 100, at least about or 150, at least or about 200, at least or about 250, at least or about 300, at least or about 350, or at least or about 400 basepairs of a gene sequence that is not found in other gene sequences. The term “about” as used in the preceding sentence refers to an increase or decrease of 10% from the stated numerical value. Such polynucleotides may also be referred to as polynucleotide probes that are capable of hybridizing to sequences of the genes, or unique portions thereof, described herein. Preferably, the sequences are those of mRNA encoded by the genes, the corresponding cDNA to such mRNAs, and/or amplified versions of such sequences. In preferred embodiments of the invention, the polynucleotide probes are immobilized on an array, other devices, or in individual spots that localize the probes.


In another embodiment of the invention, all or part of a disclosed sequence may be amplified and detected by methods such as the polymerase chain reaction (PCR) and variations thereof, such as, but not limited to, quantitative PCR (Q-PCR), reverse transcription PCR (RT-PCR), and real-time PCR (including as a means of measuring the initial amounts of mRNA copies for each sequence in a sample), optionally real-time RT-PCR or real-time Q-PCR. Such methods would utilize one or two primers that are complementary to portions of a disclosed sequence, where the primers are used to prime nucleic acid synthesis. The newly synthesized nucleic acids are optionally labeled and may be detected directly or by hybridization to a polynucleotide of the invention. The newly synthesized nucleic acids may be contacted with polynucleotides (containing sequences) of the invention under conditions which allow for their hybridization. Additional methods to detect the expression of expressed nucleic acids include RNAse protection assays, including liquid phase hybridizations, and in situ hybridization of cells.


Alternatively, and in yet another embodiment of the invention, gene expression may be determined by analysis of expressed protein in a cell sample of interest by use of one or more antibodies specific for one or more epitopes of individual gene products (proteins), or proteolytic fragments thereof, in said cell sample or in a bodily fluid of a subject. The cell sample may be one of breast cancer epithelial cells enriched from the blood of a subject, such as by use of labeled antibodies against cell surface markers followed by fluorescence activated cell sorting (FACS). Such antibodies are preferably labeled to permit their easy detection after binding to the gene product. Detection methodologies suitable for use in the practice of the invention include, but are not limited to, immunohistochemistry of cell containing samples or tissue, enzyme linked immunosorbent assays (ELISAs) including antibody sandwich assays of cell containing tissues or blood samples, mass spectroscopy, and immuno-PCR.


The term “label” refers to a composition capable of producing a detectable signal indicative of the presence of the labeled molecule. Suitable labels include radioisotopes, nucleotide chromophores, enzymes, substrates, fluorescent molecules, chemiluminescent moieties, magnetic particles, bioluminescent moieties, and the like. As such, a label is any composition detectable by spectroscopic, photochemical, biochemical, immunochemical, electrical, optical or chemical means.


The term “support” refers to conventional supports such as beads, particles, dipsticks, fibers, filters, membranes and silane or silicate supports such as glass slides.


As used herein, a “breast tissue sample” or “breast cell sample” refers to a sample of breast tissue or fluid isolated from an individual suspected of being afflicted with, or at risk of developing, breast cancer. Such samples are primary isolates (in contrast to cultured cells) and may be collected by any non-invasive means, including, but not limited to, ductal lavage, fine needle aspiration, needle biopsy, the devices and methods described in U.S. Pat. No. 6,328,709, or any other suitable means recognized in the art. Alternatively, the “sample” may be collected by an invasive method, including, but not limited to, surgical biopsy. A sample of the invention may also be one that has been formalin fixed and paraffin embedded (FFPE) or freshly frozened.


“Expression” and “gene expression” include transcription and/or translation of nucleic acid material.


As used herein, the term “comprising” and its cognates are used in their inclusive sense; that is, equivalent to the term “including” and its corresponding cognates.


Conditions that “allow” an event to occur or conditions that are “suitable” for an event to occur, such as hybridization, strand extension, and the like, or “suitable” conditions are conditions that do not prevent such events from occurring. Thus, these conditions permit, enhance, facilitate, and/or are conducive to the event. Such conditions, known in the art and described herein, depend upon, for example, the nature of the nucleotide sequence, temperature, and buffer conditions. These conditions also depend on what event is desired, such as hybridization, cleavage, strand extension or transcription.


Sequence “mutation,” as used herein, refers to any sequence alteration in the sequence of a gene disclosed herein interest in comparison to a reference sequence. A sequence mutation includes single nucleotide changes, or alterations of more than one nucleotide in a sequence, due to mechanisms such as substitution, deletion or insertion. Single nucleotide polymorphism (SNP) is also a sequence mutation as used herein. Because the present invention is based on the relative level of gene expression, mutations in non-coding regions of genes as disclosed herein may also be assayed in the practice of the invention.


“Detection” includes any means of detecting, including direct and indirect detection of gene expression and changes therein. For example, “detectably less” products may be observed directly or indirectly, and the term indicates any reduction (including the absence of detectable signal). Similarly, “detectably more” product means any increase, whether observed directly or indirectly.


Increases and decreases in expression of the disclosed sequences are defined in the following terms based upon percent or fold changes over expression in normal cells. Increases may be of 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 120, 140, 160, 180, or 200% relative to expression levels in normal cells. Alternatively, fold increases may be of 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5, 5.5, 6, 6.5, 7, 7.5, 8, 8.5, 9, 9.5, or 10 fold over expression levels in normal cells. Decreases may be of 10, 20, 30, 40, 50, 55, 60, 65, 70, 75, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 99 or 100% relative to expression levels in normal cells.


Unless defined otherwise all technical and scientific terms used herein have the same meaning as commonly understood to one of ordinary skill in the art to which this invention belongs.


Specific Embodiments


The present invention relates to the identification and use of gene expression patterns (or profiles or “signatures”) which discriminate between (or are correlated with) breast cancer survival outcomes in a subject. Such patterns may be determined by the methods of the invention by use of a number of reference cell or tissue samples, such as those reviewed by a pathologist of ordinary skill in the pathology of breast cancer, which reflect breast cancer cells as opposed to normal or other non-cancerous cells. Because the overall gene expression profile differs from person to person, cancer to cancer, and cancer cell to cancer cell, correlations between certain cells and overexpressed genes may be made as disclosed herein to identify genes that are capable of discriminating between breast cancer survival outcomes.


The present invention may be practiced with any number of the genes believed, or likely to be, differentially expressed with respect to breast cancer survival outcomes. The identification may be made by using expression profiles of various homogenous breast cancer cell populations, which were isolated by microdissection, such as, but not limited to, laser capture microdissection (LCM) of 100-1000 cells. The expression level of each gene of the expression profile may be correlated with a particular survival outcome. Alternatively, the expression levels of multiple genes may be clustered to identify correlations with particular survival outcomes.


Genes with significant correlations to breast cancer survival outcomes may be used to generate models of gene expressions that would maximally discriminate between survival outcomes. Alternatively, genes with significant correlations may be used in combination with genes with lower correlations without significant loss of ability to discriminate between survival outcomes. Such models may be generated by any appropriate means recognized in the art, including, but not limited to, cluster analysis, supported vector machines, neural networks or other algorithm known in the art. The models are capable of predicting the classification of a unknown sample based upon the expression of the genes used for discrimination in the models. “Leave one out” cross-validation may be used to test the performance of various models and to help identify weights (genes) that are uninformative or detrimental to the predictive ability of the models. Cross-validation may also be used to identify genes that enhance the predictive ability of the models.


The gene(s) identified as correlated with particular breast cancer survival outcomes by the above models provide the ability to focus gene expression analysis to only those genes that contribute to the ability to identify a subject as likely to have a particular survival outcome relative to another. The expression of other genes in a breast cancer cell would be relatively unable to provide information concerning, and thus assist in the discrimination of, breast cancer survival outcome.


As will be appreciated by those skilled in the art, the models are highly useful with even a small set of reference gene expression data and can become increasingly accurate with the inclusion of more reference data although the incremental increase in accuracy will likely diminish with each additional datum. The preparation of additional reference gene expression data using genes identified and disclosed herein for discriminating between different survival outcomes in breast cancer is routine and may be readily performed by the skilled artisan to permit the generation of models as described above to predict the status of an unknown sample based upon the expression levels of those genes.


To determine the (increased or decreased) expression levels of genes in the practice of the present invention, any method known in the art may be utilized. In one preferred embodiment of the invention, expression based on detection of RNA which hybridizes to the genes identified and disclosed herein is used. This is readily performed by any RNA detection or amplification+detection method known or recognized as equivalent in the art such as, but not limited to, reverse transcription-PCR, the methods disclosed in U.S. patent application Ser. No. 10/062,857 (filed on Oct. 25, 2001) as well as U.S. Provisional Patent Applications No. 60/298,847 (filed Jun. 15, 2001) and 60/257,801 (filed Dec. 22, 2000), and methods to detect the presence, or absence, of RNA stabilizing or destabilizing sequences.


Alternatively, expression based on detection of DNA status may be used. Detection of the DNA of an identified gene as methylated or deleted may be used for genes that have decreased expression in correlation with survival outcomes. This may be readily performed by, PCR based methods known in the art, including, but not limited to, quantitative PCR (Q-PCR). Conversely, detection of the DNA of an identified gene as amplified may be used for genes that have increased expression in correlation with survival outcomes. This may be readily performed by PCR based, fluorescent in situ hybridization (FISH) and chromosome in situ hybridization (CISH) methods known in the art.


Expression based on detection of a presence, increase, or decrease in protein levels or activity may also be used. Detection may be performed by any immunohistochemistry (IHC) based, blood based (especially for secreted proteins), antibody (including autoantibodies against the protein) based, ex foliate cell (from the cancer) based, mass spectroscopy based, and image (including used of labeled ligand) based method known in the art and recognized as appropriate for the detection of the protein. Antibody and image based methods are additionally useful for the localization of tumors after determination of cancer by use of cells obtained by a non-invasive procedure (such as ductal lavage or fine needle aspiration), where the source of the cancerous cells is not known. A labeled antibody or ligand may be used to localize the carcinoma(s) within a patient.


A preferred embodiment using a nucleic acid based assay to determine expression is by immobilization of one or more sequences of the genes identified herein on a solid support, including, but not limited to, a solid substrate as an array or to beads or bead based technology as known in the art. Alternatively, solution based expression assays known in the art may also be used. The immobilized gene(s) may be in the form of polynucleotides that are unique or otherwise specific to the gene(s) such that the polynucleotide would be capable of hybridizing to a DNA or RNA corresponding to the gene(s). These polynucleotides may be the full length of the gene(s) or be short sequences of the genes (up to one nucleotide shorter than the full length sequence known in the art by deletion from the 5′ or 3′ end of the sequence) that are optionally minimally interrupted (such as by mismatches or inserted non-complementary basepairs) such that hybridization with a DNA or RNA corresponding to the gene(s) is not affected. Preferably, the polynucleotides used are from the 3′ end of the gene, such as within about 350, about 300, about 250, about 200, about 150, about 100, or about 50 nucleotides from the polyadenylation signal or polyadenylation site of a gene or expressed sequence. Polynucleotides containing mutations relative to the sequences of the disclosed genes may also be used so long as the presence of the mutations still allows hybridization to produce a detectable signal.


Alternatively, amplification of such sequences from the 3′ end of genes by methods such as quantitative PCR may be used to determine the expression levels of the sequences. The Ct values generated by such methods may be used as indicators of expression levels.


The immobilized gene(s) may be used to determine the state of nucleic acid samples prepared from sample breast cell(s) for which the survival outcome of the sample's subject (e.g. patient from whom the sample is obtained) is not known or for confirmation of an outcome that is already assigned to the sample's subject. Without limiting the invention, such a cell may be from a patient suspected of being afflicted with, or at risk of developing, breast cancer. The immobilized polynucleotide(s) need only be sufficient to specifically hybridize to the corresponding nucleic acid molecules derived from the sample. While even a single correlated gene sequence may to able to provide adequate accuracy in discriminating between two breast cancer survival outcomes, two or more, three or more, four or more, five or more, six or more, seven or more, eight or more, nine or more, ten or more, or eleven or more of the genes identified herein may be used as a subset capable of discriminating may be used in combination to increase the accuracy of the method. The invention specifically contemplates the selection of more than one, two or more, three or more, four or more, five or more, six or more, seven or more, eight or more, nine or more, ten or more, or eleven or more of the genes disclosed in the tables and figures herein for use as a subset in the identification of breast cancer survival outcome.


Of course 15 or more, 20 or more, 30 or more, 40 or more, 50 or more, 60 or more, 70 or more, 80 or more, 90 or more, 100 or more, 150 or more, 200 or more, 250 or more, 300 or more, 350 or more, 400 or more, 450 or more, 500 or more, 600 or more, 700 or more, 800 or more, 900 or more, 1000 or more, 1100 or more, 1200 or more, or all the genes provided in Tables 2, 3, and/or 4 below may be used. “CloneID” as used in the context of the Tables herein as well as the present invention refers to the IMAGE Consortium clone ID number of each gene, the sequences of which are hereby incorporated by reference in their entireties as they are available from the Consortium at image.llnl.gov/ as accessed on the filing date of the present application. “GeneID” as used in the context of the Tables herein as well as the present invention refers to the GenBank accession number of a sequence of each gene, the sequences of which are hereby incorporated by reference in their entireties as they are available from GenBank as accessed on the filing date of the present application.


P value refers to values assigned as described in the Example below. The indications of “E-xx” where “xx” is a two digit number refers to alternative notation for exponential figures where “E-xx” is “10−xx”. Thus in combination with the numbers to the left of“E-xx”, the value being represented is the numbers to the left times 10−xx. Chromosome Location refers to the human chromosome to which the gene has been assigned, and Description provides a brief identifier of what the gene encodes.


The invention may also be practiced with all or a portion of the gene sequences disclosed in Tables 6, 7, 8, and 9 herein. The gene sequences of each of these tables define one of four breast cancer subtypes based upon increased expression in correlation with particular survival outcomes as shown in FIG. 3. Therefore, the increased expression of sequences of 2 or more, 4 or more, 6 or more, 8 or more, 10 or more, 12 or more, 14 or more, 16 or more, 18 or more, 20 or more, 22 or more, 24 or more, 26 or more, 28 or more, 30 or more, 32 or more, 34 or more, 36 or more, 38 or more, 40 or more, 42 or more, 44 or more, 46 or more, 48 or more, or all 50 genes in each table can be used in the practice of the invention as indicators of a breast cancer survival outcome. Of course sequences of the 25 possible odd numbers of these genes may also be used.


Genes with a correlation identified by a p value below or about 0.02, below or about 0.01, below or about 0.005, below or about 0.001, below or about 1×10−4, below or about 1×10−5, below or about 1×10−6, below or about 1×10−7, below or about 1×10−8, below or about 1×10−9, below or about 1×10−10, below or about 1×10−11, below or about 1×10−12, below or about 1×10−13, below or about 1×10−14, below or about 1×10−15, below or about 1×10−16, below or about 1×10−17, below or about 1×10−18, below or about 1×10−19, or about 1×10−20 are preferred for use in the practice of the invention. The present invention includes the use of genes that identify different ERα (estrogen receptor alpha) positive subtypes and breast cancer recurrence/metastases together to permit simultaneous identification of breast cancer survival outcome of a patient based upon assaying a breast cancer sample from said patient.


In some embodiments of the invention, the genes used will not include HRAS-like suppressor (UNIGENE ID Hs.36761; CloneID 950667; GenBank accession # NM020386; and GeneSymbol HRASLS) and/or origin recognition complex, subunit 6 (yeast homolog)-like, (UNIGENE ID Hs.49760; CloneID 306318; GenBank accession # NM014321; and GeneSymbol ORC6L) as disclosed by van't Veer et al. (supra).


In embodiments where only one or a few genes are to be analyzed, the nucleic acid derived from the sample breast cancer cell(s) may be preferentially amplified by use of appropriate primers such that only the genes to be analyzed are amplified to reduce contaminating background signals from other genes expressed in the breast cell. Alternatively, and where multiple genes are to be analyzed or where very few cells (or one cell) is used, the nucleic acid from the sample may be globally amplified before hybridization to the immobilized polynucleotides. Of course RNA, or the cDNA counterpart thereof may be directly labeled and used, without amplification, by methods known in the art.


The invention is preferably practiced with unique sequences present within the gene sequences disclosed herein. The uniqueness of a disclosed gene sequence refers to the portions or entireties of the sequences which are found in each gene to the exclusion of other genes. Such unique sequences include those found at the 3′ untranslated portion of the genes. Preferred unique sequences for the practice of the invention are those which contribute to the consensus sequences for each gene such that the unique sequences will be useful in detecting expression in a variety of individuals rather than being specific for a polymorphism present in some individuals. Alternatively, sequences unique to an individual or a subpopulation may be used. The preferred unique sequences are preferably of the lengths of polynucleotides of the invention as discussed herein.


In particularly preferred embodiments of the invention, polynucleotides having sequences present in the 3′ untranslated and/or non-coding regions of the disclosed gene sequences are used to detect expression levels in breast cells. Such polynucleotides may optionally contain sequences found in the 3′ portions of the coding regions of the disclosed sequences. Polynucleotides containing a combination of sequences from the coding and 3′ non-coding regions preferably have the sequences arranged contiguously, with no intervening heterologous sequence(s).


Alternatively, the invention may be practiced with polynucleotides having sequences present in the 5′ untranslated and/or non-coding regions of gene sequences in breast cells to detect their levels of expression. Such polynucleotides may optionally contain sequences found in the 5′ portions of the coding regions. Polynucleotides containing a combination of sequences from the coding and 5′ non-coding regions preferably have the sequences arranged contiguously, with no intervening heterologous sequence(s). The invention may also be practiced with sequences present in the coding regions of disclosed sequences.


Preferred polynucleotides contain sequences from 3′ or 5′ untranslated and/or non-coding regions of at least about 16, at least about 18, at least about 20, at least about 22, at least about 24, at least about 26, at least about 28, at least about 30, at least about 32, at least about 34, at least about 36, at least about 38, at least about 40, at least about 42, at least about 44, or at least about 46 consecutive nucleotides. The term “about” as used in the previous sentence refers to an increase or decrease of 1 from the stated numerical value. Even more preferred are polynucleotides containing sequences of at least or about 50, at least or about 100, at least about or 150, at least or about 200, at least or about 250, at least or about 300, at least or about 350, or at least or about 400 consecutive nucleotides. The term “about” as used in the preceding sentence refers to an increase or decrease of 10% from the stated numerical value.


Sequences from the 3′ or 5′ end of the above described coding regions as found in polynucleotides of the invention are of the same lengths as those described above, except that they would naturally be limited by the length of the coding region. The 3′ end of a coding region may include sequences up to the 3′ half of the coding region. Conversely, the 5′ end of a coding region may include sequences up the 5′ half of the coding region. Of course the above described sequences, or the coding regions and polynucleotides containing portions thereof, may be used in their entireties.


Polynucleotides combining the sequences from a 3′ untranslated and/or non-coding region and the associated 3′ end of the coding region are preferably at least or about 100, at least about or 150, at least or about 200, at least or about 250, at least or about 300, at least or about 350, or at least or about 400 consecutive nucleotides. Preferably, the polynucleotides used are from the 3′ end of the gene, such as within about 350, about 300, about 250, about 200, about 150, about 100, or about 50 nucleotides from the polyadenylation signal or polyadenylation site of a gene or expressed sequence. Polynucleotides containing mutations relative to the sequences of the disclosed genes may also be used so long as the presence of the mutations still allows hybridization to produce a detectable signal.


In another embodiment of the invention, polynucleotides containing deletions of nucleotides from the 5′ and/or 3′ end of the above disclosed sequences may be used. The deletions are preferably of 1-5, 5-10, 10-15, 15-20, 20-25, 25-30, 30-35, 35-40, 40-45-50, 50-60, 60-70, 70-80, 80-90, 90-100, 100-125, 125-150, 150-175, or 175-200 nucleotides from the 5′ and/or 3′ end, although the extent of the deletions would naturally be limited by the length of the disclosed sequences and the need to be able to use the polynucleotides for the detection of expression levels.


Other polynucleotides of the invention from the 3′ end of the above disclosed sequences include those of primers and optional probes for quantitative PCR. Preferably, the primers and probes are those which amplify a region less than about 350, less than about 300, less than about 250, less than about 200, less than about 150, less than about 100, or less than about 50 nucleotides from the from the polyadenylation signal or polyadenylation site of a gene or expressed sequence.


In yet another embodiment of the invention, polynucleotides containing portions of the above disclosed sequences including the 3′ end may be used in the practice of the invention. Such polynucleotides would contain at least or about 50, at least or about 100, at least about or 150, at least or about 200, at least or about 250, at least or about 300, at least or about 350, or at least or about 400 consecutive nucleotides from the 3′ end of the disclosed sequences.


The above assay embodiments may be used in a number of different ways to identify or detect the breast cancer stage and/or grade, if any, of a breast cancer cell sample from a patient as well as the likely survival outcome of said patient. In many cases, this would reflect a secondary screen for the patient, who may have already undergone mammography or physical exam as a primary screen. If positive, the subsequent needle biopsy, ductal lavage, fine needle aspiration, or other analogous methods may provide the sample for use in the above assay embodiments. The present invention is particularly useful in combination with non-invasive protocols, such as ductal lavage or fine needle aspiration, to prepare a breast cell sample.


The present invention provides a more objective set of criteria, in the form of gene expression profiles of a discrete set of genes, to discriminate (or delineate) between breast cancer survival outcomes. In particularly preferred embodiments of the invention, the assays are used to discriminate between good and poor outcomes within 5, or about 5, years after surgical intervention to remove breast cancer tumors or within about 95 months after surgical intervention to remove breast cancer tumors. Comparisons that discriminate between outcomes after about 10, about 20, about 30, about 40, about 50, about 60, about 70, about 80, about 90, about 100, about 110, about 120, about 130, about 140, or about 150 months may also be performed.


While good and poor survival outcomes may be defined relatively in comparison to each other, a “good” outcome may be viewed as a better than 50% survival rate after about 60 months post surgical intervention to remove breast cancer tumor(s). A “good” outcome may also be a better than about 60%, about 70%, about 80% or about 90% survival rate after about 60 months post surgical intervention. A “poor” outcome may be viewed as an about 60% or less, or about 50% or less, survival rate after about 40 or about 50 or about 60 months post surgical intervention to remove breast cancer tumor(s). A “poor” outcome may also be about a 70% or less survival rate after about 40 months, or about a 80% or less survival rate after about 20 months, post surgical intervention.


In one embodiment of the invention, the isolation and analysis of a breast cancer cell sample may be performed as follows:

    • (1) Ductal lavage or other non-invasive procedure is performed on a patient to obtain a sample.
    • (2) Sample is prepared and coated onto a microscope slide. Note that ductal lavage results in clusters of cells that are cytologically examined as stated above.
    • (3) Pathologist or image analysis software scans the sample for the presence of non-normal and/or atypical cells.
    • (4) If non-normal and/or atypical cells are observed, those cells are harvested (e.g. by microdissection such as LCM).
    • (5) RNA is extracted from the harvested cells.
    • (6) RNA is purified, amplified, and labeled.
    • (7) Labeled nucleic acid is contacted with a microarray containing polynucleotides of the genes identified herein as correlated to discriminations between breast cancer survival outcomes under hybridization conditions, then processed and scanned to obtain a pattern of intensities of each spot (relative to a control for general gene expression in cells) which determine the level of expression of the gene(s) in the cells.
    • (8) The pattern of intensities is analyzed by comparison to the expression patterns of the genes in known samples of breast cancer cells correlated with survival outcomes (relative to the same control).


A specific example of the above method would be performing ductal lavage following a primary screen, observing and collecting non-normal and/or atypical cells for analysis. The comparison to known expression patterns, such as that made possible by a model generated by an algorithm (such as, but not limited to nearest neighbor type analysis, SVM, or neural networks) with reference gene expression data for the different breast cancer survival outcomes, identifies the cells as being correlated with subjects with good outcomes. Another example would be taking a breast tumor removed from a subject after surgical intervention, isolation and preparation of breast cancer cells for determination/identification of atypical, non-normal, or cancer cells, and isolation of said cells followed by steps 5 through 8 above.


Alternatively, the sample may permit the collection of both normal as well as cancer cells for analysis. The gene expression patterns for each of these two samples will be compared to each other as well as the model and the normal versus individual comparisons therein based upon the reference data set. This approach can be significantly more powerful that the cancer cells only approach because it utilizes significantly more information from the normal cells and the differences between normal and non-normal or atypical or cancer cells (in both the sample and reference data sets) to determine the likely survival outcome of the patient based on gene expression in the cancer cells from the sample.


With use of the present invention, skilled physicians may prescribe treatments based on prognosis determined via non-invasive samples that they would have prescribed for a patient which had previously received a diagnosis via a solid tissue biopsy.


The above discussion is also applicable where a palpable lesion is detected followed by fine needle aspiration or needle biopsy of cells from the breast. The cells are plated and reviewed by a pathologist or automated imaging system which selects cells for analysis as described above.


The present invention may also be used, however, with solid tissue biopsies. For example, a solid biopsy may be collected and prepared for visualization followed by determination of expression of one or more genes identified herein to determine the breast cancer survival outcome. One preferred means is by use of in situ hybridization with polynucleotide or protein identifying probe(s) for assaying expression of said gene(s).


In an alternative method, the solid tissue biopsy may be used to extract molecules followed by analysis for expression of one or more gene(s). This provides the possibility of leaving out the need for visualization and collection of only cancer cells or cells suspected of being cancerous. This method may of course be modified such that only cells that have been positively selected are collected and used to extract molecules for analysis. This would require visualization and selection as an prerequisite to gene expression analysis.


In a further modification of the above, both normal cells and cancer cells are collected and used to extract molecules for analysis of gene expression. The approach, benefits and results are as described above using non-invasive sampling.


The genes identified herein may be used to generate a model capable of predicting the breast cancer survival outcomes via an unknown breast cell sample based on the expression of the identified genes in the sample. Such a model may be generated by any of the algorithms described herein or otherwise known in the art as well as those recognized as equivalent in the art using gene(s) (and subsets thereof) disclosed herein for the identification of breast cancer outcomes. The model provides a means for comparing expression profiles of gene(s) of the subset from the sample against the profiles of reference data used to build the model. The model can compare the sample profile against each of the reference profiles or against model defining delineations made based upon the reference profiles. Additionally, relative values from the sample profile may be used in comparison with the model or reference profiles.


In a preferred embodiment of the invention, breast cell samples identified as normal and cancerous from the same subject may be analyzed for their expression profiles of the genes used to generate the model. This provides an advantageous means of identifying survival outcomes based on relative differences from the expression profile of the normal sample. These differences can then be used in comparison to differences between normal and individual cancerous reference data which was also used to generate the model.


The detection of gene expression from the samples may be by use of a single microarray able to assay gene expression from some or all genes disclosed herein for convenience and accuracy.


Other uses of the present invention include providing the ability to identify breast cancer cell samples as correlated with particular breast cancer survival outcomes for further research or study. This provides a particular advantage in many contexts requiring the identification of cells based on objective genetic or molecular criteria.


The materials for use in the methods of the present invention are ideally suited for preparation of kits produced in accordance with well known procedures. The invention thus provides kits comprising agents for the detection of expression of the disclosed genes for identifying breast cancer survival outcomes. Such kits optionally comprising the agent with an identifying description or label or instructions relating to their use in the methods of the present invention, is provided. Such a kit may comprise containers, each with one or more of the various reagents (typically in concentrated form) utilized in the methods, including, for example, pre-fabricated microarrays, buffers, the appropriate nucleotide triphosphates (e.g., dATP, dCTP, dGTP and dTTP; or rATP, rCTP, rGTP and UTP), reverse transcriptase, DNA polymerase, RNA polymerase, and one or more primer complexes of the present invention (e.g., appropriate length poly(T) or random primers linked to a promoter reactive with the RNA polymerase). A set of instructions will also typically be included.


The methods provided by the present invention may also be automated in whole or in part. All aspects of the present invention may also be practiced such that they consist essentially of a subset of the disclosed genes to the exclusion of material irrelevant to the identification of breast cancer survival outcomes via a cell containing sample.


Having now generally described the invention, the same will be more readily understood through reference to the following examples which are provided by way of illustration, and are not intended to be limiting of the present invention, unless specified.


EXAMPLES
Example I
Materials and Methods

Clinical specimen collection and clinicopathological parameters. 86 patients were expression profiled, 57 of these had clinical follow-up, specifically overall survival. Biomarker status is shown below in Table 1 for all 86 patients

TABLE 1Age and biomarker status for the 86 patientssubsequently gene expression profiledAgeNo. of CasesPercentage<451214%45-552428%>555058%Estrogen-receptor statuspositive4148%negative4552%Progesterone-receptor statuspositive3237%negative5463%Her2/Neu statuspositive1619%intermediate2327%negative4554%


Example II
Identification of ER positive subtypes with different survival outcomes

Within the set of 86 patients from Example I, 41 had breast tumors that were ER+ via a biomarker test. Within this set of 41, microdissection was used to obtain breast cancer cells for identification of a molecular signature (i.e., expression of genes) that differentially categorized the ER+ group into two subgroups. This was done by (i) using unsupervised hierarchical clustering to identify two subtypes, followed by (ii) completing a t-test on every gene and (iii) extracting those genes whose differential expression was at an adjusted p <0.05 (using false discovery rate procedure).


864 genes were extracted and are listed in Tables 2 and 3. Using clinical outcome (overall survival), it was determined that these two subtypes (identified as ERa and ERb, or ER positive subtypes a and b) divided the ER+ patients into two different survival curves as shown in FIG. 1. Genes which which positively correlate with (are overexpressed in) the ERa subtype are negatively correlated with (are underexpressed in) the ERb subtype. Conversely, genes which which positively correlate with (are overexpressed in) the ERb subtype are negatively correlated with (are underexpressed in) the ERa subtype.


It is interesting to note that the ERb subtype has a similar survival as those patients whose tumors were ER negative. As such, one aspect of the invention includes the treatment of patients with breast cancer cells having the ERb subtype in the manner of treating patients with cells having an ER negative phenotype.

TABLE 2Genes, the expressions of which positively correlate with the ERa subtypeClone_IDP_valueGene_Description5041873.31E−02ESTs, Moderately similar to ALU8_HUMAN ALU SUBFAMILY SX SEQUENCE CONTAMINATION WARNING ENTRY[H. sapiens]717631.78E−02SIP|Siah-interacting protein20485241.67E−02JAK2|Janus kinase 2 (a protein tyrosine kinase)8982421.12E−02SRPR|signal recognition particle receptor (docking protein)17097917.86E−03BAIAP1|BAI1-associated protein 11105784.22E−02ESTs507134.35E−02KIAA1577|KIAA1577 protein5945172.44E−02SFRS6|splicing factor, arginine/serine-rich 6418262.67E−03Homo sapiens cDNA FLJ32064 fis, clone OCBBF10000808146201.83E−02FBP17|formin-binding protein 1711605584.31E−03B-DIOX-II|putative b,b-carotene-9,10-dioxygenase8098794.21E−02FLJ10307|hypothetical protein FLJ103072981342.68E−02FZD1|frizzled (Drosophila) homolog 13255153.20E−03FLJ10980|hypothetical protein FLJ109807823061.30E−02FLJ13110|hypothetical protein FLJ13110485182.11E−02Homo sapiens mRNA for KIAA1888 protein, partial cds16360352.82E−03GASC1|gene amplified in squamous cell carcinoma 11296448.78E−03SSH3BP1|spectrin SH3 domain binding protein 118660684.96E−02ESTs16856423.86E−02PMP2|peripheral myelin protein 23669663.22E−02Homo sapiens cDNA: FLJ21333 fis, clone COL025352819041.22E−02KIAA0349|KIAA0349 protein19260073.81E−02EST8250531.41E−03Homo sapiens mRNA; cDNA DKFZp434J0828 (from clone DKFZp434J0828)3466433.93E−02ESTs16830353.40E−02ESTs7953423.17E−02ESTs1301168.42E−03ESTs3473784.90E−02FLJ12492|hypothetical protein FLJ124924915454.86E−02KIAA0965|KIAA0965 protein8129643.36E−03Homo sapiens mRNA full length insert cDNA clone EUROIMAGE 206896221391522.06E−02PDZ-GEF1|PDZ domain containing guanine nucleotide exchange factor(GEF)15028181.30E−02ARHA|ras homolog gene family, member A16361111.30E−02HNRPU|heterogeneous nuclear ribonucleoprotein U (scaffold attachment factor A)14927804.52E−02ESTs, Weakly similar to I38022 hypothetical protein [H. sapiens]8136971.02E−02KIAA0746|KIAA0746 protein8102828.98E−03ITPK1|inositol 1,3,4-triphosphate 5/6 kinase8454544.44E−02Homo sapiens cDNA: FLJ23597 fis, clone LNG152811546573.22E−02Homo sapiens cDNA: FLJ21286 fis, clone COL019152930631.33E−02POLR2B|polymerase (RNA) II (DNA directed) polypeptide B (140 kD)7539731.98E−02NFAT5|nuclear factor of activated T-cells 5, tonicity-responsive9694951.30E−02TIGA1|TIGA17866052.18E−02APG-1|heat shock protein (hsp110 family)4178844.91E−03Homo sapiens cDNA FLJ12052 fis, clone HEMBB1002042, moderately similar to CYTOCHROME P450 4C1 (EC1.14.14.1)3256061.97E−02EST2012827.98E−03DKFZP434N126|DKFZP434N126 protein7735026.44E−03ESTs, Weakly similar to S65824 reverse transcriptase homolog [H. sapiens]8129751.42E−02KIAA0172|KIAA0172 protein1627531.29E−02DD5|progestin induced protein7124601.49E−03NKTR|natural killer-tumor recognition sequence3598361.51E−03FLJ10726|hypothetical protein FLJ107268456093.48E−02LOC90701|similar to signal peptidase complex (18 kD)2516981.02E−02FBXW1B|f-box and WD-40 domain protein 1B1369543.58E−02ESTs, Weakly similar to YEX0_YEAST HYPOTHETICAL 64.8 KDA PROTEIN IN GDI1-COX15 INTERGENIC REGION[S. cerevisiae]2834533.98E−02Homo sapiens cDNA FLJ11458 fis, clone HEMBA10015572674193.17E−02ESTs1408374.30E−02CLPX|ClpX (caseinolytic protease X, E. coli) homolog7539874.05E−02ADPRTL1|ADP-ribosyltransferase (NAD+; poly (ADP-ribose) polymerase)-like 18250764.96E−03APT6M8-9|ATPase, H+ transporting, lysosomal (vacuolar proton pump) membrane sector associated protein M8-98138541.28E−03PURA|purine-rich element binding protein A8120424.03E−02TSC1|tuberous sclerosis 14915653.64E−02CITED2|Cbp/p300-interacting transactivator, with Glu/Asp-rich carboxy-terminal domain, 27823314.50E−04ESTs4152882.73E−02SRP46|Splicing factor, arginine/serine-rich, 46 kD1490582.27E−02Homo sapiens cDNA FLJ10174 fis, clone HEMBA10039592877453.16E−02Homo sapiens cDNA FLJ30482 fis, clone BRAWH2000034, moderately similar to TRP-185 protein8976252.69E−02KIAA0532|KIAA0532 protein7573373.93E−02ESTs7733753.39E−02EST2842612.17E−02MDS030|uncharacterized hematopoietic stem/progenitor cells protein MDS0308430084.67E−02GC20|translation factor sui1 homolog14611204.06E−02DLEU2|deleted in lymphocytic leukemia, 219332553.24E−02DNAJA4|DnaJ (Hsp40) homolog, subfamily A, member 4506854.17E−02KIAA1414|KIAA1414 protein8243542.44E−02GRLF1|glucocorticoid receptor DNA binding factor 12592673.39E−02Homo sapiens mRNA; cDNA DKFZp586N2424 (from clone DKFZp586N2424)3610483.59E−02p100|EBNA-2 co-activator (100 kD)2798003.82E−02SLMAP|sarcolemma associated protein16035834.70E−02SH3BGRL|SH3 domain binding glutamic acid-rich protein like15585612.71E−02ATRN|attractin1353032.91E−04HT007|uncharacterized hypothalamus protein HT0072876831.12E−02KIAA1387|KIAA1387 protein8446808.98E−03TRD@|T cell receptor delta locus2796652.65E−02PDX1|Pyruvate dehydrogenase complex, lipoyl-containing component X; E3-binding protein530921.82E−02KIAA0436|putative L-type neutral amino acid transporter3766978.98E−03Homo sapiens cDNA FLJ30060 fis, clone ADRGL20000971264134.52E−02ITIH2|inter-alpha (globulin) inhibitor, H2 polypeptide2682343.74E−02DMXL1|Dmx-like 13635903.47E−02ARNT2|aryl hydrocarbon receptor nuclear translocator 28146732.44E−02DKFZP547E2110|DKFZP547E2110 protein2682403.67E−02FXC1|fracture callus 1 (rat) homolog3469022.06E−03Homo sapiens cDNA: FLJ21985 fis, clone HEP06226468963.10E−02PRO1331|hypothetical protein PRO13318252404.61E−02ESTs, Weakly similar to SFRB_HUMAN SPLICING FACTOR ARGININE/SERINE-RICH 11 [H. sapiens]428274.97E−02Homo sapiens cDNA FLJ31604 fis, clone NT2RI20026991385892.24E−04Homo sapiens clone 24538 mRNA sequence7970621.14E−02ESTs15878632.88E−02ACAA1|acetyl-Coenzyme A acyltransferase 1 (peroxisomal 3-oxoacyl-Coenzyme A thiolase)8412872.44E−02GNPAT|glyceronephosphate O-acyltransferase7425811.67E−02Homo sapiens cDNA FLJ10366 fis, clone NT2RM20014208235744.90E−02Homo sapiens cDNA FLJ33111 fis, clone TRACH20010853433523.45E−02KIAA1134|KIAA1134 protein20136331.33E−02STAM|signal transducing adaptor molecule (SH3 domain and ITAM motif) 12614922.69E−02LCHN|LCHN protein7126412.35E−02TPR|translocated promoter region (to activated MET oncogene)1996373.82E−02Homo sapiens cDNA FLJ31102 fis, clone IMR3220000106242914.07E−02GHITM|growth hormone inducible transmembrane protein1345253.82E−02CUL3|cullin 31418153.93E−02JAG1|jagged 1 (Alagille syndrome)1619983.97E−02FLJ23138|hypothetical protein FLJ231383450323.67E−02ESTs17121483.86E−02RNU17D|RNA, U17D small nucleolar2801541.77E−02SYNJ2|synaptojanin 28149062.91E−02KIAA0648|KIAA0648 protein7689402.28E−02KIAA0874|KIAA0874 protein8121531.60E−03FLJ13081|hypothetical protein FLJ130814909454.45E−04ESTs8121552.18E−02RABGGTB|Rab geranylgeranyltransferase, beta subunit7417953.22E−02RALGPS1A|Ral guanine nucleotide exchange factor RalGPS1A7680082.11E−02BAG2|BCL2-associated athanogene 27583182.55E−02FBXO3|F-box only protein 37533001.66E−03DKFZp586F1019|DKFZp586F1019 protein8390941.18E−02CRYBA1|crystallin, beta A17540332.07E−02LZTFL1|leucine zipper transcription factor-like 18975951.16E−02CBFA2T2|core-binding factor, runt domain, alpha subunit 2; translocated to, 27267033.48E−02Homo sapiens clone 23736 mRNA sequence16312382.28E−02KIAA1483|KIAA1483 protein8123001.36E−02FLJ20265|hypothetical protein FLJ202657882642.82E−02DPAGT1|dolichyl-phosphate (UDP-N-acetylglucosamine) N-acetylglucosaminephosphotransferase 1 (GlcNAc-1-Ptransferase)842292.97E−02GK003|GK003 protein1205614.28E−02KIDINS220|likely homolog of rat kinase D-interacting substance of 220 kDa7865922.72E−02ZNF265|zinc finger protein 26518841352.82E−02ESTs7313182.82E−02KIAA0981|KIAA0981 protein7005004.96E−03PCTK2|PCTAIRE protein kinase 23581512.73E−02ZNF33A|zinc finger protein 33a (KOX 31)8976701.90E−02Human transposon-like element mRNA7540402.02E−02Homo sapiens cDNA FLJ31626 fis, clone NT2RI2003317532762.06E−03Homo sapiens clone 24538 mRNA sequence4544591.93E−02Homo sapiens clone 23870 mRNA sequence15359531.44E−02ESTs2667471.07E−02Homo sapiens, Similar to RIKEN cDNA 2010001O09 gene, clone MGC: 21387 IMAGE: 4471592, mRNA, complete cds15846237.56E−03CCNC|cyclin C7265718.61E−03SMBP|SM-11044 binding protein15829568.15E−03DKFZP434O1427|hypothetical protein DKFZp434O14277574621.50E−02E2IG5|hypothetical protein, estradiol-induced17076373.71E−02ESTs8158004.87E−03FLJ21343|hypothetical protein FLJ213438253502.91E−04KIAA1040|KIAA1040 protein8406641.72E−02EST508877.87E−03RALGDS|ral guanine nucleotide dissociation stimulator5039144.28E−02KIAA1311|KIAA1311 protein8846574.35E−02TIMM8B|translocase of inner mitochondrial membrane 8 (yeast) homolog B4691729.21E−03SEC22C|vesicle trafficking protein6855162.70E−02GPCR150|putative G protein-coupled receptor7670913.45E−02Homo sapiens PAC clone RP1-130H16 from 22q12.1-qter3230742.67E−03HMG2L1|high-mobility group protein 2-like 116363491.69E−0215-Sep|15 kDa selenoprotein7534043.35E−02KIAA0887|KIAA0887 protein2919081.77E−02CTNND1|catenin (cadherin-associated protein), delta 116947752.60E−02EST10303494.78E−02DFFB|DNA fragmentation factor, 40 kD, beta polypeptide (caspase-activated DNase)348522.19E−02BIRC2|baculoviral IAP repeat-containing 22771852.88E−02PRO0461|PRO0461 protein2106103.88E−03CEP1|centrosomal protein 12771871.66E−02MKP-7|MAPK phosphatase-78253634.70E−02ESTs495628.44E−03KIAA0171|KIAA0171 gene product7671703.88E−02LOC51606|CGI-11 protein7840854.31E−03TUSP|tubby super-family protein16509341.78E−02Homo sapiens cDNA FLJ11472 fis, clone HEMBA100171110303513.48E−03SCYB11|small inducible cytokine subfamily B (Cys-X-Cys), member 117014021.50E−03Crk|v-crk avian sarcoma virus CT10 oncogene homolog20624293.41E−02PRO2730|hypothetical protein PRO2730284444.45E−04CRSP2|cofactor required for Sp1 transcriptional activation, subunit 2 (150 kD)1970772.86E−02GOLPH3|golgi phosphoprotein 3 (coat-protein)8262452.95E−02LOC54505|hypothetical protein15862511.80E−02LOC51030|CGI-148 protein8414851.17E−02Homo sapiens cDNA FLJ31058 fis, clone HSYRA20008287525473.01E−04Homo sapiens mRNA; cDNA DKFZp586G1520 (from clone DKFZp586G1520)5110124.21E−02AGPS|alkylglycerone phosphate synthase682252.68E−02Homo sapiens pTM5 mariner-like transposon mRNA, partial sequence1214703.30E−02BCCIP|BRCA2 and CDKN1A-interacting protein3605394.39E−02PPP3CB|protein phosphatase 3 (formerly 2B), catalytic subunit, beta isoform (calcineurin A beta)7827004.89E−02CLASP2|CLIP-associating protein 2800504.43E−02FLJ23153|likely ortholog of mouse tumor necrosis-alpha-induced adipose-related protein3435551.97E−02Homo sapiens mRNA; cDNA DKFZp586D0923 (from clone DKFZp586D0923)108425-24.52E−02ESTs, Weakly similar to JC5314 CDC28/cdc2-like kinase associating arginine-serine cyclophilin [H. sapiens]2897162.24E−04Homo sapiens mRNA; cDNA DKFZp566P1124 (from clone DKFZp566P1124)16885102.81E−02Homo sapiens CLK4 mRNA, complete cds16363608.61E−03FLJ14957|hypothetical protein FLJ149577136471.77E−02TSPAN-3|tetraspan 31363243.35E−02Homo sapiens PAK2 mRNA, complete cds518514.88E−03ESTs, Weakly similar to I78885 serine/threonine-specific protein kinase [H. sapiens]8979262.51E−03Homo sapiens clone FLB5227 PRO1367 mRNA, complete cds5883682.68E−02KIAA0947|KIAA0947 protein291852.82E−02ULK2|unc-51 (C. elegans)-like kinase 28254513.08E−02P115|vesicle docking protein p1151955573.08E−02FLJ10842|hypothetical protein FLJ1084214998644.31E−02ESTs2546252.11E−02KIAA0229|KIAA0229 protein14354812.07E−02Homo sapiens mRNA; cDNA DKFZp586G2222 (from clone DKFZp586G2222)19117063.17E−02GA|breast cell glutaminase7956773.36E−02Homo sapiens cDNA: FLJ21314 fis, clone COL022483435661.97E−02FLJ23342|hypothetical protein FLJ233425648479.88E−03Homo sapiens cDNA FLJ30861 fis, clone FEBRA20035413225113.35E−02Homo sapiens mRNA; cDNA DKFZp564D1462 (from clone DKFZp564D1462)15563223.36E−03EST7680642.23E−02CYP1A1|cytochrome P450, subfamily|(aromatic compound-inducible), polypeptide 13583441.04E−02KIAA0244|KIAA0244 protein15562594.47E−02ALAD|aminolevulinate, delta-, dehydratase7534301.14E−02ATRX|alpha thalassemia/mental retardation syndrome X-linked (RAD54 (S. cerevisiae) homolog)6693674.28E−02USP15|ubiquitin specific protease 158094211.76E−02PCBD|6-pyruvoyl-tetrahydropterin synthase/dimerization cofactor of hepatocyte nuclear factor 1 alpha (TCF1)7046974.98E−02HERC3|hect domain and RLD 315513171.91E−02EST7728884.03E−02KIAA1012|KIAA1012 protein8253942.28E−02DJ465N24.2.1|hypothetical protein dJ465N24.2.1739334.09E−02ESTs2618521.77E−02ESTs2415301.28E−03EPHA2|EphA216356501.02E−02KIAA0576|KIAA0576 protein7729622.36E−02Homo sapiens cDNA FLJ31149 fis, clone IMR322001491, moderately similar to Rattus norvegicus tricarboxylatecarrier-like protein mRNA7825876.22E−03UBE4A|ubiquitination factor E4A (homologous to yeast UFD2)8256151.34E−02ESTs8238713.34E−02SPARCL1|SPARC-like 1 (mast9, hevin)7690222.44E−02GNAQ|guanine nucleotide binding protein (G protein), q polypeptide15847551.22E−02ESTs8149837.10E−03FLJ11068|hypothetical protein FLJ110688108434.22E−02BM029|uncharacterized bone marrow protein BM029706061.97E−02ESTs3225371.67E−02Homo sapiens cDNA: FLJ21425 fis, clone COL041622896773.55E−02CG005|hypothetical protein from BCRA2 region7013711.27E−03Homo sapiens mRNA; cDNA DKFZp586I1518 (from clone DKFZp586I1518)7453602.35E−02HAT1|histone acetyltransferase 17542553.25E−02ESTs853132.99E−02KIAA1254|KIAA1254 protein1419724.44E−02ITM1|integral membrane protein 17454372.37E−02ESTs2804562.99E−02EST7885551.27E−03DKFZP564I052|DKFZP564I052 protein2025774.55E−02HNMT|histamine N-methyltransferase8131878.91E−03Homo sapiens cDNA: FLJ21264 fis, clone COL015795020969.88E−03Homo sapiens mRNA; cDNA DKFZp761K2024 (from clone DKFZp761K2024)7536023.68E−02FLJ10618|hypothetical protein FLJ106184873012.69E−02FBXL5|f-box and leucine-rich repeat protein 54880331.42E−02DNAJB9|DnaJ (Hsp40) homolog, subfamily B, member 93648652.77E−03FLJ21062|hypothetical protein FLJ210622676912.58E−04FLJ20360|hypothetical protein FLJ203607887056.25E−03USF1|upstream transcription factor 11241381.45E−02NXF1|nuclear RNA export factor 18132611.40E−02Homo sapiens clone 23645 mRNA sequence8564543.01E−04SLC3A2|solute carrier family 3 (activators of dibasic and neutral amino acid transport), member 24708614.57E−02NDUFB6|NADH dehydrogenase (ubiquinone) 1 beta subcomplex, 6 (17 kD, B17)1436613.41E−02NTN4|netrin 46654052.18E−02MYO5C|myosin 5C3031091.27E−03P2Y5|purinergic receptor (family A group 5)14703653.98E−02ST7|suppression of tumorigenicity 72203724.61E−02HS3ST1|heparan sulfate (glucosamine) 3-O-sulfotransferase 18142147.66E−03D8S2298E|reproduction 87967394.09E−02MGC10924|hypothetical protein MGC10924 similar to Nedd4 WW-binding protein 57861099.38E−04ESTs16375041.66E−03EST480331.86E−02ESTs15573184.43E−02ESTs22928073.15E−03ACAT1|acetyl-Coenzyme A acetyltransferase 1 (acetoacetyl Coenzyme A thiolase)10347769.51E−03AD037|AD037 protein2952551.78E−02KIAA0254|KIAA0254 gene product3063802.37E−03MGC4276|hypothetical protein MGC4276 similar to CG819816412452.06E−03LOC51320|hypothetical protein3030432.19E−02ESTs, Weakly similar to G02075 transcription repressor zinc finger protein 85 [H. sapiens]7527527.56E−03ESTs3584681.95E−02RNF11|ring finger protein 113631463.46E−02PPP3R1|protein phosphatase 3 (formerly 2B), regulatory subunit B (19 kD), alpha isoform (calcineurin B, type I)846131.67E−02DKFZP564K247|DKFZP564K247 protein15191432.28E−02RISC|likely homolog of rat and mouse retinoid-inducible serine carboxypeptidase8255824.62E−02Homo sapiens mRNA; cDNA DKFZp564O0122 (from clone DKFZp564O0122)7893831.97E−02CREM|cAMP responsive element modulator8134241.41E−02PPID|peptidylprolyl isomerase D (cyclophilin D)229171.89E−02Homo sapiens mRNA; cDNA DKFZp761M0111 (from clone DKFZp761M0111)15938293.51E−02TIA1|TIA1 cytotoxic granule-associated RNA-binding protein15784472.28E−02Homo sapiens cDNA FLJ31866 fis, clone NT2RP70017453622792.60E−02ARHGEF5|Rho guanine nucleotide exchange factor (GEF) 515409493.24E−02EST1551181.78E−02ESTs3217701.15E−02FBP17|formin-binding protein 178548741.30E−02KIAA0212|KIAA0212 gene product439774.70E−03KIAA0182|KIAA0182 protein1363998.91E−03DKFZP586F2423|hypothetical protein DKFZp586F24232299011.97E−02CTSO|cathepsin O7268904.87E−02MGC4643|hypothetical protein MGC46437438761.97E−02MBLR|MBLR protein8094882.82E−02RAI17|retinoic acid induced 1715727102.34E−02FLJ21213|hypothetical protein FLJ212131550502.58E−04MDS025|hypothetical protein MDS0257828511.70E−02FLJ12799|hypothetical protein FLJ1279920115151.98E−02DKFZP586B0923|DKFZP586B0923 protein16022842.60E−02EST7810464.95E−02ERBB2IP|erbb2-interacting protein ERBIN7674772.03E−02ANKRA2|ankyrin repeat, family A (RFXANK-like), 21798042.57E−02PWP2H|PWP2 (periodic tryptophan protein, yeast) homolog3659193.42E−02STAU|staufen (Drosophila, RNA-binding protein)503391.32E−02ESTs, Moderately similar to hypothetical protein [H. sapiens]15987871.32E−02FLJ20730|hypothetical protein FLJ2073021030001.74E−02ESTs8409842.53E−02CAV2|caveolin 27887451.77E−02WS-3|novel RGD-containing protein15582121.58E−03ESTs8135183.88E−02ESTs143661-21.36E−02NTN4|netrin 48119184.54E−02KIAA0952|KIAA0952 protein9511253.36E−02PECI|peroxisomal D3,D2-enoyl-CoA isomerase8118491.30E−02MGC5521|hypothetical protein MGC55212987694.52E−02KEO4|similar to Caenorhabditis elegans protein C42C1.98971421.36E−02MAP2K1IP1|mitogen-activated protein kinase kinase 1 interacting protein 17544503.27E−02ARHGEF12|Rho guanine exchange factor (GEF) 122141314.61E−02NIT2|Nit protein 21438464.77E−02LRP2|low density lipoprotein-related protein 220289163.84E−02Homo sapiens mRNA for Hmob33 protein, 3 untranslated region1957862.91E−04EST10487814.46E−02FLJ10140|hypothetical protein FLJ101407862133.97E−02AUH|AU RNA-binding protein/enoyl-Coenzyme A hydratase669312.12E−02FLJ20307|hypothetical protein FLJ20307798984.71E−02TLE1|transducin-like enhancer of split 1, homolog of Drosophila E(sp1)1152921.66E−03DKFZp586C1924|hypothetical protein DKFZp586C19243607786.76E−05ATM|ataxia telangiectasia mutated (includes complementation groups A, C and D)17320333.39E−02FLJ14427|hypothetical protein FLJ144273081633.45E−02ESTs, Weakly similar to TRHY_HUMAN TRICHOHYALI [H. sapiens]9510682.97E−02Homo sapiens, clone IMAGE: 3450973, mRNA3219453.96E−03ESTs8971533.64E−02PTD009|PTD009 protein1501371.40E−02DKFZP564O123|DKFZP564O123 protein6101033.78E−02DKFZP434N1511|hypothetical protein1242612.36E−02SNRP70|small nuclear ribonucleoprotein 70 kD polypeptide (RNP antigen)19265751.34E−02CDX2|caudal type homeo box transcription factor 2773613.57E−02LOC51119|CGI-97 protein7676411.34E−02MAPK8IP2|mitogen-activated protein kinase 8 interacting protein 216105464.45E−04HNF3A|hepatocyte nuclear factor 3, alpha5024462.22E−02DKFZP564A2416|DKFZP564A2416 protein4904491.86E−02RAD50|RAD50 (S. cerevisiae) homolog20148882.50E−02SRPUL|sushi-repeat protein1631743.21E−02TCEA1|transcription elongation factor A (SII), 14718632.31E−02Homo sapiens mRNA; cDNA DKFZp586C1817 (from clone DKFZp586C1817)7537438.91E−03IL6ST|interleukin 6 signal transducer (gp130, oncostatin M receptor)7685204.09E−02NCALD|neurocalcin delta15169383.55E−02HM74|putative chemokine receptor; GTP-binding protein8119414.96E−02Homo sapiens cDNA FLJ32130 fis, clone PEBLM2000248, weakly similar to ZINC FINGER PROTEIN 1578119441.41E−02ESTs2988621.27E−03ESTs7309531.36E−02FLJ13171|hypothetical protein FLJ131717708011.20E−02ESTs20106841.85E−02KIAA0640|SWAP-70 protein7121664.91E−02KIAA0855|golgin-675941722.44E−02Homo sapiens, clone MGC: 24302 IMAGE: 3996246, mRNA, complete cds263141.36E−02STXBP3|syntaxin binding protein 31284931.16E−02MLH1|mutL (E. coli) homolog 1 (colon cancer, nonpolyposis type 2)15193411.04E−02KIAA0907|KIAA0907 protein7537542.06E−03ESTs261711.44E−02KIAA0856|KIAA0856 protein16074824.52E−02CEBPG|CCAAT/enhancer binding protein (C/EBP), gamma8143503.80E−02IDE|insulin-degrading enzyme7969461.41E−02CSPG6|chondroitin sulfate proteoglycan 6 (bamacan)3448373.93E−02ESTs8142854.45E−04FLJ11240|hypothetical protein FLJ112401560433.81E−02Homo sapiens cDNA: FLJ21933 fis, clone HEP043371376021.56E−02Homo sapiens mRNA; cDNA DKFZp434G0972 (from clone DKFZp434G0972)3229149.11E−03ACP1|acid phosphatase 1, soluble3668303.22E−02ESTs3579404.24E−03FLJ22643|hypothetical protein FLJ226438980583.68E−02ESTs1324524.87E−02ESTs3439741.87E−02FLJ23445|hypothetical protein FLJ234452930013.20E−03DKFZP434E2318|hypothetical protein DKFZp434E23187820471.93E−02KIAA0268|KIAA0268 protein7677472.73E−02KIAA0999|KIAA0999 protein15582681.67E−02PTMS|parathymosin2777615.24E−03ESTs1503142.64E−02LYPLA1|lysophospholipase I20513523.01E−02KLHL2|kelch (Drosophila)-like 2 (Mayven)2417982.20E−02Homo sapiens cDNA FLJ30407 fis, clone BRACE2008553792163.76E−02AHNAK|AHNAK nucleoprotein (desmoyokin)7449521.97E−02ESTs, Moderately similar to UQHUR7 ubiquitin/ribosomal protein S27a, cytosolic [H. sapiens]2920681.20E−02ESTs20183323.78E−02PRKAR1A|protein kinase, cAMP-dependent, regulatory, type I, alpha (tissue specific extinguisher 1)5925921.50E−02MUC5AC|mucin 5, subtypes A and C, tracheobronchial/gastric1331972.82E−02KIAA0997|KIAA0997 protein5634513.20E−03TLK1|tousled-like kinase 18110322.11E−02PAWR|PRKC, apoptosis, WT1, regulator7861942.07E−02DCK|deoxycytidine kinase7677534.53E−03RFX5|regulatory factor X, 5 (influences HLA class II expression)5950701.49E−03SERP1|stress-associated endoplasmic reticulum protein 1; ribosome associated membrane protein 47708351.04E−02BCKDHB|branched chain keto acid dehydrogenase E1, beta polypeptide (maple syrup urine disease)2778483.73E−02Homo sapiens cDNA FLJ13900 fis, clone THYRO10017464281841.78E−02Homo sapiens, clone MGC: 18216 IMAGE: 4156235, mRNA, complete cds2079892.58E−04KIAA0022|KIAA0022 gene product8576401.12E−02COL6A2|collagen, type VI, alpha 218945191.13E−02FLJ12085|hypothetical protein FLJ120859506031.31E−03Homo sapiens clone 24670 mRNA sequence2233041.02E−02ESTs3659901.14E−02Homo sapiens cDNA FLJ11567 fis, clone HEMBA10032767708484.41E−02ESTs, Weakly similar to ALU1_HUMAN ALU SUBFAMILY J SEQUENCE CONTAMINATION WARNING ENTRY[H. sapiens]1933831.13E−02FLJ20986|hypothetical protein FLJ2098617623262.03E−02ESTs2639553.39E−02KIAA0828|KIAA0828 protein821712.11E−02Homo sapiens cDNA FLJ14041 fis, clone HEMBA10057804874992.11E−02Homo sapiens cDNA FLJ32068 fis, clone OCBBF100011415680563.84E−02ESTs, Moderately similar to I78885 serine/threonine-specific protein kinase [H. sapiens]2606191.33E−02USP12|ubiquitin specific protease 1217322478.78E−03ESTs8453551.78E−02CTSC|cathepsin C14228949.53E−03NOTCH2|Notch (Drosophila) homolog 24284114.45E−04KIAA1915|KIAA1915 protein1368452.11E−02Homo sapiens, clone IMAGE: 3915000, mRNA1422593.88E−02FIP2|tumor necrosis factor alpha-inducible cellular protein containing leucine zipper domains; Huntingtin interactingprotein L; transcrption factor IIIA-interacting protein7881094.64E−03ATR|ataxia telangiectasia and Rad3 related1148521.82E−02C16orf3|chromosome 16 open reading frame 37848304.32E−02D123|D123 gene product20094772.11E−02CD6|CD6 antigen









TABLE 3










Genes, the expressions of which positively correlate with the ERb subtype









Clone_ID
P_value
Gene_Description












898312
4.87E−02
TRAF4|TNF receptor-associated factor 4


2713047
3.35E−02
PVR|poliovirus receptor


739511
6.40E−03
PKMYT1|membrane-associated tyrosine- and threonine-specific cdc2-inhibitory kinase


323693
2.69E−02
AP1S1|adaptor-related protein complex 1, sigma 1 subunit


29927
1.14E−02
FLJ10737|hypothetical protein FLJ10737


770935
2.18E−02
7h3|hypothetical protein FLJ13511


1681421
3.88E−03
EGFL3|EGF-like-domain, multiple 3


50649
3.71E−02
PRKCL1|protein kinase C-like 1


203003
3.93E−02
NME4|non-metastatic cells 4, protein expressed in


795263
1.58E−02
FLJ22638|hypothetical protein FLJ22638


731020
4.17E−02
PSMF1|proteasome (prosome, macropain) inhibitor subunit 1 (PI31)


1460075
1.20E−02
PIN1|protein (peptidyl-prolyl cis/trans isomerase) NIMA-interacting 1


108377
1.22E−02
TUBG1|tubulin, gamma 1


727078
4.92E−03

Homo sapiens cDNA: FLJ23602 fis, clone LNG15735



740788
1.80E−02
ESTs, Weakly similar to CA13 MOUSE COLLAGEN ALPHA 1(III) CHAIN PRECURSOR [M. musculus]


756502
2.05E−03
NUDT1|nudix (nucleoside diphosphate linked moiety X)-type motif 1


53122
3.45E−02
Human (clone CTG-A4) mRNA sequence


1903066
8.90E−03
KRTHB1|keratin, hair, basic, 1


753021
2.95E−02
NOSIP|eNOS interacting protein


841308
4.45E−03
MYLK|myosin, light polypeptide kinase


144887
4.86E−02
DPM2|dolichyl-phosphate mannosyltransferase polypeptide 2, regulatory subunit


866712
2.67E−03
MGC14421|hypothetical protein MGC14421


2019258
3.40E−02
ESTs


743268
4.03E−02
MGC2835|hypothetical protein MGC2835


796079
2.24E−04
MGC4171|hypothetical protein MGC4171


154720
8.98E−03
ARD1|N-acetyltransferase, homolog of S. cerevisiae ARD1


324651
4.44E−02
LOC51102|CGI-63 protein


725558
3.84E−02
LOC51114|CGI-89 protein


366100
4.39E−02
MATN2|matrilin 2


51604
5.33E−03
RLUCL|ribosomal large subunit pseudouridine synthase C like


756372
9.48E−03
RARRES2|retinoic acid receptor responder (tazarotene induced) 2


756373
2.51E−03
ARHGEF16|Rho guanine exchange factor (GEF) 16


770884
1.97E−02
TIP-1|Tax interaction protein 1


591994
3.71E−02
FLJ21935|hypothetical protein FLJ21935


2018392
2.60E−02
GLIS2|Kruppel-like zinc finger protein GLIS2


813841
3.88E−02
PLAT|plasminogen activator, tissue


788209
1.29E−02
FLJ11807|hypothetical protein FLJ11807


727164
1.30E−02
MGC13114|hypothetical protein MGC13114


262251
8.91E−03
CLCN7|chloride channel 7


502753
2.16E−02
ANGPT2|angiopoietin 2


502682
3.28E−02
ENIGMA|enigma (LIM domain protein)


1409509
2.11E−02
TNNT1|troponin T1, skeletal, slow


138550
2.11E−02
FLJ11137|hypothetical protein FLJ11137


139354
1.97E−02
HSPC195|hypothetical protein


126320
4.54E−02
JUP|junction plakoglobin


195313
4.28E−02
KPNA6|karyopherin alpha 6 (importin alpha 7)


1323361
1.53E−02
NR2F6|nuclear receptor subfamily 2, group F, member 6


1473274
1.31E−02
MYRL2|myosin regulatory light chain 2, smooth muscle isoform


2028161
3.45E−02
UNC93B|unc93 (C. elegans) homolog B


433204
2.58E−04

Homo sapiens, Similar to RIKEN cDNA 2310012N15 gene, clone IMAGE: 3342825, mRNA, partial cds



1917207
1.77E−02
HIG2|hypoxia-inducible protein 2


753984
1.34E−02
FLJ10640|hypothetical protein


809974
2.15E−02
ESTs, Weakly similar to S10889 proline-rich protein [H. sapiens]


1568318
1.07E−02
DNASE1|deoxyribonuclease I


80764
4.35E−03
LOC51255|hypothetical protein


769565
3.51E−02
RER1|similar to S. cerevisiae RER1


39722
7.38E−03
ERCC2|excision repair cross-complementing rodent repair deficiency, complementation group 2 (xeroderma




pigmentosum D)


49273
6.00E−03
SLC27A4|solute carrier family 27 (fatty acid transporter), member 4


1600239
4.03E−02
LOC51659|HSPC037 protein


135221
4.63E−02
S100P|S100 calcium-binding protein P


898281
4.25E−02
FLNA|filamin A, alpha (actin-binding protein-280)


841334
2.91E−03
STIP1|stress-induced-phosphoprotein 1 (Hsp70/Hsp90-organizing protein)


2027515
2.58E−04
SFN|stratifin


1323448
4.90E−02
CRIP1|cysteine-rich protein 1 (intestinal)


591143
1.44E−02
LOC51329|SRp25 nuclear protein


2017821
3.78E−05
NTHL1|nth (E. coli endonuclease III)-like 1


1968422
4.59E−02

Homo sapiens mRNA full length insert cDNA clone EUROIMAGE 1968422



841338
1.31E−02
PRNPIP|prion protein interacting protein


1473289
8.98E−03
PPGB|protective protein for beta-galactosidase (galactosialidosis)


815535
2.03E−03
TCOF1|Treacher Collins-Franceschetti syndrome 1


2017754
4.22E−03
DGSI|DiGeorge syndrome critical region gene DGSI; likely ortholog of mouse expressed sequence




2 embryonic lethal


121251
2.29E−02
MGC5576|hypothetical protein MGC5576


769712
3.00E−02
GAK|cyclin G associated kinase


66406
3.82E−02
ESTs, Highly similar to T47163 hypothetical protein DKFZp762E1312.1 [H. sapiens]


73550
2.91E−04
FLJ11773|hypothetical protein FLJ11773


2015148
9.48E−03
GIT1|G protein-coupled receptor kinase-interactor 1


767034
2.02E−03
ILVBL|ilvB (bacterial acetolactate synthase)-like


714159
1.51E−03

Homo sapiens cDNA FLJ32185 fis, clone PLACE6001925



770043
2.58E−04
NDUFV1|NADH dehydrogenase (ubiquinone) flavoprotein 1 (51 kD)


1642496
3.82E−02
MGC11266|hypothetical protein MGC11266


795522
4.96E−02
TAF1C|TATA box binding protein (TBP)-associated factor, RNA polymerase I, C, 110 kD


221846
4.57E−02
CHES1|checkpoint suppressor 1


50768
2.89E−02
DKFZp667O2416|hypothetical protein DKFZp667O2416


68950
1.77E−02
CCNE1|cyclin E1


130153
1.66E−02
SUPT5H|suppressor of Ty (S. cerevisiae) 5 homolog


338599
4.09E−02
NRBP|nuclear receptor binding protein


1859037
2.38E−02
DKFZP586J0119|DKFZP586J0119 protein


138728
4.91E−02
KIAA1696|KIAA1696 protein


897570
1.77E−02
TRAP1|heat shock protein 75


471266
1.40E−02
DGCR6L|DiGeorge syndrome critical region gene 6 like


240367
1.22E−02
CTCF|CCCTC-binding factor (zinc finger protein)


1635286
4.40E−03
ITGB4BP|integrin beta 4 binding protein


179163
4.87E−03
GRIN2C|glutamate receptor, ionotropic, N-methyl D-aspartate 2C


840556
1.93E−02
EIF4EL3|eukaryotic translation initiation factor 4E-like 3


755689
1.41E−02
RARG|retinoic acid receptor, gamma


788185-2
4.35E−02
TNFRSF10B|tumor necrosis factor receptor superfamily, member 10b


346696
8.98E−03
TEAD4|TEA domain family member 4


725672
2.58E−04

Homo sapiens, Similar to transducin (beta)-like 3, clone MGC: 8613 IMAGE: 2961321, mRNA, complete cds



81662
4.35E−02
PTD004|hypothetical protein


785847
3.39E−02
UBE2M|ubiquitin-conjugating enzyme E2M (homologous to yeast UBC12)


1635364
4.52E−02
LSM2|U6 snRNA-associated Sm-like protein


809939-2
3.34E−02
MAPK3|mitogen-activated protein kinase 3


44292
2.92E−02

Homo sapiens mRNA; cDNA DKFZp434C107 (from clone DKFZp434C107)



753153
8.88E−03
IL13RA1|interleukin 13 receptor, alpha 1


2019526
4.62E−02
FLJ14220|hypothetical protein FLJ14220


68103
3.30E−02
MLC1SA|myosin light chain 1 slow a


265853
1.94E−03
TEM8|tumor endothelial marker 8


1470048
5.20E−03
LY6E|lymphocyte antigen 6 complex, locus E


743536
3.62E−02
EST


823727
3.17E−02

Homo sapiens, clone IMAGE: 2905978, mRNA, partial cds



249672
3.30E−02
FLJ12827|hypothetical protein FLJ12827


2019387
4.54E−02
SNAPC4|small nuclear RNA activating complex, polypeptide 4, 190 kD


2519200
4.03E−02
LY6H|lymphocyte antigen 6 complex, locus H


1522696
4.80E−02
FLJ10850|hypothetical protein FLJ10850


47853
4.35E−02
ALDH4A1|aldehyde dehydrogenase 4 family, member A1


138672
4.85E−02
ESTs


35620
1.16E−03
MGC4707|hypothetical protein MGC4707


26806
1.97E−02
MGC10433|hypothetical protein MGC10433


1669672
2.72E−02
THY1|Thy-1 cell surface antigen


826138
3.80E−02
GAMT|guanidinoacetate N-methyltransferase


1612722
1.90E−02
FLJ20542|hypothetical protein FLJ20542


1703339
3.80E−02
STXBP2|syntaxin binding protein 2


171912
2.24E−04

Homo sapiens mRNA full length insert cDNA clone EUROIMAGE 703547



430928
3.64E−02
BARD1|BRCA1 associated RING domain 1


235923
3.01E−04
DKFZP434P1750|DKFZP434P1750 protein


812238
1.93E−02
MGC4692|hypothetical protein MGC4692


2013659
3.22E−02
FLJ20294|hypothetical protein FLJ20294


1654978
3.51E−02
FLJ22504|hypothetical C2H2 zinc finger protein FLJ22504


366315
4.37E−03

Homo sapiens, clone MGC: 20500 IMAGE: 4053084, mRNA, complete cds



714196
3.10E−02
WDR1|WD repeat domain 1


897745
1.12E−02
FLJ13868|hypothetical protein FLJ13868


128126
2.01E−02
DAF|decay accelerating factor for complement (CD55, Cromer blood group system)


60565
1.12E−02
LLGL2|lethal giant larvae (Drosophila) homolog 2


1142132
3.01E−02
RPIP8|RaP2 interacting protein 8


1535957
1.58E−02
SEC6|similar to S. cerevisiae Sec6p and R. norvegicus rsec6


487882
2.42E−03
DKFZP761D0211|hypothetical protein DKFZp761D0211


360436
1.42E−02
COPEB|core promoter element binding protein


1592715
1.95E−02
HOMER-3|Homer, neuronal immediate early gene, 3


1845169
2.91E−03
RAB35|RAB35, member RAS oncogene family


741954
3.83E−02

Homo sapiens cDNA FLJ14656 fis, clone NT2RP2002439



812170
4.73E−02
KIAA0657|KIAA0657 protein


166236
4.31E−03
2.19|2.19 gene


714414
2.44E−02
UQCRC1|ubiquinol-cytochrome c reductase core protein I


772912
7.87E−03
AGS3|likely ortholog of rat activator of G-protein signaling 3


1557018
9.48E−03
C21orf70|chromosome 21 open reading frame 70


235938
1.66E−03
BAK1|BCL2-antagonist/killer 1


1632120
1.70E−02
COPE|coatomer protein complex, subunit epsilon


2322079
7.56E−03
EST


358162
4.30E−02
HSU79266|protein predicted by clone 23627


756666
1.09E−03
PPP1CA|protein phosphatase 1, catalytic subunit, alpha isoform


32231
1.34E−02
FLJ12442|hypothetical protein FLJ12442


346942
2.98E−02
PIGQ|phosphatidylinositol glycan, class Q


531319
8.42E−03
STK12|serine/threonine kinase 12


2027578
1.85E−02
NAKAP95|neighbor of A-kinase anchoring protein 95


741891
4.61E−02
RAB2L|RAB2, member RAS oncogene family-like


814865
8.91E−03
MGC11102|hypothetical protein MGC11102


1569187
3.53E−02
HS3ST4|heparan sulfate (glucosamine) 3-O-sulfotransferase 4


2623626
3.98E−02
PTPRG|protein tyrosine phosphatase, receptor type, G


49485
8.04E−04

Homo sapiens, clone IMAGE: 3161564, mRNA, partial cds



1555427
1.93E−02
SPINT1|serine protease inhibitor, Kunitz type 1


780947
1.14E−02
POLD1|polymerase (DNA directed), delta 1, catalytic subunit (125 kD)


455275
3.81E−02
FLJ23469|hypothetical protein FLJ23469


209066-2
3.42E−02
STK15|serine/threonine kinase 15


1759582
4.40E−03
FN14|type I transmembrane protein Fn14


141852
3.68E−02
P2RY2|purinergic receptor P2Y, G-protein coupled, 2


897768
4.25E−02
COL7A1|collagen, type VII, alpha 1 (epidermolysis bullosa, dystrophic, dominant and recessive)


41208
1.29E−03
BMP1|bone morphogenetic protein 1


825293
3.11E−02
KIAA0082|KIAA0082 protein


1860497
2.19E−02

Homo sapiens, clone MGC: 5352 IMAGE: 3048106, mRNA, complete cds



344272
2.02E−02
EMP3|epithelial membrane protein 3


327506
1.87E−02

Homo sapiens mRNA full length insert cDNA clone EUROIMAGE 327506



430954
1.84E−02
FLJ22341|hypothetical protein FLJ22341


260015
7.21E−03
DKFZP586B0519|DKFZP586B0519 protein


2017897
3.67E−02
CINP|HeLa cyclin-dependent kinase 2 interacting protein


431759
4.39E−02
TEAD3|TEA domain family member 3


810734
3.01E−03
POLD4|polymerase (DNA-directed), delta 4


357450
1.30E−02
MTVR|Mouse Mammary Turmor Virus Receptor homolog


897770
3.34E−03
EST


26910
4.00E−02
T54|T54 protein


897774
7.38E−03
APRT|adenine phosphoribosyltransferase


1536925
1.70E−02
PDPK1|3-phosphoinositide dependent protein kinase-1


207618
1.34E−02
ARAF1|v-raf murine sarcoma 3611 viral oncogene homolog 1


756687
2.02E−02
CD36L1|CD36 antigen (collagen type I receptor, thrombospondin receptor)-like 1


1588935
4.27E−02
PHLDA3|pleckstrin homology-like domain, family A, member 3


742783
1.66E−03
DKFZp434N035|hypothetical protein DKFZp434N035


172751
1.97E−02
APBA1|amyloid beta (A4) precursor protein-binding, family A, member 1 (X11)


562080
3.04E−04
FLJ10101|hypothetical protein FLJ10101


810743
9.21E−03
MLF2|myeloid leukemia factor 2


166268
4.20E−02
SR-A1|serine arginine-rich pre-mRNA splicing factor SR-A1


1476053
1.12E−02
RAD51|RAD51 (S. cerevisiae) homolog (E coli RecA homolog)


1947381
2.47E−02
FLJ22329|hypothetical protein FLJ22329


1731860
4.47E−02
GADD45B|growth arrest and DNA-damage-inducible, beta


2062432
4.88E−03
COMP|cartilage oligomeric matrix protein (pseudoachondroplasia, epiphyseal dysplasia 1, multiple)


128302
2.16E−02
PTMS|parathymosin


593114
4.44E−02
SIPA1|signal-induced proliferation-associated gene 1


897781
3.10E−02
KRT8|keratin 8


843091
1.73E−02
MGC20533|similar to RIKEN cDNA 2410004L22 gene (M. musculus)


611532
8.98E−03
TNNI2|troponin I, skeletal, fast


590640
2.24E−04
PDXK|pyridoxal (pyridoxine, vitamin B6) kinase


809413
1.28E−03
FLJ12875|hypothetical protein FLJ12875


878406
3.75E−02
MTX1|metaxin 1


26856
2.59E−02
FLOT2|flotillin 2


814961
4.96E−02
USP5|ubiquitin specific protease 5 (isopeptidase T)


840698
2.10E−03
FLJ20254|hypothetical protein FLJ20254


2009969
1.51E−02
20D7-FC4|hypothetical protein


1610168
2.67E−03
DMWD|dystrophia myotonica-containing WD repeat motif


41302
2.69E−02
KIAA0643|KIAA0643 protein


307069
1.93E−02
ALDH3B1|aldehyde dehydrogenase 3 family, member B1


878413
1.70E−02
SLC25A11|solute carrier family 25 (mitochondrial carrier; oxoglutarate carrier), member 11


267590
4.70E−02
KIAA0330|calcineurin binding protein 1


302996
4.50E−04
CLIC3|chloride intracellular channel 3


884692
2.74E−03
TCEB2|transcription elongation factor B (SIII), polypeptide 2 (18 kD, elongin B)


259579
2.61E−02
RAD51L3|RAD51 (S. cerevisiae)-like 3


859761
2.68E−02
PVRL2|poliovirus receptor-related 2 (herpesvirus entry mediator B)


825399
4.52E−02
TRAF3|TNF receptor-associated factor 3


74738
9.83E−03
MGC20486|hypothetical protein MGC20486


768217
2.19E−02

Homo sapiens, Similar to hypothetical protein, MGC: 7764, clone MGC: 20548 IMAGE: 3607345, mRNA,





complete cds


811565
1.41E−03
KIAA1694|KIAA1694 protein


843321
1.97E−02
KRT7|keratin 7


294273
9.39E−03
PXMP2|peroxisomal membrane protein 2 (22 kD)


809503
3.20E−02
ESTs, Weakly similar to AC004858 3 U1 small ribonucleoprotein 1SNRP homolog [H. sapiens]


1609781
9.51E−03

Homo sapiens clone 24819 mRNA sequence



780989
4.09E−02
DKFZP434N061|DKFZP434N061 protein


526757
1.14E−02
CCND1|cyclin D1 (PRAD1: parathyroid adenomatosis 1)


1632247
3.38E−02
FLJ23436|hypothetical protein FLJ23436


2018941
1.09E−03
D21S2056E|DNA segment on chromosome 21 (unique) 2056 expressed sequence


809507
2.06E−03
FLJ20568|hypothetical protein FLJ20568


771089
1.07E−02
NDUFB7|NADH dehydrogenase (ubiquinone) 1 beta subcomplex, 7 (18 kD, B18)


306575-2
1.22E−02
DIPA|hepatitis delta antigen-interacting protein A


25069
1.97E−02
KIAA0462|KIAA0462 protein


502151
8.37E−04
SLC16A3|solute carrier family 16 (monocarboxylic acid transporters), member 3


784260
3.05E−03
MAN1B1|mannosidase, alpha, class 1B, member 1


814989
3.46E−04
PPM1G|protein phosphatase 1G (formerly 2C), magnesium-dependent, gamma isoform


377018
1.14E−02
FLJ20850|hypothetical protein FLJ20850


1574058
8.98E−03
AGPAT2|1-acylglycerol-3-phosphate O-acyltransferase 2 (lysophosphatidic acid acyltransferase, beta)


235056
4.45E−03
24432|hypothetical protein 24432


771233
5.17E−03

Homo sapiens, clone MGC: 16395 IMAGE: 3939387, mRNA, complete cds



291880
1.34E−02
MFAP2|microfibrillar-associated protein 2


809512
1.53E−02
FLJ10767|hypothetical protein FLJ10767


2125819
1.60E−02
BAX|BCL2-associated X protein


1837280
9.08E−03
EST


346134
3.39E−02
CRHSP-24|calcium-regulated heat-stable protein (24 kD)


1535082
4.39E−02
KIAA1271|KIAA1271 protein


1470278
2.99E−02
FLJ21841|hypothetical protein FLJ21841


246704
1.23E−02
RAI|RelA-associated inhibitor


1575008
3.48E−02
WBP1|WW domain binding protein 1


32299
3.34E−02
IMPA2|inositol(myo)-1(or 4)-monophosphatase 2


296030
2.32E−02

Homo sapiens cDNA: FLJ20944 fis, clone ADSE01780



2315207
1.94E−02
SCYB6|small inducible cytokine subfamily B (Cys-X-Cys), member 6 (granulocyte chemotactic protein 2)


1882823
2.73E−02
ESTs


810927
3.25E−03
RFXANK|regulatory factor X-associated ankyrin-containing protein


838662
1.04E−02
HCNGP|transcriptional regulator protein


2314197
3.36E−02
FLJ12671|hypothetical protein FLJ12671


809521
1.85E−02
HMT-1|beta-1,4 mannosyltransferase


41406
4.52E−02
NMA|putative transmembrane protein


796723
4.09E−02

Homo sapiens clone CDABP0014 mRNA sequence



1690762
2.60E−02
CDK10|cyclin-dependent kinase (CDC2-like) 10


1908666
3.81E−02
ZNF79|zinc finger protein 79 (pT7)


788566
2.69E−02
PCP4|Purkinje cell protein 4


1732922
6.02E−03

Homo sapiens mRNA; cDNA DKFZp762H106 (from clone DKFZp762H106)



1492426
1.49E−02
C19orf3|chromosome 19 open reading frame 3


2010543
1.07E−02
DDX28|DEAD/H (Asp-Glu-Ala-Asp/His) box polypeptide 28


769986
3.75E−04
NUBP2|nucleotide binding protein 2 (E. coli MinD like)


299388
4.44E−02
PP15|nuclear transport factor 2 (placental protein 15)


2322367
4.55E−02
RTN4|reticulon 4


771323
1.33E−02
PLOD|procollagen-lysine, 2-oxoglutarate 5-dioxygenase (lysine hydroxylase, Ehlers-Danlos syndrome type VI)


897107
9.38E−04
SLC25A1|solute carrier family 25 (mitochondrial carrier; citrate transporter), member 1


184240
8.88E−03
ESTs


1551282
3.57E−02
FLJ13956|hypothetical protein FLJ13956


124143
2.58E−03
DKFZP761H1710|hypothetical protein DKFZp761H1710


770388
2.73E−02
CLDN4|claudin 4


809609
2.08E−02

Homo sapiens cDNA FLJ32583 fis, clone SPLEN2000348



815017
3.34E−02

Homo sapiens HSPC337 mRNA, partial cds



629916
2.19E−02
TIM17B|translocase of inner mitochondrial membrane 17 homolog B (yeast)


1521341
8.91E−03
HIRIP3|HIRA-interacting protein 3


251330
1.14E−02
MGC10540|hypothetical protein MGC10540


510273
3.67E−02
PLEC1|plectin 1, intermediate filament binding protein, 500 kD


810942
8.97E−03
IDH3G|isocitrate dehydrogenase 3 (NAD+) gamma


1476251
7.10E−03
FLJ20512|hypothetical protein FLJ20512


810948
1.22E−02
TRAP240|thyroid hormone receptor-associated protein, 240 kDa subunit


45632
2.99E−02
GYS1|glycogen synthase 1 (muscle)


279146
8.91E−03
ITPKC|inositol 1,4,5-trisphosphate 3-kinase C


753620
3.17E−02
IGFBP6|insulin-like growth factor binding protein 6


755228
2.54E−02
DNM1|dynamin 1


489076-2
2.61E−02
EMILIN|elastin microfibril interface located protein


347035
4.03E−02
KIAA0476|KIAA0476 gene product


1850224
1.99E−02
ESTs


825583
3.91E−04
RALY|RNA-binding protein (autoantigenic)


742125
2.23E−02
LOXL1|lysyl oxidase-like 1


504945
3.75E−04
FLJ20608|hypothetical protein FLJ20608


1947804
1.93E−02
TREX1|three prime repair exonuclease 1


1699142
1.53E−02
AP1G2|adaptor-related protein complex 1, gamma 2 subunit


343695
1.67E−02

Homo sapiens cDNA FLJ31668 fis, clone NT2RI2004916



1506046
1.74E−02
FLJ10815|hypothetical protein FLJ10815


855749
4.28E−02
TPI1|triosephosphate isomerase 1


269606
2.02E−02
MPG|N-methylpurine-DNA glycosylase


739993
4.54E−02
BRE|brain and reproductive organ-expressed (TNFRSF1A modulator)


183602
5.77E−03
KRT14|keratin 14 (epidermolysis bullosa simplex, Dowling-Meara, Koebner)


183462
3.48E−02
MAN2C1|mannosidase, alpha, class 2C, member 1


809557
9.15E−03
MCM3|minichromosome maintenance deficient (S. cerevisiae) 3


725224
2.79E−02
HES6|likely ortholog of mouse Hes6 neuronal differentiation gene


564981
9.30E−03

Homo sapiens, Similar to RIKEN cDNA 2810433K01 gene, clone MGC: 10200 IMAGE: 3909951, mRNA,





complete cds


811907
1.06E−02
FLJ22056|hypothetical protein FLJ22056


323522
2.98E−02
NRBP|nuclear receptor binding protein


951117
4.34E−02
SHMT2|serine hydroxymethyltransferase 2 (mitochondrial)


511096
4.96E−03

Homo sapiens, Similar to RIKEN cDNA 2010317E24 gene, clone IMAGE: 3502019, mRNA, partial cds



502277
4.05E−02
LOC51025|CGI-136 protein


700900
4.90E−02
LOC51693|unknown


625584
3.59E−02
TRIP|TRAF interacting protein


37708
2.68E−02
MGC3101|hypothetical protein MGC3101


2508044
1.49E−02
HP|haptoglobin


150118
2.70E−02
DKFZp434F054|hypothetical protein DKFZp434F054


2018131
2.11E−02
RACGAP1|Rac GTPase activating protein 1


813514
4.12E−02
FLJ22573|hypothetical protein FLJ22573


700699
6.02E−03
IL1RL1LG|putative T1/ST2 receptor binding protein


796694
1.80E−02
BIRC5|baculoviral IAP repeat-containing 5 (survivin)


138672-2
4.54E−02
ESTs


811848
2.06E−02
LOC56912|hypothetical protein


1492463
2.42E−03
SEPX1|selenoprotein X, 1


1947827
2.95E−02
MSTP028|MSTP028 protein


839583
3.71E−02
ESTs, Moderately similar to T46386 hypothetical protein DKFZp434P011.1 [H. sapiens]


810979
2.91E−03
MRPS2|mitochondrial ribosomal protein S2


712139
3.45E−02
ARL7|ADP-ribosylation factor-like 7


592540
2.86E−02
KRT5|keratin 5 (epidermolysis bullosa simplex, Dowling-Meara/Kobner/Weber-Cockayne types)


2019011
6.76E−05
MT3|metallothionein 3 (growth inhibitory factor (neurotrophic))


241677
6.64E−03
MGC15416|hypothetical protein MGC15416


770709
2.42E−02
KIAA1089|KIAA1089 protein


740620
1.20E−02
TPM2|tropomyosin 2 (beta)


882515
3.34E−02
EIF3S9|eukaryotic translation initiation factor 3, subunit 9 (eta, 116 kD)


1574330
3.11E−02
GROS1|growth suppressor 1


503234
8.91E−03
FLJ23471|hypothetical protein FLJ23471


811923
1.07E−02
POLE|polymerase (DNA directed), epsilon


1592048
1.70E−02
SSNA1|Sjogrens syndrome nuclear autoantigen 1


810983
1.37E−02
DKFZP434H132|DKFZP434H132 protein


462961
2.17E−02
DHFR|dihydrofolate reductase


839594
4.20E−02
LTBP1|latent transforming growth factor beta binding protein 1


1534633
1.03E−03
MGC2479|hypothetical protein MGC2479


770579
1.12E−02
CLDN3|claudin 3


184362
2.49E−02
KCNJ9|potassium inwardly-rectifying channel, subfamily J, member 9


1613955
3.45E−02

Homo sapiens, clone MGC: 20633 IMAGE: 4761663, mRNA, complete cds



165921
1.80E−02
CEP2|centrosomal protein 2


810120
3.97E−02
LOC51160|VPS28 protein


814266
4.89E−02
PRKCZ|protein kinase C, zeta


810124
8.98E−03
PAFAH1B3|platelet-activating factor acetylhydrolase, isoform lb, gamma subunit (29 kD)


244307
1.69E−02
SERPINE1|serine (or cysteine) proteinase inhibitor, clade E (nexin, plasminogen activator inhibitor type 1),




member 1


951216
2.18E−02
NDUFB10|NADH dehydrogenase (ubiquinone) 1 beta subcomplex, 10 (22 kD, PDSW)


2062825
7.66E−03
KIAA0964|KIAA0964 protein


306575
2.86E−02
DIPA|hepatitis delta antigen-interacting protein A


878652
9.48E−03
PCOLCE|procollagen C-endopeptidase enhancer


1631746
5.77E−03
POLM|polymerase (DNA directed), mu


23903
1.67E−02

Homo sapiens clone 23903 mRNA sequence



743114
2.28E−02
HSPBP1|hsp70-interacting protein


123614
5.02E−03
MGC4675|hypothetical protein MGC4675


824108
4.41E−02
SCAND1|SCAN domain-containing 1


51097
3.22E−02
BAIAP3|BAI1-associated protein 3


770588
2.28E−02

Homo sapiens TTF-I interacting peptide 20 mRNA, partial cds



130835
4.52E−02

Homo sapiens, Similar to clone FLB3816, clone IMAGE: 3454380, mRNA



725407
4.97E−03
SMURF1|E3 ubiquitin ligase SMURF1


66952
1.07E−02
ZNF205|zinc finger protein 205


345487
4.70E−03

Homo sapiens, clone MGC: 23280 IMAGE: 4637504, mRNA, complete cds



1591264
4.54E−02
TALDO1|transaldolase 1


1868534
3.48E−02
MGC2408|hypothetical protein MGC2408


951080
3.24E−02
RECQL4|RecQ protein-like 4


144740
1.22E−02
SDCCAG28|serologically defined colon cancer antigen 28


625693
4.10E−02
MGC10911|hypothetical protein MGC10911


1563792
3.66E−02
LOC51333|mesenchymal stem cell protein DSC43


194214
4.39E−02
TGIF|TGFB-induced factor (TALE family homeobox)


1845744
2.17E−03
EST


356992
3.82E−02
HSPC023|HSPC023 protein


282428
3.71E−02

Homo sapiens, Similar to RIKEN cDNA 9030409E16 gene, clone MGC: 26939 IMAGE: 4796761, mRNA,





complete cds


254010
3.08E−02
LOC51175|epsilon-tubulin


264646
3.76E−02
HGS|hepatocyte growth factor-regulated tyrosine kinase substrate


724615
4.54E−02
CHC1|chromosome condensation 1


647767
2.91E−03
MGC4758|similar to RIKEN cDNA 2310040G17 gene


951233
3.43E−02
PSMB3|proteasome (prosome, macropain) subunit, beta type, 3


814287
6.96E−04
XRCC3|X-ray repair complementing defective repair in Chinese hamster cells 3


2013094
1.18E−02
KIF1C|kinesin family member 1C


366834
3.25E−02
EVPL|envoplakin


51328
2.05E−02
CDC34|cell division cycle 34


842846
3.82E−02
TIMP2|tissue inhibitor of metalloproteinase 2


1640586
3.59E−02
DUSP3|dual specificity phosphatase 3 (vaccinia virus phosphatase VH1-related)


740801
6.02E−03
BCKDHA|branched chain keto acid dehydrogenase E1, alpha polypeptide (maple syrup urine disease)


68717
3.22E−02
UCK1|uridine-cytidine kinase 1


33478
4.62E−02
FPGS|folylpolyglutamate synthase


813490
1.67E−02
CORO1C|coronin, actin-binding protein, 1C


415136
7.38E−03
ESTs, Weakly similar to T00370 hypothetical protein KIAA0659 [H. sapiens]


725284
2.05E−03
PHKG2|phosphorylase kinase, gamma 2 (testis)


1868626
5.84E−03
PFKL|phosphofructokinase, liver


882488
4.21E−02
TERF2|telomeric repeat binding factor 2


785459
3.08E−02
SMTN|smoothelin


813499
3.82E−02
SSSCA1|Sjogrens syndrome/scleroderma autoantigen 1


1473131
3.07E−02
TLE2|transducin-like enhancer of split 2, homolog of Drosophila E(sp1)


632137
2.02E−02
SIVA|CD27-binding (Siva) protein


784589
4.57E−02
MMP15|matrix metalloproteinase 15 (membrane-inserted)


811897
4.55E−02
MKL1|megakaryoblastic leukemia (translocation) 1


1486099
4.00E−02
TP73|tumor protein p73


145491
1.14E−02
PCDH1|protocadherin 1 (cadherin-like 1)


1946069
3.91E−04
SPHK1|sphingosine kinase 1


854079
3.55E−02
ACTN1|actinin, alpha 1


965223
2.83E−02
TK1|thymidine kinase 1, soluble


824132
2.18E−02

Homo sapiens, Similar to cofactor required for Sp1 transcriptional activation, subunit 8 (34 kD), clone MGC: 11274





IMAGE: 3944264, mRNA, complete cds


2108077
4.87E−03
LOC51016|CGI-112 protein


22991
1.34E−02
SUPT6H|suppressor of Ty (S. cerevisiae) 6 homolog


796968
2.31E−02
KIAA1534|KIAA1534 protein


2326019
2.38E−02
COX5B|cytochrome c oxidase subunit Vb


1637732
1.76E−02
PPAN|peter pan (Drosophila) homolog


1580874
2.45E−03
CORO2A|coronin, actin-binding protein, 2A


154466
1.80E−02
STUB1|STIP1 homology and U-Box containing protein 1


1474955
3.54E−02
TAF15|TAF15 RNA polymerase II, TATA box binding protein (TBP)-associated factor, 68 kD


197727
2.95E−02
PEMT|phosphatidylethanolamine N-methyltransferase


346604
1.76E−02
AGER|advanced glycosylation end product-specific receptor


592818
6.44E−03
KIAA1437|hypothetical protein FLJ10337


2043418
3.39E−02
CRF|C1q-related factor


842794
1.86E−02
KIAA1668|KIAA1668 protein


1926769
3.16E−02
SCNN1B|sodium channel, nonvoltage-gated 1, beta (Liddle syndrome)


882571
9.94E−03
OAZIN|ornithine decarboxylase antizyme inhibitor


156211
8.98E−03
ATP6B1|ATPase, H+ transporting, lysosomal (vacuolar proton pump), beta polypeptide, 56/58 kD, isoform 1 (Renal




tubular acidosis with deafness)


2307514
1.67E−02
MLC1|KIAA0027 protein


154610
3.14E−03
MGC3248|dynactin 4


80708
2.51E−03
UFD1L|ubiquitin fusion degradation 1-like


770910
3.28E−02
ELF3|E74-like factor 3 (ets domain transcription factor, epithelial-specific)


753860
4.32E−02
SLC25A13|solute carrier family 25, member 13 (citrin)


772377
3.45E−02

Homo sapiens mRNA; cDNA DKFZp761H229 (from clone DKFZp761H229); partial cds



34370
1.34E−02
PLEC1|plectin 1, intermediate filament binding protein, 500 kD


271102
7.55E−03
CCS|copper chaperone for superoxide dismutase


280934
1.77E−02
MVD|mevalonate (diphospho) decarboxylase


140574
2.08E−02
SCYD1|small inducible cytokine subfamily D (Cys-X3-Cys), member 1 (fractalkine, neurotactin)


1575410
1.51E−03

Homo sapiens, Similar to RIKEN cDNA 2700064H14 gene, clone MGC: 21390 IMAGE: 4519078, mRNA,





complete cds


1509761
2.06E−03
KRTHB6|keratin, hair, basic, 6 (monilethrix)


68818
2.97E−03

Homo sapiens, clone IMAGE: 3957135, mRNA, partial cds



813807
7.03E−03
RNF25|ring finger protein 25


432075
1.05E−03
TSSC4|tumor suppressing subtransferable candidate 4


813738
3.20E−03
BRF1|BRF1 homolog, subunit of RNA polymerase III transcription initiation factor IIIB (S. cerevisiae)


857652
1.93E−02
PPT2|palmitoyl-protein thioesterase 2


898237
3.61E−02
BAT3|HLA-B associated transcript 3


770856
2.69E−02
DKFZP564D0478|hypothetical protein DKFZp564D0478


760224
1.68E−03
XRCC1|X-ray repair complementing defective repair in Chinese hamster cells 1


85804
2.70E−02
FLJ21918|hypothetical protein FLJ21918


1607741
2.44E−02
FLJ10385|hypothetical protein FLJ10385


512410
2.91E−04
RNASEHI|ribonuclease HI, large subunit


2326112
2.98E−02
RPL22|ribosomal protein L22


32927
1.89E−02
FBXL6|f-box and leucine-rich repeat protein 6


744047
2.47E−03
PLK|polo (Drosophia)-like kinase


785707
3.67E−02
PRC1|protein regulator of cytokinesis 1


471200
1.14E−02
LOC51042|zinc finger protein


263894
3.56E−02
QPRT|quinolinate phosphoribosyltransferase (nicotinate-nucleotide pyrophosphorylase (carboxylating))









Example III
Molecular Signature that Correlates with Recurrence of Breast Cancer

A molecular signature that correlates with recurrence of breast cancer after removal of cancer by surgery was identified as follows. Breast cancer tissue removed by surgery was microdissected (“laser captured”) to isolate breast cancer cells. The expression levels of multiple genes in the cells were used to identify those that correlate with cancer recurrence. The set of genes that correlate was identified by using a cox proportional hazard regression model using a single gene at a time as a covariate. Genes were selected with p<0.01 derived from the regression model. 396 genes were selected that correlated with recurrence, and they are listed in Table 4. The sign of the coefficient values in Table 4 correspond to whether a gene is positively or negatively correlated with survival outcome. A positive coefficient means that the gene is positively correlated (overexpressed) in patients with a poor (shorter) survival outcome and negatively correlated (underexpressed) in patients with a good or better (longer) survival outcome. A negative coefficient means that the gene is positively correlated (overexpressed) in patients with a good or better (longer) survival outcome and negatively correlated (underexpressed) in patients with a poor (shorter) survival outcome.


To validate this signature, an independent dataset of gene expression (van't Veer et al., supra) with clinical outcome (survival) was challenged with this signature. Of the 396 genes in Table 4, 297 genes overlapped with those examined in by van't Veer et al. and were thus used to determine whether this 297 gene set was correlative to overall survival. The 297 gene signature (identities of the genes are presented in Table 5 via their Clone ID, GenBank ID, and Unigene ID numbers) segregates the survival data (patient population) of van't Veer et al. into “long” and “short” groups with significantly different overall survival curves as shown by the lines identified as “AAG-Long” and “AAG-Short” in FIG. 2. Like FIG. 1, the horizontal axis of, FIG. 2 is in months and the vertical axis is in survival probability (where 1.0 is survival of 100% of the subjects in a group and 0.5 is survival of 50% of the subjects in a group). The line identified as “AAG-Short” is the lowest line at time points of about 60 months and higher.



FIG. 2 also shows the comparison of this 297 gene set with that of a set of 17 genes correlated with matastasis described by Ramaswamy et al. (supra, see Table 1 therein). The curves corresponding to the Ramaswamy et al. signature are identified as “Golub-Long” and “Golub-Short”. FIG. 2 shows that 297 gene signature separated the survival curves to a greater extent than the 17 gene set of Ramaswamy et al. The 297 gene signature also correlated with the data with a p value of 0.00106, which is approximately 10 fold better than the p value of 0.0171 for the Ramaswamy et al. 17 gene set.

TABLE 4Genes, the expressions of which correlate with the breast cancer recurrenceCloneIDp valuecoefdescription2299019.71E−07−1.95CTSO|cathepsin O16356181.71E−062.07KIAA1115|KIAA1115 protein1420223.98E−06−1.62ESTs7744465.70E−060.79ADM|adrenomedullin854096.76E−06−1.46CREG|cellular repressor of E1A-stimulated genes6661699.91E−06−2.43MTR|5-methyltetrahydrofolate-homocysteine methyltransferase20151481.95E−051.16GIT1|G protein-coupled receptor kinase-interactor 16283572.02E−051.95ACTN3|actinin, alpha 38152353.12E−052.10RCD-8|autoantigen4910534.46E−05−3.50ARIH2|ariadne (Drosophila) homolog 28238195.35E−05−1.734872975.49E−05−1.60CAP2|adenylyl cyclase-associated protein 2782385-25.53E−05−2.08DKFZP566D193|DKFZP566D193 protein268118.32E−05−1.99XRCC4|X-ray repair complementing defective repair in Chinese hamster cells 43413168.81E−05−1.38HTATSF1|HIV TAT specific factor 17431821.01E−041.22DJ37E16.5|hypothetical protein dJ37E16.53105841.09E−04−2.25ARL1|ADP-ribosylation factor-like 120164261.22E−042.79KIAA0664|KIAA0664 protein5028911.22E−04−1.46FLJ11184|hypothetical protein FLJ111842025771.30E−04−0.87HNMT|histamine N-methyltransferase16372821.31E−041.23HK2|hexokinase 21500031.40E−04−0.99FLJ13187|phafin 23662091.41E−04−1.10ESTs8100631.99E−04−1.45GFER|growth factor, erv1 (S. cerevisiae)-like (augmenter of liver regeneration)8558002.29E−04−1.18PREP|prolyl endopeptidase7812222.56E−041.48TIAF1|TGFB1-induced anti-apoptotic factor 18971642.72E−04−0.95CTNNA1|catenin (cadherin-associated protein), alpha 1 (102 kD)1342702.87E−04−1.19Human hbc647 mRNA sequence7453602.91E−04−1.14HAT1|histone acetyltransferase 123136732.91E−041.59LOC50999|CGI-100 protein3094692.98E−041.38KIAA1725|KIAA1725 protein20188083.28E−04−1.08PRCP|prolylcarboxypeptidase (angiotensinase C)108425-23.29E−04−1.70ESTs, Weakly similar to JC5314 CDC28/cdc2-like kinase associating arginine-serinecyclophilin [H. sapiens]7887453.30E−04−1.72WS-3|novel RGD-containing protein16388273.49E−041.19RFPL3S|ret finger protein-like 3 antisense16706883.59E−04−1.89BACH2|BTB and CNC homology 1, basic leucine zipper transcription factor 2758863.95E−04−1.08ESTs, Weakly similar to E54024 protein kinase [H. sapiens]856144.01E−04−1.40LEPROTL1|leptin receptor overlapping transcript-like 117377244.12E−041.55LRRN1|leucine-rich repeat protein, neuronal 11559204.23E−041.95FLJ10211|hypothetical protein FLJ102113069334.24E−041.27Homo sapiens clone 25012 mRNA sequence17320334.27E−04−1.94FLJ14427|hypothetical protein FLJ144278151674.37E−04−1.54PLEKHA3|pleckstrin homology domain-containing, family A (phosphoinositide binding specific) member 31661994.51E−041.87ADRBK1|adrenergic, beta, receptor kinase 1507944.58E−040.74ZNF133|zinc finger protein 133 (clone pHZ-13)5042014.68E−041.49Homo sapiens, clone IMAGE: 3677194, mRNA, partial cds16097484.92E−04−0.82MGC10882|hypothetical protein MGC108827733755.23E−04−1.23401735.66E−041.42MAST205|KIAA0807 protein14167825.66E−040.63CKB|creatine kinase, brain8262865.82E−041.86IMP13|importin 132350565.94E−041.0624432|hypothetical protein 244328245106.13E−041.26LOC51647|CGI-128 protein7962556.27E−04−1.13MRPS14|mitochondrial ribosomal protein S147854596.38E−040.92SMTN|smoothelin396776.40E−04−2.30FLJ10702|hypothetical protein FLJ107021495396.67E−04−1.21MKP-7|MAPK phosphatase-7322317.03E−040.91FLJ12442|hypothetical protein FLJ1244214662377.16E−041.54TES|testis derived transcript (3 LIM domains)1550507.39E−04−1.42MDS025|hypothetical protein MDS025842877.42E−041.47ESTs8455137.46E−041.34AP47|clathrin-associated protein AP4719030677.48E−042.66C21orf18|chromosome 21 open reading frame 18836537.55E−04−2.30HSPC128|HSPC128 protein16035837.80E−04−0.81SH3BGRL|SH3 domain binding glutamic acid-rich protein like7440478.09E−040.94PLK|polo (Drosophia)-like kinase19473818.56E−041.05FLJ22329|hypothetical protein FLJ223298846778.60E−04−1.47Homo sapiens, clone IMAGE: 3611719, mRNA, partial cds840688.93E−04−1.52CL25084|hypothetical protein5291479.17E−04−1.20VPS26|vacuolar protein sorting 26 (yeast homolog)16933579.35E−040.99EDN2|endothelin 2268569.51E−040.96FLOT2|flotillin 27677539.62E−04−1.49RFX5|regulatory factor X, 5 (influences HLA class II expression)23220791.01E−031.028150571.03E−03−1.11FLJ10652|hypothetical protein FLJ1065220624531.05E−030.74DKFZP727G051|DKFZP727G051 protein1262211.06E−031.15TPD52L2|tumor protein D52-like 22905361.07E−031.39ESTs, Weakly similar to T43483 translation initiation factor IF-2 homolog [H. sapiens]5052991.12E−03−2.27BBP|beta-amyloid binding protein precursor7966941.12E−032.00BIRC5|baculoviral IAP repeat-containing 5 (survivin)7860531.13E−031.27Homo sapiens cDNA FLJ30898 fis, clone FEBRA20055721451361.14E−03−1.48Homo sapiens cDNA FLJ13103 fis, clone NT2RP30023041409511.17E−031.06ACTN4|actinin, alpha 47253951.18E−03−1.14UBE2L6|ubiquitin-conjugating enzyme E2L 62957811.20E−03−0.86MGC9084|hypothetical protein MGC90842675901.20E−031.37KIAA0330|calcineurin binding protein 12993881.21E−031.48PP15|nuclear transport factor 2 (placental protein 15)15060461.24E−031.00FLJ10815|hypothetical protein FLJ108152503131.25E−03−1.57ESTs18820511.27E−03−1.58FLJ20080|hypothetical protein FLJ200808983121.27E−031.08TRAF4|TNF receptor-associated factor 47124821.31E−03−1.73APTX|aprataxin19262491.31E−031.28LOC58509|NY-REN-24 antigen265071.34E−031.547583181.38E−03−1.32FBXO3|F-box only protein 37857081.42E−03−1.51ESTs, Weakly similar to O4HUD1 debrisoquine 4-hydroxylase [H. sapiens]8429681.42E−031.38BUB1B|budding uninhibited by benzimidazoles 1 (yeast homolog), beta34778-21.45E−030.87VEGF|vascular endothelial growth factor7420071.45E−03−1.42KIAA0146|KIAA0146 protein10303511.48E−03−1.50SCYB11|small inducible cytokine subfamily B (Cys-X-Cys), member 117414741.54E−030.79GPI|glucose phosphate isomerase8271711.61E−03−0.90LRRC2|leucine-rich repeat-containing 22667471.61E−03−0.97Homo sapiens, Similar to RIKEN cDNA 2010001O09 gene, clone MGC: 21387 IMAGE: 4471592, mRNA,complete cds521031.62E−03−1.49FLJ23045|hypothetical protein FLJ230457958931.63E−031.91PPP1R15A|protein phosphatase 1, regulatory (inhibitor) subunit 15A7826891.64E−030.68SLC6A8|solute carrier family 6 (neurotransmitter transporter, creatine), member 87246151.66E−031.12CHC1|chromosome condensation 11387881.68E−03−0.87PRLR|prolactin receptor8155351.68E−031.37TCOF1|Treacher Collins-Franceschetti syndrome 12614811.70E−03−1.08CUL3|cullin 314757381.72E−03−1.99RPS25|ribosomal protein S25706061.76E−03−0.92ESTs3454231.80E−03−1.57DKFZP564M112|likely ortholog of preimplantation protein 34149921.84E−030.90LOC57106|K562 cell-derived leucine-zipper-like protein 17705881.85E−031.41Homo sapiens TTF-I interacting peptide 20 mRNA, partial cds1635581.86E−031.91SIRT6|sirtuin (silent mating type information regulation 2, S. cerevisiae, homolog) 68408651.92E−031.66MACS|myristoylated alanine-rich protein kinase C substrate (MARCKS, 80K-L)238311.92E−030.51ALDOC|aldolase C, fructose-bisphosphate237721.95E−031.24LZTR1|leucine-zipper-like transcriptional regulator, 17566621.95E−031.40KIAA0943|KIAA0943 protein7841501.97E−03−1.24RAB31|RAB31, member RAS oncogene family2427061.99E−03−1.48HSPC274|HSPC274 protein19478042.04E−031.13TREX1|three prime repair exonuclease 12790852.07E−031.19MYO9B|myosin IXB1093162.08E−03−1.17SERPINA3|serine (or cysteine) proteinase inhibitor, clade A (alpha-1 antiproteinase, antitrypsin), member 38405062.08E−03−1.533-Apr|apoptosis related protein APR-34914862.09E−03−1.24LOC51578|adrenal gland protein AD-00417343092.13E−030.75SPAG4|sperm associated antigen 48109832.16E−031.41DKFZP434H132|DKFZP434H132 protein477952.16E−03−1.31ZNF161|zinc finger protein 1613079332.17E−03−2.26NDUFB5|NADH dehydrogenase (ubiquinone) 1 beta subcomplex, 5 (16 kD, SGDH)8979712.18E−03−2.23COPB|coatomer protein complex, subunit beta7438102.20E−032.32MGC2577|hypothetical protein MGC25778600002.21E−031.61RFC2|replication factor C (activator 1) 2 (40 kD)2627392.23E−03−0.97P125|Sec23-interacting protein p1257545372.32E−03−0.79Homo sapiens cDNA FLJ10229 fis, clone HEMBB1000136377082.32E−030.79MGC3101|hypothetical protein MGC310117525482.32E−03−2.59CNGB3|cyclic nucleotide gated channel beta 33077402.37E−03−1.12ESTs510632.43E−030.86ESTs2779992.47E−03−1.16DKFZP434D193|DKFZP434D193 protein7684522.47E−03−0.94Homo sapiens EST from clone 491476, full insert8561642.48E−031.26AS3|androgen-induced prostate proliferative shutoff associated protein20097792.48E−03−1.24RAB5EP|rabaptin-57555782.48E−030.61SLC7A5|solute carrier family 7 (cationic amino acid transporter, y+ system), member 519139432.52E−030.78ESTs, Weakly similar to I38022 hypothetical protein [H. sapiens]7670682.53E−030.54DKFZP586G1517|DKFZP586G1517 protein7391912.54E−031.74ZNF261|zinc finger protein 2617866742.59E−030.51SOX2|SRY (sex determining region Y)-box 27959362.60E−03−1.62TSN|translin6872892.64E−03−2.20Homo sapiens, clone MGC: 3245 IMAGE: 3505639, mRNA, complete cds6855162.67E−03−0.59GPCR150|putative G protein-coupled receptor382442.70E−031.22FLJ12587|hypothetical protein FLJ125878558722.70E−031.62NRD1|nardilysin (N-arginine dibasic convertase)21258192.70E−031.22BAX|BCL2-associated X protein23071192.74E−031.03INPP4A|inositol polyphosphate-4-phosphatase, type I, 107 kD24493432.74E−030.71PTPRH|protein tyrosine phosphatase, receptor type, H3255152.85E−03−0.73FLJ10980|hypothetical protein FLJ109808241322.87E−031.22Homo sapiens, Similar to cofactor required for Sp1 transcriptional activation, subunit 8 (34 kD), cloneMGC: 11274 IMAGE: 3944264, mRNA, complete cds15002412.88E−03−0.51C1orf24|chromosome 1 open reading frame 248117902.89E−03−1.19DKFZP564G0222|DKFZP564G0222 protein7708352.94E−03−1.07BCKDHB|branched chain keto acid dehydrogenase E1, beta polypeptide (maple syrup urine disease)7961142.94E−03−1.18SIRT1|sirtuin (silent mating type information regulation 2, S. cerevisiae, homolog) 18844382.96E−03−1.18NFE2L2|nuclear factor (erythroid-derived 2)-like 21508973.00E−030.50B3GNT3|UDP-GlcNAc: betaGal beta-1,3-N-acetylglucosaminyltransferase 315190133.04E−030.95Homo sapiens, clone IMAGE: 3537447, mRNA, partial cds3236933.04E−031.25AP1S1|adaptor-related protein complex 1, sigma 1 subunit1240463.09E−031.30JAZ|double-stranded RNA-binding zinc finger protein JAZ8430913.10E−030.88MGC20533|similar to RIKEN cDNA 2410004L22 gene (M. musculus)1658283.10E−030.75FHOS|FH1/FH2 domain-containing protein1595353.14E−03−1.22ESTs8262563.18E−03−0.68TM7SF1|transmembrane 7 superfamily member 1 (upregulated in kidney)683453.21E−031.43ITPR3|inositol 1,4,5-triphosphate receptor, type 31284263.27E−030.63WBSCR14|Williams-Beuren syndrome chromosome region 1416016013.28E−031.73CSF2|colony stimulating factor 2 (granulocyte-macrophage)14741643.36E−031.51FLJ12886|hypothetical protein FLJ1288618714233.39E−03−1.27CDC23|CDC23 (cell division cycle 23, yeast, homolog)19088403.45E−03−1.58ZNF174|zinc finger protein 174685573.45E−031.50FABP1|fatty acid binding protein 1, liver7697123.46E−031.64GAK|cyclin G associated kinase7674773.47E−03−0.91ANKRA2|ankyrin repeat, family A (RFXANK-like), 2416473.49E−03−0.66PTPRT|protein tyrosine phosphatase, receptor type, T7674953.50E−03−0.51GLI3|GLI-Kruppel family member GLI3 (Greig cephalopolysyndactyly syndrome)7545823.50E−03−1.05EVI2A|ecotropic viral integration site 2A1662683.59E−031.61SR-A1|serine arginine-rich pre-mRNA splicing factor SR-A17690043.61E−03−2.39MPHOSPH1|M-phase phosphoprotein 12802493.66E−031.37KLF7|Kruppel-like factor 7 (ubiquitous)1988743.67E−031.33FLJ10922|hypothetical protein FLJ10922747383.74E−030.94MGC20486|hypothetical protein MGC204861301533.75E−031.15SUPT5H|suppressor of Ty (S. cerevisiae) 5 homolog514693.82E−031.17ADPRTL2|ADP-ribosyltransferase (NAD+; poly(ADP-ribose) polymerase)-like 21227393.82E−031.28FLJ21918|hypothetical protein FLJ219187827873.83E−03−0.98FLJ21347|hypothetical protein FLJ2134718945193.84E−03−1.35FLJ12085|hypothetical protein FLJ120852443073.87E−030.92SERPINE1|serine (or cysteine) proteinase inhibitor, clade E (nexin, plasminogen activator inhibitor type 1),member 11378363.92E−03−0.99PDCD10|programmed cell death 1017027423.95E−030.63SLC7A5|solute carrier family 7 (cationic amino acid transporter, y+ system), member 58134904.00E−030.99CORO1C|coronin, actin-binding protein, 1C7705184.01E−030.99KIAA0618|KIAA0618 gene product8251764.02E−03−1.00FLJ11273|hypothetical protein FLJ112735309544.07E−031.17CFL2|cofilin 2 (muscle)15889734.08E−03−1.35IMAGE3451454|hypothetical protein IMAGE34514547695374.13E−030.94ECH1|enoyl Coenzyme A hydratase 1, peroxisomal4907534.15E−031.22FLJ20420|hypothetical protein FLJ204204885054.16E−030.73SLC6A8|solute carrier family 6 (neurotransmitter transporter, creatine), member 8209066-24.18E−030.68STK15|serine/threonine kinase 157672364.30E−03−1.07CGI-51|CGI-51 protein5030964.31E−031.10ESTs15754104.33E−031.14Homo sapiens, Similar to RIKEN cDNA 2700064H14 gene, clone MGC: 21390 IMAGE: 4519078, mRNA,complete cds7454374.33E−03−1.55ESTs5903384.33E−03−0.86LOC51065|40S ribosomal protein S27 isoform7573284.34E−031.43FLJ22678|hypothetical protein FLJ226787267864.35E−03−1.69MGC2821|hypothetical protein MGC2821510104.35E−031.13FLJ20859|hypothetical protein FLJ208597704304.40E−031.26DKFZP434D0421|hypothetical protein DKFZp434D04213659194.40E−03−1.03STAU|staufen (Drosophila, RNA-binding protein)444434.40E−03−1.08SCYE1|small inducible cytokine subfamily E, member 1 (endothelial monocyte-activating)8119074.50E−030.96FLJ22056|hypothetical protein FLJ220565021514.52E−030.56SLC16A3|solute carrier family 16 (monocarboxylic acid transporters), member 39506674.53E−03−1.02HRASLS|HRAS-like suppressor7427074.76E−031.33ESTs, Weakly similar to MUC2_HUMAN MUCIN 2 PRECURSOR [H. sapiens]2992744.79E−03−0.71Homo sapiens cDNA FLJ32430 fis, clone SKMUS2001129, weakly similar to NAD-DEPENDENTMETHANOL DEHYDROGENASE (EC 1.1.1.244)1353034.79E−03−0.87HT007|uncharacterized hypothalamus protein HT0077885114.80E−031.16RPS6KA1|ribosomal protein S6 kinase, 90 kD, polypeptide 120628254.82E−030.77KIAA0964|KIAA0964 protein6865524.83E−03−1.23GOLPH1|golgi phosphoprotein 15866504.85E−031.05SLC29A1|solute carrier family 29 (nucleoside transporters), member 122392904.86E−03−0.95SDF1|stromal cell-derived factor 125027224.87E−03−0.60LOH11CR2A|loss of heterozygosity, 11, chromosomal region 2, gene A5878474.88E−030.81GPX2|glutathione peroxidase 2 (gastrointestinal)20548964.89E−03−0.94FLJ21669|hypothetical protein FLJ216698121534.94E−03−1.14FLJ13081|hypothetical protein FLJ130818118884.97E−03−1.22DKFZP586F1122|hypothetical protein DKFZp586F1122 similar to axotrophin5048264.97E−03−1.31TFAM|transcription factor A, mitochondrial16356955.01E−030.55GGA2|Golgi-associated, gamma-adaptin ear containing, ARF-binding protein 216361665.07E−030.98KIAA0668|KIAA0668 protein3225115.09E−03−0.97Homo sapiens mRNA; cDNA DKFZp564D1462 (from clone DKFZp564D1462)263145.12E−03−1.13STXBP3|syntaxin binding protein 324306765.16E−031.40EZFIT|endothelial zinc finger protein induced by tumor necrosis factor alpha3465455.19E−030.93Homo sapiens cDNA FLJ30346 fis, clone BRACE200752715925305.22E−030.94IP6K2|mammalian inositol hexakisphosphate kinase 2326845.25E−03−1.15RPL32|ribosomal protein L322798005.28E−03−1.19SLMAP|sarcolemma associated protein17339355.30E−031.34DDX8|DEAD/H (Asp-Glu-Ala-Asp/His) box polypeptide 8 (RNA helicase)8244875.30E−031.09MGC2594|hypothetical protein MGC25948132815.35E−03−0.72WWP1|WW domain-containing protein 11501375.38E−03−1.29DKFZP564O123|DKFZP564O123 protein1355035.38E−031.38BRD4|bromodomain-containing 47809475.39E−030.92POLD1|polymerase (DNA directed), delta 1, catalytic subunit (125 kD)8844555.57E−031.04PRDX5|peroxiredoxin 52665005.63E−03−0.53ESTs513285.68E−031.00CDC34|cell division cycle 348977675.69E−032.04U5-100K|prp28, U5 snRNP 100 kd protein8110295.74E−030.89KIAA0365|KIAA0365 gene product8103915.74E−030.81HYAL1|hyaluronoglucosaminidase 123069195.76E−03−0.93SLC35A3|solute carrier family 35 (UDP-N-acetylglucosamine (UDP-GlcNAc) transporter), member 320188205.80E−03−1.19LRP3|low density lipoprotein receptor-related protein 34629395.82E−03−1.08Homo sapiens cDNA FLJ31303 fis, clone LIVER10000828824885.85E−031.27TERF2|telomeric repeat binding factor 22629165.87E−03−1.27PPM1B|protein phosphatase 1B (formerly 2C), magnesium-dependent, beta isoform19265755.90E−03−1.33CDX2|caudal type homeo box transcription factor 28142855.90E−03−1.34FLJ11240|hypothetical protein FLJ112402961905.92E−03−1.48KIAA0321|KIAA0321 protein348525.93E−03−1.01BIRC2|baculoviral IAP repeat-containing 214043965.95E−031.10PLCB3|phospholipase C, beta 3 (phosphatidylinositol-specific)4318696.00E−030.88Homo sapiens, clone IMAGE: 3506202, mRNA, partial cds8843886.05E−031.21FLJ21103|hypothetical protein FLJ2110323139216.14E−03−0.91NDUFB3|NADH dehydrogenase (ubiquinone) 1 beta subcomplex, 3 (12 kD, B12)8243526.14E−03−1.35RAD23B|RAD23 (S. cerevisiae) homolog B3219456.15E−03−1.25ESTs1405746.20E−030.42SCYD1|small inducible cytokine subfamily D (Cys-X3-Cys), member 1 (fractalkine, neurotactin)8239126.20E−03−0.96UBL3|ubiquitin-like 38541386.25E−031.01CSNK1E|casein kinase 1, epsilon4876976.26E−03−0.71CROT|carnitine O-octanoyltransferase8427656.27E−03−1.15PC326|PC326 protein7265976.35E−03−0.84Homo sapiens cDNA FLJ32642 fis, clone SYNOV20011441727856.38E−030.60LOC51754|NAG-5 protein8982516.41E−03−1.55FLJ20727|hypothetical protein FLJ207272019766.44E−03−1.82ELF1|E74-like factor 1 (ets domain transcription factor)420186.45E−03−1.09KIAA1468|KIAA1468 protein787366.47E−030.94Homo sapiens clone 24877 mRNA sequence1152926.48E−03−1.11DKFZp586C1924|hypothetical protein DKFZp586C1924229176.52E−03−0.66Homo sapiens mRNA; cDNA DKFZp761M0111 (from clone DKFZp761M0111)7552286.60E−030.66DNM1|dynamin 110756356.62E−030.85MTR1|MLSN1- and TRP-related8148266.66E−03−1.38ESTs3225616.67E−03−0.95RPL31|ribosomal protein L312398626.68E−03−1.96KIAA0962|KIAA0962 protein5905446.69E−03−1.17MAPK9|mitogen-activated protein kinase 98977686.78E−030.70COL7A1|collagen, type VII, alpha 1 (epidermolysis bullosa, dystrophic, dominant and recessive)3765516.83E−03−1.67ETAA16|ETAA16 protein20219566.84E−031.16LOC56930|hypothetical protein from EUROIMAGE 16693878776366.87E−03−1.18DCTN4|dynactin 4 (p62)7705796.87E−031.18CLDN3|claudin 33063186.91E−030.94ORC6L|origin recognition complex, subunit 6 (yeast homolog)-like8683087.01E−03−1.04ESTs, Highly similar to RS23_HUMAN 40S RIBOSOMAL PROTEIN S2 [H. sapiens]754157.02E−03−0.75HINT|histidine triad nucleotide-binding protein8238507.03E−030.71RAI14|retinoic acid induced 1417097867.05E−03−0.68TRPS1|trichorhinophalangeal syndrome I29196517.12E−030.57PGLYRP|peptidoglycan recognition protein9652237.12E−031.59TK1|thymidine kinase 1, soluble4902517.13E−03−1.18PPP1R2|protein phosphatase 1, regulatory (inhibitor) subunit 24691727.14E−03−1.31SEC22C|vesicle trafficking protein519817.15E−03−1.15GALNT2|UDP-N-acetyl-alpha-D-galactosamine: polypeptide N-acetylgalactosaminyltransferase 2 (GalNAc-T2)17329227.19E−030.66Homo sapiens mRNA; cDNA DKFZp762H106 (from clone DKFZp762H106)2889997.20E−030.90SPEC1|small protein effector 1 of Cdc427823397.23E−031.08PRKAB1|protein kinase, AMP-activated, beta 1 non-catalytic subunit2216327.34E−031.95EIF2B2|eukaryotic translation initiation factor 2B, subunit 2 (beta, 39 kD)16057847.34E−03−1.26SYNE-2|synaptic nuclei expressed gene 2420707.40E−030.58NT5|5′ nucleotidase (CD73)16377567.44E−031.07ENO1|enolase 1, (alpha)372057.45E−030.72ESTs16259457.46E−03−0.98NDRG3|N-myc downstream-regulated gene 3321227.46E−03−1.04FLJ10210|hypothetical protein FLJ102105952977.48E−03−0.99SNAPAP|SNARE associated protein snapin2566807.50E−03−1.08BITE|p10-binding protein16093727.50E−03−0.79RIPK3|receptor-interacting serine-threonine kinase 315347197.50E−031.05MYO1D|myosin ID22445617.52E−030.79CROC4|transcriptional activator of the c-fos promoter705337.52E−031.21HPS|Hermansky-Pudlak syndrome15626047.59E−031.25AP2A1|adaptor-related protein complex 2, alpha 1 subunit470261-27.66E−03−0.61SMA5|SMA57813417.71E−03−1.02HSPE1|heat shock 10 kD protein 1 (chaperonin 10)795657.72E−03−0.75FLJ22662|hypothetical protein FLJ22662527247.75E−030.98FLJ20241|hypothetical protein FLJ20241807277.75E−030.73ROR1|receptor tyrosine kinase-like orphan receptor 13770187.76E−031.00FLJ20850|hypothetical protein FLJ208508155077.77E−03−1.598416637.78E−030.95NARF|nuclear prelamin A recognition factor1478417.83E−03−0.82FLJ12287|hypothetical protein FLJ12287 similar to semaphorins7125597.91E−03−1.21SEC24A|SEC24 (S. cerevisiae) related gene family, member A10310297.92E−03−2.65Homo sapiens cDNA FLJ32971 fis, clone TESTI2008847665997.94E−03−0.38NAT1|N-acetyltransferase 1 (arylamine N-acetyltransferase)7892047.95E−03−1.20TLOC1|translocation protein 1710877.97E−031.11MAFF|v-maf musculoaponeurotic fibrosarcoma (avian) oncogene family, protein F2768167.98E−030.73KIAA1718|KIAA1718 protein8249158.00E−031.51CAPN10|calpain 102029018.07E−030.71VAV2|vav 2 oncogene6693758.10E−030.94DKK1|dickkopf (Xenopus laevis) homolog 121161888.13E−030.83HDAC5|histone deacetylase 58149138.18E−03−0.83C11orf15|chromosome 11 open reading frame 153060138.19E−03−0.88MS4A1|membrane-spanning 4-domains, subfamily A, member 19506788.21E−031.05SREBF2|sterol regulatory element binding transcription factor 222372798.25E−03−0.63LGI1|leucine-rich, glioma inactivated 1330768.33E−03−0.54LOC56994|cholinephosphotransferase 14699248.35E−031.07PCTP|phosphatidylcholine transfer protein1900218.40E−031.25PIASY|protein inhibitor of activated STAT protein PIASy7695798.42E−030.81MAP2K2|mitogen-activated protein kinase kinase 215588328.44E−03−1.08MAT2B|methionine adenosyltransferase II, beta7724558.45E−03−1.02PPP4C|protein phosphatase 4 (formerly X), catalytic subunit306738.49E−03−0.51KIAA1022|cortactin SH3 domain-binding protein4178848.49E−03−0.60Homo sapiens cDNA FLJ12052 fis, clone HEMBB1002042, moderately similar toCYTOCHROME P450 4C1 (EC 1.14.14.1)7574358.49E−03−0.49NKX3A|NK homeobox (Drosophila), family 3, A2309108.50E−031.1315591988.52E−03−0.95Homo sapiens cDNA FLJ14923 fis, clone PLACE1008244, weakly similar to VEGETATIBLEINCOMPATIBILITY PROTEIN HET-E-18093538.58E−030.99IRF3|interferon regulatory factor 35649818.66E−030.78Homo sapiens, Similar to RIKEN cDNA 2810433K01 gene, clone MGC: 10200 IMAGE: 3909951, mRNA,complete cds7860488.66E−030.90E2F4|E2F transcription factor 4, p107/p130-binding2090668.67E−030.62STK15|serine/threonine kinase 1522140208.68E−03−1.36GRIN2D|glutamate receptor, ionotropic, N-methyl D-aspartate 2D8152768.68E−031.23NUP62|nucleoporin 62 kD8138458.75E−03−0.94RNUT1|RNA, U transporter 14715688.76E−030.89HN1|hematological and neurological expressed 18454198.77E−031.04FANCA|Fanconi anemia, complementation group A16317138.78E−03−1.02NEDD5|neural precursor cell expressed, developmentally down-regulated 525046988.83E−031.10ARRB2|arrestin, beta 219114638.90E−03−1.36ESTs14750288.94E−03−0.77RPS27|ribosomal protein S27 (metallopanstimulin 1)5021618.99E−030.75APPBP1|amyloid beta precursor protein-binding protein 1, 59 kD5094599.13E−030.99Homo sapiens cDNA FLJ14241 fis, clone OVARC10005337120499.14E−03−1.16IL24|interleukin 247855499.16E−03−1.28KIAA1902|KIAA1902 protein8094219.17E−03−0.85PCBD|6-pyruvoyl-tetrahydropterin synthase/dimerization cofactor of hepatocyte nuclear factor 1 alpha(TCF1)1544939.20E−03−0.89IFI41|interferon-induced protein 41, 30 kD1308459.25E−03−1.15PWP1|nuclear phosphoprotein similar to S. cerevisiae PWP125080449.30E−030.80HP|haptoglobin20139089.32E−03−1.0720541229.43E−03−0.39SLC11A3|solute carrier family 11 (proton-coupled divalent metal ion transporters), member 38121599.46E−031.15FLJ20337|hypothetical protein FLJ203377426959.49E−03−0.90Homo sapiens cDNA FLJ31534 fis, clone NT2RI2000671690029.50E−030.41ANGPTL4|angiopoietin-like 4328129.56E−03−0.98BCAS2|breast carcinoma amplified sequence 27530389.62E−030.76KIFC3|kinesin family member C37042999.74E−031.10TAZ|tafazzin (cardiomyopathy, dilated 3A (X-linked); endocardial fibroelastosis 2; Barth syndrome)8155019.74E−030.79MGC2721|hypothetical protein MGC272132083149.75E−03−0.58GPR27|G protein-coupled receptor 277583439.78E−031.01PPIF|peptidylprolyl isomerase F (cyclophilin F)3615879.80E−03−0.48KIAA1789|KIAA1789 protein8149519.81E−03−1.26Homo sapiens, RIKEN cDNA 2310005G07 gene, clone MGC: 10049 IMAGE: 3890955, mRNA,complete cds3237809.82E−031.34Homo sapiens cDNA FLJ11177 fis, clone PLACE100740216034049.82E−03−0.76LR8|LR8 protein1326379.86E−03−0.97GCA|grancalcin, EF-hand calcium-binding protein1316539.87E−03−1.63MRPS12|mitochondrial ribosomal protein S128976699.87E−031.08PRKCSH|protein kinase C substrate 80K-H492739.89E−030.78SLC27A4|solute carrier family 27 (fatty acid transporter), member 45308759.97E−03−0.37TKT|transketolase (Wernicke-Korsakoff syndrome)









TABLE 5










297 gene subset of genes in Table 4









Clone_ID
GB_ID
Unigene_ID












22917
AL137346
Hs.13299


23772
NM_006767
Hs.78788


23831
NM_005165
Hs.155247


26314
NM_007269
Hs.8813


26507
AB002304
Hs.356290


26811
NM_003401
Hs.150930


26856
NM_004475
Hs.184488


30673
AB028945
Hs.12696


30673
AF141901
Hs.12696


32122
NM_018027
Hs.183639


32684
NM_000994
Hs.169793


32812
NM_005872
Hs.22960


33076
NM_020244
Hs.171889


34852
NM_001166
Hs.289107


38244
AL109693
Hs.301338


39677
NM_018184
Hs.104222


40173
AB018350
Hs.101474


41647
NM_007050
Hs.225952


42018
AB040901
Hs.23542


42070
NM_002526
Hs.153952


44443
NM_004757
Hs.333513


47795
NM_007146
Hs.6557


49273
NM_005094
Hs.248953


50794
NM_003434
Hs.78434


51328
L22005
Hs.76932


51469
AK001980
Hs.24284


51981
NM_000972
Hs.99858


52724
AK000482
Hs.181780


52724
NM_017721
Hs.181780


66599
NM_000662
Hs.155956


68345
NM_002224
Hs.77515


68557
NM_001443
Hs.351719


69002
NM_016109
Hs.9613


70533
NM_000195
Hs.83951


71087
NM_012323
Hs.51305


75415
NM_005340
Hs.256697


78736
AF131821
Hs.3964


80727
NM_005012
Hs.274243


83653
NM_014167
Hs.90527


84068
AK001913
Hs.7100


85409
NM_003851
Hs.5710


85614
NM_015344
Hs.11000


109316
NM_001085
Hs.234726


124046
NM_012279
Hs.181012


126221
NM_003288
Hs.154718


128426
AF156603
Hs.285681


130153
NM_003169
Hs.70186


130845
NM_007062
Hs.172589


132637
NM_012198
Hs.79381


134270
U68494
Hs.24385


135303
NM_018480
Hs.24371


135503
NM_014299
Hs.278675


137836
NM_007217
Hs.28866


138788
NM_000949
Hs.1906


140574
NM_002996
Hs.80420


140951
NM_004924
Hs.182485


150137
NM_014043
Hs.11449


150897
NM_014256
Hs.69009


154493
NM_004509
Hs.38125


154493
NM_004510
Hs.38125


155920
NM_018028
Hs.127240


165828
NM_013241
Hs.95231


166199
NM_001619
Hs.83636


172785
NM_016446
Hs.8087


190021
NM_015897
Hs.105779


198874
NM_018273
Hs.19039


201976
M82882
Hs.154365


202577
NM_006895
Hs.81182


221632
NM_014239
Hs.170001


229901
NM_001334
Hs.75262


235056
AF070535
Hs.78019


239862
AB023179
Hs.9059


242706
NM_014145
Hs.3576


244307
M16006
Hs.82085


244307
NM_000602
Hs.82085


262739
NM_007190
Hs.300208


262916
NM_002706
Hs.5687


267590
NM_012295
Hs.7840


277999
AL080129
Hs.225841


279085
NM_004145
Hs.159629


279800
NM_007159
Hs.4007


280249
NM_003709
Hs.21599


288999
NM_020239
Hs.22065


295781
AL035369
Hs.33922


296190
AB002319
Hs.8663


299388
NM_005796
Hs.151734


306013
X07203
Hs.89751


306318
NM_014321
Hs.49760


306933
AF131828
Hs.7961


307933
NM_002492
Hs.19236


322511
AL080078
Hs.85335


322561
NM_000993
Hs.184014


323693
NM_001283
Hs.57600


325515
AB037791
Hs.29716


345423
NM_015387
Hs.107942


365919
NM_004602
Hs.6113


365919
NM_017453
Hs.6113


365919
NM_017454
Hs.6113


376551
NM_019002
Hs.82664


377018
NM_017967
Hs.30783


469172
NM_004206
Hs.12942


469924
AF151638
Hs.285218


469924
NM_021213
Hs.285218


471568
NM_016185
Hs.109706


487297
NM_006366
Hs.296341


487697
AF073770
Hs.12743


487697
NM_021151
Hs.12743


488505
NM_005629
Hs.187958


490251
NM_006241
Hs.267819


490753
NM_017812
Hs.6693


491053
NM_006321
Hs.241558


502151
NM_004207
Hs.85838


502161
NM_003905
Hs.61828


502891
NM_018352
Hs.267446


504826
NM_003201
Hs.75133


529147
NM_004896
Hs.67052


530875
NM_001064
Hs.89643


530875
NM_005516
Hs.89643


530954
AL117457
Hs.180141


586650
NM_004955
Hs.25450


587847
NM_002083
Hs.2704


590338
NM_015920
Hs.108957


595297
NM_012437
Hs.32018


628357
NM_001104
Hs.1216


666169
NM_000254
Hs.82283


669375
NM_012242
Hs.40499


685516
NM_014373
Hs.97101


686552
AF020762
Hs.6831


704299
NM_000116
Hs.79021


712049
NM_006850
Hs.315463


712559
AJ131244
Hs.211612


724615
NM_001269
Hs.84746


725395
NM_004223
Hs.169895


739191
NM_005096
Hs.9568


741474
NM_000175
Hs.279789


742007
D63480
Hs.278634


744047
NM_005030
Hs.77597


745360
NM_003642
Hs.13340


753038
NM_005550
Hs.23131


754537
AK001091
Hs.274415


754582
NM_014210
Hs.70499


755228
NM_004408
Hs.166161


755578
NM_003486
Hs.184601


756662
AB023160
Hs.352535


756662
NM_013325
Hs.352535


757435
NM_006167
Hs.55999


758318
NM_012175
Hs.16577


758343
NM_005729
Hs.173125


767068
AL117452
Hs.44155


767495
NM_000168
Hs.72916


767753
NM_000449
Hs.166891


769004
NM_016195
Hs.240


769537
NM_001398
Hs.196176


769579
L11285
Hs.72241


769712
NM_005255
Hs.153227


770518
AL080109
Hs.295112


770518
NM_014833
Hs.295112


770579
NM_001306
Hs.25640


770588
AF000560
Hs.79531


770835
NM_000056
Hs.1265


772455
NM_002720
Hs.2903


774446
NM_001124
Hs.394


780947
NM_002691
Hs.99890


781222
NM_004740
Hs.75822


784150
NM_006868
Hs.223025


785459
AJ010306
Hs.149098


785459
NM_006932
Hs.149098


786048
NM_001950
Hs.108371


786674
Z31560
Hs.816


788511
NM_002953
Hs.149957


788745
NM_006571
Hs.39913


789204
NM_003262
Hs.8146


795893
NM_014330
Hs.76556


795936
NM_004622
Hs.75066


796114
NM_012238
Hs.31176


796255
AL049705
Hs.247324


796694
NM_001168
Hs.1578


809353
NM_001571
Hs.75254


809421
NM_000281
Hs.3192


810063
NM_005262
Hs.27184


810391
NM_007312
Hs.75619


810983
NM_015492
Hs.17936


811029
AB002363
Hs.190452


811790
NM_014044
Hs.13370


811888
AL050171
Hs.5306


812159
NM_017772
Hs.26898


813490
NM_014325
Hs.17377


813845
NM_005701
Hs.21577


814285
NM_018368
Hs.339833


815057
NM_018169
Hs.236844


815235
NM_014329
Hs.75682


815276
NM_012346
Hs.9877


815276
NM_016553
Hs.9877


815535
NM_000356
Hs.301266


823850
AB037755
Hs.15165


823912
NM_007106
Hs.173091


824352
NM_002874
Hs.178658


824510
NM_016062
Hs.9825


824915
NM_021251
Hs.112218


825176
NM_018374
Hs.3542


826256
NM_003272
Hs.15791


826286
NM_014652
Hs.158497


840506
NM_016085
Hs.9527


840865
NM_002356
Hs.75607


841663
AL137729
Hs.256526


841663
NM_012336
Hs.256526


842765
NM_018442
Hs.279882


842968
NM_001211
Hs.36708


845419
NM_000135
Hs.284153


854138
NM_001894
Hs.79658


855800
NM_002726
Hs.86978


855872
NM_002525
Hs.4099


856164
NM_015032
Hs.168625


856164
NM_015928
Hs.168625


860000
NM_002914
Hs.139226


877636
NM_016221
Hs.180952


882488
NM_005652
Hs.100030


884438
NM_006164
Hs.155396


884455
NM_012094
Hs.31731


897164
NM_001903
Hs.178452


897767
NM_004818
Hs.168103


897768
NM_000094
Hs.1640


897971
NM_016451
Hs.3059


898251
NM_017944
Hs.300700


898312
NM_004295
Hs.8375


950667
NM_020386
Hs.36761


950678
NM_004599
Hs.108689


965223
NM_003258
Hs.105097


1030351
NM_005409
Hs.103982


1075635
AJ270996
Hs.272287


1404396
Z26649
Hs.37121


1416782
NM_001823
Hs.173724


1466237
NM_015641
Hs.165986


1474164
NM_019108
Hs.10116


1475028
NM_001030
Hs.195453


1475738
NM_001028
Hs.113029


1500241
AL137572
Hs.48778


1506046
NM_018231
Hs.10499


1534719
AB018270
Hs.39871


1558832
AF182814
Hs.54642


1592530
AL117458
Hs.323432


1592530
AL137514
Hs.323432


1592530
NM_016291
Hs.323432


1601601
NM_000758
Hs.1349


1603404
NM_014020
Hs.190161


1603583
NM_003022
Hs.14368


1605784
AL080133
Hs.57749


1605784
AL117404
Hs.57749


1609372
NM_006871
Hs.268551


1631713
NM_004404
Hs.155595


1635581
NM_016539
Hs.105463


1635618
NM_014931
Hs.72172


1635695
NM_015044
Hs.155546


1636166
AB014568
Hs.5898


1637282
NM_000189
Hs.198427


1637756
M55914
Hs.254105


1637756
NM_001428
Hs.254105


1693357
NM_001956
Hs.1407


1702742
NM_003486
Hs.184601


1709786
NM_014112
Hs.26102


1732922
AL162069
Hs.140978


1733935
NM_004941
Hs.171872


1734309
AF262992
Hs.123159


1737724
NM_002319
Hs.125742


1752548
NM_019098
Hs.154433


1871423
NM_004661
Hs.153546


1882051
NM_017657
Hs.7942


1894519
AL157464
Hs.48827


1903067
NM_017438
Hs.50748


1908840
NM_003450
Hs.155204


1913943
NM_002032
Hs.62954


1926249
AF052087
Hs.128425


1926575
NM_001265
Hs.77399


1947804
NM_016381
Hs.278408


2009779
NM_004703
Hs.326056


2015148
NM_014030
Hs.318339


2016426
AB014564
Hs.22616


2018808
NM_005040
Hs.75693


2054122
NM_014585
Hs.5944


2062825
NM_014902
Hs.177425


2116188
NM_005474
Hs.9028


2125819
NM_004324
Hs.159428


2237279
NM_005097
Hs.194704


2239290
NM_000609
Hs.237356


2239290
U16752
Hs.237356


2244561
NM_006365
Hs.322469


2306919
NM_012243
Hs.159322


2307119
NM_001566
Hs.32944


2307119
NM_004027
Hs.32944


2313673
AL080084
Hs.348996


2313673
NM_016040
Hs.348996


2313921
NM_002491
Hs.109760


2502722
NM_014622
Hs.152944


2504698
NM_004313
Hs.18142


2508044
NM_005143
Hs.75990


2919651
NM_005091
Hs.137583


3208314
NM_018971
Hs.278283









Example IV
Molecular Signatures of Four Additional Breast Cancer Subtypes

Frozen breast cancer samples from 247 patients were expression profiled and classified into four subtypes (A, B, C, and D) based on the expression of gene sequences in correlation with survival outcomes of the patients from whom the samples were obtained.


Within the set of 247 samples, 143 were ER+ via a biomarker test. Within this set of 41, microdissection was used to obtain breast cancer cells for identification of a molecular signature (i.e., expression of genes) that differentially categorized the ER+ group into subtypes A and B. The remaining samples were microdissected to obtain cells for identification of subtypes C and D.


The 50 genes which are overexpressed in relation to each of subtypes A, B, C, and D are shown in Tables 6, 7, 8, and 9, respectively. The number of samples classified into subtypes A, B, C, and D are 86, 57, 70, and 34, respectively.


Subtypes A and B are both subtypes of ER+ samples with significantly different survival outcomes as shown in FIG. 3. Subtype C samples are ER− and so may be viewed as, as well as used as, gene sequences the overexpression of which are indicative of ER−status. The survival outcomes of patients with subtype C samples are shown in FIG. 3. It is interesting to note that subtype B samples are from patients with survival similar to that of subtype C (patients whose tumors were ER negative). As such, an additional aspect of the invention is the treatment of patients with subtype B breast cancer cells in the manner of treating patients with cells having an ER negative phenotype.


Subtype D samples are independent of ER status and thus contain samples that may be ER+ or ER−. The survival outcomes of patients with subtype C samples are also shown in FIG. 3. Similar to subtype B as discussed above, the invention provides for the treatment of patients with subtype D breast cancer cells in the manner of treating patients with cells having an ER negative phenotype.

TABLE 650 gene sequences which define Subtype AP values(Wilcoxon Test)GeneIDDescription6.40592E−18AW473119ESR1|estrogen receptor 14.98711E−17AA130089ESTs5.56867E−17AL049265Homo sapiens mRNA; cDNA DKFZp564F053 (from clone DKFZp564F053)2.14044E−16AL360204Homo sapiens mRNA full length insert cDNA clone EUROIMAGE 9805473.93903E−16AK000158FLJ20151|hypothetical protein FLJ201518.60498E−16AI457338Homo sapiens cDNA FLJ33115 fis, clone TRACH20013141.02633E−15AL157499Homo sapiens mRNA; cDNA DKFZp434N2412 (from cloneDKFZp434N2412) 1.0264E−15AK024999Homo sapiens cDNA: FLJ21346 fis, clone COL027051.14067E−15AF131785KIAA0882|KIAA0882 protein1.51026E−15AW265341ESTs1.56394E−15AI439798FGD3|FGD1 family, member 31.61961E−15AK022441Homo sapiens cDNA FLJ12379 fis, clone MAMMA10025541.86262E−15BC008317LIV-1|LIV-1 protein, estrogen regulated1.92875E−15BC014948MLPH|melanophilin3.99501E−15AF176012JDP1|J domain containing protein 14.58544E−15AI200852ESTs 5.2605E−15AW015443ESTs, Weakly similar to JE0350 Anterior gradient-2 [H. sapiens]6.24497E−15R49089ESTs, Moderately similar to T12539 hypothetical protein DKFZp434J154.1[H. sapiens]6.68731E−15AW300348Homo sapiens ovarian cancer-related protein 2 (OCR2) mRNA, complete cds 8.4916E−15AF070632Homo sapiens clone 24405 mRNA sequence1.27628E−14AI277016ESTs1.27636E−14BF433570ESTs 1.3202E−14AL133622KIAA0876|KIAA0876 protein1.34262E−14BE967259BCL2|B-cell CLL/lymphoma 21.78871E−14AI364725KIAA0239|KIAA0239 protein1.91317E−14BC007997RERG|RAS-like, estrogen-regulated, growth-inhibitor2.50201E−14AY009106DKFZP434I092|DKFZP434I092 protein3.61137E−14AK000269FLJ20262|hypothetical protein FLJ202624.05649E−14AI263695NME5|non-metastatic cells 5, protein expressed in (nucleoside-diphosphatekinase)4.55599E−14AL050116Homo sapiens mRNA; cDNA DKFZp586A131 (from clone DKFZp586A131) 4.8679E−14BF110928ESTs, Weakly similar to I38022 hypothetical protein [H. sapiens]7.97977E−14AF035282C1orf21|chromosome 1 open reading frame 218.52063E−14AA775255ANKHZN|ANKHZN protein9.09746E−14AF052504RNB6|RNB61.00347E−13AI912086Homo sapiens cDNA FLJ30744 fis, clone FEBRA20003781.07127E−13BC013732NAT1|N-acetyltransferase 1 (arylamine N-acetyltransferase) 1.1068E−13AF007153Homo sapiens clone 23736 mRNA sequence1.14343E−13AK058158Homo sapiens cDNA FLJ25429 fis, clone TST056301.34564E−13BC017701AD036|AD036 protein1.39009E−13BF129497EST 1.6349E−13NM_020974CEGP1|CEGP1 protein1.80162E−13AL136926DKFZP586M1120|hypothetical protein DKFZp586M11201.98501E−13NM_016613LOC51313|AD021 protein2.05012E−13AI128582ESTs2.11732E−13AA826324Homo sapiens cDNA FLJ32320 fis, clone PROST20035372.25829E−13BC010607Homo sapiens, clone MGC: 18216 IMAGE: 4156235, mRNA, complete cds3.01538E−13AK027148FLJ23495|hypothetical protein FLJ23495 4.2846E−13AI382972TPBG|trophoblast glycoprotein4.71356E−13BC017338FUCA1|fucosidase, alpha-L-1, tissue5.02267E−13BC000809TCEAL1|transcription elongation factor A (SII)-like 1









TABLE 7










50 gene sequences which define Subtype B









P values




(Wilcoxon Test)
GeneID
Description





1.38458E−08
BC007659
NQO1|NAD(P)H dehydrogenase, quinone 1


1.14979E−07
NM_012134
LMOD1|leiomodin 1 (smooth muscle)


 1.664E−07
BF436656
MFAP4|microfibrillar-associated protein 4


2.33563E−07
BC010690
FLJ14529|hypothetical protein FLJ14529


5.84863E−07
AF035408
CILP|cartilage intermediate layer protein, nucleotide pyrophosphohydrolase


5.99703E−07
NM_014890
DOC1|downregulated in ovarian cancer 1


8.49583E−07
AF068651
LDB2|LIM domain binding 2


1.32045E−06
BE671609
ESTs, Weakly similar to T28770 hypothetical protein W03D2.1 -




Caenorhabditis elegans [C. elegans]


 1.3529E−06
BC005939
PTGDS|prostaglandin D2 synthase (21 kD, brain)


 1.4201E−06
BC011535
DKFZP566K1924|DKFZP566K1924 protein


1.45481E−06
BC008750
NDN|necdin homolog (mouse)


1.52693E−06
AI378647
ESTs


1.94159E−06
AI499501
ESTs, Weakly similar to FMOD_HUMAN FIBROMODULIN PRECURSOR




[H. sapiens]


2.24009E−06
AL079279

Homo sapiens mRNA full length insert cDNA clone EUROIMAGE 248114



2.83756E−06
AJ295149
LOC64174|putative dipeptidase


3.42268E−06
AK024551
FLJ20898|hypothetical protein FLJ20898


3.75687E−06
AI095484

Homo sapiens cDNA FLJ32163 fis, clone PLACE6000371



3.80068E−06
U67784
RDC1|G protein-coupled receptor


 4.2186E−06
AF035269
PS-PLA1|phosphatidylserine-specific phospholipase A1alpha


4.31724E−06
AF137027
TCL1B|T-cell leukemia/lymphoma 1B


4.52117E−06
BC012160
TNFRSF7|tumor necrosis factor receptor superfamily, member 7


4.52117E−06
BC001232
C6orf32|chromosome 6 open reading frame 32


5.55831E−06
NM_003734
AOC3|amine oxidase, copper containing 3 (vascular adhesion protein 1)


5.55831E−06
AI952055
ESTs


6.15839E−06
BC018650
EDG1|endothelial differentiation, sphingolipid G-protein-coupled receptor, 1


 7.3812E−06
BC016964

Homo sapiens, clone MGC: 21621 IMAGE: 4181577, mRNA, complete cds



7.63505E−06
AL136805
KIAA1474|KIAA1474 protein


7.80877E−06
NM_001773
CD34|CD34 antigen


7.80877E−06
BC009698
APOC1|apolipoprotein C-I


8.35283E−06
BC015694
KIAA1607|KIAA1607 protein


8.54208E−06
R42463
ENTPD1|ectonucleoside triphosphate diphosphohydrolase 1


9.34072E−06
AI470943
ESTs


1.06731E−05
AJ238044
BDKRB1|bradykinin receptor B1


1.09121E−05
X86163
BDKRB2|bradykinin receptor B2


1.14056E−05
AI754777
ESTs


1.16602E−05
AW024539
ESTs


 1.1789E−05
AW295374

Homo sapiens cDNA FLJ11422 fis, clone HEMBA1001008



1.27335E−05
AA749213
GMFG|glia maturation factor, gamma


1.33048E−05
BC016755
HFL1|H factor (complement)-like 1


1.35995E−05
AI671590
C11orf21|chromosome 11 open reading frame 21


1.48413E−05
NM_001504
GPR9|G protein-coupled receptor 9


1.51683E−05
AW874252
ESTs, Moderately similar to PBK1 protein [H. sapiens]


1.51686E−05
AF052094
EPAS1|endothelial PAS domain protein 1


1.72788E−05
NM_002405
MFNG|manic fringe homolog (Drosophila)


1.76565E−05
AK025307
CPT1A|carnitine palmitoyltransferase I, liver


1.80417E−05
NM_000609
SDF1|stromal cell-derived factor 1


1.80421E−05
NM_004419
DUSP5|dual specificity phosphatase 5


1.96658E−05
BI492073
ITM2A|integral membrane protein 2A


2.00929E−05
X56210
HFL2|H factor (complement)-like 2


2.05284E−05
AF131817

Homo sapiens clone 25023 mRNA sequence

















TABLE 8










50 gene sequences which define Subtype C









P values




(Wilcoxon Test)
GeneID
Description





1.12657E−20
AW450675
ESTs


1.96271E−20
AW139831

Homo sapiens cDNA FLJ11796 fis, clone HEMBA1006158, highly similar to






Homo sapiens transcription factor forkhead-like 7 (FKHL7) gene



1.96289E−20
NM_014211
GABRP|gamma-aminobutyric acid (GABA) A receptor, pi


6.14853E−20
AW004032
LOC56963|hypothetical protein from EUROIMAGE 363668


6.41109E−20
NM_001453
FOXC1|forkhead box C1


7.58367E−20
N31940
ESTs, Weakly similar to 2004399A chromosomal protein [H. sapiens]


2.06095E−19
NM_005044
PRKX|protein kinase, X-linked


3.82617E−19
AF257472
C21orf68|chromosome 21 open reading frame 68


3.98699E−19
AI567843
ESTs, Weakly similar to JC5314 CDC28/cdc2-like kinase associating arginine-




serine cyclophilin [H. sapiens]


4.15413E−19
AI160174
ESTs


5.09939E−19
AW140023
FLJ13204|hypothetical protein FLJ13204


 5.5344E−19
AI800206
STAC|src homology three (SH3) and cysteine rich domain


 7.0715E−19
AA767129
PRKY|protein kinase, Y-linked


2.02758E−18
AJ404611
BCL11A|B-cell CLL/lymphoma 11A (zinc finger protein)


2.28777E−18
AI804716
ESTs


2.28777E−18
AJ010277
TBX19|T-box 19


2.91023E−18
BC017913
ART3|ADP-ribosyltransferase 3


3.15313E−18
AAI56097
ESTs, Weakly similar to LKHU proteoglycan link protein precursor [H. sapiens]


3.69992E−18
NM_032047
B3GNT5|UDP-GlcNAc: betaGal beta-1,3-N-acetylglucosaminyltransferase 5


 4.0074E−18
AF118070
DKFZp762A227|hypothetical protein DKFZp762A227


 4.0074E−18
AK026733

Homo sapiens cDNA: FLJ23080 fis, clone LNG06052



 4.5165E−18
AW071804
ESTs


 4.5165E−18
AB037813
DKFZp762K222|hypothetical protein DKFZp762K222


5.51045E−18
BC017352
TRIM29|tripartite motif-containing 29


5.73373E−18
AW204371
DSC2|desmocollin 2


 6.2074E−18
BC000045
TONDU|TONDU


9.59111E−18
S72493
KRT16|keratin 16 (focal non-epidermolytic palmoplantar keratoderma)


1.79795E−17
AW206460
KIAA0481|KIAA0481 gene product


1.79795E−17
NM_002852
PTX3|pentaxin-related gene, rapidly induced by IL-1 beta


2.65568E−17
AK025251
CHST3|carbohydrate (chondroitin 6) sulfotransferase 3


 2.761E−17
AK026946
FLJ23293|likely ortholog of mouse ADP-ribosylation-like factor 6 interacting




protein 2


3.22481E−17
AF084830
KCNK5|potassium channel, subfamily K, member 5 (TASK-2)


4.56904E−17
AF070614
SCHIP1|schwannomin interacting protein 1


4.93528E−17
BF433019
ESTs, Weakly similar to TRHY_HUMAN TRICHOHYALI [H. sapiens]


5.54062E−17
AA622986
ESTs


7.53411E−17
NM_005401
PTPN14|protein tyrosine phosphatase, non-receptor type 14


8.78218E−17
NM_002639
SERPINB5|serine (or cysteine) proteinase inhibitor, clade B (ovalbumin),




member 5


9.12461E−17
U95089
EGFR|epidermal growth factor receptor (erythroblastic leukemia viral (v-erb-




b) oncogene homolog, avian)


 1.0631E−16
NM_003034
SIAT8A|sialyltransferase 8A (alpha-N-acetylneuraminate: alpha-2,8-




sialytransferase, GD3 synthase)


 1.0631E−16
AF308297
PPP1R14C|protein phosphatase 1, regulatory (inhibitor) subunit 14C


2.02749E−16
BC016004
MARCO|macrophage receptor with collagenous structure


2.54298E−16
AI741143

Homo sapiens cDNA FLJ32401 fis, clone SKMUS2000339



3.06941E−16
H29323
SFRP1|secreted frizzled-related protein 1


3.30861E−16
AI188827
PIM1|pim-1 oncogene


3.37105E−16
AL110178
TRIM2|tripartite motif-containing 2


3.43538E−16
AI740531
MAPK4|mitogen-activated protein kinase 4


6.01505E−16
BC012107
SH2D2A|SH2 domain protein 2A


 6.4813E−16
BC017918
LOC64148|17 kD fetal brain protein


6.72616E−16
AK026818

Homo sapiens cDNA: FLJ23165 fis, clone LNG09846



7.24508E−16
BC018646
PLCG2|phospholipase C, gamma 2 (phosphatidylinositol-specific)
















TABLE 9










50 gene sequences which define Subtype D









P values




(Wilcoxon Test)
GeneID
Description





2.77034E−09
AA609183
ESTs


2.87559E−09
AA843233
ESTs, Weakly similar to I38344 titin, cardiac muscle [H. sapiens]


1.15332E−08
BF003134
CLCA2|chloride channel, calcium activated, family member 2


 3.9503E−08
BC017073
Homo sapiens, Similar to RIKEN cDNA 1810054O13 gene, clone




IMAGE: 3845933, mRNA, partial cds


4.23232E−08
AL117406
ABCC11|ATP-binding cassette, sub-family C (CFTR/MRP), member 11


 5.5684E−08
BC005297
KMO|kynurenine 3-monooxygenase (kynurenine 3-hydroxylase)


1.13109E−07
BC002480
FLJ13352|hypothetical protein FLJ13352


1.73946E−07
BC000051
KIAA0950|lifeguard


1.79754E−07
BC005246
TM4SF3|transmembrane 4 superfamily member 3


2.18736E−07
AA991437
ESTs


2.65798E−07
AW444437
ESTs


3.43985E−07
AI090561
M160|scavenger receptor cysteine-rich type 1 protein M160 precursor


4.03622E−07
AI139456
LOC118430|small breast epithelial mucin


4.73181E−07
U63008
HGD|homogentisate 1,2-dioxygenase (homogentisate oxidase)


5.36992E−07
AI304573
CEACAM7|carcinoembryonic antigen-related cell adhesion molecule 7


6.09026E−07
BC010910
MCJ|DNAJ domain-containing


6.09026E−07
NM_001197
BIK|BCL2-interacting killer (apoptosis-inducing)


8.06728E−07
X60069
GGT1|gamma-glutamyltransferase 1


9.13192E−07
AK024899
ENPP3|ectonucleotide pyrophosphatase/phosphodiesterase 3


1.00177E−06
BF508222
ESTs


1.28014E−06
AL080207
ABCA12|ATP-binding cassette, sub-family A (ABC1), member 12


1.89723E−06
AA913512
LOC56624|mitochondrial ceramidase


2.01447E−06
M30474
GGT2|gamma-glutamyltransferase 2


2.07567E−06
AW666005
PRM3|protamine 3


2.27002E−06
AI783781
EST


2.33874E−06
NM_001445
FABP6|fatty acid binding protein 6, ileal (gastrotropin)


2.55664E−06
BC005257
MSMB|microseminoprotein, beta-


2.96382E−06
AK025757
FLJ22104|hypothetical protein FLJ22104


3.05238E−06
BF511014
CTRP2|complement-c1q tumor necrosis factor-related protein 2


3.85783E−06
AF027977
PPEF1|protein phosphatase, EF hand calcium-binding domain 1


3.97159E−06
AK024360
FLJ14298|hypothetical protein FLJ14298


4.08891E−06
X53578
FUT3|fucosyltransferase 3 (galactoside 3(4)-L-fucosyltransferase, Lewis




blood group included)


5.61574E−06
BC011020
MPHOSPH6|M-phase phosphoprotein 6


5.61574E−06
AB014603
KIAA0703|KIAA0703 gene product


6.11857E−06
BC002805
GJB1|gap junction protein, beta 1, 32 kD (connexin 32, Charcot-Marie-Tooth




neuropathy, X-linked)


6.47721E−06
BI711505
HLXB9|homeo box HB9


6.47735E−06
N51717
ESTs


6.85615E−06
BC017772
HT021|HT021


 7.4642E−06
AF007149
Homo sapiens clone 24771 mRNA sequence


8.12347E−06
AF331643
Homo sapiens chromosome 17 open reading frame 26 (C17orf26) mRNA,




complete cds


8.35512E−06
H19129
FGF12|fibroblast growth factor 12


8.59342E−06
AK025289
KLHL2|kelch-like 2, Mayven (Drosophila)


8.83782E−06
BC014209
BM040|uncharacterized bone marrow protein BM040


9.34702E−06
BC011587
Homo sapiens, Similar to RIKEN cDNA 1700018O18 gene, clone




IMAGE: 4121436, mRNA, partial cds


9.61178E−06
AW410306
NXPH4|neurexophilin 4


9.61219E−06
BF108852
ERBB2|v-erb-b2 erythroblastic leukemia viral oncogene homolog 2,




neuro/glioblastoma derived oncogene homolog (avian)


9.74693E−06
BC016153
Homo sapiens, Similar to hypothetical protein FLJ10134, clone MGC: 13208




IMAGE: 3841102, mRNA, complete cds


1.04507E−05
AF023676
TM7SF2|transmembrane 7 superfamily member 2


1.07451E−05
BC004925
Homo sapiens, Similar to G protein-coupled receptor, family C, group 5,




member C, clone MGC: 10304 IMAGE: 3622005, mRNA, complete cds


1.10479E−05
AW299530
ESTs









All references cited herein, including patents, patent applications, and publications, are hereby incorporated by reference in their entireties, whether previously specifically incorporated or not.


Having now fully described this invention, it will be appreciated by those skilled in the art that the same can be performed within a wide range of equivalent parameters, concentrations, and conditions without departing from the spirit and scope of the invention and without undue experimentation.


While this invention has been described in connection with specific embodiments thereof, it will be understood that it is capable of further modifications. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains and as may be applied to the essential features hereinbefore set forth.

Claims
  • 1-7. (canceled)
  • 8. A method to determine the prognosis or clinical course and aggressiveness of breast cancer of a subject comprising assaying for the expression level(s) of one or more genes in Table 2, 3, 4, 6, 7, 8, or 9 from a breast cancer cell sample from the subject.
  • 9. The method of claim 8 wherein said assaying comprises preparing RNA, optionally labeled, from said sample and optionally converting said RNA into cDNA, optionally labeled.
  • 10. The method of claim 9 wherein said RNA is not labeled and used for quantitative PCR.
  • 11. The method of claim 9 wherein said assaying comprises using an array.
  • 12. The method of claim 8 wherein said sample is a ductal lavage or fine needle aspiration or FFPE breast tissue sample.
  • 13. The method of claim 12 wherein said sample is microdissected to isolate one or more cells that are breast cancer cells or suspected of being breast cancer cells.
  • 14. The method of claim 10 wherein genes from Table 4 are used and further comprising determination of the ratio of the expression of an underexpressed gene to the expression of an overexpressed gene as an indicator of prognosis or clinical course and aggressiveness of breast cancer in said subject.
  • 15. A method of determining prognosis of a subject having breast cancer, said method comprising: assaying for the expression level(s) of one or more genes in Table 2, 3, 4, 6, 7, 8, or 9 from a breast cancer cell sample from said subject.
  • 16. The method of claim 15 wherein said assaying comprises preparing RNA, optionally labeled, from said sample and optionally converting said RNA into cDNA, optionally labeled.
  • 17. The method of claim 16 wherein said RNA is not labeled and used for quantitative PCR.
  • 18. The method of claim 15 wherein said assaying comprises using an array.
  • 19. The method of claim 15 wherein said sample is a ductal lavage or fine needle aspiration or FFPE breast tissue sample.
  • 20. The method of claim 19 wherein said sample is microdissected to isolate one or more cells that are breast cancer cells or suspected of being breast cancer cells.
  • 21. The method of claim 17 wherein genes from Table 4 are used and further comprising determination of the ratio of the expression of an underexpressed gene to the expression of an overexpressed gene as an indicator of prognosis in said subject.
  • 22. A method to determine the survival outcome of a breast cancer afflicted subject comprising assaying a sample of breast cancer cells of said subject for the expression level(s) of one or more genes listed in Table 2, 3, 4, 6, 7, 8, or 9.
  • 23. The method of claim 22 wherein said assaying comprises preparing RNA, optionally labeled, from said sample and optionally converting said RNA into cDNA, optionally labeled.
  • 24. The method of claim 23 wherein said RNA is not labeled and used for quantitative PCR.
  • 25. The method of claim 22 wherein said assaying comprises using an array.
  • 26. The method of claim 22 wherein said sample is a ductal lavage or fine needle aspiration or FFPE breast tissue sample.
  • 27. The method of claim 26 wherein said sample is microdissected to isolate one or more cells that are breast cancer cells or suspected of being breast cancer cells.
  • 28. The method of claim 24 wherein genes from Table 4 are used and further comprising determination of the ratio of the expression of an underexpressed gene to the expression of an overexpressed gene as an indicator of prognosis in said subject.
  • 29. (canceled)
RELATED APPLICATIONS

This application claims benefit of priority from U.S. Provisional Patent application No. 60/453,006, filed Mar. 7, 2003, which is hereby incorporated by reference in its entirety as if fully set forth.

Provisional Applications (1)
Number Date Country
60453006 Mar 2003 US