It has been widely accepted that carcinogenesis is a multistep process involving genetic and epigenetic changes that dysregulate molecular control of cell proliferation and differentiation (Balmain, 2003, Nat. Genet. 33, 238-244). The genetic changes can include activation of proto-oncogenes and/or the inactivation of tumor suppressor genes that can initiate tumorigenesis. Tumorprogression and Metastasis are also multi-stage processes by which tumor cells leave the site of a primary tumor, enter blood and lymph vessels, migrate to distant parts of the body and form novel foci of tumor growth. Metastasis is a major cause of mortality for cancer patients. Many studies on cancer metastasis have been conducted and several molecules participating in tumor cell invasion and metastasis have been identified and characterized. Among these molecules, some facilitate invasion and metastasis, e.g. laminin receptor, metalloproteinases, and CD44 (Hojilla, 2003, Br. J. Cancer 89, 1817-1821; Marhaba, 2004, J. Mol. Histol. 35, 211-231).
Despite use of a number of histochemical, genetic, and immunological markers, clinicians still have a difficult time predicting which tumors will progress and will finally metastasize to other organs, or whether a patient has already developed early metastasis. Some patients are in need of adjuvant therapy to prevent recurrence and metastasis and others are not. Distinguishing between these subpopulations of patients is not straightforward. There is therefore a need for new markers for distinguishing between tumors of differing metastatic potential and for new molecular targets and new therapeutic treatment options. In addition, such markers could be useful to monitor a potential anti-tumor response of a patient's body upon treatment with an anti-cancer drug.
Modern drug development typically involves the elucidation of the molecular mechanism underlying a disease or a condition, the identification of candidate target molecules and the evaluation of said target molecules. It is obvious that the identification of a candidate target molecule is essential to such process. With the sequencing of the human genome and publishing of respective sequence data, in principle, all of the coding nucleic acids of man are available. However, a serious limitation to this data is that typically no annotation of the function of said sequence is given. Furthermore, the mere knowledge of a coding nucleic acid sequence is not sufficient to predict the polypeptide's function in vivo.
In order to utilize such aforementioned new markers, it is required to identify the molecular basis of these markers based on their gene nucleotide and protein sequences. To define the profile of such genes whose expression is up-regulated during progression from a non metastasizing to metastatic cancer competence, initially rat tumor progression models were used for the identification of the markers presented in this invention. Here, instead of starting directly from human tumor material, it was chosen to analyze precisely defined clonal rodent tumor cell lines in a first differential gene sequence expression analysis. The utilization of such well characterized tumor cell lines offers the advantage that they often exhibit a reproducible metastatic or nonmetastatic phenotype that can be retested at any stage of the analysis. Moreover, tumor cell lines are accessible to genetic manipulation and functional tests in experimental animals. Rat tumor cells have the advantage of being able to be passaged in syngeneic animals, whereas human tumor cells have to be passaged in the rather artificial setting of an immunodeficient host. Furthermore, the cross species homology between rodent and human sequences creates the opportunity for the subsequent isolation of human homologues of such candidate tumor progression genes, hereafter referred to as “cancer antigens”, and evaluation of their expression in primary human tumor material.
For the above mentioned intended molecular comparison of gene expression differences, two rat carcinoma models were used. The first model represents a rat pancreatic adenocarcinoma model which comprises several clones that differ in their metastatic potential in vivo and have been derived from a common primary tumor (Matzku, 1983, Cancer Research 49, 1294-1299). For example, BSp73-1AS cells form primary tumors that do not metastasize, whereas BSp73-ASML cells are highly metastatic and, after s.c. injection into host animals, disseminate via the lymphatic system to finally colonize the lungs. The second system, the rat mammary adenocarcinoma cell system 13762NF (Neri, 1981, Int. J. Cancer 28, 731-738), is composed of a number of cell lines derived from a parental mammary tumor and its corresponding spontaneous lung and lymph node metastases. For example, the cell line MTPa has been reported to be nonmetastatic in vivo in syngeneic animals, whereas the related MTLY cells are highly metastatic, giving rise to multiple metastases in the lymph nodes and lungs (Neri, 1981, Int. J. Cancer 28, 731-738). These systems guarantee a high reproducibility of the cellular metastatic potentials and provide a reproducible and easy access to cellular material. Thus, a high standard of quality and quantity of the critical starting material is warranted. The metastatic and the non-metastatic material is highly related, a relationship which cannot be reached using human primary or secondary tumors or human tumor derived cell lines as frequently employed in other studies.
In order to identify gene sequences—cancer antigens—in these systems which are stronger expressed in cells displaying high metastatic potential in comparison to related cells with a lower metastatic potential, transcripts of the non-metastatic cell line were subtracted from those of the metastatic cells via the Subtractive Suppression Hybridization (SSH Analysis) (Nestl, 2001, Cancer Research 61, 1569-1577) technology. For this purpose, RNA was isolated from the metastatic (tester population) and non-metastatic cells (driver population), cDNA was then generated and digested to get smaller, suitably sized pieces of DNA. Tester cDNA was divided into two portions and each was ligated with a different adaptor. Each tester sample was then hybridized with an excess of driver cDNA. Only DNA fragments specifically present in the tester sample (derived from the metastatic cells) remained single stranded. The primary hybridization samples were then mixed and hybridized again. Now, only the remaining equalized and subtracted single strand tester cDNAs are able to reassociate and form hybrids with two different adaptors. Those fragments with two different adaptor ends could then be amplified by PCR and transferred into suitable vector systems for further analysis. Therefore, only the transcripts specifically expressed in metastatic cells are amplified whereas the amplification of transcripts present in both populations is suppressed (Diatschenko, 1996, Proc. Natl. Acad. Sci. 93, 6025-6030).
Using this analysis, 981 differentially expressed cDNA clones from these rat systems were isolated, which after analysis using sequence blast and clustering analysis bioinformatics tools equated to 229 individual rat sequences. Of those, 189 could subsequently be transferred to human sequences utilizing human gene sequence data banks and advanced bioinformatics analysis. Of these 189 gene sequences, 144 represented human proteins of known function, and 45 coded for human proteins of unknown function or hypothetical proteins.
To further characterize these sequences in respect to their biological connection to the process of tumor progression and metastasis formation, and to verify their suitability as cancer antigens or as metastasis markers, several additional analytical examinations were applied. Initially, all sequences of which a connection to metastasis formation or tumor progression has previously already been reported were sorted out. Secondly, the expression of the remaining gene sequences was analyzed in human tumor samples, and thirdly, the functional involvement of these sequences in cellular metastatic processes was analyzed by (i) overexpression of the gene sequences, and (ii) by RNA interference studies in suitable test systems. This analytical process revealed 9 previously not described new cancer antigens or metastasis markers which are useful as diagnostic tools or which may serve as new target structures to create new therapeutic treatment options for cancer patients, and which are one subject of this invention.
This invention relates to these sequences and their role in cellular process of increased metastasizing potential since their expression is found to be increased parallel to the increase in this metastasising potential. Thus, these gene sequences and the proteins encoded thereof may alone or in combination of two or more of these sequences contribute to the establishment of, or the progression to a more metastatic phenotype. With this respect, the pro-metastatic activities of a given sequence or the respectively encoded polypeptide may be enhanced when these activities are combined with the pro-metastatic activities of another sequence or polypeptide encoded thereof. Thus, the acquisition of pro-metastatic activities through enhanced expression of such individual sequences and polypeptides must therefore be regarded as part of a process in which a cell step wise acquires an increasing metastatic phenotype, whereas such a single step is defined by the acquisition of the upregulated expression of one of these sequences. This implies that these sequences are functionally linked to each other by each adding one step to the process of cellular metastatic potential, and these sequences should therefore be regarded as all being part of the same process, and therefore the same underlying invention which is presented herein.
A first aspect of the present invention is a method for diagnosing a disease or condition, or a susceptibility to a disease or condition, comprising the step of determining the expression, activity or mutations of at least one polynucleotide or expression product thereof in a biological sample from a (first) subject, wherein said at least one polynucleotide comprises
The subject from which the biological sample was obtained may be a patient having the disease or condition, or an individual not affected by the disease or condition. In the latter case, the subject may be an individual suspected of having the disease or condition. Usually, the subject is a human.
The biological sample may be derived from or contain a body liquid obtained from said subject, for example blood or cerebrospinal fluid. In a preferred embodiment, the biological sample contains tissue material obtained through biopsy. The tissue may be a tissue affected by the disease or condition, e.g. a solid tumor. A tissue affected by the disease or condition is a tissue which differs from the corresponding tissue from a healthy individual. The difference may be a difference in morphology, histology, gene expression, response to treatment, protein composition etc.
Usually, the sample has been processed to be in a condition suitable for the method of determining the expression, activity or mutations as detailed infra. The processing may include dilution, concentration, homogenization, extraction, precipitation, fixation, washing and/or permeabilization, etc. The processing may also include reverse transcription and/or amplification of nucleic acids present in the sample.
The method of the invention may comprise only steps which are carried out in vitro. In that case, the step of obtaining the tissue material from the subject's body is not encompassed by the present invention. In another embodiment, the method further comprises the step of obtaining the biological sample from the subject's body.
The method comprises the step of determining the expression, activity or mutations of at least one polynucleotide or expression product thereof in a biological sample. The phrase “determining the expression” as used herein preferably means “determining the expression level”. The expression or expression level correlates with the amount of polynucleotide or expression product thereof in the sample. The phrase “determining the expression of polynucleotide or expression product in the biological sample” includes or consists of determining the presence and/or amount of said at least one polynucleotide or expression product thereof. As used herein, the phrase “determining the mutations” means determining the presence or absence of one or more mutations in the nucleotide sequence of said at least one polynucleotide in said biological sample. It is preferred that mutations with respect to any one of the sequences SEQ ID NO:1 through 9 are determined.
The term “polynucleotide(s)” generally refers to any polyribonucleotide or polydeoxyribonucleotide that may be RNA or DNA. The polynucleotide may be single- or double-stranded. The polynucleotide in accordance with the diagnostic method of this invention may have a sequence as shown in any one of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8 and SEQ ID NO:9. In addition, the polynucleotide may have a sequence which is a variant of these sequences. The variant may be a sequence having one or more additions, substitutions, and/or deletions of one or more nucleotides such as an allelic variant or single nucleotide polymorphisms of the above sequences. The variant may have an identity of at least 80%, preferably of at least 85%, more preferably of at least 90%, even more preferably of at least 95%, most preferably of at least 99% to any one of the sequences SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8 and SEQ ID NO:9. The percent identity or conservation may be determined by the algorithm of Wilbur and Lipman, Proc. Natl. Acad. Sci. USA 80; 726-730 (1983) which is embodied in the MegAlign program (DNA Star), using a k-tuple of 3 and a gap penalty of 3. Alternatively the algorithm of Myers and Miller, CABIOS (1989), which is embodied in the ALIGN program (version 2.0) or its equivalent, using a gap length penalty of 12 and a gap penalty of 3 where such parameters are required. All other parameters are set to their default positions. Access to ALIGN is readily available (see, e.g., http://www2.igh.cnrs.fr/bin/align-guess.cgi on the Internet).
The variant may be a polynucleotide which hybridizes to any one of the sequences SEQ ID NO:1 through 9, preferably under stringent conditions. A specific example of stringent hybridization conditions is incubation at 42° C. for 16 hours in a solution comprising: 50% formamide, 5×SSC (150 mM NaCl, 15 mM trisodium citrate), 50 mM sodium phosphate (pH7.6), 5× Denhardt's solution, 10% dextran sulfate, and 20 μg/ml of denatured, sheared salmon sperm DNA, followed by washing the hybridization support in 0.1×SSC at about 65° C. Hybridization and wash conditions are well known and exemplified in Sambrook, et al., Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, N.Y., (1989), particularly Chapter 11 therein. Alternative hybridization conditions are described infra with respect to solid supports.
In the variant 1 to 20, preferably 1 to 10, more preferably 1 to 5, most preferably 1, 2 or 3 nucleotides may be added, substituted or inserted with respect to any one of the sequences as shown in SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8 and SEQ ID NO:9. The variants further include fragments of SEQ ID NO:1 through 9. The fragments may comprise at least 100, preferably at least 500, more preferably at least 1000 contiguous nucleotides of any one of SEQ ID NO:1 through 9. Most preferably the fragment has a length such that less than 100, or less than 50, or less than 25 nucleotides are missing with respect to any one of SEQ ID NO:1 through 9.
Alternatively, the polynucleotide may have the corresponding RNA sequence. The sequence of the polynucleotide may also be complementary to any one of the above sequences.
Preferably, the polynucleotide in accordance the diagnostic method of this invention comprises a sequence encoding a polypeptide having a sequence selected from the group consisting of SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ ID NO:17 and SEQ ID NO:18.
The expression product of said polynucleotide usually is a polypeptide encoded by any one of the above polynucleotides. Preferably, the polypeptide comprises a sequence selected from the group consisting of SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ ID NO:17 and SEQ ID NO:18. The polypeptide may be a variant of any one of SEQ ID NO:10-18. For example, the amino acid sequence of the polypeptide may have an identity of at least 80%, preferably of at least 85%, more preferably of at least 90%, even more preferably of at least 95%, most preferably of at least 98% to any one of the sequences SEQ ID NO:10-18. The identity is to be understood as identity over the entire length of the polypeptide. The percent identity or conservation may be determined by the algorithm of Wilbur and Lipman, Proc. Natl. Acad. Sci. USA 80; 726-730 (1983) which is embodied in the MegAlign program (DNA Star), using a k-tuple of 3 and a gap penalty of 3. Alternatively the algorithm of Myers and Miller, CABIOS (1989), which is embodied in the ALIGN program (version 2.0) or its equivalent, using a gap length penalty of 12 and a gap penalty of 3 where such parameters are required. All other parameters are set to their default positions. Access to ALIGN is readily available (see, e.g., http://www2.iqh.cnrs.fr/bin/align-guess.cgi on the Internet).
In the variant 1 to 10, preferably 1 to 5, more preferably 1 to 4, most preferably 1, 2 or 3 amino acids may be added, substituted or inserted with respect to any one of the sequences as shown in SEQ ID NO:10 through 18. The variants further include fragments of SEQ ID NO:10 through 18. The fragments may comprise at least 50, preferably at least 100, more preferably at least 500 contiguous amino acids of any one of SEQ ID NO:10 through 18. Most preferably the fragment has a length such that less than 50, or less than 30, or less than 15 amino acids are missing with respect to any one of SEQ ID NO:10 through 18.
In some embodiments, the variant polynucleotides and/or the polypeptides they encode retain at least one activity or function of the unmodified polynucleotide and/or the polypeptide, such as hybridization, antibody binding, etc.
In one embodiment, the method comprises the use of nucleic acid hybridization technology for determining the amount or presence of the polynucleotide in the sample, or for determining the mutations in the polynucleotide. Hybridization methods for nucleic acids are well known to those of ordinary skill in the art (see, e.g. Molecular Cloning: A Laboratory Manual, J. Sambrook, et al., eds., Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989, or Current Protocols in Molecular Biology, F. M. Ausubel, et al., eds., John Wiley & Sons, Inc., New York).
According to the invention, standard hybridization techniques of microarray technology may be utilized to assess polynucleotide expression. Microarray technology, which is also known as DNA chip technology, gene chip technology, and solid-phase nucleic acid array technology, is well known to the skilled person and is based on, but not limited to, obtaining an array of identified nucleic acid probes on a fixed support, labeling target molecules with reporter molecules (e.g., radioactive, chemiluminescent, or fluorescent tags), hybridizing target nucleic acids to the probes, and evaluating target-probe hybridization. A probe with a nucleic acid sequence that perfectly matches the target sequence will, in general, result in detection of a stronger reporter-molecule signal than will probes with less perfect matches. Many components and techniques utilized in nucleic acid microarray technology are presented in “The Chipping Forecast”, Nature Genetics, Vol. 21, January 1999.
According to the present invention, microarray supports may include but are not limited to glass, silica, aluminosilicates, borosilicates, plastics, metal oxides, nitrocellulose, or nylon. The use of a glass support is preferred. According to the invention, probes are selected from the group of polynucleotides including, but not limited to: DNA, genomic DNA, cDNA, and oligonucleotides; and may be natural or synthetic. Oligonucleotide probes preferably are 20 to 25-mer oligonucleotides and DNA/cDNA probes preferably are 500 to 5000 bases in length, although other lengths may be used. Appropriate probe length may be determined by the skilled person by known procedures. Probes may be purified to remove contaminants using standard methods known to those of ordinary skill in the art such as gel filtration or precipitation. Accordingly, the polynucleotide immobilized to the solid support is preferably an isolated polynucleotide. The term “isolated” polynucleotide refers to a polynucleotide that is substantially free from other nucleic acid sequences, such as and not limited to other chromosomal and extrachromosomal DNA and RNA. Isolated polynucleotides may be purified from a host cell. Conventional nucleic acid purification methods known to skilled artisans may be used to obtain isolated polynucleotides. The term also includes recombinant polynucleotides and chemically synthesized polynucleotides.
In one embodiment, probes are synthesized directly on the support in a predetermined grid pattern using methods such as light-directed chemical synthesis, photochemical deprotection, or delivery of nucleotide precursors to the support and subsequent probe production. In embodiments of the invention one or more control polynucleotides are attached to the support. Control polynucleotides may include but are not limited to cDNA of genes such as housekeeping genes or fragments thereof.
The solid support comprises at least one polynucleotide immobilized on or attached to its surface, wherein said polynucleotide hybridizes with a polynucleotide as described supra, preferably under stringent conditions. Suitable hybridization conditions are for example described in the manufacturer's instructions of “DIG Easy Hyb Granules” (Roche Diagnostics GmbH, Germany, Cat. No. 1796895). These instructions are incorporated herein by reference. The hybridization conditions described in the following protocol may be used:
Preferably, the polynucleotide immobilized on the solid support has a sequence as shown in any one of SEQ ID NO:1 through 9; or a complement thereof; or a fragment thereof.
In one embodiment, preferred probes are sets of two or more of the nucleic acid molecules as defined. In a specific embodiment, at least 9 different isolated polynucleotides are immobilized on said solid support, and said 9 different isolated polynucleotides have the nucleotide sequences as shown in SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8 and SEQ ID NO:9, respectively, or the corresponding complementary sequences, or fragments thereof.
In another embodiment, at least 20 or at least 50 or at least 75 different isolated polynucleotides selected from the polynucleotides listed in
In another embodiment, the method comprises utilizing an antibody directed against a polypeptide described hereinabove. Preferably, the polypeptide is selected from the group consisting of SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ ID NO:17 and SEQ ID NO:18. The antibody may be polyclonal or monoclonal, with monoclonal antibodies being preferred. The antibody is preferably immunospecific for any one of the above polypeptides. The antibodies can be used to detect the polypeptide by any standard immunoassay technique including ELISA, immunoblotting (Western blotting), immunoprecipitation, BIACORE technology and the like, as will be appreciated by one of ordinary skill in the art.
The method of the invention usually further comprises the step of comparing said expression or activity determined as described supra and the expression or activity of said polynucleotide or expression product thereof in a second sample which was obtained from tissue which is not affected by said disease. For example, an increased expression or activity in said first sample compared to the expression or activity in said second sample may be diagnostic of the disease. The second sample may be derived from a second subject which is not affected by the disease. Alternatively, the second sample may be derived from the first subject, but from a different tissue than the first sample.
The disease may be a tumor disease or cancer. Preferably, the disease is any one of the following diseases and conditions: estrogen receptor-dependent breast cancer, estrogen receptor-independent breast cancer, hormone receptor-dependent prostate cancer, hormone receptor-independent prostate cancer, brain cancer, renal cancer, colon cancer, colorectal cancer, pancreatic cancer, bladder cancer, esophageal cancer, stomach cancer, genitourinary cancer, gastrointestinal cancer, uterine cancer, ovarian cancer, astrocytomas, gliomas, skin cancer, squamous cell carcinoma, Keratoakantoma, Bowen disease, cutaneous T-Cell Lymphoma, melanoma, basal cell carcinoma, actinic keratosis, sarcomas, Kaposi's sarcoma, osteosarcoma, head and neck cancer, small cell lung carcinoma, non-small cell lung carcinoma, leukemias, lymphomas, or other blood cell cancers, ichtiosis, acne, acne vulgaris, thyroid resistance syndrome, diabetes, thalassemia, cirrhosis, protozoal infection, rheumatoid arthritis, rheumatoid spondylitis, all forms of rheumatism, osteoarthritis, gouty arthritis, multiple sclerosis, insulin dependent diabetes mellitus, non-insulin dependent diabetes, asthma, rhinitis, uveithis, lupus erythematoidis, ulcerative colitis, Morbus Crohn, inflammatory bowel disease, chronic diarrhea, psoriasis, atopic dermatitis, bone disease, fibroproliferative disorders, atherosclerosis, aplastic anemia, DiGeorge syndrome, Graves' disease, epilepsia, status epilepticus, alzheimer's disease, depression, schizophrenia, schizoaffective disorder, mania, stroke, mood-incongruent psychotic symptoms, bipolar disorder, affective disorders, meningitis, muscular dystrophy, multiple sclerosis, agitation, cardiac hypertrophy, heart failure, reperfusion injury and obesity.
Most preferably, the disease is minimal residual disease or tumor metastasis.
The genes identified herein permit, inter alia, rapid screening of biological samples by nucleic acid microarray hybridization or protein expression technology to determine the expression of the specific genes and thereby to predict the outcome of the disease. Such screening is beneficial, for example, in selecting the course of treatment to provide to the patient, and to monitor the efficacy of a treatment.
Another aspect of this invention is a method for identifying compounds which modulate the expression or activity of any of the polynucleotides or expression products thereof as defined in any one of claims 1 to 3, comprising
The step of determining the effect may comprise comparing said expression, activity, cellular localization or structural condition of said polynucleotide or polypeptide with the expression, activity, cellular localization or structural condition of said polynucleotide or polypeptide in cells which were not contacted with the candidate compound. The method may further comprise comparing the viability of the cells which were contacted with the candidate compound and the viability of cells which were not contacted with the candidate compound.
The candidate compound may be selected if the expression of said polynucleotide or polypeptide in the cells which were contacted with the candidate compound is lower than in the cells which were not contacted with the candidate compound. In such case, the compound is capable of suppressing the expression of the polynucleotide or expression product thereof. One may further compare the viability of the cells which were contacted with the candidate compound and the viability of cells which were not contacted with the candidate compound.
The invention further concerns a compound identified by the above-described method, wherein said compound is a compound which antagonizes or agonizes any one of the polynuleotides or expression products thereof as defined in this application. Such compounds include but are not limited to antisense nucleic acid molecules capable of suppressing the expression of any one of the polynucleotides or expression products thereof as defined herein.
Yet another aspect of the invention is a solid support on which at least one isolated polynucleotide is immobilized, wherein said isolated polynucleotide has
The solid support preferably has the form of a microarray or DNA chip. Other preferred embodiments of the solid support have been described hereinabove in connection with the diagnostic methods of the invention.
Yet another aspect of the invention is the use of a polynucleotide or polypeptide as defined herein for the diagnostic method, or of a compound identified by the screening method described above, in the manufacture of a medicament for the treatment or prevention of a disease associated with increased activity or expression of a polynucleotide or polypeptide as defined herein.
Diagnostic tools based on the newly identified cancer antigens are another subject of this invention, and include test systems to analyze expression of these sequences in tumors to predict the tumor's potential to progress and to develop metastasis. In addition, these tools can be used to examine a patient's body for the presence of micrometastases or minimal residual disease which may lead to improved decisions on further treatment modalities. In this respect, a test system applied could consist of cDNAs, comprising, e.g., the cancer antigen sequences, which are contained on a carrier system, such as being spotted on, e.g., glass slides (gene or cDNA chip) which subsequently would be analysed utilizing fluorescence labelled RNA samples—derived from patients—that are hybridized to these chips to investigate the expression patterns of several metastasis markers—including the cancer antigens—at the same time.
Therefore, the present invention relates to methods for the diagnosis or screening of a subject in need, e.g., a patient suffering from a disease, e.g., but not limited to cancer, which correlates with the expression of at least one of the cancer antigens of this invention, to test whether the subject displays an enhanced activity or expression of a polynucleotide or polypeptide. Such investigations could, e.g., give information about the presence of the metastatic potential of a patient's tumor cells, or whether a patient's body harbors minimal residual tumor disease. These investigations may comprise nucleic acid technologies, such as hybridisation methods using hybridisation samples derived from patient's normal or diseased tissues. Also, such processes may be useful to draw prognostic conclusions about about a patient's disease, or about a patient's response to a therapeutic treatment by monitoring of the clinical effectiveness of the treatment, and the correlation of the expression or activity of a cancer antigen (polynucleotide or polypeptide) of this invention.
Furthermore, since the genes or gene products coding for the cancer antigens of this invention could be causally involved in the progression of tumor diseases, these gene sequences or gene products encoded by those, may represent new target structures for the development of new drugs, including but not limited to anti-cancer drugs, and the subsequent therapeutic treatment of patients with these drugs.
Therefore, this invention also comprises methods for the treatment of a subject having the need to inhibit the activity or expression of a polynucleotide or polypeptide presented herein. Such treatment could comprise one or more of the following steps targeting the expression or function of a polynucleotide or polypeptide:
This invention also comprises methods for the expression, production and/or functional analysis of specific polynucleotides and polypeptides. For this purpose, a polynucleotide covered by this invention should be defined as comprising a nucleotide sequence that has at least 80% identity over its entire length to any of the polynucleotide sequences described herein. More preferably, the identity is larger than 90%, and even more preferably, this identity is larger than 95%. A polypeptide covered by this invention should be defined as comprising at least 80% identity over its entire length to a polypeptide sequences described herein. More preferably, this identity is larger than 90%, and even more preferably, the identity is larger than 95%.
The methods therefore included in this invention cover the use of a DNA or RNA molecule comprising an expression system, wherein said expression system is capable of producing a polynucleotide or polypeptide encoded therefrom when said expression system is present in a compatible host cell. This host cell may be a eukaryotic or bacterial host cell, and it may be used for a process for producing a polynucleotide or polypeptide by transforming or transfecting it with an expression system such that the host cell, under appropriate culture conditions, produces the encoded polynucleotide or polypeptide.
This invention also covers methods for the identification and development of compounds, agonist or antagonists, which are capable of interfering with the expression or function of a polynucleotide or polypeptide described herein. Such methods may include the following steps:
Also the following steps may be used for the identification of such compounds:
The diagnostic and therapeutic methods of this invention may be useful for diseases selected from the group of estrogen receptor-dependent breast cancer, estrogen receptor-independent breast cancer, hormone receptor-dependent prostate cancer, hormone receptor-independent prostate cancer, brain cancer, renal cancer, colon cancer, colorectal cancer, pancreatic cancer, bladder cancer, esophageal cancer, stomach cancer, genitourinary cancer, gastrointestinal cancer, uterine cancer, ovarian cancer, astrocytomas, gliomas, skin cancer, squamous cell carcinoma, Keratoakantoma, Bowen disease, cutaneous T-Cell Lymphoma, melanoma, basal cell carcinoma, actinic keratosis, sarcomas, Kaposi's sarcoma, osteosarcoma, head and neck cancer, small cell lung carcinoma, non-small cell lung carcinoma, leukemias, lymphomas, or other blood cell cancers, ichtiosis, acne, acne vulgaris, thyroid resistance syndrome, diabetes, thalassemia, cirrhosis, protozoal infection, rheumatoid arthritis, rheumatoid spondylitis, all forms of rheumatism, osteoarthritis, gouty arthritis, multiple sclerosis, insulin dependent diabetes mellitus, non-insulin dependent diabetes, asthma, rhinitis, uveithis, lupus erythematoidis, ulcerative colitis, Morbus Crohn, inflammatory bowel disease, chronic diarrhea, psoriasis, atopic dermatitis, bone disease, fibroproliferative disorders, atherosclerosis, aplastic anemia, DiGeorge syndrome, Graves' disease, epilepsia, status epilepticus, alzheimer's disease, depression, schizophrenia, schizoaffective disorder, mania, stroke, mood-incongruent psychotic symptoms, bipolar disorder, affective disorders, meningitis, muscular dystrophy, multiple sclerosis, agitation, cardiac hypertrophy, heart failure, reperfusion injury and/or obesity.
The following examples further describe the invention:
SEQ ID NO:1 (A8)
One rat cDNA clone, originally derived from the above described SSH analysis of the mammary tumor test system was used to establish the corresponding EST (Expressed Sequence Tag) cluster from rat EST databases. The nucleotide sequence identity within the cluster was over 96%. The consensus sequence of this cluster was used to run a blast (Basic Local Alignment Search Tool, http://www.ncbi.nlm.nih.gov/BLAST/) analysis against mouse gene sequence databases. A sequence identity of 89% was found with the mouse mRNA BC005755, which again showed a 89% identity on the nucleotide sequence level to the mRNAs of the human MEP50 gene sequence. The corresponding NCBI (National Center for Biotechnology Information) reference sequence (http://www.ncbi.nlm.nih.gov/RefSeq/) for this locus, NM—024102 has a length of 2428 nucleotides and codes for a protein of 342 amino acids. The gene MEP50 maps on chromosome 1.
MEP50 contains a G-protein beta WD-40 repeat according to a search with the database Pfam (Protein family alignment multiple). Pfam is a large collection of protein multiple sequence alignments and profile hidden Markov models (Bateman, 2000, Nucleic Acids Res. 30, 276-280).
MEP50 also contains a Glycosyl hydrolases family 18 motif. MEP50 was shown to be part of the Methylosome (Friesen, 2002, J. Biol. Chem. 277, 8243-8247) that is involved in the assembly of snRNP. Interestingly MEP50 was also shown to interact with the phosphatase FCP1, the only Pol II Phosphatase isolated so far (Licciardo, 2003, Nucleic Acids Res. 31, 999-1005).
In
This sequence was shown to be differentially expressed in analysis of “In situ hybridization” (ISH) of matched human tumors (BioCat BA3, http://www.biocat.de), namely in cancers of the colon, stomach and breast, as exemplified in
Tumor specific expression was further analyzed by hybridization experiments with Cancer Profiling Arrays (CA) from Clontech (http://www.bdbiosciences.com). The Cancer Profiling Arrays include normalized amplified cDNA from 241 tumor and corresponding normal tissues from individual patients, along with negative and positive controls, and cDNA from nine cancer cell lines. Here, overexpression was defined as upregulation of expression in the tumor probe versus expression in the normal probe of at least 1.5 fold. Percentage of upregulation in the tissues analysed is shown in
In
In order to functionally examine whether MEP50 could be causally involved in the process of tumor progression, MEP50 was transiently overexpressed or transiently downregulated by RNA interference in HEK-293T cells and subsequently potential resulting influences on tumor cell properties were assayed. Experiments shown in
In respect to these functional analysis, in detail the following tests have been performed:
A: For these tests, MEP50 and Ras cDNAs were cloned into the mammalian expression vector pCDNA3.1 (Invitrogen). HEK-293T cells were then transfected with expression vectors for the indicated proteins using Lipofectamine (Invitrogen) according to the manufacturers instructions. 16 h after transfection cells were seeded with 10,000 per well in triplicates in 96 well plates. From this time point on viable cells were determined every 24 h using the CellTiter Kit (Promega). The graphs represent the mean values of relative growth rates of three independent experiments. Note the increased growth rate upon expression of the Ras or MEP50 gene sequences.
B: Western Blot analysis testing the expression of the expressed proteins Ras and MEP50. For this purpose cells were lysed 24 h after transfection and lysates were subjected to gel electrophoresis and subsequent Western blotting with an anti-HA-antibody (12-CA-5). Note the clear expression of the proteins upon transfection of the expression constructs.
A: Analysis of the efficiency of the interference with the target protein expression, here tested on the protein level. HEK-293T cells were transiently transfected with an expression vector for MEP50 and the indicated siRNAs. 48 h after transfection cells were lysed and lysates were subjected to gelelectrophoresis and subsequent Western Blotting with an anti-HA-antibody (12-CA-5). Note that the expression of the target protein MEP50 could be strongly inhibited by using the siRNA targeting the MEP50 gene transcripts.
B: HEK-293T were transfected with the indicated siRNAs using Lipofectamine (Invitrogen) according to the manufacturers instructions. 16 h after transfection cells were seeded with 10,000 cells per well in triplicates in 96 well plates. From this time point on viable cells were determined every 24 h using the non radioactive cell proliferation assay “Cell Titer 96” (Promega). The CellTiter 96 Assay is colorimetric method for determining the number of viable cells. It is composed of solutions of a novel tetrazolium compound [3-(4,5-dimethylthiazol-2-yl)-5-(3-carboxymethoxyphenyl)-2-(4-sulfophenyl)-2H]-tetrazolium, inner salt; MTS. MTS is bioreduced by cells into a formazan product that is soluble in tissue culture medium. The conversion of MTS into the aqueous soluble formazan product is accomplished by dehydrogenase enzymes found in metabolically active cells. The quantity of formazan product as measured by the amount of 490 nm absorbance is directly proportional to the number of living cells in culture. The graphs represent mean values for absorbance at 490 nm of three independent experiments. Note the inhibition of proliferation upon down-regulation of MEP50 expression using MEP50 specific siRNA molecules.
HT29 colon cancer cells and T47D breast cancer cells were stably transfected with either control vector pCDNA3.1 or a corresponding expression vector derived thereof for MEP50. Stable mass cultures were selected using Neomycin. Cells were seeded with 10,000 cells per well in triplicates in 96 well plates. From this time point on viable cells were determined every 24 h using the CellTiter Kit (Promega). The graphs represent mean values for absorbance at 490 nm of three independent experiments. Note that the growth rate of both cell types is increased upon expression of MEP50.
In summary, MEP50 shows upregulation in metastasizing tumor cells versus non metastasizing tumor cells, and also displays upregulated expression in various tumor tissues versus normal tissue samples. Moreover, MEP50 is functionally involved in processes involved in tumor progression like increased proliferation and invasion. Therefore, this sequence may particularly be useful for staging of human tumor diseases, as well as for decisions on prognosis and treatment modalities. Furthermore, the MEP50 gene and its gene products may be used as target structures to develop therapeutic anti-cancer drugs.
The combined data established for SEQ ID NO:1 together with the data for SEQ ID NO:2-9 and selected additional sequences are presented in summary in
Names and/or accession numbers (Acc. No.) of differentially expressed sequences are given. According to data derived from Microarray Analysis (gene expression analysis), in total 89 sequences were found to be differentially expressed in at least one pair of metastasizing versus non metastasizing cells (indicated as a “+” mark in the column Microarray). These Microarray Analysis experiments were performed as described in
Several sequences were also analyzed in in Cancer profiling Arrays (CA): Here, overexpression of a given gene (indicated as a “+” mark in the column CA) was defined as upregulation of expression in the tumor probe versus the normal probe in at least 50% of analyzed pairs which were derived from at least 3 of 8 different tissues analyzed.
In addition,
Nine sequences were estimated as positive (“+” mark in the column functional indications) for at least three out of four criteria measured for having a relevance in metastatic processes (i.e. measurements of the following tests: Analyses in Microarray, ISH, CA, functional tests). These sequences are highlighted and refer to SEQ ID NO:1-9. Detailed descriptions of these SEQ ID NO:1-9 are given in Examples 1-9. The column “ID” lists the internal identification number, “Sequence No” gives the number of the sequence used in the text.
In
SEQ ID NO:2 (E4)
Another rat cDNA clone, originally derived from the above described SSH analysis of the pancreatic tumor test system was used to establish the corresponding EST cluster from rat EST databases. Nucleotide sequence identity with an identified rat sequence cluster was over 96%. Three further clones derived from this pancreatic test system also matched to this gene sequence cluster with over 96% nucleotide sequence identity. The consensus sequence of this cluster was established by using the software DNAStar, SeqManII (http://www.dnastar.com/), and was subsequently used in blast analysis using the human genome sequence database BLAT (http://genome.ucsc.edu/cgi-bin/hgBlat?command=start). This way, a nucleotide sequence identity of 90% was identified with the human mRNA AK130372 representing the locus FAM49B (family with sequence similarity 49, member B), alias BM-009. The corresponding NCBI reference sequence for this locus, NM—016623 comprises a length of 2219 nucleotides and codes for a predicted protein of unknown function. According to the AceView application, different transcripts of this gene exist, altogether putatively encoding 19 different protein isoforms.
AceView represents an integrated view of the human genes as reconstructed by alignment of all publicly available mRNAs and ESTs on the genome sequence (http://www.ncbi.nih.gov/IEB/Research/Acembly/index.html?human).
The amino acid sequence of FAM49B was analyzed by PSORT, a computer program for the prediction of protein localization sites in cells. According to PSORT2 (http://psort.nibb.ac.jp) the proteins encoded by this RNA are most likely located in the cytoplasm. The amino acid sequence of FAM49B was also analyzed by Pfam search. According to this analysis this protein belongs to a family of several hypothetical eukaryotic proteins (DUF1394) of around 320 residues in length. The functions of this protein family are unknown. The gene is localized in the 8q24 region, an area found to be minimally overepresented in prostate cancer (Tsuchiya, 2000, Am. J. Pathol. 160, 1799-1806).
In
This sequence was shown to be differentially expressed in Microarray Analysis comparing samples of metastasizing versus non metastasizing cells as exemplified for SEQ ID NO:1 in
In order to functionally examine whether FAM49B could be causally involved in the process of tumor progression, it was transiently overexpressed or transiently downregulated by RNA interference in HEK-293T cells and subsequently potential resulting influences on tumor cell properties were assayed. For overexpression a sequence corresponding to the NCBI reference sequence (http://www.ncbi.nlm.nih.gov/RefSeq/) was used. Experiments as previously exemplified for SEQ ID NO:1 in
Furthermore, FAM49B also affects the invasion potential of tumor cells.
In summary, FAM49B shows upregulation in metastasizing tumor cells versus non metastasizing tumor cells, and also displays upregulated expression in various tumor tissues versus normal tissue samples. Moreover, FAM49B is functionally involved in processes involved in tumor progression like increased proliferation and invasion. Therefore, this sequence may particularly be useful for staging of human tumor diseases, as well as for decisions on prognosis and treatment modalities. Furthermore, the FAM49B gene and its gene products may be used as target structures to develop therapeutic anti-cancer drugs.
SEQ ID NO:3 (H3)
Another rat cDNA clone, originally derived from the above described SSH analysis of the pancreas tumor test system was used to establish the corresponding EST cluster from rat EST databases. Identity to the ESTs within this cluster was 98%. Identity within the cluster was over 96%. The consensus sequence of this cluster was used to blast against human genome sequence databases. An identity of 89% was found to the human mRNA NM—024085 representing the locus FLJ22169. The reference RNA has a length of 3816 nucleotides and codes for a predicted protein of unknown function with 839 amino acids. According to Pfam Search the predicted protein shares homology to Autophagy protein Apg9. In yeast, 15 Apg proteins coordinate the formation of autophagosomes. Autophagy is a bulk degradation process induced by starvation in eukaryotic cells. Apg9 plays a direct role in the formation of the cytoplasm to vacuole targeting and autophagic vesicles, possibly serving as a marker for a specialised compartment essential for these vesicle-mediated alternative targeting pathways. According to Psort2, this protein most likely localizes to the membrane. According to AceView, this gene produces, by alternative splicing, 9 different transcripts altogether encoding 9 different protein isoforms.
In
This sequence was shown to be differentially expressed in Microarray Analysis comparing samples of metastasizing versus non metastasizing tumor cells as exemplified for SEQ ID NO:1 in
In order to functionally examine whether FLJ22169 could be causally involved in the progression of tumor progression, it was transiently overexpressed or transiently downregulated by RNA interference in HEK-293T cells and subsequently potential resulting influences on tumor cell properties were assayed. For its overexpression a sequence corresponding to the NCBI reference sequence (http://www.ncbi.nlm.nih.gov/RefSeq/) was used. Experiments as previously exemplified for SEQ ID NO:1 in
In summary, FLJ22169 shows upregulation in metastasizing tumor cells versus non metastasizing tumor cells, and also displays upregulated expression in various tumor tissues versus normal tissue samples. Moreover, FLJ22169 is functionally involved in processes involved in tumor progression like increased proliferation and invasion. Therefore, this sequence may particularly be useful for staging of human tumor diseases, as well as for decisions on prognosis and treatment modalities. Furthermore, the FLJ22169 gene and its gene products may be used as target structures to develop therapeutic anti-cancer drugs.
SEQ ID NO:4 (B3)
Another rat cDNA clone, derived from the above described SSH analysis of the mammary tumor test system showed 99% identity to the rat mRNA CB717750. The corresponding rat EST cluster was used for a blast analysis against human genome databases. An identity of 90% was found on the nucleotide level to the human mRNA AK000178 representing the locus FLJ20171 which maps on chromosome 8. According to AceView, this locus produces, by alternative splicing, 13 different transcripts altogether encoding 13 different protein isoforms.
The corresponding NCBI Reference sequence NM—017697 comprises 2140 nucleotides and encodes a hypothetical protein of 358 amino acids. According to SMART analysis (Simple Modular Architecture Research Tool, http://smart.embl-heidelberg.de/) this protein contains a RNA recognition motif known as the eukaryotic putative RNA-binding region RNP-1 signature or RNA recognition motif (RRM). RRMs are found in a variety of RNA binding proteins, including heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs). The motif also appears in a few single stranded DNA binding proteins. The RRM structure consists of four strands and two helices arranged in an alpha/beta sandwich, with a third helix present during RNA binding in some cases.
In
This sequence was shown to be differentially expressed in Microarray Analysis comparing samples of metastasizing versus non metastasizing tumor cells as previously exemplified for SEQ ID NO:1 in
In order to functionally examine whether FLJ20171 could be causally involved in the progression of tumor progression, it was transiently overexpressed or transiently downregulated by RNA interference in HEK-293T cells and subsequently potential resulting influences on tumor cell properties were assayed. For its overexpression a sequence corresponding to the NCBI reference sequence (http://www.ncbi.nlm.nih.gov/RefSeq/) was used. Experiments as previously exemplified for SEQ ID NO:1 in
In summary, FLJ20171 shows upregulation in metastasizing tumor cells versus non metastasizing tumor cells, and also displays upregulated expression in various tumor tissues versus normal tissue samples. Moreover, FLJ20171 is functionally involved in processes involved in tumor progression like increased proliferation and invasion. Therefore, this sequence may particularly be useful for staging of human tumor diseases, as well as for decisions on prognosis and treatment modalities. Furthermore, the FLJ20171 gene and its gene products may be used as target structures to develop therapeutic anti-cancer drugs.
SEQ ID NO:5 (D2)
Another rat cDNA clone was used to establish the corresponding EST cluster from rat EST databases. Identity within the cluster was over 96%. The consensus sequence of this cluster was used for a blast analysis against human genome databases. An identity of 80% was found to the human mRNA NM—030815 representing the locus C20orf126 which maps on chromosome 20. The Ensembl Genome Browser (http://www.ensembl.org/Homo_sapiens/) predicts that it produces one transcript with a length of 1290 bp. The coding sequence of the protein between the first in frame amino acid and the stop codon contains 176 residues. The first methionine corresponds to amino acid 44. The calculated molecular weight of the protein product is 15.5 kD.
Bioinformatic analysis according to PSORTII predicts that the subcellular localization of this protein is expected to be in the nucleus. Besides a nuclear localization signal, the predicted protein contains coiled coil domains. Such coiled coil structures (Psort Motiv, http://psort.nibb.ac.jp/) are found in some structural proteins, e.g. myosins, and in some DNA binding proteins as the so called leucine zipper. In this structure two α-helices bind each other forming a coil, in which this helices show a 3.5 residue periodicity which is slightly different from the typical value estimated at 3.6. Thus, the detection of coiled coil structure by searching for 7-residue periodicity is relatively more accurate than usual secondary structure prediction. Currently a classical detection algorithm developed by A. Lupas is used (Lupas, 1991, Science 252, 1162-1164). The function of C20orf126 is still unknown. Pfam analysis shows that this protein does not belong to any recognized protein family.
In
This sequence was shown to be differentially expressed in Microarray Analysis comparing samples of metastasizing versus non metastasizing cells as previously exemplified for SEQ ID NO:1 in
In order to functionally examine whether C20orf126 could be causally involved in the progression of tumor progression, it was transiently overexpressed or transiently downregulated by RNA interference in HEK-293T cells and subsequently potential resulting influences on tumor cell properties were assayed. For its overexpression a sequence corresponding to the NCBI reference sequence (http://www.ncbi.nlm.nih.gov/RefSeq/) was used. Experiments as previously exemplified for SEQ ID NO:1 in
In summary, C20orf126 shows upregulation in metastasizing tumor cells versus non metastasizing tumor cells, and also displays upregulated expression in various tumor tissues versus normal tissue samples. Moreover, C20orf126 is functionally involved in processes involved in tumor progression like increased proliferation and invasion. Therefore, this sequence may particularly be useful for staging of human tumor diseases, as well as for decisions on prognosis and treatment modalities. Furthermore, the C20orf126 gene and its gene products may be used as target structures to develop therapeutic anti-cancer drugs.
SEQ ID NO:6 (H5)
Another rat cDNA clone, originally derived from the above described SSH analysis of the mammary tumor test system was used for a blast analysis against rat EST databases. Similarity was found to the EST BE101513 which the was used to establish the corresponding EST cluster from rat EST databases. Identity within the cluster was over 96%. The consensus sequence of this cluster was used for blast analysis against the human genome browser BLAT (http://genome.ucsc.edu/cgi-bin/hgBlat?command=start). An identity of 90% was found to the human mRNA AK025697 representing the locus FBXO45 which maps on chromosome 3. According to AceView, this gene produces, by alternative splicing, 3 different transcripts altogether encoding 3 different protein isoforms. The corresponding NCBI Reference sequence XM—117294 comprises 4159 nucleotides and encodes a hypothetical protein of 286 amino acids. Comparison to the InterPro Database, a database of protein families, domains and functional sites (http://www.ebi.ac.uk/interpro/index.html), a Cyclin like F box motif is identified in the product of this gene. The F-box domain was first described as a sequence motif found in cyclin-F that interacts with the protein SKP1. This relatively conserved structural motif is present in numerous proteins and serves as a link between a target protein and a ubiquitin-conjugating enzyme. According to InterPro, also the SPIa/RYanodine receptor SPRY motif is found in 2 isoforms from this gene. The SPRY domain is of unknown function.
In
This sequence was shown to be differentially expressed in Microarray Analysis comparing samples of metastasizing versus non metastasizing tumor cells as previously exemplified for SEQ ID NO:1 in
In order to functionally examine whether FBXO45 could be causally involved in the progression of tumor progression, it was transiently overexpressed or transiently downregulated by RNA interference in HEK-293T cells and subsequently potential resulting influences on tumor cell properties were assayed. For its overexpression a sequence corresponding to the NCBI reference sequence (http://www.ncbi.nlm.nih.gov/RefSeq/) was used. Experiments as previously exemplified for SEQ ID NO:1 in
In summary, FBXO45 shows upregulation in metastasizing tumor cells versus non metastasizing tumor cells, and also displays upregulated expression in various tumor tissues versus normal tissue samples. Moreover, FBXO45 is functionally involved in processes involved in tumor progression like increased proliferation and invasion. Therefore, this sequence may particularly be useful for staging of human tumor diseases, as well as for decisions on prognosis and treatment modalities. Furthermore, the FBXO45 gene and its gene products may be used as target structures to develop therapeutic anti-cancer drugs.
SEQ ID NO:7 (G2)
Another rat cDNA clone, originally derived from the above described SSH analysis of the mammary tumor test system was used for a blast analysis against rat EST databases. Identity of 99% was found to the rat mRNA CO568861. This sequence was used for a blast analysis against human genome databases. An identity of 84% was found to the human mRNA AK025571 representing the locus FLJ21918 which maps on chromosome 16. According to AceView, this gene produces, by alternative splicing, 7 different transcripts altogether encoding 8 different protein isoforms. The corresponding NCBI Reference sequence NM—024939 comprises 4021 nucleotides and encodes a hypothetical protein of 717 amino acids. According to InterPro, the RNA-binding region RNP-1 (RNA recognition motif motif is found in 5 isoforms from this gene. Many eukaryotic proteins that are known or supposed to bind single-stranded RNA contain one or more copies of a putative RNA-binding domain of about 90 amino acids. This is known as the eukaryotic putative RNA-binding region RNP-1 signature or RNA recognition motif (RRM). RRMs are found in a variety of RNA binding proteins, including heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs). The motif also appears in a few single stranded DNA binding proteins.
In
This sequence was shown to be differentially expressed in Microarray Analysis comparing samples of metastasizing versus non metastasizing tumors cells as previously exemplified for SEQ ID NO:1 in
In order to functionally examine whether FLJ21918 could be causally involved in the progression of tumor progression, it was transiently overexpressed or transiently downregulated by RNA interference in HEK-293T cells and subsequently potential resulting influences on tumor cell properties were assayed. For its overexpression a sequence corresponding to the NCBI reference sequence (http://www.ncbi.nlm.nih.gov/RefSeq/) was used. Experiments as previously exemplified for SEQ ID NO:1 in
In summary, FLJ21918 shows upregulation in metastasizing tumor cells versus non metastasizing tumor cells, and also displays upregulated expression in various tumor tissues versus normal tissue samples. Moreover, FLJ21918 is functionally involved in processes involved in tumor progression like increased proliferation and invasion. Therefore, this sequence may particularly be useful for staging of human tumor diseases, as well as for decisions on prognosis and treatment modalities. Furthermore, the FLJ21918 gene and its gene products may be used as target structures to develop therapeutic anti-cancer drugs.
SEQ ID NO:8 (L1)
Another rat cDNA clone, originally derived from the above described SSH analysis of the mammary tumor test system was used for a blast analysis against rat EST databases. 100% identity was found to the rat EST AW919679. This EST was used for a blast analysis against mouse genome databases. Identity of 90% was found to the mouse mRNA AK088107. The protein encoded by this RNA shows 90% identity on the amino acid level to the human hypothetical protein NP—620129 encoded by the locus C19orf22, alias MGC16353. The corresponding NCBI Reference sequence NM—138774 comprises 1810 nucleotides and encodes a hypothetical protein of 166 amino acids. According to AceView, it produces, by alternative splicing, 8 different transcripts altogether encoding 7 different protein isoforms. PSORT II analysis, trained on yeast data, predicts that the subcellular location of this partial protein is expected to be in the nucleus (56%). The following domain was found: PKAKGRK. Pfam analysis shows that this protein does not belong to any recognized protein family.
In
This sequence, C19orf22, was shown to be differentially expressed in Microarray Analysis comparing its expression in metastasizing versus non metastasizing tumor cells as previously exemplified for SEQ ID NO:1 in
In order to functionally examine whether C19orf22 could be causally involved in the progression of tumor progression, it was transiently overexpressed or transiently downregulated by RNA interference in HEK-293T cells and subsequently potential resulting influences on tumor cell properties were assayed. For its overexpression a sequence corresponding to the NCBI reference sequence (http://www.ncbi.nlm.nih.gov/RefSeq/) was used. Experiments as previously exemplified for SEQ ID NO:1 in
In summary, C19orf22 shows upregulation in metastasizing tumor cells versus non metastasizing tumor cells, and also displays upregulated expression in various tumor tissues versus normal tissue samples. Moreover, C19orf22 is functionally involved in processes involved in tumor progression like increased proliferation and invasion. Therefore, this sequence may particularly be useful for staging of human tumor diseases, as well as for decisions on prognosis and treatment modalities. Furthermore, the C19orf22 gene and its gene products may be used as target structures to develop therapeutic anti-cancer drugs.
SEQ ID NO:9 (G4)
Another rat cDNA clone, originally derived from the above described SSH analysis of the mammary tumor test system was used for a blast analysis against rat EST databases. 84% identity was found to the rat RNA BC030338 representing the locus LOC292139. The protein encoded by this locus shows 77% identity to the hypothetical human protein NP—060800 representing the locus KIAA1598. The corresponding NCBI reference mRNA for this locus NM—018330 comprises 3417 nucleotides and encodes a hypothetical protein of 456 amino acids which maps on chromosome 10. According to AceView, this gene produces, by alternative splicing, 11 different transcripts altogether encoding 11 different protein isoforms. PSORT II analysis predicts that the subcellular location of this protein is expected to be in the nucleus (60%). Pfam Search shows that the amino-terminus of the protein shares homology with the SMC domain of Chromosome segregation ATPases.
In
This sequence was shown to be differentially expressed in Microarray Analysis comparing samples of metastasizing versus non metastasizing tumor cells as previously exemplified for SEQ ID NO:1 in
In order to functionally examine whether KIAA1598 could be causally involved in the progression of tumor progression, it was transiently overexpressed or transiently downregulated by RNA interference in HEK-293T cells and subsequently potential resulting influences on tumor cell properties were assayed. For its overexpression a sequence corresponding to the NCBI reference sequence (http://www.ncbi.nlm.nih.gov/RefSeq/) was used. Experiments as previously exemplified for SEQ ID NO:1 in
In summary, KIAA1598 shows upregulation in metastasizing tumor cells versus non metastasizing tumor cells, and also displays upregulated expression in various tumor tissues versus normal tissue samples. Moreover, KIAA1598 is functionally involved in processes involved in tumor progression like increased proliferation and invasion. Therefore, this sequence may particularly be useful for staging of human tumor diseases, as well as for decisions on prognosis and treatment modalities. Furthermore, the KIAA1598 gene and its gene products may be used as target structures to develop therapeutic anti-cancer drugs.
All clones were used to perform blast analyses using gene sequence databases. Out of these investigations, in summary, 89 of 235 deduced human sequences were chosen and corresponding cDNAs were spotted with 4 ng per spot onto glass slides (Cornings CMT ULTRAGaps slides), to create a diagnostic, a so called cDNA chip. Subsequent hybridization experiments showed that all of these 89 sequences are differentially expressed in at least one of several pairs of metastasizing and non metastasizing cells, such as, e.g., in five pairs of primary tumor and metastasis samples from colon cancer patients.
In addition, the expression patterns of these 89 sequences in established cell lines displaying different metastasizing potentials were analysed. The following cell lines were utilized for this purpose:
Accession numbers of all sequences that showed differential expression at least in one of these systems in microarray analysis are listed in
An example of a gene-chip hybridization experiment utilizing cDNAs from the endometrial cancer cell line HEC-1A and the metastasizing endometrial cancer cell line AN3-CA (ATCC HTB-112 and -111) is shown in
In summary, all sequences listed in
These findings show, that this cDNA chip comprising the listed sequences of
A therapeutic modulation of a cancer antigens function could be established by interfering with the expression of such a cancer antigen by e.g., including but not limited to, utilizing means of anti-sense RNA, RNAi or catalytic RNA technologies, or by various DNA or modified DNA oligonucleotide approaches.
Alternatively, antibodies directed against these cancer antigens could be suitable anti-cancer drugs, or drugs that interfere with activities, such as, but not limited to, enzymatic or structural activities, of these cancer antigens, or their existing localization specifications. Also, drugs which act on signaling pathways which are influenced by these cancer antigens could give rise to potent anti-cancer drugs.
In a particular embodiment of this invention, such therapeutic approaches could be suitable for the treatment of metastatic cancer disease, or for the prevention or suppression of metastatic tumor progression, and for the treatment, prevention and suppression of minimal residual tumor disease.
Number | Date | Country | Kind |
---|---|---|---|
05103409.8 | Apr 2005 | EP | regional |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/EP2006/003713 | 4/21/2006 | WO | 00 | 10/25/2007 |