Cancer of the colorectal part of the gastrointestinal tract is a frequently occurring disorder. In a first stage a benign tumour (adenoma) occurs which can turn into a malignant cancer (adenocarcinoma). Not all adenomas progress to carcinomas. In fact this progression into a carcinoma happens in only a small subset of tumours. Initiation of genomic instability is a crucial step and occurs in two ways in colorectal cancer (Lengauer et al. (1998) Nature, 396, 643-649). DNA mismatch repair deficiency leading to microsatellite instability (abbreviated as MSI or MIN), has been most extensively studied (di Pietro et al. (2005) Gastroenterol., 129, 1047-1059), and explains about 15% of adenoma to carcinoma progression. MSI tumours occur due to defects in the DNA mismatch repair genes (MMR), which promote tumour progression due to the accumulation of mutations and epigenetic changes. In the other 85% of cases where colorectal adenomas progress to carcinomas, genomic instability occurs at the chromosomal level (CIN) giving rise to aneuploidy. While for a long time these chromosomal aberrations have been regarded as random noise, secondary to cancer development, it has now been well established that these DNA copy number changes occur in specific patterns and are associated with different clinical behaviour (Hermsen et al. (2002). Gastroenterol. 123, 1109-1119). Chromosomal aberrations frequently reported in colorectal cancers are 7pq, 8q, 13q, 20q gains and 4pq, 5q, 8p, 15q, 17p, 18q losses. It was shown that 8q, 13q and 20q gains and 8p, 15q, 17p and 18q losses, are associated with progression of colorectal adenomas to carcinomas (Hermsen et al. (2002) cited above). In summary, colorectal cancer results from the accumulation of chromosomal aberrations, mutations and epigenetic changes ultimately causing alterations in gene expression at the RNA and protein level. Recent investigations have shown that a novel class of non-coding RNA molecules called microRNAs (miRNAs) also can act as oncogenes and tumour suppressors during cancer development. These non-coding RNAs contribute to tumour progression due to changes in their expression. Microarray based profiling studies have shown in a number of different tumor tissues and tumor cell lines, that miRNAs are differentially expressed. This differential expression, in some cases, was correlated to type of tumor and could be used to distinguish normal versus aberrant (tumor) tissue (Bloomston et al. (2007) JAMA 297:1901-1908), Iorio et al. (2005) Cancer Res. 65:7065-7070. In some cases, the unique small RNA profile correlated with prognosis (Yanaihara et al. (2006) Cancer Cell 9:189-198, Calin et al. (2005) N. Engl. J. Med. 353:1793-1801). Importantly, expression profiling studies in chronic lymphocytic leukemia have shown that miRNAs signatures can classify leukemia samples according to their developmental history and categorize poorly differentiated tumours more accurately than mRNA expression profiles (Calin et al. (2005) N. Engl. J. Med. 353:1793-1801).
Therefore, these experiments reveal that the accumulation of specific combinations of chromosomal aberrations, epigenetic changes and mutations that characterize a tumour, determine its miRNA expression signature and vice versa. Consequently, miRNA expression profiles are a powerful tool for diagnosis, determining prognosis, progression risk and response to therapy of human cancers.
It is an object of the invention to molecularly identify and characterize the above described progression of adenomas into adenocarcinomas in as early a stage as possible, to allow early stage and more specific treatment of (pre-)malignant tumors while avoiding unnecessary surgical intervention. It is a further object of the invention to identify, high-risk pre-malignant colorectal tumors at a stage when the malignant phenotype (i.e. invasion and metastastis) has not yet emerged. At such a stage there usually are no symptoms yet that evoke clinical evaluation resulting in in situ optical analysis or microscopic analysis of biopsy or resection material. Current diagnostic procedures for early detection of colorectal cancer include colonoscopy, sigmoidoscopy and fecal tests such as faecal occult blood test (FOBT) and DNA faeces tests. Colonoscopy and sigmoidoscopy are not only quite uncomfortable for the patient, but specificity is low since many non-progressive adenomas are detected. Secondly, sensitivity is hampered by experience and devotion (retraction time) of the endoscopist and shape of the tumors (flat lesions go undetected more easily than polypoid). In addition, sigmoidoscopy only covers part of the large intestine. Polypectomies, endoscopic resections or biopsies during colonoscopy are performed when colorectal tumors/lesions are identified to confirm the nature of these lesions by histological analysis. Existing commercial fecal based tests have yielded promising results but need further improvement. FOBT (Faecal Occult Blood Test) is the most recognized method of selecting high risk individuals, and has been demonstrated to reduce death from colorectal cancer at the population level when applied in a population based screening program. Unfortunately, with FOBT a considerable percentage of the tumours is missed and the already bleeding tumour is often discovered at a relatively late stage. DNA faeces tests so far have mainly focused on detecting point mutations in only a limited number of genes, i.e. KRas, APC and p53. kRas mutations have proven not to be very informative, and testing for APC and p53 mutations is technically demanding, and thus expensive. In addition, substantial numbers of CRC do not carry these mutations.
More recently, DNA microarray-based tumor gene expression profiles have been used for cancer diagnosis. However, studies have been limited to a few cancer types and have spanned multiple technology platforms complicating comparison among different datasets. Expression analysis performed at the protein level has a number of drawbacks, the first being linked to the origin of the samples, i.e. the need for an invasive colonoscopy.
The present invention shows that small RNAs (miRNAs and other small RNAs) can serve as informative biomarkers for early detection/secondary prevention of colorectal cancer and biological classification of established colorectal cancer in order to facilitate therapeutic decision making. As miRNAs and other small RNAs are produced from longer precursors it is also possible to use the precursor as a biomarker. As these RNA molecules are typically detected in groups, for instance, using (micro)arrays, the present invention also provides collection of probes for the detection of miRNAs, other small RNAs and/or precursors thereof. These collections are not only useful for the detection of said biomarkers in the context of colorectal cancer. They are also useful in other diagnostic tests to type or classify samples of nucleic acid. mRNAs and other small RNAs are oligonucleotides of between 15 and 200 bases, preferably between 15-100 bases. Typically these RNAs are between 18 and 27 bases.
The invention therefore provides a collection of probes or primers for detecting at least 5, preferably at least 10, more preferably at least 20 of the nucleic acid molecules of table 23 (the miRNA and star sequences) and/or table 24 (the pre-miRNA sequences). In a preferred embodiment said collection of probes or primers comprises probes or primers for at least 100, more preferably at least 500 of the nucleic acid molecules of table 23 (the miRNA and star sequences) and/or table 24 (the pre-miRNA sequences). Preferably said collection of probes or primers comprises at least one probe for each of the nucleic acid molecules of table 23 (the miRNA and star sequences) and/or table 24 (the pre-miRNA sequences). In a preferred embodiment said collection of probes or primers comprises probes or primers for detecting nucleic acid molecules of table 23 (the miRNA and star sequences). Collections of these probes or primers more accurately classify samples, presumable because these molecules and not the precursors are functional in a cell.
In a preferred embodiment a collection of probes of the invention comprises the nucleic acid sequences identified in table 21. These collections are particularly suited for diagnostic tests. The expression levels of nucleic acid molecules identified in table 23 or in table 24 can be determined by labelling the small RNA complement of a sample, for example by direct chemical crosslinking of label to the RNA backbone, end-labelling by adapter ligation or RNA extension, and subsequent hybridization to reverse complement DNA probes (for instance as depicted in table 21) under stringent conditions using state-of-the-art techniques. Other, potentially more sensitive methods for small RNA detection have been described, including the use of DNA-modified probes (e.g. locked-nucleic-acid LNA probes) for hybridization and nucleic acid amplification-based methods using primers specific for the nucleic acid molecules identified in table 23 or in table 24. A preferred method of amplification is real-time PCR. Means and methods for amplifying miRNA or other small RNA molecules and precursors thereof are described in Lee et al (2006) Int. J. Cancer Vol 120, pp 1046-1054; Gaur et al (2007) Cancer Res Vol. 67, pp. 2456-2468; Bandres et al (2006) Molecular Cancer Vol 5: 29). Hybridization is preferably performed under stringent conditions. Hybridization stringency regulates the percentage of nucleotides which must match on two unrelated single-stranded nucleic acid molecules before they will base pair with each other to form a duplex, given a certain set of physical and chemical conditions. The hybridization stringency is used to determine when a hybridization probe and a target nucleic acid will come together, and can be set by the researcher by varying the conditions. In general, if the percentage of matching nucleotides is lower than 90 percent, the two single-stranded nucleic acid molecules are considered nonhomologous and any hybridization is considered nonstringent. Conditions which can be used for stringent hybridization of the probes and primers of the invention are known to person skilled in the art. Other conditions can be obtained by a person skilled in the art by comparing hybridization of the probe or primer with a sequence that has more than 10 percent mismatches with the target sequence. In this way conditions can be selected that allow discrimination between the amount of hybridization to the unmodified target sequence and the sequence having 10% mismatches. Typically, though not necessarily, not all of the probes give a signal when analysing nucleic acid samples. However, starting with a collection of probes of the invention one can easily compare levels of nucleic acid of table 23 or table 24, for instance through hybridization of the respective probes and select probes for inclusion in the diagnostic test on the basis of the levels of hybridization. An obtained pattern for the levels of the nucleic acids such as the pattern obtained for the levels of hybridization to the respective probes is indicative for the type of sample tested. Different types of samples give different patterns and can thus be classified on the basis of pattern. To this end the invention provides the use of a collection of probes or primers of the invention for detecting the levels of at least 10 nucleic acid molecules of table 23 or table 24, preferably through detecting hybridization of nucleic acid from a sample to a collection of probes of the invention. The same types of samples often show a limited amount of variation with respect to the patterns they produce. It is possible to select probes or primers that produce more homogeneous patterns by testing a number of samples. To this end in a preferred embodiment the invention provides a method for selecting probes for testing nucleic acid samples said method comprising hybridizing nucleic acid from a representative number of said samples to a collection of probes according to the invention,
The predictive value of classifiers as described above can be further improved. A first way would be to increase the sample set this reduces noise and leads to the selection of more predictive small RNAs. This can in some cases increase the accuracy of prediction and decrease the error rate. However, in other cases the predictive value of a diagnostic test has already reached its limit and increasing the sample set will not result in a more accurate predictive set. A further way to improve the predictive value of a diagnostic test of the invention is to combine the dataset based on the microarray data with other data sets. For example, a diagnostic test of the invention may combine data based on the small RNA expression levels with array CGH data obtained from the same samples. This can result in a more accurate prediction of disease state. A further option is to use combined diagnostic tests in a step-wise procedure, where the first diagnostic test will have the best predictive value. For some individuals that are difficult to classify with a method of the invention, a further diagnostic test will follow that will ultimately result in a diagnosis of the disease state. Where the second test on its own might not have given a clear answer, in combination with a method of the invention, a defined and reliable diagnosis can be given also for individuals or samples that are difficult to classify with high confidence with a method of the invention.
A sample can comprise cells. In a preferred embodiment the sample comprises colon cells, colon derived tumor cells or nucleic acid derived therefrom. Typically, however, a sample has undergone some type of manipulation prior to analysing the presence or absence therein of a miRNA, a small RNA and/or hairpin RNA according to the invention. Such manipulation, typically, though not necessarily comprises isolation of at least (part of) the nucleic acid of the cells. The nucleic acid in a sample may also have undergone some type of amplification and/or conversion prior to analysis with a method of the invention miRNA or a small RNA or a precursor thereof can be detected directly via complementary probe specific for said miRNA or small RNA or indirectly. Indirect forms include, but are not limited to conversion into DNA or protein and subsequent specific detection of the product of the conversion. Conversion can also involve several conversions. For instance, RNA can be converted into DNA and subsequently into RNA which in turn can be translated into protein. Of course such conversions may involve adding the appropriate signal sequences such as promoters, translation initiation sites and the like. Other non-limiting examples include amplification, with or without conversion of said miRNA or small RNA in said sample for instance by means of PCR or NASBA or other nucleic acid amplification method. All these indirect methods have in common that the converted product retains at least some of the specificity information of the original miRNA, a small RNA and/or hairpin RNA, for instance in the nucleic acid sequence or in the amino acid sequence or other sequence. Indirect methods can further comprise that nucleotides or amino acids other than occurring in nature are incorporated into the converted and/or amplified product. Such products are of course also within the scope of the invention as long as they comprise at least some of the specificity information of the original miRNA, a small RNA and/or hairpin RNA. By at least some of the specificity information of the original miRNA, a small RNA and/or hairpin RNA is meant that the converted product (or an essential part thereof) is characteristic for the miRNA, a small RNA and/or hairpin RNA of which the presence or absence is to be determined. A preferred small RNA is a miRNA. A hairpin RNA is herein also referred to as a precursor of a miRNA. The hairpin does not have to contain the entire sequence of the precursor but contains at least the stem loop structure.
A diagnostic test of the invention is particularly suited for distinguishing, typing and/or discriminating between different stages of disease. A diagnostic test of the invention preferably comprises a test that detects illness in an animal, preferably a primate, more preferably a human. A diagnostic test or a set/collection of probes of the invention is particularly suited to discriminate between different types of colon cells. A diagnostic test or a set/collection of probes of the invention are particularly suited to discriminate between samples comprising nucleic acid of normal colon cells, colon adenoma cells and/or colon carcinoma cells. A diagnostic test of the invention is particularly suited for distinguishing, typing and/or discriminating samples of different stages of tumorigenesis. Thus preferably at least one of said at least two types of samples comprises nucleic acid from tumor cells. In a particularly preferred embodiment said tumor cells are colon tumor cells. A tumor or tumour is an abnormal growth or mass of tissue. A tumor can be either malignant or benign. Preferred benign tumors are pre-malignant tumors, preferably adenomas. Nearly all tumors are examples of neoplasia, although certain developmental malformations or inflammatory masses may occasionally be referred to as tumors. This latter meaning is not intended herein. Tumors are typically caused by mutations in DNA of cells, which interfere with a cell's ability to regulate and limit cell division. Some of these mutations involve loss or gain of large parts of chromosomes (typically at least 100 megabases). Such loss or gain mutations often involve chromosome arms. An accumulation of mutations is needed for a tumor to emerge. Mutations that activate oncogenes or repress tumor suppressor genes can eventually lead to tumors. Cells have mechanisms that repair DNA and other mechanisms that cause the cell to destroy itself by apoptosis if DNA damage gets too severe. Mutations that repress the genes for these mechanisms can also eventually lead to cancer. A mutation in one oncogene or one tumor repressor gene is usually not enough for a tumor to occur. A combination of a number of mutations is necessary. As a result of the transformation tumor cells express genes and RNAs at levels that differ from their normal counterpart. The same holds true for different stages of tumorigenesis. Later stages express genes and RNAs at different levels when compared to less progressed stages. In the present invention it was further found that tumor progression can be monitored by determining the level of RNA using a collection of probes of the invention. It was found that differences in the level of RNA in samples of different stages of the same type of tumor (e.g. colorectal epithelial neoplasia) produce different signatures that can be used to predict tumor progression. Thus differences in the level of specific RNAs of a tumor at a certain stage can be used to predict progression thereof, particularly within a specific time span. The prediction is more accurate for progression within two years, particularly within a year, more in particular within 6 months from the moment the sample was taken. A diagnostic test of the invention can thus be used to determine a prognosis for the individual or the future behaviour of the tumor. A diagnostic test of the invention is particularly suited to characterise a tumor. The invention therefore provides the use of a test of the invention, preferably a diagnostic test of the invention, for characterising a tumor. A diagnostic test of the invention can also be used to predict therapy outcome or efficacy of treatment or be used to follow the effect of a therapy. The invention therefore provides the use of a test of the invention, preferably a diagnostic test of the invention for predicting therapy outcome or predicting efficacy of treatment or for following an effect of a therapy. The invention further provides a kit for a diagnostic test of the invention said kit comprising a collection of probes or primers of the invention.
A tumor is preferably a tumor of the anus, bladder, bile duct, bone, brain, breast, cervix, colon/rectum, endometrium, esophagus, eye, gallbladder, head/neck, liver, kidney, larynx, lung, mediastinum (chest), mouth, ovaries, pancreas, penis, prostate, skin, small intestine, stomach, tailbone, testicles or of the thyroid. A tumor is preferably a papilloma/carcinoma, a choriocarcinoma, an endodermal sinus tumor, a teratoma, a adenoma/adenocarcinoma, dysplasia, hyperplasia, intraepithelial neoplasia, squamous cell carcinoma, undifferentiated carcinoma, transitional cell carcinoma, carcinoma not otherwise specified, a soft tissue sarcoma, a melanoma, a fibroma/fibrosarcoma, lipoma/liposarcoma, a leiomyoma/leiomyosarcoma, a rhabdomyoma/rhabdomyosarcoma, a mesothelioma, an angioma/angiosarcoma, an osteoma/osteosarcoma, a chondroma/chondrosarcoma, a glioma or a lymphoma/leukaemia.
The invention further provides a method for selecting probes or primers of the invention for testing nucleic acid samples. Subsets of probes or primers are preferably selected from the collection of probes or primers for the nucleic acids of table 23 and/or table 24. A probe or a primer for a nucleic acid molecule of table 23 or table 24 is specific for said nucleic acid molecule when used under stringent conditions. Probes are preferably selected from the collection of probes of table 21. The selection is preferably done by comparing at least two types of samples with each other. As mentioned herein above, it is preferred that at least one of said types of samples is a particular type of tumor.
Preferably said type of sample is a particular stage of tumor type. Preferred stages are the benign stage of a tumor, the progressed but not yet metastasized stage and the metastasized stage. Thus the progressed (i.e. locally invasive) but not yet metastasized stage and the respective metastasized stages (i.e. locoregional lymph node metastasis and metastasis to distant organs). It is preferred that at least one other type of sample is a reference sample. The reference sample can be any type of sample. It is preferred that when the test type of sample is of a human, the reference sample type is also human. In one preferred embodiment said reference sample comprises normal cells, preferably from the type of cells that the tumor originated from. Normal cells, counterpart ect. are used herein typically to refer to unaffected cells, counterpart etc, such as are preferably obtained from a healthy individual. In another preferred embodiment said normal cells are obtained from an affected individual but from a site that is not affected by the disease. In a preferred embodiment said normal cells are of the same type of cells that the test sample is derived from and that are suspected of containing diseased cells. In another preferred embodiment the reference sample is the same type of tumor but at a different stage of progression. In yet another preferred embodiment the reference sample is of a different tissue, preferably from blood. In a preferred embodiment the sample and the reference (sample) is a blood sample. A preferred type of blood sample is whole blood, preferably peripheral whole blood or a fraction thereof. A preferred fraction is the fraction of peripheral blood mononuclear cells (PBMC). The small RNA or precursors do not have to be present in intact cells in the blood but can also be present in free form or associated with cell debris or in protein bound complexes. Thus in another preferred embodiment said blood sample comprises serum or plasma. Plasma and serum are particularly suited to detect a nucleic acid molecule of table 23 or table 24 that is not normally present in blood cells. A tumor sample can be obtained in a number of ways. A preferred sample is a biopsy sample. Another preferred sample is a fine needle aspirate (FNA), stool or faeces sample. Such samples contain cells and cell derivatives from which RNA can be prepared. These samples are particularly useful when the individual is suffering from or suspected of suffering from a colon tumor. The invention thus further provides method for detecting cells in a faeces sample comprising detecting small RNA or precursors therefore in said sample. The small RNA or precursors therefore do not have to be present in intact cells in faeces but can also be associated with cell debris or in protein bound complexes. The invention thus further provides method for detecting small RNA or precursors thereof in a faeces sample comprising detecting small RNA in said sample. In a preferred embodiment said method comprises detecting the level of nucleic acid molecules identified in table 8, table 9a, table 9b, table 10a, table 10b, table 11a, table 11b, table 12a, table 12b, table 12c, table 23, table 24 and/or table 34. Preferably the level of the nucleic acid molecules identified in table 11a and/or table 11b is detected. In a particularly preferred embodiment the level of the nucleic acid molecules identified in table 12a, table 12b, table 12c and/or table 34 is detected.
Many tumors, particularly at the later stages shed cells into the blood stream. These cells can be detected. Using a method for selecting probes of the invention it is possible to select probes that distinguish, type or discriminate samples of blood from healthy individuals from samples of blood from individuals that suffer from a particular type of tumor. Further provided is a method for detecting cells in a sample of blood or a fractionated part thereof, comprising detecting small RNA in said sample. The sample itself does not have to contain cells. Particularly in blood it is possible that cells can be detected that have shed the indicative nucleic acid into the blood stream, for instance upon death of said cell. Detection of this nucleic acid in the blood, plasma or serum is indicative for the cell that has shed nucleic acid in the blood stream. The cells to be detected are preferably tumor cells, more preferably colon tumor cells. In a preferred embodiment the level of nucleic acid molecules identified in table 13, table 14a, table 14b, table 15a, table 15b, table 16a, table 16b, table 18a, table 18b, or table 18c, table 23, table 24 and/or table 35 is detected. Preferably the level of the nucleic acid molecules identified in table 16a and/or table 16b is detected. In a particularly preferred embodiment the level of the nucleic acid molecules identified in table 18a, table 18b, table 18c and/or table 35 is detected.
The selection methods described herein above yield collections of probes or primers that can be used to distinguish, type and/or discriminate samples containing nucleic acid from cells. As a result the invention provides collections of probes or primers that are obtainable by such a method. In a further aspect the invention provides a subset comprising at least ten probes for the nucleic acids of table 23 and/or table 24. The invention further provides a collection of probes comprising the nucleic acid molecules identified in table 21. A subset of probes is preferably a subset of the collection of the probes of table 21. In a preferred embodiment said subset comprises at least 10 of the probes of table 21.
In a particularly preferred embodiment the invention provides a subset of probes of the collection of probes of table 21, comprising and/or consisting of the probes identified in table 1, table 2a, table 2b, table 2c, table 2d, table 3a, table 3b, table 3c, table 3d, table 4a, table 4b, table 4c, table 4d, table 5a, table 6a, table 7a, table 9a, table 9b, table 10a, table 10b, table 11a, table 11b, table 12a, table 12b, table 12c, table 14a, table 14b, table 15a, table 15b, table 16a, table 16b, table 18a, table 18b, table 18c, table 19c or table 19d, table 20, table 25a, table 26, table 27, table 28a, table 28b, table 29a, table 29b, table 30, table 31a, table 31b, table 32a, table 32b and/or table 33. In another preferred embodiment said subset of probes from the collection of probes of table 21, comprises and/or consists of the probes identified in table 8 or table 13.
In a particularly preferred embodiment the invention provides a subset of probes or primers for detecting nucleic acid molecules of table 23 or table 24, comprising and/or consisting of probes or primers for the nucleic acid molecules identified in table 1, table 2a, table 2b, table 2c, table 2d, table 3a, table 3b, table 3c, table 3d, table 4a, table 4b, table 4c, table 4d, table 5a, table 6a, table 7a, table 9a, table 9b, table 10a, table 10b, table 11a, table 11b, table 12a, table 12b, table 12c, table 14a, table 14b, table 15a, table 15b, table 16a, table 16b, table 18a, table 18b, table 18c, table 19c, table 19d, table 20, table 25a, table 26, table 27, table 28a, table 28b, table 29a, table 29b, table 30, table 31a, table 31b, table 32a, table 32b, table 33, table 34 and/or table 35. In another preferred embodiment said subset of probes or primers for detecting nucleic acid molecules of table 23 or table 24, comprises and/or consists of the probes or primers for the nucleic acid molecules identified in table 8 or table 13.
The invention further provides a method for determining whether a sample comprises nucleic acid from a particular type of cells said method comprising hybridising nucleic acid from said sample to a collection of probes for the nucleic acid molecules as identified in table 1, table 2a, table 2b, table 2c, table 2d, table 3a, table 3b, table 3c, table 3d, table 4a, table 4b, table 4c, table 4d, table 5a, table 6a, table 7a, table 9a, table 9b, table 10a, table 10b, table 11a, table 11b, table 12a, table 12b, table 12c, table 14a, table 14b, table 15a, table 15b, table 16a, table 16b, table 18, table 19c, table 19d, table 20, table 21, table 25a, table 26, table 27, table 28a, table 28b, table 29a, table 29b, table 30, table 31a, table 31b, table 32a, table 32b, table 33, table 34 and/or table 35, quantifying hybridization of said probes with nucleic acid in said samples, and comparing the result of said quantification with a reference. In a preferred embodiment said cells are tumor cells or suspected of being tumor cells. Preferably said cells are colon cells.
The invention further provides a subset of probes or primers according to the invention, comprising probes or primers for each of the nucleic acids identified in table 1, table 2a, table 2b, table 2c, table 2d, table 3a, table 3b, table 3c, table 3d, table 4a, table 4b, table 4c, table 4d, table 5a, table 6a, table 7a, table 9a, table 9b, table 10a, table 10b, table 11a, table 11b, table 12a, table 12b, table 12c, table 14a, table 14b, table 15a, table 15b, table 16a, table 16b, table 18a, table 18b, table 18c, table 19c, table 19d, table 20, table 20, table 21, table 25a, table 26, table 27, table 28a, table 28b, table 29a, table 29b, table 30, table 31a, table 31b, table 32a, table 32b, table 33, table 34 and/or table 35. Further provided is a subset of probes or primers from the collection of probes or primers of the invention, comprising probes or primers for each of the nucleic acids identified in table 8 or table 13. In a preferred embodiment said cells are tumor cells or suspected of being tumor cells. Preferably wherein said cells are colon cells.
A nucleic acid molecule of the invention such as a probe or primer often can contain a nucleotide analogue. Many analogues are presently available that mimic at least some of the base pairing characteristics of the “standard” nucleotides A, C, G, T and U. Alternatively, nucleotide analogues such as inosine can be incorporated into a nucleic acid molecule of the invention. Other types for analogues include LNA, PNA, morpholino and the like. Further methods for the specific detection of nucleic acid include but are not limited to specific nucleic acid amplification methods such as polymerase chain reaction (PCR) and NASBA. Such amplification methods typically use one or more specific primers. A primer or probe preferably comprises between 12-40 nucleotides having at least 90% sequence identity to a sequence as depicted in table 23 or table 24, or the complement thereof. A probe, primer or nucleic acid molecule of the invention is preferably single stranded. However, for shipping, production and or therapeutic purposes a double stranded molecule is sometimes preferred. For example a miRNA or small RNA is processed in the cell in double stranded form. Such analogues are considered to be a functional equivalent of a nucleic acid molecule such as a probe or a primer of the invention, when they exhibit the same hybridization characteristics under stringent conditions in kind, not necessarily in amount.
The invention further provides a method for determining whether a sample comprises nucleic acid from colorectal cells said method comprising detecting and quantifying the level of the nucleic acid molecules identified in table 1 in said sample, and comparing the result of said quantification with a reference.
Further provided is a method for determining whether a sample comprises nucleic acid from adenoma cells said method comprising detecting and quantifying the level of the nucleic acid molecules identified in table 2a, table 2b, table 2c, table 2d, table 4a, table 4b, table 4c, table 4d, table 5a, table 7a, table 9a, table 9b, table 11a, table 11b, table 12a, table 12c, table 14a, table 14b, table 16a, table 16b, table 18a, table 18c, table 19c, table 19d, table 20, table 25a, table 26, table 27, table 34 and/or table 35 in said sample, and comparing the result of said quantification with a reference. In this embodiment it is preferred that said reference is a quantification of the levels of the same nucleic acid molecules for a sample comprising nucleic acid from adenocarcinoma cells or from normal cells.
The invention further provides a method for determining whether a sample comprises nucleic acid from adenocarcinoma cells said method comprising detecting and quantifying the level of the nucleic acid molecules identified in table 3a, table 3b, table 3c, table 3d, table 4a, table 4b, table 4c, table 4d, table 6a, table 7a, table 10a, table 10b, table 11a, table 11b, table 12b, table 12c, table 15a, table 15b, table 16a, table 16b, table 18b, table 18c, table 19c, table 19d, table 25a, table 26, table 27, table 28a, table 28b, table 29a, table 29b, table 30, table 31a, table 31b, table 32a, table 32b, table 33, table 34 and/or table 35 in said sample, and comparing the result of said quantification with a reference. In this embodiment it is preferred that the level of the nucleic acid molecules identified in table 4a, table 4b, table 4c, table 4d, table 7a, table 11a, table 11b, table 12c, table 16a, table 16b, table 18c, table 19c, table 19d, table 25a, table 26, table 27, table 28a, table 28b, table 29a, table 29b, table 30, table 31a, table 31b, table 32a, table 32b, table 33, table 34 and/or table 35 is quantified. In a particularly preferred embodiment the level of the nucleic acid molecules identified in table 4c, table 4d, table 7a, table 25a and/or table 26 is quantified. In this embodiment it is preferred that said reference is a quantification of the levels of the same nucleic acid molecules for a sample comprising nucleic acid from adenoma cells or from normal cells. In a preferred embodiment said normal cells are brushed cells from normal colon mucosa.
In one aspect the invention provides a method for determining whether a sample comprises nucleic acid from colorectal cells said method comprising hybridising nucleic acid from said sample to a collection of probes as identified in table 1, quantifying hybridization of said probes with nucleic acid in said samples, and comparing the result of said quantification with a reference. In another aspect the invention provides a method for determining whether a sample comprises nucleic acid from adenoma cells said method comprising hybridising nucleic acid from said sample to a collection of probes as identified in table 2a, table 2b, table 2c, table 2d, table 4a, table 4b, table 4c, table 4d, table 5a, table 7a, table 9a, table 9b, table 11a, table 11b, table 12 a and c, table 14a, table 14b, table 16a, table 16b, table 18a and c, table 19c, table 19d, table 20, table 25a, table 26, table 27, table 34 and/or table 35, quantifying hybridization of said probes with nucleic acid in said sample, and comparing the result of said quantification with a reference. In a preferred embodiment said reference is a quantification of a hybridization of a reference sample with said collection of probes. Preferably said reference sample comprises nucleic acid from adenocarcinoma cells. In this preferred embodiment it is preferred that said collection of probes is a collection of probes as identified in table 4a, table 4b, table 4c, table 4d, table 7a, table 11a, table 11b, table 12c, table 16a, table 16b, table 18c, table 19c, table 19d, table 20, table 21, table 25a, table 26, table 27, table 34 and/or table 35. In another preferred embodiment said reference sample comprises nucleic acid from normal cells. In this preferred embodiment it is preferred that said collection of probes is a collection of probes as identified table 2a, table 2b, table 2c, table 2d, table 5a, table 9a, table 9b, table 12a, table 14a, table 14b, table 18a, table 20 and/or table 21.
In another aspect the invention provides a method for determining whether a sample comprises nucleic acid from adenocarcinoma cells said method comprising hybridising nucleic acid from said sample to a collection of probes as identified in table 3a, table 3b, table 3c, table 3d, table 4a, table 4b, table 4c, table 4d, table 6a, table 7a, table 10a, table 10b, table 11a, table 11b, table 12b and c, table 15a, table 15b, table 16a, table 16b, table 18b and c, table 19c, table 19d, table 21, table 25a, table 26, table 27, table 28a, table 28b, table 29a, table 29b, table 30, table 31a, table 31b, table 32a, table 32b, table 33, table 34 and/or table 35, quantifying hybridization of said probes with nucleic acid in said sample, and comparing the result of said quantification with a reference. In a preferred embodiment said reference is a sample comprising nucleic acid from adenoma cells. In this preferred embodiment it is preferred that said collection of probes is a collection of probes as identified in table 4a, table 4b, table 4c, table 4d, table 7a, table 11a, table 11b, table 12c, table 16a, table 16b, table 18c, table 19c, table 19d, table 21, table 25a, table 26, table 27, table 28a, table 28b, table 29a, table 29b, table 30, table 31a, table 31b, table 32a, table 32b, table 33, table 34 and/or table 35. In another preferred embodiment said reference is a sample comprising nucleic acid from normal cells. In this preferred embodiment it is preferred that said collection of probes is a collection of probes as identified in table 3a, table 3b, table 3c, table 3d, table 6a, table 10a, table 10b, table 12b, table 15a, table 15b, table 18b, table 19c, table 19d and/or 21. In a preferred embodiment said normal cells are blood cells or brush cells from normal colon.
It has been found that probes or primers for miRNAs, small RNAs and/or their precursors are particularly suited for detecting cells that have a chromosomal aberration. The invention thus further provides the use of a collection of probes or primers for miRNAs, small RNAs or their precursors, for classifying cells as cells that contain a chromosomal aberration. These aberrations typically involve a significant part of a chromosome. The aberrations can be the direct origin of the miRNA or small RNA that is being tested for. Alternatively, as miRNA and small RNA regulate the expression of other transcription units and are themselves derived from transcription, the aberration can be detected through its action on the levels of particular miRNA, small RNA or the precursor thereof in a cell. This aspect of the invention is particularly suited for detecting tumor cells that have chromosomal aberrations, and consequently are prone to progression from a premalignant to malignant stage, or have already done so. Preferably said tumor cells are colon tumor cells. This aspect can be used to detect aberrations involving segments of DNA in a chromosome that encode one or multiple small RNAs. In principle it is possible to detect even smaller aberrations. A particularly preferred type of aberration is the loss or gain of a part of a chromosome, preferably a gain of 7pq, 8q, 13q or 20q or a loss of 4pq, 5q, 8p 15q, 17p or 18q, particularly preferred is a loss of chromosome 8p or 17p. In a preferred embodiment said loss of a chromosome 8p or 17p is detected using at least one probe or primer for nucleic acid molecules of table 23 or table 24, as identified in table 19c (for loss of 8p) or table 19d (for loss of 17p). Preferably said loss of a chromosome 8p or 17p is detected using at least 5, more preferably at least 10 and most preferred all of the probes or primers for nucleic acid molecules of table 23 or table 24, as identified in table 19c (for loss of 8p) or table 19d (for loss of 17p).
A sample can be any type of sample as long as it contains nucleic acid. It is preferred that said sample contains complex nucleic acid, preferably nucleic acid derived from a cell. A collection of probes of the invention can be used to analyze any type of nucleic acid, be it DNA, RNA or an analogue thereof having the same hybridization characteristics in kind not necessarily in amount. In a preferred embodiment said nucleic acid comprises RNA derived from cells. This aspect of the invention is useful for determining levels of different RNAs in the cells. In a particularly preferred embodiment the sample is enriched for RNA, preferably for RNA that is smaller than 500 nucleotides. In a particularly preferred embodiment said sample is enriched for RNA that is about 200 nucleotides. Thus in a preferred embodiment a collection of probes or primers of the invention, or a method of the invention involving detecting of a nucleic acid molecule of table 23 or table 24, is used to detect RNA in a sample, preferably small RNA.
miRNAs and other small RNAs are found to be relevant for the tumorigenic state of a tumor. The invention provides miRNA precursors thereof that are up or down regulated in a tumor when compared to normal cells, or up or down regulated in a progressed form of said tumor when compared to the less progressed state. It is thus possible to counteract this level by increasing the level when it is down regulated or decreasing the level when it is upregulated. In this way it is possible to intervene with the tumor or the tumor state. Thus the invention further provides a method for the treatment and/or prevention of a disease in an individual comprising administering to said individual at least one nucleic acid molecule of table 23 or table 24, at least one nucleic acid molecule comprising the reverse complementary sequence of a nucleic acid molecule of table 23 or table 24, or a functional equivalent of said at least one nucleic acid molecule. Preferably said individual is suffering from colon cancer or is at risk of suffering therefrom. Said at least one nucleic acid molecule of table 23 or table 24 or the reverse complement thereof, is preferably a nucleic acid molecule of table 2a, table 2b, table 2c, table 2d, table 3a, table 3b, table 3c, table 3d, table 4a, table 4b, table 4c, table 4d, table 5a, table 6a, table 7a, table 9a, table 9b, table 10a, table 10b, table 11a, table 11b, table 12a, table 12b, table 12c, table 14a, table 14b, table 15a, table 15b, table 16a, table 16b, table 18a, table 18b, table 18c, table 19c, table 19d or table 20, table 25a, table 26, table 27, table 34 and/or table 35, or a functional equivalent thereof. For the treatment of disease, it is preferred that said at least one nucleic acid molecule of table 23 or table 24 or the reverse complement thereof, is a nucleic acid molecule of table 3a, table 3b, table 3c, table 3d, table 4a, table 4b, table 4c, table 4d, table 6a, table 7a, table 10a, table 10b, table 11a, table 11b, table 12b, table 12c, table 15a, table 15b, table 16a, table 16b, table 18b, table 18c, table 19c, table 19d, table 20, table 25a, table 26, table 27, table 34 and/or table 35 or a functional equivalent thereof. For treatment it is particularly preferred that said at least one nucleic acid molecule of table 23 or table 24 or the reverse complement thereof, is a nucleic acid molecule of table 3a, table 3b, table 3c, table 3d, table 4a, table 4b, table 4c, table 4d, table 19c, table 19d, table 20, table 26, table 27 and/or table 33, or a functional equivalent thereof. In a particularly preferred embodiment it is preferred that said at least one nucleic acid molecule of table 23 or table 24 or the reverse complement thereof is a nucleic acid molecule of table 4c, table 4d, table 19c, table 26, table 27 and/or table 33. Using at least one of these nucleic acid molecules for the treatment it is possible to modify cells of a progressed tumor (preferably an adenocarcinoma tumor) such that they become less progressed (preferably resembling a more adenoma phenotype). For the prevention of disease, it is preferred that said at least one nucleic acid molecule of table 23 or table 24 or the reverse complement thereof, is a nucleic acid molecule of table 2a, table 2b, table 2c, table 2d, table 4a, table 4b, table 4c, table 4d, table 5a, table 7a, table 9a, table 9b, table 11a, table 11b, table 12a, table 12c, table 14a, table 14b, table 16a, table 16b, table 18a, table 18c, table 19c, table 19d, table 20, table 25a, table 26, table 27, table 34 and/or table 35 or a functional equivalent thereof. For prevention it is preferred that at least one nucleic acid molecule of table 23 or table 24 or the reverse complement thereof, comprises a nucleic acid molecule of table 2a, table 2b, table 2c, table 2d, table 4a, table 5a, table 7a, table 20, table 25a, table 26 and/or table 27 or a functional equivalent thereof. Using at least one nucleic acid molecule for prevention of disease it is possible to modify cells of a benign stage such that the propensity for tumor progression is reduced in these cells.
In case of a nucleic acid molecule that is downregulated it is preferred that said nucleic acid molecule is provided to the cell wherein it is down-regulated. To this end it is preferred that said at least one nucleic acid molecule of table 23 or table 24, comprises a nucleic acid molecule of table 2d, table 3d, table 4d, table 19c, table 19d, table 27 or table 33. In a preferred embodiment said at least one nucleic acid molecule of table 23 or table 24, comprises a nucleic acid molecule of table 2d, table 3d, table 4d, or table 27 or a functional equivalent thereof. These preferred nucleic acid molecules are highly correlated with the tumor or tumor stage and or thus effective in a larger fraction of the tumors (stage).
In case of a nucleic acid molecule that is up regulated it is preferred that said nucleic acid molecule is reduced in the cell wherein it is up regulated. To this end it is preferred that said at least one nucleic acid molecule comprising the reverse complementary sequence of a nucleic acid molecule of table 23 or table 24, comprises a nucleic acid molecule of table 2c, table 3c, table 4c or table 19c or table 19d. Preferably said at least one nucleic acid molecule of table 23 or table 24, comprises a nucleic acid molecule of table 2c, table 3c or table 4c, or a functional equivalent thereof. These preferred nucleic acid molecules are highly correlated with the tumor or tumor stage and or thus effective in a larger fraction of the tumors (stage).
In one embodiment the invention provides a pharmaceutical composition, comprising as an active agent at least one nucleic acid molecule of table 23a or table 24a, or a nucleic acid molecule comprising the reverse complement of at least one nucleic acid molecule of table 23a or table 24a or a functional equivalent of such nucleic acid molecules, and optionally a pharmaceutically acceptable carrier. A pharmaceutical composition according to the invention further optionally comprises another additive. Such another additive can for example be a preservative or a colorant. Alternatively an additive is a known pharmaceutically active compound. A carrier is any suitable pharmaceutical carrier. A preferred carrier is a compound that is capable of increasing the efficacy of a nucleic acid molecule to enter a target-cell. Examples of such compounds are liposomes, particularly cationic liposomes. A composition is for example a tablet, an ointment or a cream. Preferably a composition is an injectable solution or an injectable suspension. In one embodiment the invention provides a pharmaceutical composition according to the invention for diagnostic applications. In another embodiment the invention provides a pharmaceutical composition according to the invention for therapeutic applications. In a preferred embodiment the invention provides a pharmaceutical composition according to the invention, as a modulator for a developmental or pathogenic disorder. In a preferred embodiment said developmental or pathogenic disorder is cancer. A miRNA molecule for example functions as a suppressor gene or as a regulator of translation of a gene. A method for the treatment of an individual to obtain therein a described effect, wherein said individual is provided with a small RNA, precursor thereof, or the reverse complement thereof as described herein, is similarly a use of said small RNA, precursor thereof, or the reverse complement for the preparation of a medicament for obtaining said described effect. A pharmaceutical preferably comprises a pharmaceutically acceptable carrier or excipient.
A functional equivalent of a nucleic acid molecule of table 23a, for the purpose as a pharmaceutical, is preferably a nucleic acid comprising at least 12 and preferably at least 18 consecutive bases of the nucleic acid molecule of table 23a. The functional equivalent nucleic acid may have 1 or 2, preferably only 1 mismatch with the nucleic acid of table 23a. Preferably there are no mismatches in said consecutive stretch.
A functional equivalent of a nucleic acid of table 24a is a nucleic acid having at least 90%, and preferably at least 95%, more preferably at least 99% sequence identity to a nucleic acid of table 24a. The sequence of the small RNA therein is preferably a nucleic acid comprising at least 12 and preferably at least 18 consecutive bases of the small RNA nucleic acid molecule of table 23a. The corresponding small RNA part preferably comprises no more than 2 nucleic acid mismatches. In a particularly preferred embodiment the sequence of the small RNA in said functional equivalent is exactly the same as the sequence thereof given in table 23a.
A functional equivalent of a nucleic acid molecule comprising the reverse complement a nucleic acid molecule of table 23a is preferably a nucleic acid comprising the reverse complementary sequence of at least 12 and preferably at least 18 consecutive bases of the nucleic acid molecule of table 23a. The functional equivalent nucleic acid may have 1 or 2, preferably only 1 mismatch with the reverse complement nucleic acid of table 23a. Preferably there are no mismatches in said consecutive complementary stretch.
A functional equivalent of the reverse complement a nucleic acid molecule of table 24 is a nucleic acid of which its reverse complement has at least 90%, and preferably at least 95%, more preferably at least 99% sequence identity to a nucleic acid of table 24a.
A nucleic acid molecule for use as a pharmaceutical is preferably a nucleic acid molecule as identified in table 2d, 3d, 4d, 19c, 19d, 28a, table 28b, table 30, table 32a, table 32b, or 33. Preferably said nucleic acid molecule is also identified in table 19c, table 19d, table 20, table 26 and/or table 27.
A nucleic acid molecule that comprises a reverse complement of a nucleic acid of table 23a is preferably a nucleic acid molecule as identified in table 2c, table 3c, table 4c, table 31a or table 31b. Preferably said nucleic acid molecule is also identified in table 20, table 26 and/or table 27.
A nucleic acid molecule for use as a pharmaceutical as described herein can be an RNA molecule, a DNA molecule, a hybrid thereof or other be modified. For instance, many variants of the naturally occurring bases are known and many more will likely be developed. These synthetic bases can share the same hydrogen bonding characteristics with it natural counterpart. These synthetic bases can be incorporated into a nucleic acid of the invention. A nucleic acid molecule of the invention therefore can include one or more of such synthetic bases. A nucleic acid molecule of the invention can be modified to obtain a desired effect. Examples of nucleic acid molecules comprising such modifications are locked nucleic acid modifications (LNA), morpholino modifications and other modifications. Furthermore, the backbone can be modified to provide resistance to RNAseH. A nucleic acid molecule of the invention may comprise modifications to obtain a variety of effects. For instance, the nucleic acid molecule may be modified to enhance stability, hybridization characteristics, cellular uptake, circulation half life, and targeting of the nucleic acid.
A nucleic acid molecule according to the invention is administered by any suitable known method. The mode of administration of a pharmaceutical composition of course depends on its form. In a preferred embodiment a solution is injected in a tissue. A nucleic acid molecule according to the invention is introduced in a target cell by any known method in vitro or in vivo. Said introduction is for example established by a gene transfer technique known to the person skilled in the art, such as electroporation, microinjection, DEAE-dextran, calcium phosphate, cationic liposomes or viral methods.
A nucleic acid of the invention, such as a small RNA, a miRNA, a precursor thereof or the reverse complement of the mentioned molecules can be administered directly to a cell or an individual, or through a gene delivery vehicle. The gene delivery vehicle preferably comprises an expression cassette containing a promoter and other transcription regulatory and signaling sequences to express the small RNA, a miRNA, a precursor thereof or the reverse complement thereof. The gene delivery vehicle is preferably a viral vector or a plasmid.
The invention further provides the use of a nucleic acid molecule of table 23 or table 24, as a drug target. Preferably provided is the use of a nucleic acid molecule of table 3a, table 4a, table 20, table 26, table 27 or table 33, as a drug target. Also provided is the use of a nucleic acid molecule of table 23 or table 24 or the reverse complement thereof, for identifying a molecular process in a cell that is contributing to a disease or a disease state. Preferably provided is the use of a nucleic acid molecule of table 3a, table 4a, table 20, table 26, table 27 or table 33, for identifying a molecular process in a cell that is contributing to a disease or a disease state.
In yet another embodiment the invention provides a collection of nucleic acid molecules comprising the nucleic acid molecules of table 23 or table 24. Also provided is a subset of this collection of nucleic acids comprising at least 10 of the nucleic acids identified in table 23 or table 24. In a preferred embodiment said subset comprises the nucleic acid molecules identified in table 1, table 2a, table 2b, table 2c, table 2d, table 3a, table 3b, table 3c, table 3d, table 4a, table 4b, table 4c, table 4d, table 5a, table 6a, table 7a, table 9a, table 9b, table 10a, table 10b, table 11a, table 11b, table 12a, table 12b, table 12c, table 14a, table 14b, table 15a, table 15b, table 16a, table 16b, table 18a, table 18b, table 18c, table 19c, table 19d, table 20, table 25a, table 26, table 27, table 28a, table 28b, table 29a, table 29b, table 30, table 31a, table 31b, table 32a, table 32b, table 33, table 34 and/or table 35. In a particularly preferred embodiment said subset comprises the nucleic acid molecules identified in table 2d, table 3d, table 4d, table 26, table 27 and/or table 33. In another preferred embodiment said subset comprises the nucleic acid molecules identified in table 8 or table 13.
A probe or primer of the invention can be longer than the region of complementarity with the sequence of table 23 or table 24. For a probe, the reverse complement of the sequence depicted in table 23 or table 24 is preferably incorporated into another nucleic acid. This separates the probe sequence from a surface to which it is attached. Such a probe typically exhibits improved hybridization characteristics. A probe or primer of the invention contains at least 7 consecutive nucleotides that together are the reverse complement of a sequence of the same length in the nucleic acid molecule of table 23 or table 24. Preferably said probe or primer contains at least 12, more preferably at least 16 of such consecutive nucleotides. In a particularly preferred embodiment said probe comprises the reverse complement of the entire sequence given for the nucleic acid molecule of table 23 or table 24. A probe or primer specific for a nucleic acid molecule of table 23 may have 1 or 2, preferably only 1 mismatch (in the consecutive stretch of complementarity) with the nucleic acid of table 23 it is specific for. Preferably there are no mismatches in said consecutive stretch.
A probe or primer for that is specific for a nucleic acid of table 24 comprises at least 90%, and preferably at least 95%, more preferably at least 99% sequence identity to the reverse complement of the nucleic acid comprising the region of complementarity of the corresponding nucleic acid of table 24. A primer is typically though not necessarily between 10-30 nucleotides long. A primer can contain one or more nucleotides in addition to the region of complementarity. These sequences can, for instance, be used to attach a label.
A collection of probes of the invention is preferably associated or physically linked to a solid surface. The solid surface is preferably a microarray. In another preferred embodiment a collection of probes is associated or physically linked to a collection of beads. Preferably, the individual probe are associated or physically linked to different beads such that each bead contains essentially one probe.
A collection of probes or primers of the invention preferably comprises probes or primers specific for nucleic acid molecules of table 23a or table 24a. A collection of the invention, of course preferably contains different nucleic acid molecules. When herein a collection is said to have at least a certain number of nucleic acid molecules given in the tables, or probes or primers specific therefore, it is preferred that at least said certain number of these nucleic acid molecules, probes or primers is different. In addition to this number of different ones there may be additional identical ones. A collection of probes or primers or a subset thereof of the invention are preferably used for the detection of one or more nucleic acid molecules of table 23 or table 24. However, such collections can also serve other purposes, for instance therapeutic purposes. When herein reference is made to a collection of probes or primers of the invention, this typically also includes subsets of collection of probes and primers. The same holds true for collections of nucleic acid molecules or the reverse complement thereof.
The invention further provides a collection of nucleic acid molecules comprising the reverse complement of the nucleic acid molecules of table 23 or table 24. A non-limiting use of such collection is a collection of probes for the molecules of table 23 or table 24.
A collection of probes or a subset of probes of the invention preferably comprises no more than 15.000 different probes, preferably no more than 12.000 probes, preferably no more than 10.000 and more preferably no more than 7.000 probes, particularly for probes specific for nucleic acid molecules of table 23a or table 24a. Said preferred maximum amounts are increased by 1000 when also the probes specific for nucleic acid molecules of table 23b or table 24b are included. In a particularly preferred embodiment the collection of probes comprises a maximum of 5000 probes, preferably a maximum of 4000 probes, preferably a maximum of 3000 probes, preferably a maximum of 2000 probes, preferably a maximum of 1000 probes, preferably a maximum of 500 probes. A subset of probes or the invention is preferably present in an array. Preferably a micro-array. Preferably a collection of proves of the invention is associated with a solid surface.
In a preferred aspect the invention provides a collection of probes or primers for detecting each of the nucleic acid molecules of table 25a, table 5a and/or 6a. These collections are particularly suited to discriminate colorectal adenoma cells from normal colon cells (table 5a), colorectal carcinoma cells from normal colon cells (table 6a), and colorectal adenoma cells from colorectal carcinomal cells (table 25a). In a preferred embodiment said collection of probes or primer comprises probes or primers for detecting each of the nucleic acid molecules of table 25a and table 5a. Particularly preferred is a collection of probes or primers, comprising probes or primers for detecting each of the nucleic acid molecules of table 25a, table 5a and table 6a. Preferably said collections of probes or primers are used for discriminating between normal colorectal tissue, colorectal adenoma and colorectal adenocarcinoma.
In the present invention it was found that the small RNA of table 23 or the precursors thereof of table 24 are particularly suited to map micro alterations in the chromosomal DNA. It was found that colon carcinoma contain various characteristic micro-deletions and micro-gains (amplifications) of chromosomal DNA. Micro in this context means deletions or gains of less than 3 megabases of chromosomal DNA. Several locations were identified in colon carcinoma cells that are subject micro-deletions and/or microgains. The invention thus further provides a method for determining whether a colon cell comprises a deletion of less than 3 megabases in chromosomal region 1p32.2, 1q21.1, 1q22, 6p22.2, 7p15.2, 11p15.5, 16q12.2 or 17p 13.1 comprising determining whether said cell comprises a deletion of DNA coding for a small RNA as identified in table 28a, or a precursor thereof. Said microdeletions can be detected in various ways including PCR and other amplification strategies using primers that are specific for the deletion or the boundaries thereof. Preferably said method comprises the step that nucleic acid of said cell is contacted with a primer or a probe specific for said small RNA, a precursor thereof, a DNA encoding said small RNA and/or a precursor thereof. Preferably said small RNA is a small RNA as identified in table 28b. A method of this embodiment is particularly suited for typing a cell as a colon carcinoma cell. Thus a colon cell comprising a deletion as indicated herein above is a colon carcinoma cell. Thus a method of this embodiment preferably further comprises typing said cell as a colon carcinoma cell. Preferably said small RNA is a small RNA as identified in table 30. The micro-deletion characterised by the small RNAs of table 30 are characteristic for microsatellite stable colon carcinoma cells. A method of this embodiment thus preferably further comprises typing said cell as a microsatellite stable colon carcinoma cell.
In another preferred embodiment said small RNA is a small RNA as identified in table 32a or table 32b. Micro-deletions characterised by the absence of these small RNA or the corresponding precursors characterise microsatellite instable colon carcinoma cells. Preferably said small RNA is a small RNA as identified in table 32b. A method of this embodiment thus preferably further comprises typing said (colon) cell as a microsatellite instable colon carcinoma cell.
The invention further provides a method for determining whether a colon cell comprises an additional chromosomal copy of less than 3 megabases of chromosomal region 5q31.3, 8p21.3, 9p21.3, 10p15.1 or 16q13 comprising determining whether said cell comprises an additional copy of DNA coding for a small RNA as identified in table 29a, or a precursor thereof. Said microgains can be detected in various ways including PCR and other amplification strategies using primers that are specific for the amplified regions or the boundaries thereof. Preferably said method comprises a step wherein nucleic acid of said cell is contacted with a primer or a probe specific for said small RNA, a precursor thereof, a DNA encoding said small RNA and/or a precursor thereof. In a preferred embodiment said small RNA is a small RNA as identified in table 29b. Colon cells containing such microgains are colon carcinoma cells. Thus a method of the invention preferably further comprises typing said cell as a colon carcinoma cell. In a preferred embodiment said small RNA is a small RNA as identified in table 31a or table 31b. Preferably a small RNA is a small RNA as identified in table 31b. Microgains characterised by gains of these latter small RNA or precursors thereof are characteristic for microsatellite stable colon carcinoma cells. Thus preferably said embodiment further comprises typing said cell as a microsatellite stable colon carcinoma cell.
The invention further provides a method for the treatment of an individual suffering from a colon tumor comprising administering to said individual a nucleic acid comprising a nucleic acid as identified in table 28a, table 28b, table 30, table 32a, table 32b, or 33 or the precursor of said nucleic acid, or the reverse complement of a nucleic acid as identified in table 31b or a functional equivalent of said nucleic acid. By administering this nucleic acid a small RNA that is present in said (micro)deletion (table 28, table 30, table 32 and table 33) is provided to the colon carcinoma cell thereby supplementing the reduced level of said small RNA in said cell and thereby reducing a tumorigenic state thereof. Preferably said nucleic acid is a nucleic acid as identified in table 28b, table 32b or table 33.
By administering the reverse complement of a small RNA that is present in said microgain (table 31), the colon carcinoma cell is provided with a means for reducing the expression of the overexpressed therein. This reduces the tumorogenic state of said colon carcinoma cell. Preferably said nucleic acid comprises the reverse complement of a small RNA of table 31. Preferably the reverse complement of a small RNA of table 31b.
In a preferred embodiment the nucleic acid is provided as a nucleic acid encoding a precursor for the nucleic acid. The nucleic acid may be administered as indicated herein above. In brief, it may be administered as such and/or in the context of a gene delivery vehicle.
67 snap frozen human colorectal tumours (36 non-progressed adenomas and 31 adenocarcinomas) and ten samples of human normal colorectal epithelium obtained from histopathologically tumor free resection margins of colorectal cancer resection specimens were collected prospectively at the VU-University Medical Center (VUmc), Amsterdam, The Netherlands (Table 22). In addition, stool and blood samples from two other patients with colorectal cancer were collected. All samples were used in compliance with the institution's ethical regulations.
RNA isolation was done using TRIzol and following the manufacture's protocol. Briefly, tumours samples were homologized in 2 ml Trizol using a homogenizer. 0.4 ml chloroform was added to each sample and the tube was shaken vigorously. After 3 min of incubation at room temperature, the samples were centrifuged for 10 min at 12,000 rpm at 4° C. The upper aqueous phase was transferred to a new tube and 1 ml of isopropanol was added. Then the samples were incubated at room temperature for 10 min and centrifuged for 10 min at 12,000 rpm at 4° C. The supernatant was removed by decanting and the pellets were washed by adding 1 ml of 70% ethanol and centrifuging for 5 min at 7,500 rpm at 4° C. After the 70% ethanol was removed by decanting, the pellet was briefly spun down to remove the remaining ethanol using a sharp tip. Last, the pellet was dissolved in 100 μl RNAse-free water. Total RNA was cleanup performed using the RNeasy Mini Kit (QIAgen) and according to the manufacture's protocol. Briefly, a lysis buffer (RLT) was added and mixed well. After adding 100% ethanol the samples were transferred to an RNeasy Mini spin column and centrifuging for 15 s at >10,000 rpm. The flow-through was discarded and washing of the spin column was done twice with the RPE Buffer by centrifuging for 15 s (first washing) and 2 min (second washing) at >10,000 rpm. RNA was finally eluted in a new 1.5 ml tube in 2× 30 ul of RNase-free water by centrifuging for 1 min at >10,000 rpm and stored at −80° C. until use. The quantity of the RNA samples was measured using a spectrophotometer and the quality was evaluated in by visual inspection on an agarose gel (1%).
RNA labeling and small RNA enrichement was done according Kreatech Biotechnologies' protocol (Kreatech, Amsterdam, The Netherlands). Labelled total RNA was prepared using 500 ng of total RNA as starting material, 0.5 μl of Cy3-ULS/Cy5-ULS, 0.33 μl of 10× labeling buffer solution and water to a final volume of 3.3 μl. The mix was incubated 15 min at 85° C. and then placed on ice. Clean-up of the samples was carried out using KREApure columns. Columns were prepared for use by resuspending the column material using a vortex, inserting the column in a 2 ml collection tube and centrifuging 1 min at 20,800 g. Then the flow-through was discarded and the column was rewashed by adding 300 μl of water and centrifuging 1 min at 20,800 g. After cleaning, the column was placed into a new 1.5 ml tube and the ULS-labelled total material was pipetted onto the column and centrifuged 1.5 min at 20.800 g. The flow-through containing the labeled total RNA was placed on ice and 1 μl was taken to determine the degree of labelling by measuring the OD260 and OD550 (for Cy3-ULS) or OD650 (for Cy3-ULS). After labeling, the material was subjected to small RNA isolation by adding RNAses-free water to a 50 μl volume, 250 μl lysis buffer, 1.75 μl 6-mercapto ethanol, 25 μl 2M sodium acetate (pH 4) and 175 μl ethanol (100%). The samples were then loaded into a RNA-binding column placed into a 2 ml collection tube and centrifuged at 16,000 g 30-60 sec. Following centrifugation, the column was discarded and 583 μl of 100% ethanol was added to the flow-through. Then, each sample was loaded into a new RNA-binding column placed into a 2 ml collection tube and centrifuged 30-60 sec at 16,000 g. Two loads of approximately 700 μl each were needed. Next, samples were washed first with 500 μl of low-salt wash buffer and centrifuged at 16.000 g for 30 sec and after discarding the flow-through with 300 μl of low-salt wash buffer were added and centrifuged at 16.000 g for 30 sec. After discarding the flow-through, 500 μl of fresh 80% ethanol was added in the column and centrifuged 30-60 sec followed by an additional 2 min centrifugation to dry the membrane. Last, 20 μl of RNAse-free water pre-heated at 60° C. directly onto the center of the column, incubated at room temperature for 2 min and centrifuged at 16.000 g for 1 min. Last, the column was reloaded with the flow-through onto the center of the column, incubated 2 min at room temperature and centrifuged at 16.000 g for 1 min. The flow-through containing the labeled small RNA fraction was kept on ice until used.
Each of the small RNA enriched samples labeled with Cy3 was hybridized against a reference pool labeled with Cy5 containing total RNA from the above described tissue samples tumour samples, 61 tumor cell lines (59 tumor cell lines from the NCI-60 panel plus 2 other colorectal tumor cell lines) and placenta tissue. The rationale behind this common pool is to provide a reference signal to which all experiments can be compared. To make sure that a fold-change for every candidate miRNA can be calculated, the reference pool should give a positive signal for every candidate, which is ensured by including every sample in this pool.
The samples were prepared for hybridization by mixing equimolar amounts of Cy3 and Cy5-labelled samples Hybridization took place to a single subgrid of the 8×15K Agilent microarray under standard conditions. Prior hybridization, each sample was heated at 60° C. in the dark for 2 min, centrifuged at 13,000 rpm for 1 min and place on ice until it was loaded on the array. Hybridization took place at 37° C. for 17 hours at 10 rpm. Post hybridization washes were done at room temperature after careful removal of cover into the Wash solution 1 were the slides were kept for 1 min and then transfer to Wash solution 2, and wash for 1 min. Last, the slides were immersed into wash solution 3 for 30 sec and then dried by air. Images of the arrays were acquired by scanning at 5 micron resolution (Agilent DNA Microarray scanner G2505B-Agilent technologies, Palo Alto, USA).
Design of the microarray was based on 474 human pre-miRNA sequences From miRBase 9.1 and 3315 human pre-miRNA sequences from PCT/NL2006/0000010. Positions of mature miRNA sequences within pre-miRNA were known from previous public and private cloning data. In cases where only the mature sequence corresponding to one of the arms of the pre-miRNA was known, the corresponding star sequence was predicted by folding the pre-miRNA using RNAfold (Vienna RNA package, Hofacker Ill. (2003) Vienna RNA secondary structure server. Nucleic Acids Res. 31(13):3429-31) and selecting the sequence from the hairpin stem based on Dicer/Drosha cleaving properties (2 nt 3′ overhang) and the position of cloned mature miRNA sequence. The first 22 nt of every cloned and predicted mature miRNAs were used for designing probes complementary to the respective miRNAs. In cases where the mature miRNA was shorter than 22 nt, it was extended with the corresponding sequence from the pre-miRNA. Agilent custom microarrays (Wolber P K, Collins P J, Lucas A B, De Witte A, Shannon K W. (2006). The Agilent in situ-synthesized microarray platform. Methods Enzymol. 2006; 410:28-57) were designed with 60 nt probe sequences. Every probe carries spacers (CGATCTTT) and two copies of miRNA-specific sequence:
5′-22 nt-miR-CGATCTTT-22 nt-miR-CGATCTTT-3′
where 22 nt-miR is the reverse-complemented sequence of mature miRNAs selected as describe above. In addition, 13 non-overlapping 5S and 13 U6 22 nt sequences were included into array as controls. These sequences were manually selected from proprietary small RNA cloning data. In total the array contained (2×474)+(2×3315)+13+13=7604 probe sequences that were put in duplicates on every subgrid of the 8×15k Agilent custom microarrays. A complete list of all probes included in the microarray is given in Table 21
Array processing and spot-calling was performed automatically using Agilent's FeatureExtraction software (9.5.3). Background subtractions, array normalization and extraction of gene expression values were performed by limma package from Bioconductor (Tables 1, 8 and 13).
Supervised cluster analysis was done using TIGR multi experiment viewer (TMev), applying complete linkage and using Euclidian distances as metric
For comparing miRNA expression levels in colorectal adenomas and colorectal carcinomas, colorectal adenomas and controls, and colorectal carcinomas and controls a Wilcoxon signed-rank sum test was used with a threshold p-value of 0.2.
For comparing miRNA expression levels in colorectal tumors with and without defined chromosomal aberrations (i.e. 8p loss, 13q loss, 15q loss, 17p loss, 18q loss and 20q gain) a Wilcoxon signed-rank sum test was used with a threshold p-value of 0.3.
To correct for multiple comparisons a Benjamini-Hochberg correction was performed in all analyses.
Indicator miRNAs in Blood or Faeces (Tables 9-11 and 14-16)
Differentially expressed miRNAs (Tables 2-4) were compared to miRNAs expressed in faeces (Table 8) or blood (Table 13). Small RNAs that were expressed differentially in a significant manner (BH<0.2) in tissue and detectable in Faeces or blood (signal/background>2) were listed.
For multivariate classification of samples based on small RNA expression levels, PAM (prediction analysis for microarrays) software was used. PAM produces a classification rule using multiple variables (i.e. small RNAs) for optimal classification of predefined categories of samples (e.g. adenomas versus adenocarcinomas). PAM (Tibshirani et al. (2002) Proc Natl Acad Sci USA. 99:6567-72.) is a nearest centroid classification-type algorithm The training data is used to compute so-called class centroids. The centroid is a multivariate (since all expressed small RNAs are used) generalization of a median. Basically, a new sample would be classified to the class to which the squared distance is minimal.
Nearest shrunken centroid classification makes one important modification to standard nearest centroid classification. It “shrinks” each of the class centroids toward the overall centroid for all classes by an amount we call the threshold. This shrinkage consists of moving the centroid towards zero by a threshold, setting it equal to zero if it hits zero. After shrinking the centroids, the new sample is classified by the usual nearest centroid rule, but using the shrunken class centroids. This shrinkage does automatic selection of the most relevant small RNAs.
The error rates are computed by 10-fold cross-validation: 10 samples are left out of the training procedure and their class label is predicted as descibed above. The procedure is set up such that each sample is left out exactly once. The confusion table then displays the number of correct and erronous classifications for both classes.
Pairwise comparisons were performed (control vs adenoma, control vs carcinoma and adenoma vs carcinoma) followed by feature selection. Then, cross-validated misclassification errors were computed (see the confusion tables 5b-7b). This shows which miRNAs were vital for a high accuracy of prediction the type of tissue (control vs adenoma vs carcinoma) and the accuracy of the prediction (%). Subsequently, the selected feature miRNAs were eliminated from the data and the classification was redone. The cross-validated misclassification errors were computed again. An increase in error shows that the accuracy of the classification relies on the presence of the featured miRNAs.
A large set of novel small RNAs have recently been experimentally identified at the Hubrecht Institute, Utrecht, The Netherlands, and classified as candidate miRNAs using computational analysis (e.g. stable hairpin formation of the precursor) (Berezikov et al. (2005) Cell 120:21-24, Berezikov et al. (2006) Nature Gen. 38: 1375-1377, Berezikov et al. (2006) Genome Res. 16: 1289-1298, PCT/NL2006/0000010). Earlier studies have shown that expression profiling of small RNAs can be used to classify tumors and use it as a diagnostic tool. In colorectal cancer, early diagnosis could make an enormous difference in life expectancy, morbidity and mortality. To determine whether these novel small RNAs, which are suspected to be miRNAs, are differentially expressed in different types of cancer, analysis of the expression of these small RNAs was performed in different colorectal tissues. The signals detected for the miRNAs and small RNAs obtained for the tissue samples were compared to signals obtained using the reference sample. Table 1 shows the result of this comparison. In total, 1296 probes produced signal with difference between maximum and minimum values above 4 in log 2 scale, indicating substantial differential expression of corresponding small RNAs in at least some of the investigated colorectal tissues.
In order to determine which small RNAs might serve as disease markers and may consequently be used for diagnostic purposes, detection signals of the analyzed small RNAs were compared between colorectal adenomas and control samples, colorectal adenocarcinomas and control samples and colorectal adenomas and adenocarcinomas. This comparison resulted in a list of small RNAs that are differentially expressed between the respective categories analysed (Tables 2-4). Moreover, Tables 2b-4b demonstrate which of the differentially expressed small RNAs are upregulated and which are downregulated. In summary, a comparison of colorectal adenomas and control samples showed that 1554 small RNAs are differentially expressed, of which 50 are upregulated (Table 2c) and 1504 are downregulated (Table 2d) in adenomas. The comparison of colorectal adenocarcinomas and control samples demonstrated that 105 small RNAs are differentially expressed (Table 3a), of which 23 are upregulated (Table 3c) and 82 are downregulated (Table 3d) in adenocarcinomas. Lastly, comparing colorectal adenomas and adenocarcinomas resulted in a list of 1080 small RNAs that are differentially expressed (Table 4a), of which 1 is upregulated (Table 4c) and 1079 are downregulated (Table 4d) in adenocarcinomas. When the statistical analysis was performed under more stringent criteria (p<0.1), a list of 384 small RNAs was selected, that are most significantly differentially expressed between adenoma and adenocarcinoma (Table 26). All of these small RNAs were downregulated in adenocarcinomas. The significance of these results is reiterated by the fact that, when cluster analysis (Euclidean) is performed, using all differentially expressed small RNAs, two distinct clusters are formed (Figure I-III). Figure Ia shows cluster analysis based on all differentially expressed small RNAs between control and adenomas. Cluster analysis shows two distinct clusters: one which included mostly adenomas (Cluster II) and one including mostly controls (Cluster I). This demonstrates that on the basis of the differentially expressed small RNAs it is possible to distinguish these two different tissue types. Statistical analysis (
In conclusion, these results show differences in small RNA content between the respective categories of samples (normal colorectal epithelium, adenomas and adenocarcinomas) tested. They further show that the selected small RNAs that are differentially expressed have great value for clinical diagnostic purposes and will be further developed as such. Based on the expression of the different small RNAs identified above, it becomes possible to classify an unknown tissue sample into a normal category an adenoma category or an adenocarcinoma category depending on its small RNA type expression profile. Such a comparison may then result in an accurate diagnosis of normal vs adenoma vs adenocarcinoma. In addition, individual or groups of small RNAs may actually drive (the set of upregulated small RNAs, table 2c, table 3c and table 4c) or inhibit (downregulated small RNAs, table 2d, table 3d, table 4d) the transition from adenoma to carcinoma and may therefore function as targets for therapy.
The subpopulations of small RNAs that are differentially expressed as identified above are still relatively large. This might hamper its use as a diagnostic tool in practice. Therefore, we determined which of the differentially expressed small RNAs are actually most predictive in its classification of a tissue sample in the categories normal, adenoma or adenocarcinoma. The selection of the most predictive small RNAs for multivariate classification of samples based on small RNA expression levels, PAM (prediction analysis for microarrays) software was used. PAM produces a classification rule using multiple variables (i.e. small RNAs) for optimal classification of predefined categories of samples (e.g. adenomas versus adenocarcinomas). It is well-known that with these high dimensional data one may obtain competitive classifiers with (almost) no overlapping features (small RNAs in this case). To assess the importance of the selected small RNAs as a set, we deleted these from the data set and repeated the classification procedure. We observe that the error rates drop substantially, which indicates that the initial set of small RNA does contain features which are essential for discriminating the two groups.
The results of these analyses are shown in Tables 5-7. These tables show lists of small RNAs that have a high predictive value. They are ranked, with the first being the small RNA with the highest predictive value. This predictive value is represented by the largest difference between the adenoma score and the adenocarcinoma score as listed in Tables 5a-7a. The predictive value and power of the complete set of small RNAs selected is due to the combination of these particular small RNAs. Table 5a shows that 55 from the selection of 1554 differentially expressed small RNAs in colorectal adenoma vs control are highly predictive and classify the tissue samples accurately in 80% of the cases (overall error rate 0.212, table 5b). This is exemplified by the fact that the accuracy rate decreases to 65% (overall error rate 0.349) when these 55 small RNAs are not included in the small RNA population used to classify tissue, demonstrating that these 55 contain good classifiers, to accurately predict the type of tissue analyzed. Table 6a shows that 79 from the selection of 105 differentially expressed small RNAs in adenocarcinoma vs control samples are highly predictive and classify the tissue samples accurately in 80% of the cases (overall error rate 0.198, Table 6b). This is exemplified by the fact that the accuracy rate decreases to 65% (overall error rate 0.365) when these 79 small RNAs are not included in the small RNA population used to classify tissue. Importantly, when these good predictors are excluded, it results in wrongly classifying 8 controls as an adenocarcinoma compared to 2 wrongly classified controls when the good predictors are included: the false positive rate increases significantly from 0.222 to 0.888. These data show that these 79 probably contain good classifiers, to accurately predict the type of tissue analyzed. Table 7a shows that 26 from the selection of 105 differentially expressed small RNAs in colorectal adenomas vs adenocarcinomas are highly predictive and classify the tissue samples accurately in 80% of the cases (overall error rate 0.198, table 7b). This is exemplified by the fact that the accuracy rate decreases to 65% (overall error rate 0.365) when these 26 small RNAs are not included in the small RNA population used to classify tissue. Moreover, exclusion of these small RNAs results in the increase of the false negative rate of 0.258 to 0.483 as indicated by double the adenocarcinomas that are wrongly classified as adenomas when excluding the 26 small RNAs. These results show that these 26 contain good classifiers to accurately predict the type of tissue analyzed. In conclusion, the described small RNAs (Tables 5a-7a) are minimal sets of small RNAs part or all of which are included in a diagnostic test, which could classify an unknown sample into the different categories (normal, colorectal adenoma, adenocarcinoma). The above described PAM analysis resulted in a classification rule that may be used in a diagnostic application. Such an application makes use of the small RNA content and expression profiles of an unknown sample which will be compared to the above identified combination of small RNAs using advanced algorithms. This calculates whether the final value is below or above a certain threshold that determines the consequent diagnosis. The threshold depends on the required specificity and sensitivity of the final diagnostic test.
To design a functional and practical diagnostic test it is important to take into account the type of sample that is to be analyzed. Existing commercial tests are based on the analysis of stool samples. They have yielded promising results but need further improvement. Fecal occult blood test is the most recognized method of selecting high risk individuals. Unfortunately, with FOBT a considerable percentage of the tumours is missed and the already bleeding tumour is often discovered at a relatively late stage. A diagnostic test based on the detection of small RNA can be performed using tissue samples (biopsies) as described above. However, it is preferred to use test samples that can be obtained by less invasive measures. Stool samples and blood samples obtained from suspected colorectal patients could qualify. We therefore tested whether small RNAs derived from the tumor tissue are traceable in either blood or faeces. Therefore, we first determined if it was possible to extract small RNAs from blood and faces. The results are shown in tables 8 and 13. It shows the ratio of detection signal of the different small RNAs over background. For this calculation, processed background and signal values produced by FeatureExtraction software (Agilent) were used. We considered all signals with a ratio of 2 over background a significant signal. It demonstrates that 3759 small RNAs can be detected in stool sample and 3644 small RNAs in whole blood obtained from a patient with a colorectal adenocarcinoma. Interestingly, of some hairpin RNAs both arms of the hairpin could be detected, where one arm is higher expressed than the other, clearly resembling endogenous miRNA processing, where the mature strand gives stronger signals than the star sequence (see tables 8b and 13b). Secondly, we analyzed which small RNA, expressed in faeces and blood, are differentially expressed in comparing the three categories (colorectal adenomas and control samples, colorectal adenocarcinomas and control samples and colorectal adenomas and adenocarcinomas). These analyses should indicate which small RNAs are indicative of abnormality and can be detected in faeces or blood. They are based on the combination of Tables 8 and 13, including all small RNAs expressed twice above background, and Tables 2-4, including all differentially expressed small RNAs (p<0.2). A summary of Tables 9-11 for faeces and 14-16 for blood is given below (Table 17). The table shows that it is possible to recover small RNAs in blood and faeces that are also differentially expressed in tissue. In conclusion, both blood and faeces obtained from suspected colorectal patients are suitable samples for use in diagnostic small RNA based microarray tests. This is once again exemplified by the fact that part of the group of small RNAs identified as good classifiers can also be detected in faeces (Table 12) and blood (Table 18). Importantly, a comparison of these subsets presumable derived from colorectal cancer cells and detected in blood and faeces, learns that blood and faeces have 85-90% of these predictive small RNAs in common. This shows that these small RNAs are indeed derived from colorectal adenocarcinoma cells and more importantly, that both blood and faeces are suitable for detecting colorectal carcinoma derived small RNAs.
Correlation of Expression of Small RNAs with Chromosomal Aberrations
The colorectal tissue samples used in this study have further been characterized on other molecular levels. By means of array CGH analysis it had been determined which of the above studied adenomas and adenocarcinomas are characterized by chromosome instability, and which aberrations specifically. It was shown that 8q, 13q and 20q gains and 8p, 15q, 17p and 18q losses, are associated with progression of colorectal adenomas to carcinomas (Hermsen et al. (2002) cited above). Therefore, we determined whether the chromosomal aberrations frequently reported in colorectal cancer are correlated with the expression of small RNAs. Table 19 shows that two typical chromosomal aberrations, loss of X8p and loss of X17p, correspond to significant expression of 219 and 15 small RNAs respectively. One could view the significance of this correlation of small RNA expression with certain chromosomal aberration in two manners; the first would regard small RNAs correlated with certain chromosomal aberrations as indicators for those aberrations and could therefore be used in diagnostic tests based on such chromosomal instabilities. The second way of regarding this correlation is to take into account the biological consequences. The loss of X8p and X17p affected expression of these particular small RNAs. This could either be a direct consequence: these small RNAs are encoded by the part of these chromosomes that is lost. It could also be an indirect effect: the chromosomes that are lost encode for genes that regulate the expression of these small RNAs. Moreover, the fact that loss of these chromosomes affects clinical behaviour of tumors may suggest that the small RNAs, correlated with this loss, have a biological function in the process of tumorigenesis.
Currently, the small RNA expression profiles identified in these 68 colorectal tissue samples have been analyzed to identify good clinical classifier and correlated with small RNAs recovered from faeces and blood obtained from colorectal adenocarcinoma patients, to determine if a diagnostic small RNA based test would be feasible. Moreover, these expression profiles were combined with array CGH data, determining whether chromosomal aberrations can be correlated with certain small RNAs. It is hypothesized that the small RNAs correlated with loss of X8p and X17p (as identified in table 19) will also have a biological function, since loss of these chromosomal aberrations is correlated with clinical prognosis. Therefore, in the future, these small RNAs will be studied in more detail to test that hypothesis. Importantly, the correlation between small RNA expression profiles and already available mRNA expression profiles will be determined shortly. This may give indications on which genes are targeted by the small RNAs that are identified in colorectal adenomas and adenocarcinomas. It is of particular interest to determine the target genes of small RNAs that are differentially expressed between adenomas and adenocarcinomas. These small RNAs may be involved in progression to adenocarcinomas and may form interesting targets for drug discovery. Consequently, the small RNA gene targets may therefore also be of great interest.
For materials and methods for the PAM analysis, represented in table 25, we refer to the materials and methods as presented under example 1.
To confirm the classifier as listed in table 7a, the PAM analysis was performed again, this time using a double cross validation to assure accurate prediction of the most predictive small RNAs. The small RNAs as listed in table 25a are the result of this analysis. 41 small RNAs were selected from the list of differentially expressed small RNAs. The combination of these 41 small RNAs is able to distinguish adenomas from adenocarcinomas in a highly predictive manner with a 72% accuracy (overall error rate 0.283, Table 25b). This is exemplified by the fact that the accuracy rate decreases to 66% (overall error rate 0.34) when these 41 small RNAs are not included in the small RNA population used to classify tissue, demonstrating that these 41 contain good classifiers, to accurately predict the type of tissue analyzed. The above described PAM analysis resulted in a classification rule that may be used in a diagnostic application. Such an application makes use of the small RNA content and expression profiles of an unknown sample which will be compared to the above identified combination of small RNAs using advanced algorithms. This calculates whether the final value is below or above a certain threshold that determines the consequent diagnosis. The threshold depends on the required specificity and sensitivity of the final diagnostic test.
To improve the predictive value of such classifiers as described above, a number of strategies can be followed. The first strategy could be to increase the sample set to ensure the correct selection of predictive small RNAs. This strategy could in some cases increase the accuracy of prediction and decrease the error rate. However, in other cases the predictive value of a diagnostic test has already reached its limit and increasing the sample set will not result in a more accurate predictive set. Therefore, a second strategy to improve the predictive value of the diagnostic test may be to combine the dataset based on the microarray data with other data sets. For example, a diagnostic test may combine data based on the small RNA expression levels with array CGH data obtained from the same samples. This could result in a more accurate prediction of disease state. A third option may be to use combined diagnostic tests in a step-wise procedure, where the first diagnostic test will have the best predictive value. For patients that fall within a category for which no clarity as to diagnosis can be given, a second diagnostic test will follow that will ultimately result in a diagnosis of the disease state. The second test on its own would not have given a clear answer, but in combination with the first test, this resulted in a defined and reliable diagnosis.
Importantly, of these 41 classifier small RNAs, a portion of them can be detected in faeces (Table 34) or blood (Table 35). As mentioned, both blood and faeces obtained from suspected colorectal patients are suitable samples for use in diagnostic small RNA based microarray tests. The presence of small RNAs identified as good classifiers in faeces and stool suggests that the classifier can be used as a diagnostic test using faeces or blood as a source for small RNAs derived from colorectal adenomas or adenocarcinomas.
For materials and methods for the cluster analysis, represented in Fig IV, we refer to the materials and methods as presented under example 1.
From the 384 small RNAs that were found to be most significantly differentially expressed between adenomas and adenocarcinomas (Fig IIIc, d and table 26), 91 were selected and believed to be highly predictive for classification of adenoma vs adenocarcinoma (Table 27). This was reiterated by the fact that, when cluster analysis (Euclidian) was performed using the 91 small RNAs, two distinct clusters are formed (Fig IVa). Moreover, statistical tests showed (Chi-square test) that the clustering based on the 91 selected small RNAs is highly significant (P<0.0001) (Fig IVb).
Fresh colorectal tissue samples from 58 patients with colorectal tumors were collected prospectively at the department of pathology of the VU-University medical center (VUmc), Amsterdam, the Netherlands. In total 27 adenomas and 31 adenocarcinomas, of which 12 adenomas and 12 carcinomas were also analyzed by small RNA based microarray (see example 1). Of all samples, total RNA was isolated using TRIzol (Invitrogen) following the manufacturer's guidelines with some modifications (http://www.vumc/nl/microarrays/index.html). Total RNA quantity was determined with a Nanodrop ND-1000 spectrophotometer (Isogen, Hackensack, N.J., USN and quality was assessed in a 1% agarose gel, stained with ethidium bromide. The study was carried out in accordance with the ethical guidelines of our institution concerning the use of patient material.
TaqMan PCR
Quantification of the left arm of hsa-miR-24-2 (hsa-miR-24-2-1) expression levels was done by real-time RT-PCT using Tagman® microRNA assays (Applied Biosystems, Foster City, Calif., USA) directed to hsa-mir-24 (ABI4 373072) and an endogenous reference, the RNU48 gene (ABI 437338). Reactions were performed following the manufacture's protocol using 10 ng of total RNA as input material. All reactions were carried out in duplo in a 7300 Real-time PCR System (Applied Biosystems, Foster City, Calif., USA).
Expression levels of the hsa-miR-24-2-1 were calculated from the obtained Ct values using the delta Ct method as previously described. Box and scatter plots were used to appreciate the descriptive statistics of the data (SPSS 14.0 for Windows, SPSS Inc. Chicago, Ill., USA). Significance of differences in expression levels between adenomas and carcinomas were computed by the Mann-Whitney U non-parametric test for independent samples (SPSS 14.0 for Windows, SPSS Inc. Chicago, Ill., USA).
To validate the microarray expression data as presented in
Colorectal cancer (CRC) results from the accumulation of DNA copy number gains and losses, promoter methylation changes, mutations and alterations in microRNAs (miRNAs) expression. It is well documented that 85% of the colorectal tumours show genetic instability through specific chromosomal gains and losses of small or large portions or whole chromosomes. These tumours are known as chromosomal instable (CIN) tumours. The biological consequences of the described DNA copy number changes that characterize CRC have not been fully established since the genes or miRNAs with tumour suppressor or oncogenic function, which are located in those regions have not been fully identified. In the last years, small RNA molecules have shown to play an important role in the pathogenesis of human cancers as oncogenes and tumours suppressor genes. Small RNAs contribute to the pathogenesis of human cancers due to their altered expression. The cause of their altered expression has only been partially characterized. It is known that DNA copy number changes, epigenetic changes and transcription factors involved in human carcinogenesis lead to increase or decrease expression of small RNAs during cancer initiation and progression. Therefore, small RNAs located in loss and gain regions characteristic of CRC as well as small RNAs which expression is associated to DNA copy number changes, may play a role as causative oncogenes or tumour suppressor genes in CRC. Identification of the small RNAs that participate as causative oncogenes and tumour suppressor genes in DNA copy number gain and loss regions will lead to a better understanding of the molecular mechanisms underlying CRC pathogenesis and may both serve as highly specific biomarkers for CRC or as possible targets for pharmaceutical intervention with the development of CRC.
38 colorectal carcinomas with array CGH data and microsatellite instability status available were collected at the Zaans Hospital, Zaandam, the Netherlands and Leeds University, UK. The study was carried out in accordance with the ethical guidelines of our institution concerning the use of patient material.
DNA from healthy mucosa and tumor DNA of each patient was isolated using the QIAmp microkit (Qiagen, Westburg, Leusden, the Netherlands) as previously described. (Weiss M M, Hermsen M A, Meijer G A, et al. Comparative genomic hybridisation. Mol Pathol 1999; 52:243-51. Buffart T E, Carvalho B, Hopmans E, et al. Gastric cancers in young and elderly patients show different genomic profiles. J Pathol 2007; 211:45-51).
Microsatellite Instability analysis was determined by using the MSI Analysis System, MSI Multiplex System Version 1.2 (Promega, Madison, Wis., USA) containing five monomorphic markers (BAT-25, BAT-26, NR-21, NR-24, MONO-27). Reactions were performed following the manufacturer's instructions. The obtained PCR products were separated on a ABI 3130 DNA sequencer (Applied Biosystems, Foster City, Calif., USA) and analyzed by GeneScan 3100 (Applied Biosystems, Foster City, Calif., USA). An internal lane size standard was added to the PCR samples for accurate sizing of alleles and to adjust for run-to run variations. Tumours were considered as microsatellite instable (MSI) when two or more markers ((MSI-H) showed length changes (instability). Tumours showing none or only one instable marker (MSI-L) were considered as Microsatellite stable (MSS).
Array CGH was performed on 30K oligonucleotide arrays as described before (Snijders A M, Nowak N, Segraves R, et al. Assembly of microarrays for genome-wide measurement of DNA copy number. Nat Genet. 2001; 29:263-4. Van den Ijssel P, Tijssen M, Chin S F, et al. Human and mouse oligonucleotide-based array CGH. Nucleic Acids Res 2005; 33:e192). Briefly, 600 ng of tumour and normal DNA were differentially labelled by random priming (Bioprime DNA Labeling System, Invitrogen, Breda, the Netherlands). Unincorporated nucleotides were removed with Sephadex columns (ProbeQuant G-50 Micro Columns, Amersham Biosciences) and 50 μl of tumour and normal DNA were combined with 10 μg Cot-1 DNA and precipitated by adding 2.5 volumes of ice-cold 100% ethanol and 0.1 volume of 3M sodium acetate (pH 5.2). The DNA was collected by centrifugation at 14000 rpm for 30 minutes at 4° C. The pellet was dissolved in 130 μl hybridization solution and incubated at 73° C. for 10 minutes to denature the DNA, followed by 60-120 minute incubation at 37° C. Next, the hybridization mixture was added to the array in a hybridization station (HybArray12™, Perkin Elmer Life Sciences, Zaventum, Belgium) and incubated for 38 hours at 37° C. After hybridization the slides were washed in six washing steps and dried by centrifugation for 3 minutes at 1000 g.
Images of the arrays were acquired by scanning (Agilent DNA Microarray scanner, Agilent Technologies, Palo Alto, USA) and Bluefuse software version 3.4 (BlueGnome, Cambridge, UK) was used for automatic feature extraction. Spots were excluded when the quality flag was below 1 or the confidence value was below 0.1. Log 2 tumour to normal ratio was calculated for each spot and normalized against the mode of the ratios of all autosomes. For determining copy number gains and losses, the R package CGH call was used (van de Wiel M A, Kim K I, Vosse S J, et al. CGHcall: calling aberrations for array CGH tumor profiles. Bioinformatics 2007; 23:892-4).
ACE-it (Array CGH Expression integration tool) was applied to statistically test whether gene dosage affects microRNA expression (van Wieringen et al., 2006). This tool was applied to the whole genome data. We used a cut-off value 0.15 for gains and losses, a default group value of 9 and an FDR for significance of 0.10.
Eight microdeleted regions, smaller than 3 megabases, were observed in at least two of the 38 colorectal carcinomas. To determine whether the changes in small RNA expression in colorectal carcinomas may be a consequence of their location in the previously mentioned microdeletions or microgains (smaller than 3 Mb), the localization of small RNAs in such regions detected by array CGH was ascertained. A significant amount of small RNAs appear to be located in these eight deleted regions (table 28a). Out of this list of small RNAs, 6 small RNAs were shown to be differentially expressed between adenomas and carcinomas (table 28b). Due to the location of the small RNA in these specific regions, these small RNAs could therefore contribute causally to colorectal carcinogenesis as tumor suppressor genes or oncogenes and form targets for therapeutic intervention or be used as highly specific biomarkers. Reintroduction of such, previously deleted, small RNAs may benefit the patient by affecting tumor progression or response to therapy and therefore clinical outcome. Moreover, they could be used as biomarkers for early diagnosis, therapy response or clinical outcome of colorectal carcinomas. Five micro-gain regions, smaller than 3 megabases, were observed in at least two of the 38 colorectal carcinomas. In these 5 regions, 22 small RNAs are located (Table 29a), one of which has been shown to be differentially expressed between adenomas and carcinomas (Table 29b). Gain of these small RNAs may be causatively involved in the initiation and maintenance of colorectal tumours. As such, inhibition of these small RNAs could result in the opposite effect and provide a novel therapeutic strategy for the treatment of colorectal carcinomas. In addition, these microRNAs could be used as biomarkers for molecular diagnosis or tumor classification regarding therapy response, therapeutical intervention or clinical outcome.
Within the population of colorectal carcinomas, two groups can be distinguished. One group that has stable microsatellites, the other group is defined by instable microsatellites. The microsatellite instability is often a result of DNA mismatch repair deficiencies and therefore, the status of stability of the microsatellites determines the sensitivity of these tumors for the accumulation of mutations or epigenetic changes. To determine whether microdeletions or gains can be found specifically in one group or the other, array CGH data was used to determine such changes in these two groups.
In at least two out of 26 microsatellite stable carcinomas, two micro-deletions, smaller than 3 megabases, were observed. Four small RNAs are located in these regions (table 30). Three micro-gain regions, smaller than 3 megabases, were observed in at least two of the 26 microsatellite stable colorectal carcinomas. In these regions, 10 small RNAs are localized (table 31a), of which 1 is differentially expressed in adenomas compared to carcinomas (table 31b). It is known that MSI and MSS have different clinical outcome and respond differently to the current chemotherapy regimens. On this basis, small RNAs located in microdeleted or microgained chromosomal regions that are most frequently present in MSS tumours, might be responsible for the different clinical behaviour seen in the MSS colorectal tumors. Therefore, small RNAs that are located in these gain and loss regions are highly suitable for use as biomarkers to distinguish microsatellite stable from microsatellite instable carcinomas, for early diagnosis of these tumors or tumor classification regarding therapy response, to dispense the most suitable therapeutical regimen for the patients or predict clinical outcome.
Eight micro-deletions, smaller than 3 megabases, were observed in at least two of the 12 microsatellite instable colorectal carcinomas. In these deleted regions, 41 small RNAs are localized (table 32a), of which 4 are differentially expressed between adenomas and carcinomas (table 32b). No microgains could be detected in this group of colorectal carcinomas. These 41 small RNAs that are localized in deleted regions, and in particular the 4 that are differentially expressed, may contribute to the maintenance of the tumor phenotype. Therefore, they could be suitable for therapeutic approaches, where reintroduction of these small RNAs may intervene with tumor growth or other tumor specific characteristics. Moreover, these small RNAs may serve as highly specific biomarkers since it is known that MSI and MSS have different clinical outcome and respond differently to the current chuemotherapy regimens. On this basis, small RNAs located in microdeleted chromosomal regions that are most frequently present in MSI tumours, might be responsible for the different clinical behaviour seen in the MSI colorectal tumors. Therefore, small RNAs that are located in these microdeleted regions are highly suitable for use as biomarkers to distinguish microsatellite stable from microsatellite instable carcinomas, for early diagnosis of these tumors or tumor classification regarding therapy response, to dispense the most suitable therapeutical regimen for the patients or predict clinical outcome.
BAC array CGH data were related to small RNA expression array data as presented in Example 1, independently of adenoma or carcinoma status. Therefore, a dedicated integration tool called ACE-it was applied (van Wieringen et al., 2006). We obtained a list of five small RNAs for which DNA dosage affected expression levels (FDR<0.1) (Table 33). The expression of these five small RNAs is associated to DNA copy number changes, located on chromosomal regions implicated in colorectal pathogenesis (Beatriz Carvalho, Cindy Postma, Sandra Mongera, Erik Hopmans, Sharon Diskin, Mark A. van de Wiel, Wim van Criekinge, Olivier Thas, Anja Matthai, Miguel A. Cuesta, Jochim S. Terhaar sive Droste, Mikael Edward Craanen, Evelin Schrock, Bauke Ylstra, and Gerrit A. Meijer. Multiple putative oncogenes at the chromosome 20q amplicon contribute to colorectal adenoma to carcinoma progression (in press)). This indicates that these 5 small RNAs may be involved and causal in colorectal carcinogenesis. Small RNAs which expression is associated to DNA copy number changes may contribute causally to colorectal carcinogenesis as tumor suppressor genes or oncogenes and form targets for therapeutic intervention or be used as highly specific biomarkers. Moreover, they could be used as therapeutical targets or be used as biomarkers for early diagnosis, therapy response or clinical outcome of colorectal carcinomas.
1Cut off log2 range > 4
1Wilcoxon analysis: cut off adj. p value <0.2
1Wilcoxon analysis: cut off adj p value <0.2
2upregulation of small RNA when ratio expression >1 downregulation of small RNA when ratio expression <1
1Wilcoxon: cut off adjusted p value < 0.2
1Wilcoxon: cut off adjusted p value < 0.2
2Upregulated small RNAs have a ratio for expression >1 (below line in table)
1Wilcoxon analysis: cut off adjusted p value < 0.2
1Wilcoxon analysis: cut off p value < 0.2
2Upregulated small RNA expression when ratio >1 (below line)
1Cut off signal/background >2
1Cut off Signal/background >2
2Cut off adj p < 0.2
1Cut off Signal/background > 2
2Cut off adj p < 0.2
1Cut off Signal/background > 2
2Cut off adj p < 0.2
1Cut off Signal/background > 2
2Cut off adj p < 0.2
1Cut off Signal/background > 2
2Cut off adj p < 0.2
1Cut off Signal/background > 2
2Cut off adj p < 0.2
1Signal/background cut off > 2
1Cut off Signal/background>2
2Cut off adj p < 0.2
1Cut off Signal/background > 2
2Cut off adj p < 0.2
1Cut off Signal/background > 2
2Cut off adj p < 0.2
1 Cut off Signal/background > 2
2 Cut off adj p < 0.2
1Cut off Signal/background > 2
2Cut off adj p < 0.2
1Cut off Signal/background > 2
2Cut off adj p < 0.2
Number | Date | Country | Kind |
---|---|---|---|
07075567.3 | Jul 2007 | EP | regional |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/NL08/50459 | 7/7/2008 | WO | 00 | 9/27/2010 |
Number | Date | Country | |
---|---|---|---|
60958733 | Jul 2007 | US | |
60961001 | Jul 2007 | US |