The present invention relates to methods for tumour diagnosis, more particularly for the diagnosis of colorectal adenomas and adenocarcinomas. The present invention further provides marker genes, probes and arrays for performing these methods.
Cancer of the colorectal part of the gastrointestinal tract is a frequently occurring disorder. In a first stage a benign tumour (adenoma) occurs which can turn into a malignant cancer (adenocarcinoma). Not all adenomas progress to carcinomas. Indeed this progression into carcinomas occurs only in a small subset of tumours. Initiation of genomic instability is a crucial step and occurs in two ways in colorectal cancer (Lengauer et al. (1998) Nature, 396, 643-649). DNA mismatch repair deficiency leading to microsatellite instability (abbreviated as MSI or MIN), has been most extensively studied (di Pietro et al. (2005) Gastroenterology, 129, 1047-1059), but explains only about 15% of adenoma to carcinoma progression. In the other 85% of cases where colorectal adenomas progress to carcinomas, genomic instability occurs at the chromosomal level (CIN) giving rise to aneuploidy. While for a long time these chromosomal aberrations have been regarded as random noise, secondary to cancer development, it has now been well established that these DNA copy number changes occur in specific patterns and are associated with different clinical behaviour (Hermsen et al. (2002). Gastroenterology 123, 1109-1119). Chromosomal aberrations frequently reported in colorectal cancers are 7pq, 8q, 13q, 20q gains and 4pq, 5q, 8p, 15q, 17p, 18q losses. It was shown that 8q, 13q and 20q gains and 8p, 15q, 17p and 18q losses, are associated with progression of colorectal adenomas to carcinomas (Hermsen et al. (2002) cited above). Gain of chromosome arm 20q is the most frequent gain observed in colorectal cancer, being altered in more than 65% of the cases (Meijer et al. (1998) J. Clin. Pathol. 51, 901-909). Gains on 20q, in particular the region 20q12-q13, are also commonly described in other types of solid tumours and have been associated with poor outcome in both gastric and colorectal cancers.
It is of utmost clinical importance to identify the above described progression of adenomas into adenocarcinomas in as early a stage as possible, to allow early stage treatment of carcinomas while avoiding unnecessary surgical intervention of adenomas. Ideally, the adenocarcinomas can be identified at a stage where the presence of malignant cells is not yet detectable by classical microscopic analyses.
Different studies refer to individual or limited sets of genes which show a different expression level in adenomas compared to adenocarcinomas, mostly without correlation with the underlying chromosomal instability (Habermann et al. (2007) Genes Chromosomes Cancer. 46, 10-26, US20040258761).
The present invention is based on the observation that there is a significant difference in the expression pattern of genes in colorectal adenocarcinoma cells when compared with the expression pattern of genes of colorectal adenoma cells and that these differences are correlated with the occurrence of chromosomal aberrations in adenocarcinomas. Accordingly adenocarcinomas can be classified according to their chromosomal aberration and marker genes for these chromosomal aberrations have been identified. The large number of samples investigated in the present analysis resulted in a set of marker genes for all occurring chromosomal aberrations.
The present invention thus relates to methods and tools for cancer diagnosis, more specifically for the discrimination between adenomas and carcinomas in the colon and/or the rectum. More particularly, the present invention relates to the detection of the progression of an adenoma into a adenocarcinoma. It is an advantage of the methods of the present invention that they allow detection of the presence of adenocarcinoma cells at a very early stage i.e. before their presence can be detected by echography, radiography or MRI (magnetic resonance imaging).
The present invention is based on the finding that the progression of a colorectal adenoma into an adenocarcinoma is often caused by the appearance of a chromosomal gain or loss in an adenoma cell. Accordingly the present invention provides methods and tools for detecting adenocarcinomas by detecting a chromosomal gain or loss by indirect analysis methods.
According to the present invention chromosomal gain or loss, correlated with the progression from colorectal adenoma into adenocarcinoma, is detected indirectly by detection of altered expression levels of markers genes for the chromosomal gain or loss. Marker genes of which the expression is linked to chromosomal gain or loss can be divided into two classes. The first class of genes are those genes which are located in the regions of the chromosomes which are gained or lost. While a gene which is located in a chromosomal region which is lost will not be expressed, regulatory mechanism in the cell may upregulate the expression of the corresponding gene on the other intact chromosome. Contrary to what may be expected, not all genes which are located on a region of chromosomal gain are overexpressed. Thus the mere knowledge of the location of a gene within a region of chromosomal gain or loss is not sufficient to predict whether, and if so how, the expression of the gene is affected. The second class of genes are those genes which are themselves not located in a region of chromosomal gain or loss, but of which the expression is influenced by one or more genes located on a chromosomal aberration.
The present invention provides combinations of marker genes which allow a reliable detection of the presence of adenocarcinoma cells in a sample.
In one aspect, the present invention discloses marker genes which have not previously been identified as marker genes for the progression of colorectal adenoma cells into carcinoma cells (Table 17).
Accordingly, the present invention provides in vitro methods for detecting the presence of colorectal adenocarcinoma cells in a patient, the method comprising the steps of (a) detecting in a test sample of that patient the expression level of one or more marker genes, selected from the group listed in Table 17, and (b) comparing the expression level of these one or more marker genes used in step (a) to that of a control. An elevated or decreased expression level of the marker genes in the test sample compared to the control sample, is indicative of the presence of colorectal adenocarcinoma cells in the patient. According to a particular embodiment, the methods of the invention comprise detecting the expression level of one or more marker genes, selected from the group listed in Table 17, whereby the marker genes are genes of which the expression is upregulated in adenocarcinoma cells compared to adenoma cells.
In another aspect, the present invention provides an extensive list of marker genes such that a representative number of marker genes can be used to reliably determine the progression of colorectal adenoma cells into carcinoma cells (Table 1).
Accordingly, the present invention provides in vitro methods for detecting the presence of colorectal adenocarcinoma cells in a patient, the method comprising the steps of (a) detecting in a test sample of said patient the expression level of at least 12 of the marker genes, selected from the group listed in Table 1, and (b) comparing the expression level of the marker genes used in step a) to that of a control sample. An elevated or decreased expression level of these marker genes in the test sample compared to the control sample, is indicative of the presence of colorectal adenocarcinoma cells in the patient.
In yet another aspect, the present invention discloses marker genes the expression level of which is correlated with the presence of a specific type of chromosomal aberration in a colorectal adenoma cell (Tables 2 to 9).
Accordingly, the present invention provides in vitro methods for detecting the presence of colorectal adenocarcinoma cells in a patient based on the detection of marker genes the expression of which is correlated with the presence of a chromosomal aberration. More specifically, the methods according to this aspect of the invention comprise the steps of (a) detecting in a test sample of the patient the expression level of one or more marker genes, the expression of which is altered upon the presence of a chromosomal aberration, and (b) comparing the expression level of the one or more marker genes used in step (a) to that of control sample(s). An elevated or decreased expression level of the one or more marker genes in the test sample compared to the control sample, is indicative of the presence of adenocarcinoma cells in the patient.
In particular embodiments of this aspect of the invention, the one or more marker genes comprise:
two or more marker genes selected from the group of marker genes depicted in Table 2 of which the expression level is altered by chromosomal loss at chromosome 8p and/or,
one or more marker genes selected from the group of marker genes depicted in Table 3 of which the expression level is altered by chromosomal gain at chromosome 8q and/or,
three or more marker genes selected from the group of marker genes depicted in Table 4 of which the expression level is altered by chromosomal gain at chromosome 13q and/or
one or more marker genes selected from the group of marker genes depicted in Table 5 of which the expression level is altered by chromosomal loss at chromosome 15q and/or
one or more marker genes selected from the group of marker genes depicted in Table 6 of which the expression level is altered by chromosomal loss at chromosome 17p and/or,
three or more marker genes selected from the group of marker genes depicted in Table 7 of which the expression level is altered by chromosomal loss at chromosome 18q and/or,
nine or more marker genes selected from the group of marker genes depicted in Table 8 of which the expression level is altered by chromosomal gain at chromosome 20q and/or
two or more marker genes selected from the group of marker genes depicted in Table 9 of which the expression level is altered by chromosomal gain at chromosome 20q.
In yet another aspect, the present invention discloses sets of marker genes which are representative for the different chromosomal aberrations linked to colorectal adenocarcinoma. These aberrations are known to occur in at least 85% of all occurring colorectal adenocarcinomas, thus representing a reliable screening assay for the detection of adenocarcinomas.
Accordingly, the present invention provides in vitro methods for detecting the presence of colorectal adenocarcinoma cells in a patient, the method comprising the steps of (a) detecting in a test sample of the patient the expression level of a plurality of marker genes representative of the occurrence of chromosomal aberrations linked to colorectal adenocarcinomas and (b) comparing the expression level of the plurality of marker genes used in step a) to that of a control sample. An elevated or decreased expression level of the plurality of markers in the test sample compared to the control sample, is indicative of the presence of adenocarcinoma cells in the patient. More particularly, the plurality of marker genes comprises:
at least one marker gene of which the expression level is altered by chromosomal loss at chromosome 8p, selected from the group of marker genes depicted in Table 2, and
at least one marker gene of which the expression level is altered by chromosomal gain at chromosome 8q, selected from the group of marker genes depicted in Table 3, and
at least one marker gene of which the expression level is altered by chromosomal gain at chromosome 13q, selected from the group of marker genes depicted in Table 4, and
at least one marker gene of which the expression level is altered by chromosomal loss at chromosome 15q, selected from the group of marker genes depicted in Table 5, and
at least one marker gene of which the expression level is altered by chromosomal loss at chromosome 17p, selected from the group of marker genes depicted in Table 6, and
at least one marker gene of which the expression level is altered by chromosomal loss at chromosome 18q, selected from the group of marker genes depicted in Table 7, and
at least one marker gene of which the expression level is altered by chromosomal gain at chromosome 20q, selected from the group of marker genes depicted in Table 8 or in Table 9.
In yet another aspect, the present invention discloses marker genes the expression level of which is altered upon the presence a specific type of chromosomal aberration in a colorectal adenoma cell and which are located within that chromosomal aberration (Tables 10-16).
Accordingly, the present invention provides in vitro methods for detecting the presence of colorectal adenocarcinoma cells in a patient, the method comprising the steps of (a) detecting in a test sample of the patient the expression level of a plurality of marker genes representative of the occurrence of chromosomal aberrations linked to colorectal adenocarcinomas and (b) comparing the expression level of the plurality of marker genes used in step a) to that of a control sample. An elevated or decreased expression level of the plurality of markers in the test sample compared to the control sample, is indicative of the presence of adenocarcinoma cells in the patient. More particularly, the plurality of marker genes comprises:
at least one marker gene located in the region of chromosomal loss at chromosome 8p, selected from the group of marker genes depicted in Table 10, and
at least one marker gene located in the region of chromosomal gain at chromosome 8q, selected from the group of marker genes depicted in Table 11, and
at least one marker gene located in the region of chromosomal gain at chromosome 13q, selected from the group of marker genes depicted in Table 12, and
at least one marker gene located in the region of chromosomal loss at chromosome 15q, selected from the group of marker genes depicted in Table 13, and
at least one marker gene located in the region of chromosomal loss at chromosome 17p, selected from the group of marker genes depicted in Table 14, and
at least one marker gene located in the region of chromosomal loss at chromosome 18q, selected from the group of marker genes depicted in Table 15, and
at least one marker gene located in the region of chromosomal gain at chromosome 20q, selected from the group of marker genes depicted in Table 16.
In particular embodiments of the above methods of the invention, the marker genes used are selected such that the one or more marker genes, the expression level of which is altered by a chromosomal loss at chromosome 8p or which are located in a region of chromosomal loss at chromosome 8p are selected from the group of marker genes consisting of NM—020749, NM—004315, NM—003747, NM—016353, NM—152415, NM—006197, NM—000662, NM—000015, D31887, NM—017884, NM—004462, NM—006765, NM—001715, NM—012331, NM—139167, NM—013354 and NM—005144, depicted in Table 10, and/or
the one or more marker genes, the expression level of which is altered by a chromosomal gain at chromosome 8q or which are located in a region of chromosomal gain at chromosome 8q, are selected from the group of marker genes consisting of NM—138455, NM—032611, NM—032862, AL713790, BC030520, NM—024035, NM—017767, NM—002346, AF289596, NM—012162 and AB051475, depicted in Table 11 and/or
the one or more marker genes, the expression level of which is altered by a chromosomal gain at chromosome 13q or which are located in a region of chromosomal gain at chromosome 13q, are selected from the group of marker genes consisting of NM—145293, NM—005358, U50531, NM—012158, NM—017817, NM—003899, NM—003903, NM—018386, BC008975, NM—023011, NM—001260, NM—006646, U50524, NM—033111, NM—024808, NM—014832, NM—006002, NM—015057, NM—024546, BC026126, NM—006493, NM—018210 and NM—017664, depicted in Table 12 and/or
the one or more marker genes, the expression level of which is altered by a chromosomal loss at chromosome 15q or which are located in a region of chromosomal loss at chromosome 15q, are selected from the group of marker genes consisting of NM—030574, NM—004255, NM—002573, AB033025, NM—033240, NM—000126, NM—015079, NM—015969 and NM—016073, depicted in Table 13 and/or
the one or more marker genes, the expression level of which is altered by a chromosomal loss at chromosome 17p or which are located in a region of chromosomal loss at chromosome 17p are selected from the group of marker genes consisting of NM—130766, NM—015721 and NM—031430, depicted in Table 14 and/or
the one or more marker genes, the expression level of which is altered by a chromosomal loss at chromosome 18q or which are located in a region of chromosomal loss at chromosome 18q are selected from the group of marker genes consisting of NM—004715, NM—006701 and NM—014913, depicted in Table 15 and/or
the one or more marker genes, the expression level of which is altered by a chromosomal gain at chromosome 20q or which are located in a region of chromosomal gain at chromosome 20q are selected from the group of marker genes consisting of NM—016397, NM—018270, NM—006602, NM—080476, NM—017896, NM—006097, NM—021809, NM—018840, NM—003600, NM—017495, NM—007002, NM—016354, NM—014071, NM—002212, NM—003185, NM—152255, NM—022082, NM—018244, NM—014902, NM—032013, NM—020182, NM—006886, NM—020673, BC003122, NM—012325, NM—014183, NM—021100, NM—004738, NM—016045, NM—014054, NM—022105, NM—015666, NM—032527, BC025345, NM—033405, NM—006892, NM—005225, NM—000687, BC035639, NM—018677, NM—006047, NM—016436, NM—015511, NM—016082, NM—007238, NM—003908, NM—003610, NM—153360, NM—080425, NM—000114, NM—001853, NM—144498, NM—017798 and NM—012384, depicted in Table 16, more particularly are selected from the group consisting of The method according to claim 3 or 4, wherein the marker genes the expression level of which is altered by a chromosomal gain at chromosome 20q are selected from the group of marker genes consisting of NM—016397, NM—018270, NM—006602, NM—080476, NM—017896, NM—006097, NM—021809, NM—018840, NM—003600, NM—017495, NM—007002, NM—016354, NM—014071, NM—002212, NM—003185, NM—152255, NM—022082, NM—018244, NM—014902, NM—032013, NM—020182, NM—006886, NM—020673, BC003122, NM—012325, NM—014183, NM—021100, NM—004738, NM—016045, NM—014054, NM—022105, NM—015666, NM—032527, BC025345 and NM—033405, depicted in Table 16.
The present invention further discloses a set of marker genes of which a difference in expression level is correlated with the presence of three, four or even five different chromosomal aberrations (Table 18).
Accordingly, the present invention provides in vitro methods for detecting the presence of colorectal adenocarcinoma cells in a patient, the method comprising the steps of (a) detecting in a test sample of the patient the expression level of one or more marker genes selected from the marker genes depicted in Table 18 and (b) comparing the expression level of these marker genes used in step (a) to that of a control sample. An elevated or decreased expression level of these one or more marker genes in the test sample compared to the control sample, is indicative of the presence of adenocarcinoma cells in the patient.
Particular embodiments of the methods of the invention described above, relate to methods wherein the marker genes are further selected based on their p-value or FDR value. More specifically, the marker genes have a p or FDR value below 0.05 or more particularly below 0.01, as indicted in Tables 1 to 9.
Further particular embodiments of the methods of the invention described above, relate to methods wherein the marker genes are further selected based on the size of the difference in expression between the adenocarcinoma and adenoma cells. More particularly, those markers are selected which are indicated as having a difference in expression level of at least ‘2’, more particularly, of at least ‘4’, in accordance with Tables 2 to 9.
Typically, in the in vitro methods of the present invention, the test sample is a sample which comprises or is suspected to comprise cells of the colorectal lesion. In one embodiment, the test sample is a sample from a biopsy or ressection of the colorectal lesion. Alternatively, the test sample is selected from the group consisting of a sample of urine, blood, saliva, sweat and stool.
In particular embodiments of the methods of the present invention the control sample is a sample of the same tissue of a patient comprising colorectal adenoma cells and not colorectal carcinoma cells. In alternative embodiments the control sample is an adenoma standard.
In a particularly preferred embodiment the present invention relates to an in vitro method for detecting the presence of colorectal adenocarcinoma cells in a patient, the method comprising the steps of:
a) detecting in a test sample of said patient the expression level of at least the marker genes NM—017495 and NM—006602 of Table 8, 9 or 16, and
b) comparing the expression level of said marker genes used in step a) to that of a control sample,
wherein an elevated expression level of said at least 2 marker genes in said test sample, compared to said control sample, is indicative of the presence of colorectal adenocarcinoma cells in said patient.
A further elaboration of this preferred aspect of the invention relates to an in vitro method according to claim 1 comprising the steps of:
a) detecting in a test sample of said patient the expression level of at least the marker genes NM—017495, NM—006602, NM—018840, NM—003600, NM—018270, NM—007002 and NM—016397 of Table 8, 9 or 16, and
b) comparing the expression level of said marker genes used in step a) to that of a control sample,
wherein an elevated expression level of said at least 7 marker genes in said test sample, compared to said control sample, is indicative of the presence of colorectal adenocarcinoma cells in said patient.
Such methods allow to correctly classify and distinguish adenomas from adenocarcinomas in at least 85%, preferably in at least 88% cases examined.
In particular embodiments of the in vitro methods of the present invention, the expression level is determined at the DNA or RNA level. Alternative embodiments encompass the determination of marker gene expression at the protein level.
The methods of the present invention are particularly suitable for diagnosing the progression of a colorectal adenoma into a colorectal carcinoma.
A further aspect of the invention provides kits for detecting colorectal adenocarcinoma cells comprising agents for specifically detecting marker genes the expression of which is indicative of the presence of adenocarcinoma cells. The agents are optionally functionalised with a label, such as a label selected from a radioactive label, a magnetic label an MRI contrast label, a chromophoric label, and an ultrasound label. In a particular embodiment, the kits comprise agents for the specific detection of the expression of at least 12 of the marker genes depicted in Table 1.
Alternatively, kits are provided for detecting colorectal adenocarcinoma cells comprising agents for specifically detecting the expression of at least one of the marker genes depicted in Table 17.
Specific embodiments of the invention relate to kits for detecting colorectal adenocarcinoma cells comprising agents for specifically detecting the expression of one or more marker genes, said kit comprising:
agents for specifically detecting three or more marker genes selected from the group depicted in Table 2 and/or,
agents for specifically detecting one or more marker genes selected from the group depicted in Table 3 and/or,
agents for specifically detecting two or more marker genes selected from the group depicted in Table 4 and/or,
agents for specifically detecting one or more marker genes selected from the group depicted in Table 5 and/or,
agents for specifically detecting one or more marker genes selected from the group depicted in Table 6 and/or,
agents for specifically detecting three or more marker genes selected from the group depicted in Table 7 and/or,
agents for specifically detecting seven or more marker genes selected from the group depicted in Table 8 and/or,
agents for specifically detecting seven or more marker genes selected from the group depicted in Table 9.
Further embodiments of the kits of the present invention relate to kits for detecting colorectal adenocarcinoma cells comprising a set of agents for specifically detecting the expression of at least one marker gene of each of the following groups: the marker genes depicted in Table 2, the marker genes depicted in Table 3, the marker genes depicted in Table 4, The marker genes depicted in Table 5, the marker genes depicted in Table 6, the marker genes depicted in Table 7, the marker genes depicted in Table 8 and the marker genes depicted in Table 9.
Further specific embodiments of the kits of the present invention described above provide agents for the specific detection of marker genes wherein said marker genes have a p or FDR value below 0.05, more specifically below 0.01, according to Table 1 to 9.
Further specific embodiments of the kits of the present invention described above provide agents for the specific detection of marker genes wherein said marker genes have a difference in expression level between an adenoma and an adenocarcinoma cell of at least “2”, more particularly at least “4”, according to Table 2 to 9 or 17.
In further specific embodiments of the kits described herein, the agents are oligonucleotides or antibodies. The oligonucleotides are optionally arrayed on a support. Further specific embodiments of the kits described herein relate to kits comprising antibodies, wherein the antibodies bind to proteins encoded by marker genes. In a particular embodiment, the antibodies bind to proteins encoded by the marker genes whereby the proteins are secreted proteins or proteins on the cell surface.
Yet a further aspect of the invention relates to in vivo methods for the detection of adenocarcinoma cells in colorectal tissue of a patient.
More specifically, the present invention provides in vivo methods for detecting adenocarcinoma cells in a colorectal lesion said method comprising the steps of contacting the colorectal lesion with a labelled agent capable of specifically binding or interacting with a protein expressed by a marker gene selected from the gene depicted in Table 17 and (b) detecting the binding or interaction of the agent with said colorectal lesion. The detection of a difference in the binding or interaction of said agent within loci of the colorectal lesion is indicative for the presence of adenocarcinoma cells in the lesion.
In particular embodiments, the marker gene used in the in vivo methods of the present invention is a gene encoding a protein located at the cell membrane, a secreted protein or an enzyme.
The present invention thus provides agent binding or interacting specifically with a marker gene depicted in Table 17 or 18, for use as an in vivo diagnostic for detecting the presence of an adenocarcinoma cell in a colorectal adenoma tissue.
Accordingly, the present invention also relates to the use of an agent binding or interacting specifically with a marker gene depicted in Table 17 or 18, in the manufacture of a diagnostic for detecting the presence of adenocarcinoma cell in a colorectal tumour tissue.
The above and other characteristics, features and advantages of the present invention will become apparent from the following detailed description. This description is given for the sake of example only, without limiting the scope of the invention.
The present invention will be described with respect to particular embodiments and with reference to certain drawings but the invention is not limited thereto but only by the claims. Any reference signs in the claims shall not be construed as limiting the scope. The drawings described are only schematic and are non-limiting. In the drawings, the size of some of the elements may be exaggerated and not drawn on scale for illustrative purposes. Where the term “comprising” is used in the present description and claims, it does not exclude other elements or steps. Where an indefinite or definite article is used when referring to a singular noun e.g. “a” or “an”, “the”, this includes a plural of that noun unless something else is specifically stated.
Furthermore, the terms first, second, third and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances and that the embodiments of the invention described herein are capable of operation in other sequences than described or illustrated herein.
The following terms or definitions are provided solely to aid in the understanding of the invention. These definitions should not be construed to have a scope less than understood by a person of ordinary skill in the art.
“Tumour” or “neoplasm” as used herein refers to an abnormal tissue that grows by cellular proliferation more rapidly than normally, and continues to grow after the stimuli that initiated the new growth cease. The term “lesion”, generally referring to an abnormality involving any tissue or organ due to any disease or any injury, is also used herein to refer to a neoplasm. Tumours, neoplasm or lesions can be either benign or malignant.
“Cancer” is a general term referring to any type of malignant neoplasm.
“Adenoma”, (or non-progressed adenoma) as used herein, relates to a benign epithelial neoplasm. Adenomas are usually well circumscribed, they can be flat or polypoid and the neoplastic cells do not infiltrate or invade adjacent tissue.
“Adenocarcinoma”, as used herein, relates to a malignant neoplasm of epithelial cells. Most commonly carcinoma's form a glandular or glandlike pattern. Synonyms are “glandular cancer” and “glandular carcinoma”. Malignant cells are often characterized by characterized by progressive and uncontrolled growth. They can spread locally or though the blood stream and lymphatic system to other parts of the body.
“Progressed adenoma” refers to an adenoma that harbours a focus of a cancer. This is also called a “malignant polyp”. Colorectal adenomas are common in the elderly population, but only a small proportion of these pre-malignant tumours (estimated approximately 5%) progresses to malignant tumours (i.e. colorectal adenocarcinoma).
“Colorectal” relates to of the colon and/or the rectum, i.e. the complete large intestine.
“Chromosomal aberration”, as used herein refers to a chromosomal loss or gain, i.e. a region in the chromosome which has been deleted or duplicated.
“Marker gene”, as used herein is a gene of which expression differs (decreased or increased) between adenoma and adenocarcinoma cells, and which can thus be used to differentiate between adenoma and carcinoma cells.
“Expression profile” refers to the expression level of a number of a marker genes in a cell or a sample.
The present invention provides diagnostic methods and tools for distinguishing between malignant and benign colorectal lesions and for diagnosing the presence of colorectal cancer in a subject. The diagnostic methods of the present invention allow a reliable detection of very early-stage colorectal cancers.
One aspect of the present invention relates to in vitro methods for detecting the presence of colorectal adenocarcinoma cells in a patient, the method comprising the steps of detecting in a test sample of that patient the expression level of one or more marker genes. The expression level of the marker gene(s) is compared to that of a control sample, whereby a difference in expression level of the marker gene(s) in the test sample compared to the expression level in the control sample, is indicative of the presence of colorectal adenocarcinoma cells in the patient.
In general, the marker genes used in the methods of the invention are differentially expressed between adenoma and carcinoma cells. A further selection can be made based on the size of the difference, i.e. the extent to which up- or down-regulation of a marker gene occurs (e.g. two-fold, four-fold or eight-fold or even more) in a adenocarcinoma sample. The expression levels are indicated in the Tables under the header “effect” as log-2 values. Accordingly a value of 1 refers to a 2-fold higher expression of a gene in an adenocarcinoma compared to an adenoma, a value of 2 refers to a 4-fold higher expression, a value of 3 refers to an 8-fold higher expression, etc. Similarly, a value of −1 refers to a 2-fold lower expression of a gene in an adenocarcinoma compared to an adenoma, a value of −2 refers to a 4-fold lower expression, a value of −3 refers to an 8-fold lower expression, etc. Alternatively, a further selection of the marker genes may also be made based on a calculation of the statistical significance (using p-value or FDR (False Discovery Rate) value) of the differential expression in adenoma and carcinoma cells of the marker. In particular embodiments, both selection criteria are used.
In one embodiment of the present invention, the presence of adenocarcinoma cells in a patient is identified by determining the expression of one or more marker genes of which the expression is changed (increased or decreased) more than two-fold, four-fold or eight-fold when comparing adenoma and adenocarcinoma cells, and/or of which the p-value for the correlation between marker gene expression and occurrence of adenoma or adenocarcinoma cells is lower than 0.01 (is statistically significant).
The sample used for detection in the in vitro methods of the present invention may be collected in any clinically acceptable manner, but is collected such that nucleic acids (in particular RNA) or proteins are preserved. The samples which are analysed according to the present invention are generally colorectal biopsies or resections. Intact cells or lysed cells from tumour tissue may also detach from the colon without intervention and will end up in the faeces. Accordingly, stool samples are also considered as a suitable source for isolating RNA. Furthermore, colorectal adenocarcinoma cells may migrate into other tissues. Consequently, also blood and other types of sample can be used. As it is the aim to detect the progression from colorectal adenoma to adenocarcinoma as early as possible, a biopsy or resection may contain a majority of adenoma cells and only a minority of adenocarcinoma cells. To increase the signal/background ratio, a resection can be divided into different sub-samples prior to analysis. Even if the total number of carcinoma cells in the biopsy or resection is limited, it can be expected that at least one of the sub-samples will contain a increased ratio of adenocarcinoma versus adenoma cells.
In the in vitro methods of the present invention, the expression of one or more marker genes is determined, at RNA or protein level, in a sample of the patient and compared to the expression of these genes in a control sample. The control sample can be an adenoma sample from the same patient. Alternatively, the control sample is an adenoma sample obtained from another individual or obtained from pooled samples of colorectal adenomas from one or more other individuals. Controls can also be made artificially by pooling a set of nucleic acids or proteins to mimic the RNA/protein content of a colorectal adenoma tissue.
In a first embodiment, the marker genes characterised as being either up- or down-regulated in colorectal carcinoma cells compared to a control such as colorectal adenoma cells, are selected from Table 1. While any of the marker genes listed in Table 1 is suitable for use in the methods of the present invention for identifying the presence of colorectal adenocarcinoma cells in a sample, an important advantage of the present invention is the provision of an extensive list of suitable markers so as to allow an increased reliability of detection. Accordingly, particular embodiments of the invention relate to the use of at least 2, at least 5, at least 10, 12 or 15, at least 20, at least 50, at least 100, at least 200, at least 500 or all of the marker genes of Table 1. In a particular embodiment, a subset of marker genes of Table 1 is used, namely those marker genes of Table 1, which have a p value below 0.01.
In another embodiment, the marker genes characterised as being either up regulated or down-regulated in colorectal carcinoma cells compared to a control such as colorectal adenoma cells, are selected from Table 17. None of the marker genes of Table 17 have previously been identified as marker genes of colorectal adenocarcinomas and each of the marker genes of Table 17 is suitable for use in the methods of the present invention for identifying the presence of colorectal adenocarcinoma cells in a sample. In particular embodiments the expression of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more of the genes identified in Table 17 is determined in a sample of the patient and compared to a control to identify the presence of adenocarcinoma cells in the patient.
According to a further embodiment of the methods of the present invention, subsets of marker genes of Table 1 are use of which the expression level is directly correlated with a particular chromosomal aberration in colorectal tumour cells.
In one embodiment, the present invention provides a set of marker genes of which the expression level is altered when a chromosomal loss occurs in colorectal adenoma cells at chromosome 8p. These marker genes are listed in Table 2. Accordingly any of the marker genes listed in Table 2 can be used in the methods and tools of the present invention for identifying colorectal carcinoma cells. In the method and tools of the present invention at least 1, at least 2, at least 3, at least 5, at least 10, at least 20, at least 50, at least 100, at least 200 or all of the marker genes of Table 2 are used. Additionally or alternatively, the marker gene(s) used in the methods of the invention is/are selected from a subset of Table 2, namely those marker genes of Table 2, which have a FDR (False Discovery Rate) value below 0.05 or, more particularly below 0.01. Additionally or alternatively, the marker gene(s) used in the methods of the present invention is/are selected from a subset of Table 2 corresponding to those marker genes for which the expression level is increased in adenocarcinoma cells compared to adenocarcinoma cells. Additionally or alternatively, the marker gene(s) used in the methods of the present invention is/are selected from a subset of Table 2 corresponding to those marker genes for which the expression level in adenoma and adenocarcinoma cells differs with at least a factor 2, at least a factor 4 or at least a factor 8. Equally a subset can be created of marker genes having a difference in expression level (increase or decrease) between adenoma and adenocarcinoma of at least a factor 2, 4 or 8 and for which the FDR value is below 0.05 or below 0.01.
In another embodiment, the present invention provides a set of marker genes of which the expression level is altered when a chromosomal gain occurs in colorectal adenoma cells at chromosome 8q. These marker genes are listed in Table 3. Accordingly any of the marker genes listed in Table 3 can be used in the methods and tools of the present invention for identifying colorectal carcinoma cells. In the methods and tools of the present invention at least 1, at least 2, at least 3, at least 5, at least 10, at least 20, or all of the marker genes of Table 3 are used. Additionally or alternatively, the marker gene(s) used in the methods of the invention is/are selected from a subset of Table 3, namely those marker genes of Table 3, which have a FDR (False Discovery Rate) value below 0.05 or, more particularly below 0.01. Additionally or alternatively, the marker gene(s) used in the methods of the present invention is/are selected from a subset of Table 3 corresponding to those marker genes for which the expression level is increased in adenocarcinoma cells compared to adenocarcinoma cells. Additionally or alternatively, the marker gene(s) used in the methods of the present invention is/are selected from a subset of Table 3 corresponding to those marker genes for which the expression level in adenoma and adenocarcinoma cells differs with at least a factor 2, at least a factor 4 or at least a factor 8. Equally a subset can be created from Table 3 of marker genes having a difference in expression level (increase or decrease) between adenoma and adenocarcinoma of at least a factor 2, 4 or 8 and for which the FDR value is below 0.05 or below 0.01.
In another embodiment, the present invention provides a set of marker genes of which the expression level is altered when a chromosomal gain occurs in colorectal adenoma cells at chromosome 13q. These marker genes are listed in Table 4. Accordingly any of the marker genes listed in Table 4 can be used in the methods and tools of the present invention for identifying colorectal carcinoma cells. In one embodiment of the methods and tools of the present invention at least 1, at least 2, at least 3, at least 5, at least 10, at least 20, at least 50, at least 100 or all of the marker genes of Table 4 are used. Additionally or alternatively, the marker gene(s) used in the methods of the invention is/are selected from a subset of Table 4, namely those marker genes of Table 4, which have a FDR (False Discovery Rate) value below 0.05 or, more particularly below 0.01. Additionally or alternatively, the marker gene(s) used in the methods of the present invention is/are selected from a subset of Table 4 corresponding to those marker genes for which the expression level is increased in adenocarcinoma cells compared to adenocarcinoma cells. Additionally or alternatively, the marker gene(s) used in the methods of the present invention is/are selected from a subset of Table 4 corresponding to those marker genes for which the expression level in adenoma and adenocarcinoma cells differs with at least a factor 2, at least a factor 4 or at least a factor 8. Equally a subset can be created from Table 4 of marker genes having a difference in expression level (increase or decrease) between adenoma and adenocarcinoma of at least a factor 2, 4 or 8 and for which the FDR value is below 0.05 or below 0.01.
In another embodiment, the present invention provides a set of marker genes of which the expression level is altered when a chromosomal loss occurs in colorectal adenoma cells at chromosome 15q. These marker genes are listed in Table 5. Accordingly any of the marker genes listed in Table 5 can be used in the methods and tools of the present invention for identifying colorectal carcinoma cells. In a particular embodiment of the methods and tools of the present invention at least 1, at least 2, at least 3, at least 5, at least 10, at least 20, at least 30 or all of the marker genes of Table 5 are used. Additionally or alternatively, the marker gene(s) used in the methods of the invention is/are selected from a subset of Table 5, namely those marker genes of Table 5, which have a FDR (False Discovery Rate) value below 0.05 or, more particularly below 0.01. Additionally or alternatively, the marker gene(s) used in the methods of the present invention is/are selected from a subset of Table 5 corresponding to those marker genes for which the expression level is increased in adenocarcinoma cells compared to adenoma cells. Additionally or alternatively, the marker gene(s) used in the methods of the present invention is/are selected from a subset of Table 5 corresponding to those marker genes for which the expression level in adenoma and adenocarcinoma cells differs with at least a factor 2, at least a factor 4 or at least a factor 8. Equally a subset can be created of marker genes from Table 5 having a difference in expression level (increase or decrease) between adenoma and adenocarcinoma of at least a factor 2, 4 or 8 and for which the FDR value is below 0.05 or below 0.01.
In another embodiment, the present invention provides a set of marker genes of which the expression level is altered when a chromosomal loss occurs in colorectal adenoma cells at chromosome 17p. These marker genes are listed in Table 6. Accordingly, any of the marker genes listed in Table 6 can be used in the methods and tools of the present invention for identifying colorectal carcinoma cells. In a particular embodiment of the methods and tools of the present invention at least 1, at least 2, at least 3, at least 4, at least 6 or all of the marker genes of Table 6 are used. Additionally or alternatively, the marker gene(s) used in the methods of the invention is/are selected from a subset of Table 6, namely those marker genes of Table 6, which have a FDR (False Discovery Rate) value below 0.1. Additionally or alternatively, the marker gene(s) used in the methods of the present invention is/are selected from a subset of Table 6 corresponding to those marker genes for which the expression level is increased in adenocarcinoma cells compared to adenoma cells. Additionally or alternatively, the marker gene(s) used in the methods of the present invention is/are selected from a subset of Table 6 corresponding to those marker genes for which the expression level in adenoma and adenocarcinoma cells differs with at least a factor 2, at least a factor 4 or at least a factor 8. Equally a subset can be created of Table 6, of marker genes having a difference in expression level (increase or decrease) between adenoma and adenocarcinoma of at least a factor 2, 4 or 8 and for which the FDR value is below 0.1.
In another embodiment, the present invention provides a set of marker genes of which expression level is altered when a chromosomal loss occurs in colorectal adenoma cells at chromosome 18q. These marker genes are listed in Table 7. Accordingly any of the marker genes listed in Table 7 is used in the methods and tools of the present invention for identifying colorectal carcinoma cells. In a particular embodiment of the methods and tools of the present invention at least 1, at least 2, at least 3, at least 5, at least 10, at least 20, at least 50, at least 100, at least 200 or all of the marker genes of Table 7 are used. Additionally or alternatively, the marker gene(s) used in the methods of the invention is/are selected from a subset of Table 7, namely those marker genes of Table 7, which have a FDR (False Discovery Rate) value below 0.05 or, more particularly below 0.01. Additionally or alternatively, the marker gene(s) used in the methods of the present invention is/are selected from a subset of Table 7 corresponding to those marker genes for which the expression level is increased in adenocarcinoma cells compared to adenoma cells. Additionally or alternatively, the marker gene(s) used in the methods of the present invention is/are selected from a subset of Table 7 corresponding to those marker genes for which the expression level in adenoma and adenocarcinoma cells differs with at least a factor 2, at least a factor 4 or at least a factor 8. Equally a subset can be created of Table 7, of marker genes having a difference in expression level (increase or decrease) between adenoma and adenocarcinoma of at least a factor 2, 4 or 8 and for which the FDR value is below 0.05 or, more particularly below 0.01.
In another embodiment, the present invention provides a set of marker genes of which expression level is altered when a chromosomal gain occurs in colorectal adenoma cells at chromosome 20q. These marker genes are listed in Table 8 and 9, wherein the data in Table 8 are derived from a univariant analysis and the data in Table 9 from a multivariant analysis. Accordingly, any of the marker genes listed in Table 8 or 9 can be used in the methods and tools of the present invention for identifying colorectal carcinoma cells. In a particular embodiment of the methods and tools of the present invention at least 1, at least 2, at least 4, at least 6, at least 8, at least 10, at least 15, at least 20, at least 25, at least 50, at least 100, at least 200, at least 400, at least 600 or all of the marker genes of Table 8 or 9 are used. Additionally or alternatively, the marker gene(s) used is/are selected from a subset of Table 8 or 9, namely those marker genes of Table 8 or 9, which have a FDR value below 0.05 or more particularly below 0.01. Additionally or alternatively, the marker gene(s) used in the methods of the present invention is/are selected from a subset of Table 8 or 9 corresponding to those marker genes for which the expression level is increased in colorectal adenocarcinoma cells compared to adenoma cells. Additionally or alternatively, the marker gene(s) is/are selected from another subset of Table 8 or 9, namely those marker genes for which the expression level in adenoma and adenocarcinoma differs at least a factor 2, at least a factor 4 or at least a factor 8. Equally subsets of marker genes can be created of Table 8 or 9, wherein the marker genes have a difference in expression levels (increase or decrease) between adenoma and adenocarcinoma of at least a factor 2, 4 or 8 and an FDR value below 0.05 or below 0.01.
In a particularly preferred embodiment the present invention relates to an in vitro method for detecting the presence of colorectal adenocarcinoma cells in a patient, the method comprising the steps of:
a) detecting in a test sample of said patient the expression level of at least the marker genes NM—017495 and NM—006602 of Table 8, 9 or 16, and
b) comparing the expression level of said marker genes used in step a) to that of a control sample,
wherein an elevated expression level of said at least 2 marker genes in said test sample, compared to said control sample, is indicative of the presence of colorectal adenocarcinoma cells in said patient.
A further elaboration of this preferred aspect of the invention relates to an in vitro method according to claim 1 comprising the steps of:
a) detecting in a test sample of said patient the expression level of at least the marker genes NM—017495, NM—006602, NM—018840, NM—003600, NM—018270, NM—007002 and NM—016397 of Table 8, 9 or 16, and
b) comparing the expression level of said marker genes used in step a) to that of a control sample,
wherein an elevated expression level of said at least 7 marker genes in said test sample, compared to said control sample, is indicative of the presence of colorectal adenocarcinoma cells in said patient.
A gain of chromosomal arm 20q can be observed in collorectal adenoma cells in more than 50% or in even more than 60% of cases. For such a gain of chromosomal arm 20q, the above mentioned markers genes NM—017495, NM—006602, NM—018840, NM—003600, NM—018270, NM—007002 and NM—016397 of Tables 8, 9 or 16 were found to be overexpressed in adenocarcinomas vs. adenomas to allow correctly distinguishing adenomas from adenocarcinomas in at least 85%, preferably in at least 88% of cases examined.
The above described sets of marker genes listed in Tables 2 to 9 comprise marker genes which are linked to a specific type of chromosomal aberration. It is however possible that a certain gene is upregulated by different chromosomal aberrations. Accordingly Tables 2 to 9 show a certain degree of redundancy. The marker genes which occur more than once in Tables 2 to 9 are particularly suitable for use in the methods of the present invention, as they are more generally characteristic of the occurrence of ‘a’ chromosomal aberration. Marker genes which expression level is altered by 3, 4 or 5 different chromosomal aberrations are presented in Table 18. Thus, a particular embodiment of the methods of the present invention relates to the use of at least 1, at least 2, at least 4, at least 6, at least 8, at least 10, at least 15, at least 20, at least 25, at least 50, at least 100, at least 200, at least 400, at least 600 or all of the marker genes of Table 18, for the identification of colorectal adenocarcinoma cells in a patient. Additionally or alternatively, the marker gene(s) used in the methods of the present invention is/are selected from a subset of Table 18 corresponding to those marker genes for which the expression level is increased in adenocarcinoma cells compared to adenoma cells.
The present invention not only provides marker genes differentially expressed between colorectal adenoma and adenocarcinoma cells, but further correlates this differential expression to the presence of a chromosomal aberration. This has allowed the development of a diagnostic tool using a set of marker genes representative for each of the chromosomal aberrations.
Thus further particular embodiments relate to sets of marker genes which comprises marker genes indicative of each of the chromosomal aberrations. More particularly, such sets of marker genes comprise:
at least 1, at least 2, at least 5, at least 10 or more marker genes selected from Table 2 (marker genes correlated with 8p loss) and
at least 1, at least 2, at least 5, at least 10 or more marker genes selected from Table 3 (marker genes correlated with 8q gain) and
at least 1, at least 2, at least 5, at least 10 or more marker genes selected from Table 4 (marker genes correlated with 13 q gain) and
at least 1, at least 2, at least 5, at least 10 or more marker genes selected from Table 5 (marker genes correlated with 15q loss), and
at least 1, at least 2, at least 5, 10 or more marker genes selected from Table 6 (marker genes correlate which correlate with 17p loss) and
at least 1, at least 2, at least 5, at least 10 or more marker genes selected from Table 7 (marker genes correlated with 18q loss) and
at least 1, at least 2, at least 5, at least 10 or more marker genes selected from Table 8 or 9 (marker genes correlates with 20q gain).
Additionally or alternatively, the marker gene(s) used in the methods of the present invention is/are selected from a subset of the sets of marker genes listed above corresponding to those marker genes for which the expression level is increased in colorectal adenocarcinoma cells compared to adenoma cells.
In other embodiments of the invention, the sets of marker genes indicative of each of the chromosomal aberrations used in the methods and tools of the present invention comprise genes which are located within the chromosomal region itself which is gained or lost. Accordingly, the sets of marker genes comprise:
at least 1, at least 2, at least 5, at least 10 or all of the genes associated with chromosomal loss at 8p, listed in Table 10. More particularly, one or more marker genes are selected from the group consisting of NM—020749 and NM—004315 and/or selected from the group consisting of NM—003747, NM—016353, NM—152415, NM—006197, NM—000662, NM—000015 and D31887, and
at least 1, at least 2, at least 5, at least 10 or all of the genes associated with chromosomal gain at 8q, listed in Table 11. More particular one or more genes are selected from the group consisting of NM—138455, NM—032611, NM—032862, and
at least 1, at least 2, at least 5, at least 10 or all of the genes associated with chromosomal gain at 13q, listed in Table 12, particularly NM—145293 and or NM—005358. More particularly one or more genes selected from the group consisting of U50531, NM—012158, NM—017817, NM—003899, NM—003903, NM—018386, BC008975 and NM—023011, and
at least 1, at least 2, at least 5, at least 10 or all of the genes associated with chromosomal loss at 15q listed in Table 13. More particularly NM030574 or one or more of the genes selected from NM—004255, NM—002573 and AB033025.
at least 1, at least 2, at least 5, at least 10 or all of the genes associated with chromosomal loss at 17p listed in Table 14. More particularly one or more genes selected from the group consisting of NM—130766, NM—015721 and NM—031430, and
at least 1, at least 2, at least 5, at least 10 or all of the genes associated with chromosomal loss at 18q and listed in Table 15. More particularly one or more marker genes selected from the group consisting of NM—004715, NM—006701 and NM—014913, and
at least 1, at least 2, at least 5, at least 10, at least 25, at least 50, at least 100 or all of the genes associated with chromosomal gain at 20q and listed in Table 16. More particularly one or more marker genes selected from NM—016397 and NM—018270 and NM—006602. Other particular marker genes are one or both of NM—080476 and NM—017896. Other particular marker genes are one or more selected from the group consisting of NM—006097, NM—021809, NM—018840, NM—003600, NM—017495, NM—007002 and NM—016354. Other particular marker genes are one or more selected from the group consisting of NM—014071, NM—002212, NM—003185, NM—152255 and NM—022082. Yet other particular marker genes are one or more selected from the group consisting of NM—018244, NM—014902, NM—032013, NM—020182, NM—006886, NM—020673 and BC003122.
Additionally or alternatively, the marker gene(s) used in the methods of the present invention is/are selected from a subset of the genes listed above, corresponding to those marker genes for which the expression level is increased in colorectal adenocarcinoma cells compared to adenoma cells.
In general the marker genes described above are used in an in vitro method for detecting the presence of colorectal adenocarcinoma cells in a patient, the method comprising the steps of detecting in a test sample of that patient the expression level of one or more marker genes as defined in the above sets and subsets, comparing the expression level of the used marker gene(s) to that of a control sample (e.g. colorectal adenoma cells). An elevated or decreased expression level of the used marker gene(s) in the test sample, compared to the control sample, is indicative of the presence of colorectal adenocarcinoma cells in the patient.
The marker genes of the present invention are used in DNA, RNA or protein based expression analysis. A significant advantage of the marker genes of the present invention is that they are suitable for use in objective quantitative diagnostics. Methods based on markers identified in the prior art were limited because they relied on the examination of small numbers of histological samples comprising carcinoma cells.
In a particular embodiment, the marker genes are use in DNA or RNA-based expression analysis, which provides the advantage of increased sensitivity typical for DNA or RNA-based diagnostics. The methods and marker genes of the present invention allow to discriminate colorectal adenoma from carcinoma cells at a very early stage of the carcinogenic process.
The determination of the expression level of marker genes in a patient sample may be accomplished by any means known in the art. For example, expression levels of various marker genes may be assessed by separation of nucleic acid molecules (e.g. RNA or cDNA) obtained from the sample in agarose or polyacrylamide gels, followed by hybridisation with marker gene specific oligonucleotide probes. Alternatively, the difference in expression level may be determined by the labelling of nucleic acid obtained from the sample followed by separation on a sequencing gel. nucleic acid samples are placed on the gel such that patient and control or standard nucleic acid are in adjacent lanes. Comparison of expression levels is accomplished visually or by means of a densitometer.
In a particular embodiment, the expression of all marker genes used in the assay is assessed simultaneously by hybridisation to a DNA array (also called “microarray” or “DNA chip”. Microarray based expression profiling can be carried out, for example, by the method as disclosed in “Microarray Biochip Technology” (Schena M., Eaton Publishing, 2000). A DNA array comprises immobilised high-density probes to detect a number of genes. The probes on the array are complementary to one or more parts of the sequence of a marker gene, or to the entire coding region of the marker gene. In the present invention, any type of polynucleotide can be used as probes for the DNA array. Typically, cDNAs, PCR products, and oligonucleotides are useful as probes. Thus, expression levels of a plurality of genes can be estimated at the same time by a single-round analysis.
A DNA array-based detection method generally comprises the following steps:
1) Isolating mRNA from a sample and optionally converting the mRNA to cDNA, and subsequently labelling this RNA or cDNA. Methods for isolating RNA, converting it into cDNA and for labelling nucleic acids are described in manuals for micro array technology.
2) Hybridising the nucleic acids from step 1 with probes for the marker genes. The nucleic acids from a sample can be labelled with a dye, such as the fluorescent dyes Cy3(red) or Cy5 (blue). Generally a control sample is labelled with a different dye.
3) Detecting the hybridisation of the nucleic acids from the sample with the probes and determining at least qualitatively, and more particularly quantitatively, the amounts of mRNA in the sample for the different marker genes investigated. The difference in the expression level between sample and control can be estimated based on a difference in the signal intensity. These can be measured and analysed by appropriate software such as, but not limited to the software provided for example by Affymetrix.
In a particular embodiment, the probes on an array can be arranged, for example according to marker genes correlated with a particular type of chromosomal aberration. Alternatively different arrays can be developed, each carrying probes for the detection of a particular type of chromosomal aberration.
There is no limitation on the number of probes corresponding to marker genes which are spotted on a DNA array. For example, one may select 1% or more, 5% or more, 20% or more, 50% or more, 70% or more of any of the sets of the marker genes of the present invention. Also a marker gene can be represented by two or more probes hybridising to different parts of a gene. Probes are designed for each selected marker gene. Such a probe is typically an oligonucleotide comprising 5-50 nucleotide residues. Longer DNAs can be synthesised by PCR or chemically. Methods for synthesising such oligonucleotides and applying them on a substrate are well known in the field of micro-arrays.
Genes other than the marker genes may be also spotted on the DNA array. For example, a probe for a gene whose expression level is not significantly altered may be spotted on the DNA array to normalise assay results or to compare assay results of multiple arrays or different assays.
In particular embodiments of the methods of the present invention, the expression level of particular marker genes is assessed by determining the amount of protein expressed by the respective marker genes. For the analysis at the protein level, every marker gene described in the present invention can in principle be used, although some proteins may be less suitable, because of factors such as limited solubility, very high or small molecular weight or extreme isoelectric point.
Determination of expression level of a marker gene at the protein level can be accomplished, for example, by the separation of proteins from a sample on a polyacrylamide gel, followed by identification of a specific marker gene-derived protein using antibodies in a Western blot. Alternatively, proteins can be separated by two-dimensional gel electrophoresis systems. Two-dimensional gel electrophoresis is well-known in the art and typically involves isoelectric focusing along a first dimension followed by SDS-PAGE electrophoresis along a second dimension. The analysis of 2D SDS-PAGE gels can be performed by determining the intensity of protein spots on the gel, or can be performed using immune detection. In other embodiments, protein samples are analysed by mass spectroscopy.
The samples used in vitro assays generally will be colorectal biopsies or resections, more particularly adenomatous polyp biopsies or resections. For in vitro protein expression analysis, cells or cell lysates of biopsies or resections can be used. Accordingly, the localisation of the protein in the cell or the function of the protein to be assayed is of no importance for the analysis. The presence of adenocarcinoma cells in a patient is expected to be reflected by the presence of elevated or decreased levels of certain proteins secreted by adenocarcinoma cells. Such proteins can be present in blood, urine, sweat and other parts of the body. Equally, adenocarcinoma cells will release proteins to the colon lumen. In addition, intact adenocarcinoma cells or their lysed content may be released to the intestinal tract, and will be present in the faeces which can be used as a source for in vitro protein analysis. However, contrary to nucleic acids, proteins can not be amplified. Accordingly it is envisaged that, in particular embodiments the methods of the invention comprise an enrichment step, more particularly an enrichment of adenocarcinoma material. For instance a sample can be contacted with ligands specific for the cell membrane or organelles of adenoma and adenocarcinoma cells, functionalised for example with magnetic particles. The material concentrated by the magnetic particles can then be analysed for the detection of marker proteins.
In another aspect of the invention, the marker genes of the present invention are used for the differential detection of colorectal adenomas and adenocarcinomas in an in vivo analysis. The invention accordingly provides in vivo methods for detecting adenocarcinoma cells in a colorectal adenoma tissue said method comprising the steps of:
contacting a labelled agent capable of specifically binding or interacting with a protein expressed by a marker gene with a colorectal lesion, and
detecting the binding or interaction of the labelled agent with cells in the lesion;
whereby the detection of a locus in the lesion characterized by a difference in the binding or interaction of the labelled agent is indicative for the presence of adenocarcinoma cells in the colorectal adenoma.
The present invention provides a set of markers which have not been previously identified as markers for colorectal carcinomas and which can be used in the in vivo methods of the present invention (shown in Table 17). Any one of these markers is suitable for use in the in vivo diagnostic methods of the present invention. A particular embodiment of the invention provides a set of marker gene(s) which is/are selected from a subset of Table 17; corresponding to those marker genes for which the expression level is increased in colorectal adenocarcinoma cells compared to adenoma cells.
In a particular embodiment of the methods of the invention, a set of marker genes is analysed, so as to increase the reliability of detection. More particularly, according to a particular embodiment, a set of agents is used whereby each agent detects a marker associated with a different type of chromosomal aberration linked to colorectal adenocarcinoma.
Thus, according to one embodiment, a set of marker genes is used for the in vivo identification of the presence of carcinoma cells, which set of marker genes comprises marker genes indicative of each of the chromosomal aberrations. More particularly, such a set of marker genes comprises:
one or more genes selected from Table 2 (marker genes correlated with 8p loss) and
one or more genes selected from Table 3 (marker genes correlated with 8q loss) and
one or more genes selected from Table 4 (marker genes correlated with 13 q gain) and
one or more genes selected from Table 5 (marker genes correlated with 15q loss), and
one or more genes selected from Table 6 (marker genes correlate which correlate with 17p loss) and
one or more genes selected from Table 7 (marker genes correlated with 18q loss) and
one or more genes selected from Table 8 or 9 (marker genes correlates with 20q gain).
Additionally or alternatively, sets of marker genes are provided for use in the in vivo methods of the invention which is/are selected from the sets of marker genes described above, corresponding to those marker genes for which the expression level is increased in colorectal adenocarcinoma cells compared to adenoma cells.
In other embodiments of the in vivo diagnostic methods invention, the sets of marker genes indicative of each of the chromosomal aberrations comprise genes which are located within the chromosomal regions itself which are gained or lost. Accordingly, the set of marker genes comprises:
one or more genes selected from the marker genes listed in Table 10. More particularly, one or more marker genes are selected from the group consisting of NM—020749 and NM—004315 and/or selected from the group consisting of NM—003747, NM—016353, NM—152415, NM—006197, NM—000662, NM—000015 and D31887, and
one or more genes selected from the marker genes listed in Table 11. More particular one or more genes are selected from the group consisting of NM—138455, NM—032611, NM—032862, and
one or more genes selected from the marker genes listed in Table 12, particularly NM—145293 and or NM—005358. More particularly one or more genes selected from the group consisting of U50531, NM—012158, NM—017817, NM—003899, NM—003903, NM—018386, BC008975 and NM—023011, and
one or more genes selected from the marker genes listed in Table 13. More particularly NM030574 or one or more of the genes selected from NM—004255, NM—002573 and AB033025.
one or more genes selected from the marker genes listed in Table 14. More particularly one or more genes selected from the group consisting of NM—130766, NM—015721 and NM—031430, and
one or more genes selected from the marker genes listed in Table 15. More particularly one or more marker genes selected from the group consisting of NM—004715, NM—006701 and NM—014913, and
one or more genes selected from the marker genes listed in Table 16. More particularly one or more marker genes selected from NM—016397 and NM—018270 and NM—006602. Other particular marker genes are one or both of NM—080476 and NM—017896. Other particular marker genes are one or more selected from the group consisting of NM—006097, NM—021809, NM—018840, NM—003600, NM—017495, NM—007002 and NM—016354. Other particular marker genes are one or more selected from the group consisting of NM—014071, NM—002212, NM—003185, NM—152255 and NM—022082. Yet other particular marker genes are one or more selected from the group consisting of NM—018244, NM—014902, NM—032013, NM—020182, NM—006886, NM—020673 and BC003122.
Additionally or alternatively, sets of marker genes are provided for use in the in vivo methods of the invention which is/are selected from the sets of marker genes described above, corresponding to those marker genes for which the expression level is increased in colorectal adenocarcinoma cells compared to adenoma cells.
Analysis of the data of the present invention shows that the altered expression of one marker gene can be caused by different chromosomal aberrations. Table 18 shows marker genes which are altered by three, four, or even five different aberrations. According to a particular embodiment, the in vivo methods of the present invention are carried out using one or more marker genes selected from the marker genes listed in Table 18. Additionally or alternatively, the in vivo methods of the present invention are carried out using one or more marker genes selected from the marker genes listed in Table 18, corresponding to those marker genes for which the expression level is increased in colorectal adenocarcinoma cells compared to adenoma cells.
The digestive tract is well accessible for the administration of diagnostic compounds in a patient. In addition, colonoscopy can be used for the detection of labelled compounds. In vivo diagnostic techniques further allow the analysis of different polyps without collection of individual biopsies.
Marker genes which are particularly suitable for in vivo imaging are marker genes encoding proteins which are located on the cell surface or which are secreted. Alternatively, the detection of proteins encoded by marker genes is envisaged for proteins which remain in the cytoplasm or nucleus. Proteins can be detected by antibodies, binding peptides/proteins (e.g. receptor ligands, or parts thereof), and if appropriate by metabolites, enzyme substrates, substrate analogues and enzyme inhibitors. A number of proteins which are located inside a cell can be indirectly detected using compound which are internalised by a cell e.g. enzyme substrates or analogues, enzyme inhibitors or certain metabolites which are incorporated into a cell. Accordingly, adenocarcinoma cells can be identified in vivo by detecting a decreased or increased expression or activity of a protein encoded by a marker gene when compared to adenoma cells.
Any of the above compounds described above to be suitable for the in vivo detection of proteins can be further functionalised with chromophoric agents, MRI labels, labels for ultrasound detection, radioactive compounds or other tools for facilitating in vivo imaging.
Yet another aspect of the present invention relates to kits for performing the in vivo and/or in vitro detection methods of the present invention. Typically, the kits of the present invention contain one or more agents allowing the specific detection of one or more marker genes disclosed herein. In particular embodiments the kits comprise a set of agents which allow detection of a set of marker genes described herein. The nature of the agents is determined by the method of detection for which the kit is intended. Where detection at the DNA/RNA method is intended, the agents are typically marker-specific primers or probes, which are optionally labelled. Where detection is at the protein level, agents are typically antibodies or compounds containing an antigen-binding fragment of an antibody. However, as described above, protein expression can also be detected using other compounds which specifically interact with the marker of interest, such as specific substrates (in case of enzymes) or ligands (for receptors).
In particular embodiments the kits of the invention also comprise a control sample, in accordance with the methods of the present invention.
Particular and preferred aspects of the invention are set out in the accompanying independent and dependent claims. Features from the dependent claims may be combined with features of the independent claims and with features of other dependent claims as appropriate and not merely as explicitly set out in the claims.
Other arrangements of the systems and methods embodying the invention will be obvious for those skilled in the art.
It is to be understood that although preferred embodiments, specific constructions and configurations, as well as materials, have been discussed herein for devices according to the present invention, various changes or modifications in form and detail may be made without departing from the scope and spirit of this invention.
73 snapped frozen colorectal tumours (37 non-progressed adenomas and 36 carcinomas) were collected prospectively at the VU-University Medical Center (VUmc), Amsterdam, The Netherlands. All samples were used in compliance with the institution's ethical regulations.
The 73 frozen specimens corresponded to 65 patients (31 females and 34 males). From these, 6 patients had multiple tumours: 4 patients, multiple adenomas and 2 patients, 1 or more adenomas next to a carcinoma. The mean age of the patients was 69 (range 47-89).
Array-CGH and expression microarrays were done on the frozen set.
RNA from snap-frozen tissues was isolated with TRIzol reagent (Invitrogen, Breda, NL) following the supplier's instructions. Both RNA and DNA concentration and purity were measured in a Nanodrop ND-1000 spectrophotometer (Isogen, IJsselstein, NL) and integrity was evaluated in a 1% agarose gel, stained with ethidium bromide. Expression microarrays
The Human Release 2.0 oligonucleotide library, containing 60-mer oligonucleotides representing 28830 unique genes, designed by Compugen (San Jose, Calif., USA) was obtained from Sigma-Genosys (Zwijndrecht, The Netherlands). The oligonucleotides were dissolved at 10 mM concentration in 50 mM sodium phosphate buffer pH 8.5 and single spotted onto Codelink™ slides (Amersham Biosciences, Roosendaal, NL), using the OmniGrid® 100 microarrayer (Genomic Solutions, Ann Arbor, Mich., USA) equipped with SMP3 pins (TeleChem International, Sunnyvale, Calif., USA). After printing slides were processed according to the manufacturers protocol (Codelink™ slides; Amersham BioSciences, Roosendaal, NL).
First, 30 μg messenger RNA was reverse-transcribed to cDNA using SuperScript™ II Reverse transcriptase (Invitrogen, Breda, NL) with oligo-dT priming (Isogen, IJsselstein, NL). In samples with limiting amount of RNA, less was used as starting material, but at least 15 μg). cDNA was coupled to Fluorolink Cy3 and Cy5 Monofunctional Dye 5-pack (Amersham BioSciences, Rosendaal, NL). Cy3 labelled tumour cDNA and Cy5 labelled reference cDNA were combined and co-precipitated with 12 μg pd(A)40-60 (Amersham BioSciences, Rosendaal, NL), 60 μg of tRNA (Sigma-Aldrich, Zwijndrecht, NL) and 24 μg of human Cot-1 DNA (Invitrogen, Breda, NL) by adding 0.1 volume of 3 M sodium acetate (pH 5.2) and 2.5 volumes of ice-cold 100% ethanol. The precipitate was collected by centrifugation at 14,000 rpm for 10 minutes at 4° C. After air-drying the pellet was dissolved in 126.7 μl hybridisation mixture with a final concentration of 50% formamide, 2×SCC, 10% dextran sulfate and 4% SDS. The cDNA samples were denatured for 10 minutes at 73° C. followed by a 60 minutes incubation at 37° C. to allow the Cot-1 DNA to block repetitive sequences. The arrays were incubated for 14 h at 37° C. with the denatured and blocked hybridisation mixture in a hybridisation station (HybArray12™—Perkin Elmer Life Sciences, Zaventem, BE). After hybridisation, slides were washed in a solution containing 50% formamide, 2×SCC, pH 7 for 3 minutes at 45° C., followed by 1 minute wash steps at room temperature with PN buffer (PN: 0.1 M sodium phosphate, 0.1% nonidet P40, pH 8), 0.2×SSC, 0.1×SCC and 0.01×SCC. Slides were dried by centrifugation at 1000 rpm for 3 min at room temperature.
Images of the arrays were acquired by scanning (Agilent DNA Microarray scanner-Agilent technologies, Palo Alto, USA) and Imagene 5.6 software (Biodiscovery Ltd, Marina del Rey, Calif.) was used for automatic feature extraction (segmentation of the spots and quantification of the signal and background intensities for each spot for the two channels Cy3 and Cy5). Default settings for the flagging of poor quality spots was used. A microsoft Excel sheet was used to subtract local background from the signal median intensities of both test and reference cDNA. MA-plots of intensities of raw data were done to judge overall quality and flag out poor quality experiments. Normalization was done either with TIGR Midas, using “loess” correction or with “Median” normalization and implemented in the maNorm function (marray R bioconductor package), with identical results. Inter-array normalization was also performed for clustering purposes. Low intensity values were replaced by the intensity value of 50. Genes with more than 20% missing values in all tumours were excluded from further analysis.
Unsupervised cluster analysis was done using the software TIGR multi experiment viewer (TMev). Complete linkage and Euclidian distances were applied.
As all hybridisations were performed against a common reference, all comparisons were relative between different groups of colorectal neoplasias, no differences compared to normal colorectal mucosa were considered. Supervised analysis for comparison of expression between carcinomas and adenomas was done using the Wilcoxon ranking test and a new method which takes into account subpopulations.
The genes which were identified by microarray to be differentially expressed between adenomas and carcinomas are provided in Table 1.
To investigate the effects of chromosomal instability on gene expression in colorectal adenoma to carcinoma progression, whole-genome copy number changes were analysed, by array-CGH, on a series of 114 colorectal tumours (37 non-progressed adenomas, 41 progressed adenomas (malignant polyps) and 36 carcinomas).
The determination of the SROs as disclosed in the present invention is illustrated in detail herein for the region of chromosomal gain at 20q. For the 41 progressed adenomas, the adenoma and the carcinoma components were analysed for DNA copy number alterations. Losses of 1p, 4, 8p, 14q, 15q, 17p and 18 and gains of 1q, 6p, 7, 8q, 13q, 17q, 19p, 20q and 22q were observed in >20% of cases, of which 8p and 18 loss and 13q and 20q gains were the most frequent, occurring in more than 35% of the cases. Gain of chromosome 20 alone occurred in more than 60% of the cases. Genome wide, the pattern of copy number changes did not differ between adenoma and carcinoma components in progressed adenomas, i.e. the aberrations found in the carcinoma component were already present on the adenoma component, although with lower frequencies or amplitudes.
Next, the copy number changes of the 37 non-progressed adenomas and 36 carcinomas were analysed. From the 73 tumours, 67 (34 adenomas and 33 carcinomas) showed high quality genomic profiles (corresponding to a 8% drop-out). In adenomas the frequency of aberrations obviously was very low. In contrast, carcinomas showed frequent (>20% of cases) 1p, 4, 8p, 14q, 15q, 17p and 18 losses and 1q, 6p, 7, 8q, 13q, 17q, 19p, 20q and 22q gains, with 8p and 18 deletions and 13q and 20q gains present in more than 35% of the cases (like in the progressed adenomas). Chromosome 20 gains occurred in less than 15% of the adenomas but in more than 60% of the carcinomas, mostly affecting either the whole chromosome or the long arm, like in the progressed adenomas.
Hierarchical clustering of these 67 tumours (non-progressed adenomas and carcinomas) on DNA copy number profiles showed a clear separation of carcinomas and adenomas into two different clusters, cluster 1 and 2, respectively, with χ2 test p<0.001. In search for those DNA copy number changes that were significantly different (p<0.05) between non-progressed adenomas and carcinomas, we observed that 4q, 8p, 8q, 13q, 15q, 18 and 20 were the relevant regions, of which loci on 20q differed most significantly (p <0.00001).
In order to determine the most relevant regions harbouring putative oncogenes with a role in colorectal cancer progression, STAC was applied to the combined set of paraffin-embedded malignant polyps (n=41) and frozen carcinomas (n=33). For 20q, analysis of these samples revealed 3 relevant regions of aberrant copy gains, one spanning 4 Mb (32-36 Mb), one spanning 3 Mb (56-59 Mb) and the third one spanning 2 Mb (61-64 Mb). These three regions (smallest regions of overlap—SROs) still contained 80, 35 and 94 genes, respectively. Similar analyses were performed for the other regions of chromosomal aberration linked with adenocarcinomas and the SROs identified for each of these regions.
Microarray expression analysis was performed in the 37 non-progressed adenomas and 36 carcinomas of which snap-frozen material was available. High quality expression data were obtained in 68 cases (37 adenomas and 31 carcinomas, 7% drop-out).
The array-CGH data were related to the microarray expression data genome-widely, independently of adenoma or carcinoma status. To compare expression between tumours with a certain chromosomal aberration and tumours without such aberration, in order to disclose genes expression of which is influenced by the chromosomal aberration, a statistical algorithm (R environment) was developed (de Wiel, unpublished data).
For each region of chromosomal aberration, a list of genes for which gene dosage affected expression levels was obtained (Tables 2-9). Genes were identified which mapped within the regions of chromosomal abberation and showing differential expression between colorectal adenoma and adenocarcinoma cells with the relevant chromosomal aberration (Tables 10-16).
This process is described more in detail for the region of chormosomal gain at 20q hereafter. Supervised analysis of expression data, with the aim of identifying putative oncogenes on 20q, was done in two different ways. First, differential expression of genes was investigated between carcinomas and adenomas (Table 1), and secondly the expression genes in tumours with 20q gain was compared with tumours without 20q gain (Table 8), to determine in which genes the expression level was influenced by the occurrence of 20q gain. The first approach revealed genome wide 122 up-regulated genes and 219 down-regulated genes, in carcinomas when compared to adenomas (Wilcoxon test p-value <1e-5 or Thas score >10.47). Of the 122 up-regulated genes, 14 map at chromosome 20q. For the second approach, only tumours (adenomas and carcinomas) were used of which both array-CGH data and expression data were available (n=64). As a pre selection, genes were used of which expression values were differentially expressed (both up- and down) between carcinomas and adenomas (cut-off p-value<0.05), obtained in the first approach, to focus on genes on 20q which are involved in the progression from adenoma to carcinoma. With this analysis 127 genes were identified, throughout the genome, whose expression levels are due to the occurrence of 20q gain. When we compared the genes mapping on 20q which are common between the two approaches, that is, which are up-regulated in carcinomas as a result of the gain of this chromosome arm, 9 genes were found, namely TPX2, c20orf24, AURKA(STK6), RNPC1, TH1L, ADRM1, C20orf20, TCFL5 and C20orf11.
A similar analysis was performed for the other regions of chromosomal loss or gain linked to colorectal adenocarcinoma.
Drosophila) (TLE1), mRNA.
##: “Number of hits” in table 10 to 16, refers to the number of independent assays wherein a correlation between altered expression and the presence of colorectal carcinoma cells is observed for a given gene (altered expression regardless of the genomic background of the sample, altered expression in sample pools selected on the type of chromosomal aberration).
Caenorhabditis elegans, clone MGC: 40403
Number | Date | Country | Kind |
---|---|---|---|
07105722.8 | Apr 2007 | EP | regional |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/IB2008/051248 | 4/3/2008 | WO | 00 | 10/2/2009 |