The present invention relates to the identification of biomarkers associated with cancer, in particular colorectal cancer and precursors thereof, i.e. high risk adenomas, and their use in screening methods, arrays for use in colorectal cancer screening methods, methods of treating colorectal cancer, and compounds for use in treating colorectal cancer, and kits for use in colorectal cancer screening methods.
Colorectal cancer (CRC) is a significant health problem. It is the 3rd most common cancer worldwide, with over 1.200.000 new cases each year, with a fatal outcome for almost half the patients (Karsa L V, et al. Best Pract Res Clin Gastroenterol 2010; 24:381-96). Because CRC develops over many years, there is an excellent window of opportunity to detect the disease in an early, curable, or even premalignant stage. This can be achieved by screening of asymptomatic individuals. However, the currently available screening tests are either cumbersome or carry risk of complications and over treatment like colonoscopy.
The immunochemical fecal occult blood test (iFOBT), further referred to as fecal immunochemical test (FIT), is a widely used test that detects small traces of blood in stool, derived from lesions such as tumors that bleed in the colon. The FIT is a non-invasive method that makes use of an antibody directed against hemoglobin. However, not all colon tumors bleed and therefore the FIT has a sensitivity that leaves room for improvement (Duffy M J, et al. Int J Cancer 2011; 128:3-11; van Veen W, Mali W P. [Colorectal cancer screening: advice from the Health Council of the Netherlands]. Ned Tijdschr Geneeskd 2009; 153:A1441). Additional markers that detect other tumor characteristics besides bleeding in the stool could increase this sensitivity and the chance of identifying a colon tumor. Molecular changes resulting from the neoplastic process include changes in protein expression. The proteins for which expression is increased have the potential to serve as informative biomarkers with high diagnostic performance.
Several studies have been performed to identify proteins in stool and blood that can be used for the early detection of CRC and high risk colorectal adenomas, but most of these proteins have not been validated in a screening setting or failed to improve current tests for early detection, such as Carcinoembryonic antigen (CEA) or Calprotectin (Bosch L J, et al. Molecular tests for colorectal cancer screening. Clin Colorectal Cancer 2011; 10:8-23). There therefore remains a need for further tumor-specific protein markers to improve the current available non-invasive screening possibilities for CRC and high risk adenomas.
Recent technological advances in mass spectrometry can boost the discovery of novel protein markers (de Wit M, et al. Gut 2011; Jimenez C R, et al. J Proteomics 2010; 73: 1873-95). Tandem mass spectrometry-based approaches now have the power to analyze complex protein samples and to detect proteins in low concentrations (Cox J, Mann M. Annu Rev Biochem 2011; 80: 273-99). Although most biomarker discovery studies have been performed using tissue and/or cell line material followed by validation in stool or blood, the chemical composition of the biological sample used for screening may significantly affect the nature of biomarkers that can be identified. This holds especially for stool samples, in which the low pH, the protease- and glycosidase activities of bacteria, enzymes and other substances can disturb the specific detection of these markers (Young G P, Bosch L J W. Curr Colorectal Cancer Rep 2011; 7: 62-70). Measuring molecules directly in the biological sample that may be used for screening, i.e. stool, could therefore be a valuable approach to provide us with reliable biomarkers that are stable in the fecal environment. A recent study by Ang et al. has shown the feasibility of protein biomarker discovery in human stool samples using mass spectrometry (Ang C S, Nice E C. J Proteome Res 2010; 9: 4346-55).
The present invention aims to overcome some or all of the problems associated with the prior art.
According to a first aspect of the invention there is provided a method for screening for colorectal cancer, the method comprising:
In a further aspect of the invention, there is provided a method for screening for colorectal cancer, the method comprising: screening a biological sample obtained from an individual for one or more biomarkers selected from the group defined in Table 1 and/or Table 6, wherein the presence of or increased expression of the one or more biomarkers relative to a control sample is indicative that the individual is at risk of suffering from or is suffering from colorectal cancer, and wherein the one or more biomarkers were selected from secretomes significantly more abundant in colon tumor secretomes compared to normal colon secretomes.
In yet a further aspect of the invention, there is provided a method for determining the significance of one or more biomarkers for the determination whether an individual is at risk of suffering from or is suffering from colorectal cancer; comprising:
In one embodiment, the one or more biomarkers is a protein or proteins.
Thus, the present invention provides protein signatures for diagnosing or predicting colorectal cancer in a subject. The present invention advantageously allows for detection of advanced and high-risk colonic adenomas and adenocarcinomas.
In one embodiment, the presence of the one or more biomarkers is indicative that the individual is at risk of suffering from or is suffering from colorectal cancer.
In an alternative embodiment, the increased expression of the one or more biomarkers relative to a control sample is indicative that the individual is at risk of suffering from or is suffering from colorectal cancer. The one or more biomarkers may be selected from the group defined in Table 2. The biomarkers in Table 2 represent a preferred subset of the gene products of Table 1 and/or Table 6, for which the levels of differential expression in CRC or high risk colorectal adenoma samples relative to control samples are considered statistically significant.
The one or more biomarkers may be selected from the group defined in Table 3. The biomarkers of Table 3 are differentially expressed in FIT-negative CRC samples relative to a control sample.
This particular subset of the markers of Table 1 and/or Table 6 are of particular importance as these markers have the potential to be used for the detection of CRC or susceptibility to CRC (i.e. detection of high risk colorectal adenomas) in those patients for whom the fecal immunochemical test, or any other test that aims to detect hemoglobin (such as the guaiac-based Fecal Occult Blood Test), gives a comes a negative result. The same applies to the markers of Tables 5 and 7, wherein the latter show the overlap between markers found in stool samples, and those found in secretomes of cancerous tumor tissue.
In this respect, Table 6 lists proteins that were found to be significantly more abundant in colon tumor secretomes compared to normal colon secretomes.
The one or more biomarkers may have a higher discriminative power than hemoglobin. By “higher discriminative power” it is meant that a biomarker has a higher sensitivity and specificity than hemoglobin. Thus, in one embodiment, the one or more biomarkers may be selected from the group consisting of: S100 calcium binding protein A8 (S100A8), complement component C4B (Chido blood group) 2 (C4A/C4B), transferrin (TF), alpha-2-macroglobulin (A2M), S100 calcium binding protein A9 (S100A9), proteinase 3 (PRTN3), Azurocidin (AZU1), lactotransferrin (LTF), hemopexin (HPX) and defensin, alpha 1 (DEFA1).
The one or more biomarkers may be selected from the group defined in Table 4. The group of biomarkers in Table 4 represent a subset of Table 1 and/or Table 6, and are markers which have been found to be expressed only in CRC samples and not in control samples.
In one embodiment, the biological sample is a stool sample.
The biological sample preferably may be screened for the one or more biomarkers using (targeted) mass spectrometry.
The biological sample may be screened for the one or more biomarkers using a binding agent capable of binding to the one or more biomarkers.
The binding agent may be an antibody or fragment thereof. The antibody or fragment thereof may be a recombinant antibody or fragment thereof. The antibody or fragment thereof may be selected from the group consisting of: scFv; Fab; a binding domain of an immunoglobulin molecule.
In one embodiment, the binding agent may be an aptamer.
The screening may be performed using an array. Any suitable array may advantageously be employed. The array may be a bead-based array. The array may be a surface-based array.
The control sample may comprise a biological sample, e.g. a stool sample from an individual known to be free from colorectal cancer or high risk colorectal adenomas.
The biological sample may also be analysed by a fecal immunochemical test. According to a second aspect of the present invention there is provided the use of one or more biomarkers selected from the group defined in Table 1 and/or Table 6 for diagnosing or predicting colorectal cancer in an individual.
The one or more biomarkers of the second aspect may be selected from the group defined in Table 2.
The one or more biomarkers of the second aspect may be selected from the group defined in Table 3.
The one or more biomarkers of the second aspect may have a higher discriminative power than hemoglobin.
Thus, the one or more biomarkers of the second aspect may be selected from the group consisting of: S100 calcium binding protein A8 (S100A8), complement component C4B (Chido blood group) 2 (C4A/C4B), transferrin (TF), alpha-2-macroglobulin (A2M), S100 calcium binding protein A9 (S100A9), proteinase 3 (PRTN3), Azurocidin (AZU1), lactotransferrin (LTF), hemopexin (HPX) and defensin, alpha 1 (DEFA1).
The one or more biomarkers of the second aspect may be selected from the group defined in Table 4.
The one or more biomarkers of the second aspect may be selected from the group defined in Table 5.
The one or more biomarkers of the second aspect may be selected from the group defined in Table 6.
The one or more biomarkers of the second aspect may be selected from the group defined in Table 7.
According to a third aspect of the invention there is provided an array for determining whether an individual is at risk of suffering from or is suffering from colorectal cancer, the array comprising one or more binding agent as defined according to certain embodiments of the first aspect of the invention.
The array may be for use in a method according to the first aspect of the invention. According to a fourth aspect of the invention there is provided a method for treating colorectal cancer, the method comprising:
The one or more biomarkers of the fourth aspect may be selected from the group defined in Table 2.
The one or more biomarkers of the fourth aspect may be selected from the group defined in Table 3.
The one or more biomarkers of the fourth aspect may have a higher discriminative power than hemoglobin.
Thus, the one or more biomarkers of the fourth aspect may be selected from the group consisting of: S100 calcium binding protein A8 (S100A8), complement component C4B (Chido blood group) 2 (C4A/C4B), transferrin (TF), alpha-2-macroglobulin (A2M), S100 calcium binding protein A9 (S100A9), proteinase 3 (PRTN3), Azurocidin (AZU1), lactotransferrin (LTF), hemopexin (HPX) and defensin, alpha 1 (DEFA1).
The one or more biomarkers of the fourth aspect may be selected from the group defined in Table 4.
According to a fifth aspect of the invention there is provided a cancer therapeutic agent for use in a method for treating colorectal cancer, the method comprising:
The cancer therapeutic agent of the fourth or fifth aspect may comprise one or more therapeutic monoclonal antibody, one or more small molecule inhibitor or one or more chemotherapeutic agent or any combination thereof.
The one or more therapeutic monoclonal antibody may comprise one or more of bevacizumab, cetuximab or panitumumab or any combination thereof.
The one or more small molecule inhibitor may comprise one or more of erlotinib, sorafenib or alisertib or any combination thereof.
The one or more chemotherapeutic agent may comprise one or more of 5-FU, capecitabine, irinotecan oxaliplatin, or leucovorin or any combination thereof.
The one or more biomarkers of the fifth aspect may be selected from the group defined in Table 2.
The one or more biomarkers of the fifth aspect may be selected from the group defined in Table 3.
The one or more biomarkers of the fifth aspect may have a higher discriminative power than hemoglobin.
Thus, the one or more biomarkers of the fifth aspect may be selected from the group consisting of: S100 calcium binding protein A8 (S100A8), complement component C4B (Chido blood group) 2 (C4A/C4B), transferrin (TF), alpha-2-macroglobulin (A2M), S100 calcium binding protein A9 (S100A9), proteinase 3 (PRTN3), Azurocidin (AZU1), lactotransferrin (LTF), hemopexin (HPX) and defensin, alpha 1 (DEFA1).
The one or more biomarkers of the fifth aspect may be selected from the group defined in Table 4. The one or more biomarkers of the fifth aspect may be selected from the group defined in Table 5 and/or 7.
According to a sixth aspect of the invention there is provided a kit for screening for colorectal cancer in an individual, the kit comprising:
The kit may, for example, be an ELISA kit. The one or more binding agent may, for example, comprise an antibody. The one or more binding agent may comprise an aptamer.
Any one or more features described for any aspect of the present invention or preferred embodiments or examples thereof, described herein, may be used in conjunction with any one or more other features described for any other aspect of the present invention or preferred embodiments or examples thereof described herein. The fact that a feature may only be described in relation to one aspect or embodiment or example does not limit its relevance to only that aspect or embodiment or example if it is technically relevant to one or more other aspect or embodiment or example.
Colorectal Cancer
The most common colon cancer cell type is adenocarcinoma which accounts for 95% of cases. Other, rarer types include lymphoma and squamous cell carcinoma. Colorectal adenocarcinoma arises from precursor lesions called adenomas, of which only a minority progress to cancer. Adenomas that progress to cancer are referred to as high risk adenomas.
Protein markers have great potential to be applied for stool-based CRC screening, because they can be measured in small sample volumes with simple and relatively cheap assays, of which the widely used FIT is an excellent example (Bosch L J, et al. Molecular tests for colorectal cancer screening. Clin Colorectal Cancer 2011; 10; 8-23; Young G P, Bosch L J W. Curr Colorectal Cancer Rep 2011; 7: 62-70; Oort F A, et al. Aliment Pharmacol Ther 2010; 31: 432-9).
The present study has identified novel protein biomarkers by applying in-depth proteomics to stool samples. From a total of 830 detected human proteins, 134 were significantly enriched in stool samples from CRC patients compared to control stool samples, of which several showed higher discriminative power than hemoglobin and/or complementarity to hemoglobin.
The approach of measuring molecules directly in stool is of significance to reveal biomarkers that are stable in the fecal environment and detectable in the background of bacterial- and food-related molecules.
The present invention is advantageously used for screening for colorectal cancer, that is adenocarcinoma found in the colon. However, the methods of the invention should not be considered as being limited solely to the detection of colonic adenocarcinomas. Rather, the methods of the invention are also useful in the detection of advanced or high-risk colonic adenomas, thus enabling the identification of an individual at risk of developing colorectal cancer due to the presence of an advanced or a high-risk adenoma.
References herein to screening for colorectal cancer thus may include screening for advanced colonic adenomas and high-risk adenomas as well as colonic adenocarcinoma.
It is also expected that the biomarkers identified by the present invention may also find application for the diagnosis of adenocarcinomas present higher up the gastrointestinal tract.
Thus, the present invention may also provide a method for screening for gastrointestinal disease or gastrointestinal cancer, the method comprising: screening a biological sample, for example a stool sample, from an individual for one or more biomarkers selected from the group defined in Table 1, wherein the presence of or increased expression of the one or more biomarkers relative to a control sample is indicative that the patient is at risk of suffering from or is suffering from gastrointestinal disease or gastrointestinal cancer.
Sample for Screening
The sample for screening may include cell lines, biopsies, whole blood, blood serum, sputum, stool, urine, synovial fluid, wound fluid, cerebral-spinal fluid, tissue from eyes, intestine, kidney, brain, skin, heart, prostate, lung, breast, liver, muscle or connective tissue, the said tissue being optionally embedded in paraffin, histologic object slides, and all possible combinations thereof.
The preferred biological sample is stool.
The sample may be prepared by any conventional method for extracting proteins from a biological sample. One exemplary method can be found in Ang C S, Nice E C. J Proteome Res. 2010; 9:4346-55, the contents of which are incorporated herein by reference.
Biomarkers
The present invention provides a set of biomarkers which may be detected directly from a stool sample and which have been shown to be reliable indicators for the presence of advanced colonic adenomas or adenocarcinomas in an individual.
The biomarkers identified are listed in Table 1 and/or Table 6. In one embodiment, the methods of the invention screen for more than one biomarker from the group defined in Table 1, for example two, three, four, five, six, seven, eight, nine, ten of the biomarkers of the group defined in Table 1 and/or Table 6. In an alternative embodiment, the methods of the invention screen for more than ten of the biomarkers of the group defined in Table 1, for example, eleven, twelve, thirteen, fourteen, fifteen, sixteen, seventeen, eighteen, nineteen, twenty, thirty, forty, fifty, sixty, seventy, eighty, ninety, one hundred of the biomarkers of the group defined in Table 1 and/or Table 6.
Thus, the methods of the invention screen for the presence of or increased expression of one or more of: complement component C4B (Chido blood group) 2 (C4A/C4B); glutamic-oxaloacetic transaminase 2, mitochondrial (aspartate aminotransferase 2) (GOT2); glucose-6-phosphate isomerase (GPI); transketolase (TKT); N-acylaminoacyl-peptide hydrolase (APEH); histone cluster 1, H4c (HIST4H4 (includes others)); Fatty acid-binding protein 5 (psoriasis-associated) (FABP5); hexosaminidase B (beta polypeptide) (HEXB); epithelial cell adhesion molecule (EPCAM); NME1-NME2 walkthrough (NME1-NME2); Superoxide dismutase 2, mitochondrial (SOD2); Tu translation elongation factor, mitochondrial (TUFM); Glutathione synthetase (GSS); annexin A2 (ANXA2); ATP synthase, H+ transporting, mitochondrial F1 complex, beta polypeptide (ATP5B); 10 kDa heat shock protein (chaperonin 10) (HSPE1); glyoxalase I (GLO1); histone cluster 2, H2be (HIST2H2BE (includes others)); S100 calcium binding protein A4 (S100A4); S100 calcium binding protein A11 (S100A11); latexin (LXN); dehydrogenase/reductase (SDR family) member 11 (DHRS11); N-acetylglucosaminidase, alpha (NAGLU); Translin (TSN); Proteasome (prosome, macropain) subunit alpha type-4 (PSMA4); Proteasome (prosome, macropain) subunit alpha type-6 (PSMA6); ras-related C3 botulinum toxin substrate 1 (rho family, small GTP binding protein Rac1) (RAC1); Adenosylhomocysteinase (AHOY); fucosidase, alpha-L-1, tissue (FUCA1); S100 calcium binding protein P (S100P); Proteasome (prosome, macropain) subunit beta type-2 (PSMB2); X-prolyl aminopeptidase (aminopeptidase P) 1 (XPNPEP1); Keratin 18 (KRT18); Nuclear cap-binding protein subunit 1 80 kDa (NCBP1); mannosidase, alpha, class 2B, member 1 (MAN2B1); S100 calcium binding protein A6 (S100A6); valosin containing protein (VCP); quinolinate phosphoribosyltransferase (QPRT); major histocompatibility complex, class I, B (HLA-B); phosphoglycerate mutase 1 (brain) (PGAM1); ectonucleotide pyrophosphatase/phosphodiesterase 3 (ENPP3); serpin peptidase inhibitor, clade B (ovalbumin), member 10 (SERPINB10); myeloperoxidase (MPO); creatine kinase, mitochondrial 1B (CKMT1A/CKMT1B); proteinase 3 (PRTN3); elastase, neutrophil expressed (ELANE); MORC family CW-type zinc finger 1 (MORC1); ubiquitin B (UBB); phospholipase A2, group IIA (platelets, synovial fluid) (PLA2G2A); carbonic anhydrase IV (CA4); G elongation factor, mitochondrial 2 (GFM2); S100 calcium binding protein A7 (S100A7); Bactericidal permeability-increasing protein (BPI); collagen, type VI, alpha 5 (COL6A5); LIM homeobox 8 (LHX8); cysteine-rich secretory protein 3 (CRISP3); Azurocidin (AZU1); hemicentin 1 (HMCN1); Transglutaminase 3 (E polypeptide, protein-glutamine gamma-glutamyltransferase) (TGM3); CDC42 binding protein kinase alpha (DMPK-like) (CDC42BPA); Cathepsin G (CTSG); Resistin (RETN); methylmalonyl CoA mutase (MUT); armadillo repeat containing, X-linked 4 (ARMCX4); Integrin alpha-M (complement component 3 receptor 3 subunit) (ITGAM); Calcium channel, voltage dependent, R-type alpha-1E subunit (CACNA1E); T-cell lymphoma invasion and metastasis 2 (TIAM2); HIR histone cell cycle regulation defective homolog A (S. cerevisiae) (HIRA); dopey family member 2 (DOPEY2); integrin beta 1 binding protein 3 (ITGB1BP3); Sodium channel, voltage-gated, type VII, alpha (SCN7A); Rab3C, member RAS oncogene family (RAB3C); chromosome 9 open reading frame 79 (C9orf79); nuclear factor of activated T-cells, cytoplasmic, calcineurin-dependent 4 (NFATC4); UDP-glucose glycoprotein glucosyltransferase 2 (UGGT2); Cornulin (CRNN); kielin/chordin-like protein (KCP); CD1E molecule (CD1E); coiled-coil domain-containing 18 (CCDC18); leukotriene A-4 hydrolase (LTA4H); albumin (ALB); alpha-2-macroglobulin (A2M); complement component 3 (C3); hemoglobin, beta (HBB); transferrin (TF); hemoglobin, alpha 1 (HBA1/HBA2); lactotransferrin (LTF); ceruloplasmin (ferroxidase) (CP); catalase (CAT); group-specific component (vitamin D-binding protein) (GC); serpin peptidase inhibitor, clade C (antithrombin), member 1 (SERPINC1); fibrinogen gamma chain (FGG); S100 calcium binding protein A8 (S100A8); ferritin, light polypeptide (FTL); actin, beta (ACTB); fibronectin 1 (FN1); defensin, alpha 1 (DEFA1 (includes others)); serpin peptidase inhibitor, clade G (C1 inhibitor), member 1 (SERPING1); retinol binding protein 4, plasma (RBP4); peroxiredoxin 2 (PRDX2); fibrinogen alpha chain (FGA); serpin peptidase inhibitor, clade F (alpha-2 antiplasmin, pigment epithelium derived factor), member 2 (SERPINF2); carbonic anhydrase II (CA2); orosomucoid 1 (ORM1/ORM2); lactate dehydrogenase A (LDHA); vitronectin (VTN); kininogen-1 (KNG1); actin, alpha, cardiac muscle 1 (ACTC1); leucine-rich alpha-2-glycoprotein 1 (LRG1); gamma-glutamyl hydrolase (conjugase, folylpolygammaglutamyl hydrolase) (GGH); enolase 1, (alpha) (ENO1); profilin 1 (PFN1); serpin peptidase inhibitor, clade A (alpha-1 antiproteinase, antitrypsin), member 7 (SERPINA7); alpha-1-microglobulin/bikunin precursor (AMBP); lamin A/C (LMNA); apolipoprotein D (APOD); thyroid hormone receptor interactor 11 (TRIP11); complement component 4 binding protein, alpha (C4BPA); tropomyosin 4 (TPM4); filamin A, alpha (FLNA); haptoglobin (HP); hemopexin (HPX); hemoglobin, delta (HBD); fibrinogen beta chain (FGB); S100 calcium binding protein A9 (S100A9); complement component 5 (C5); solute carrier family 26, member 3 (SLC26A3); complement component 9 (C9); amyloid P component, serum (APCS); alpha-1-B glycoprotein (A1BG); complement C3-like (LOC100133511); inter-alpha (globulin) inhibitor H4 (plasma Kallikrein-sensitive glycoprotein) (ITIH4); complement component C8, alpha polypeptide (C8A); inter-alpha (globulin) inhibitor H1 (ITIH1); acyl-CoA dehydrogenase, very long chain (ACADVL); cDNA FLJ60317, highly similar to Aminoacylase-1 (ACY1); Ankyrin repeat domain-containing protein 35 (ANKRD35); baculoviral IAP repeat-containing 6 (BIRC6); Bleomycin hydrolase (BLMH); bone marrow stromal cell antigen 12 (BST1); hypothetical protein LOC643677 (C13orf40); Cytidine deaminase (CDA); chitinase 1 (chitotriosidase) (CHIT1); cathepsin C (CTSC); Cathepsin S (CTSS); Isoform 2 of Dedicator of cytokinesis protein 4 (DOCK4); Glutathione reductase (GSR); hect (homologous to the E6-AP (UBE3A) carboxyl terminus) domain and RCC1 (CHC1)-like domain (RLD) 1 (HERC1); hect domain and RLD 2 (HERC2); major histocompatibility complex, class II, DR beta 5 (HLA-DRB5); isocitrate dehydrogenase 1 (NADP+), soluble (IDH1); inter-alpha (globulin) inhibitor H2 (ITIH2); Uncharacterized protein KIAA1797 (KIAA1797); Lysozyme C (LYZ); Nebulin (NEB); NIMA (never in mitosis gene a)-related kinase 10 (NEK10); peptidase D (PEPD); quiescin Q6 sulfhydryl oxidase 1 (QSOX1); ribonuclease T2 (RNASET2); serpin peptidase inhibitor, clade A (alpha-1 antiproteinase, antitrypsin), member 1 (SERPINA1); serpin peptidase inhibitor, clade A (alpha-1 antiproteinase, antitrypsin), member 3 (SERPINA3); serpin peptidase inhibitor, clade B (ovalbumin), member 3 (SERPINB3); SET domain containing 2 (SETD2); Shugoshin-like 2 (SGOL2); sialic acid acetylesterase (SIAE); spectrin repeat containing, nuclear envelope 1 (SYNE1); Transaldolase 1 (TALDO1); Taste receptor type 2 member 42 (TAS2R42); triosephosphate isomerase 1 (TPI1); Vinculin (VCL); Zymogen granule membrane protein 16 (ZG16); hypothetical protein LOC79887 (PLBD1); Isoform 1 of Serine/threonine-protein phosphatase 6 regulatory ankyrin repeat subunit A (ANKRD28); Cystatin-C(CST3); D-dopachrome decarboxylase (DDT); Synapse-associated protein 1 (SYAP1); Proteasome subunit alpha type-2 (PSMA2); SUB1 homolog (S. cerevisiae) (SUB1); Microfibril-associated glycoprotein 3 (MFAP3); Cathepsin D (CTSD); proteasome (prosome, macropain) subunit, beta type, 1 (PSMB1); proteasome (prosome, macropain) subunit, beta type, 5 (PSMB5); cDNA FLJ61112, highly similar to BTB/POZ domain-containing protein KCTD15 (KCTD15); prolyl 4-hydroxylase, beta polypeptide (P4HB); glutathione peroxidase 1 (GPX1); serpin peptidase inhibitor, clade B (ovalbumin), member 5 (SERPINB5); Isoform 1 of collagen, type IV, alpha 3 (Goodpasture antigen) binding protein (COL4A3BP); proteasome (prosome, macropain) subunit, beta type, 6 (PSMB6); Keratin 20 (KRT20); Calpain small subunit 1 (CAPNS1); peroxiredoxin 3 (PRDX3); NACC family member 2, BEN and BTB (POZ) domain containing (NACC2); Rho GDP-dissociation inhibitor 2 (ARHGDIB); Macrophage migration inhibitory factor (MIF); Ran-binding protein 6 (RANBP6); spinster homolog 3 (Drosophila) (SPNS3); minichromosome maintenance complex component 2 (MCM2); Fumarylacetoacetase (FAH); heat shock 70 kDa protein 8 (HSPA8); brain abundant, membrane attached signal protein 1 (BASP1); Branched-chain-amino-acid aminotransferase (BCAT2); Moesin (MSN); serpin peptidase inhibitor, clade B (ovalbumin), member 8 (SERPINB8); glucose-6-phosphate dehydrogenase (G6PD); Isoform 1 of UPF0557 protein C10orf119 (C10orf119); Prosaposin (PSAP); eukaryotic translation elongation factor 1 gamma (EEF1G); four and a half LIM domains 1 (FHL1); carboxypeptidase, vitellogenic-like (CPVL); tubulin tyrosine ligase-like family, member 3 (TTLL3); IPI:IPI00942608.1|ENSEMBLENSP0000 (unmapped; 26 kDa protein); proline-rich protein BstNI subfamily 2 (PRB1/PRB2); Protocadherin-8 (PCDH8); Alpha-2-macroglobulin-like protein 1 (A2ML1); Guanine deaminase (GDA); Lipocalin-1 (LCN1); Histone H1.4 (HIST1H1E); IPI:IPI00937064.1IREFSEQ:XP—002342720 (ZAN); heterogeneous nuclear ribonucleoprotein A2/B1 (HNRNPA2B1);); Endoplasmic reticulum aminopeptidase 2 (ERAP2); 14-3-3 protein zeta/delta (YWHAZ); G-protein coupled receptor 39 (GPR39); similar to KIAA1783 protein (KIAA1783 protein); apolipoprotein A-I binding protein (APOA1BP); pleckstrin and Sec7 domain containing 2 (PSD2); prolylcarboxypeptidase (angiotensinase C) (PROP); Tubulin alpha-1C chain (TUBA1C); Calmodulin-like protein 5 (CALML5); ARP3 actin-related protein 3 homolog (yeast) (ACTR3); myosin, light chain 6, alkali, smooth muscle and non-muscle (MYL6); Vasodilator-stimulated phosphoprotein (VASP); ARP2 actin-related protein 2 homolog (yeast) (ACTR2); Rheumatoid factor (RF-IP18); Phosphoglycerate kinase 1 (PGK1); Solute carrier family 35 member F1 (SLC35F1); Solute carrier family 35 member F1 (SLC35F1); alkaline phosphatase, liver/bone/kidney (ALPL); I tropomyosin 3 (TPM3); Hexokinase-3 (HK3); Vimentin (VIM); Annexin A1 (ANXA1); IPI:IPI100930073.1|TREMBL:B2R853 (KRT6C); Keratin, type II cytoskeletal 6C (KRT6C); myosin, heavy chain 13, skeletal muscle (MYH13); cell cycle progression 1 (CCPG1); Hypothetical protein (H-INV); calcium channel, voltage-dependent, L type, alpha 1D subunit (CACNA1D); LY6/PLAUR domain containing 5 (LYPD5); aarF domain containing kinase 2 (ADCK2); Myosin-lc (MYO1C); amyloid beta precursor protein (cytoplasmic tail) binding protein 2 (APPBP2); integrin, alpha 2b (platelet glycoprotein IIb of IIb/IIIa complex, antigen CD41) (ITGA2B); tubulin, beta 6 (TUBB6); synaptotagmin-like 4 (SYTL4); aquaporin 4 (AQP4); cell division cycle 42 (GTP binding protein, 25 kDa) (CDC42); myosin, light chain 12B, regulatory (MYL12B); protein L-Myc-2-like (LOC100293553); RAP1B, member of RAS oncogene family (RAP1B); glycoprotein IX (platelet) (GP9); Destrin (DSTN); complement component 1, q subcomponent, C chain (C1QC); epidermal growth factor receptor pathway substrate 8 (EPS8); dual specificity phosphatase 3 (DUSP3); ras homolog gene family, member A (RHOA); myosin, light chain 9, regulatory (MYL9); peptidylprolyl isomerase A (cyclophilin A) (PPIA); Cofilin-1 (CFL1); and/or lactotransferrin (LTF), collagen, type XII, alpha 1 (COL12A1), agrin (AGRN), +MYB binding protein (P160) Ia (MYBBP1A), transformation/transcription domain-associated protein (TRRAP), annexin A6 (ANXA6), cytoskeleton associated protein 5 (CKAP5), minichromosome maintenance complex component 5 (MOMS), importin 4 (IPO4), neurobeachin-like 2 (NBEAL2), minichromosome maintenance complex component 4 (MCM4), 2′-5′-oligoadenylate synthetase 3, 100 kDa (OAS3), minichromosome maintenance complex component 3 (MCM3), NEDD8 activating enzyme E1 subunit 1 (NAE1), tripartite motif containing 28 (TRIM28), fused in sarcoma (FUS), phenylalanyl-tRNA synthetase, alpha subunit (FARSA), myeloid cell nuclear differentiation antigen (MNDA), suppressor of Ty 16 homolog (S. cerevisiae) (SUPT16H), DEAD (Asp-Glu-Ala-Asp) box polypeptide 5 (DDX5), tenascin C (TNC), nuclear import 7 homolog (S. cerevisiae) (NIP7), chromodomain helicase DNA binding protein 4 (CHD4), regulator of chromosome condensation 2 (RCC2), DNA (cytosine-5-)-methyltransferase 1 (DNMT1), exportin 4 (XPO4), chaperonin containing TCP1, subunit 5 (epsilon) (COTS), serine/arginine-rich splicing factor 9 (SRSF9), spectrin, beta, non-erythrocytic 2 (SPTBN2), TIMP metallopeptidase inhibitor 1 (TIMP1), nidogen 1 (NID1), ribonucleotide reductase M1 (RRM1), eukaryotic translation initiation factor 4 gamma, 1 (EIF4G1), component of oligomeric golgi complex 4 (COG4), polymerase (DNA directed), delta 1, catalytic subunit 125 kDa (POLD1), splicing factor 3b, subunit 2, 145 kDa (SF3B2), exosome component 2 (EXOSC2), minichromosome maintenance complex component 6 (MCM6), plastin 3 (PLS3), aldolase B, fructose-bisphosphate (ALDOB), SMG1 homolog, phosphatidylinositol 3-kinase-related kinase (C. elegans) (SMG1), G1 to S phase transition 1 (GSPT1), KH-type splicing regulatory protein (KHSRP), DEAD (Asp-Glu-Ala-Asp) box polypeptide 21 (DDX21), phosphatidylinositol transfer protein, beta (PITPNB), aquarius homolog (mouse) (AQR), heterogeneous nuclear ribonucleoprotein D-like (HNRPDL), annexin A3 (ANXA3), processing of precursor 1, ribonuclease P/MRP subunit (S. cerevisiae) (POP1), structural maintenance of chromosomes 2 (SMC2), dynein, cytoplasmic 1, light intermediate chain 2 (DYNC1LI2), peptidylprolyl isomerase D (PPID), vacuolar protein sorting 37 homolog B (S. cerevisiae) (VPS37B), adrenergic, beta, receptor kinase 1 (ADRBK1), DIS3 mitotic control homolog (S. cerevisiae) (DIS3), polymerase (RNA) I polypeptide A, 194 kDa (POLR1A), t-complex 1 (TCP1), plakophilin 3 (PKP3), La ribonucleoprotein domain family, member 1B (LARP1B), poly (ADP-ribose) polymerase 1 (PARP1), CD46 molecule, complement regulatory protein (CD46),
p21 protein (Cdc42/Rac)-activated kinase 2 (PAK2), ATP-binding cassette, sub-family E (OABP), member 1 (ABCE1), ubiquitin specific peptidase 14 (tRNA-guanine transglycosylase) (USP14), chaperonin containing TCP1, subunit 3 (gamma) (CCT3), Ran GTPase activating protein 1 (RANGAP1), deoxythymidylate kinase (thymidylate kinase) (DTYMK), N-myristoyltransferase 1 (NMT1), dynamin 1-like (DNM1L), interferon induced transmembrane protein 2 (1-8D) (IFITM2), fermitin family member 1 (FERMT1), tubulin folding cofactor D (TBCD), serine/arginine-rich splicing factor 10 (LOC100505793/SRSF10), STE20-like kinase (SLK), mucin 5AC, oligomeric mucus/gel-forming (MUCSAC/MUCSB), methionyl-tRNA synthetase (MARS), SMEK homolog 1, suppressor of mek1 (Dictyostelium) (SMEK1), high mobility group box 2 (HMGB2), non-POU domain containing, octamer-binding (NONO), transforming growth factor, beta-induced, 68 kDa (TGFBI), fibulin 2 (FBLN2), high density lipoprotein binding protein (HDLBP), collagen, type IV, alpha 2 (COL4A2), copine I (CPNE1), N(alpha)-acetyltransferase 50, NatE catalytic subunit (NAA50), LSM7 homolog, U6 small nuclear RNA associated (S. cerevisiae) (LSM7), structure specific recognition protein 1 (SSRP1), importin 8 (IP08), yippee-like 5 (Drosophila) (YPEL5), phosphoglucomutase 3 (PGM3), ring finger protein 40 (RNF40), structural maintenance of chromosomes 3 (SMC3), regenerating islet-derived family, member 4 (REG4), splicing factor 3a, subunit 3, 60 kDa (SF3A3), thrombospondin 1 (THBS1), chaperonin containing TCP1, subunit 6A (zeta 1) (CCT6A), PRP8 pre-mRNA processing factor 8 homolog (S. cerevisiae) (PRPF8), symplekin (SYMPK), far upstream element (FUSE) binding protein 1 (FUBP1), U2 small nuclear RNA auxiliary factor 1 (U2AF1), huntingtin (HTT), eukaryotic translation initiation factor 5B (EIFSB), nuclear autoantigenic sperm protein (histone-binding) (NASP), heterogeneous nuclear ribonucleoprotein K (HNRNPK), Y box binding protein 1 (YBX1), annexin A11 (ANXA11), RecQ protein-like (DNA helicase Q1-like) (RECQL), cortactin (CTTN), tubulin, beta 3 (TUBB3), pyrroline-5-carboxylate reductase-like (PYCRL), periplakin (PPL), phosphoglucomutase 2-like 1 (PGM2L1), chromosome 17 open reading frame 49 (C17orf49), mRNA turnover 4 homolog (S. cerevisiae) (MRTO4), methyltransferase like 1 (METTL1), squamous cell carcinoma antigen recognized by T cells 3 (SART3), S100 calcium binding protein A13 (S100A13), aminopeptidase-like 1 (NPEPL1), cyclin-dependent kinase 1 (CDK1), ubiquitin protein ligase E3 component n-recognin 1 (UBR1), Rho GTPase activating protein 18 (ARHGAP18), signal recognition particle 14 kDa (homologous Alu RNA binding protein) (SRP14), cathelicidin antimicrobial peptide (CAMP), splicing factor proline/glutamine-rich (SFPQ), RAS p21 protein activator (GTPase activating protein) 1 (RASA1), Ral GTPase activating protein, beta subunit (non-catalytic) (RALGAPB), laminin, beta 1 (LAMB1), RAB3 GTPase activating protein subunit 2 (non-catalytic) (RAB3GAP2), chaperonin containing TCP1, subunit 8 (theta) (CCT8), heterogeneous nuclear ribonucleoprotein L-like (HNRPLL), RAN binding protein 1 (RANBP1), kinetochore associated 1 (KNTC1), dyskeratosis congenita 1, dyskerin (DKC1), casein kinase 2, alpha 1 polypeptide (CSNK2A1), CAP-GLY domain containing linker protein 1 (CLIP1), chaperonin containing TCP1, subunit 2 (beta) (CCT2), tubulin tyrosine ligase-like family, member 12 (TTLL12), ataxia telangiectasia mutated (ATM), splicing factor 3a, subunit 1, 120 kDa (SF3A1), ribosomal protein S20 (RPS20), ubiquitin-conjugating enzyme E20 (UBE2O), translocated promoter region (to activated MET oncogene) (TPR), BRCA2 and CDKN1A interacting protein (BCCIP), gem (nuclear organelle) associated protein 5 (GEMIN5), ribonuclease P/MRP 30 kDa subunit (RPP30), loss of heterozygosity, 12, chromosomal region 1 (LOH12CR1), syntaxin binding protein 2 (STXBP2), ubiquitin-conjugating enzyme E2H (UBE2H), DIP2 disco-interacting protein 2 homolog B (Drosophila) (DIP2B), RAP1, GTP-GDP dissociation stimulator 1 (RAP1GDS1), heterogeneous nuclear ribonucleoprotein M (HNRNPM), LIM domain 7 (LMO7), RNA binding motif protein 25 (RBM25), aldehyde dehydrogenase 7 family, member A1 (ALDH7A1), cleavage and polyadenylation specific factor 1, 160 kDa (CPSF1), calponin 2 (CNN2), chaperonin containing TCP1, subunit 7 (eta) (CCT7), lysyl-tRNA synthetase (KARS), UDP-N-acteylglucosamine pyrophosphorylase 1 (UAP1), heat shock 70 kDa protein 4-like (HSPA4L), 138 kDa protein (138 kDa protein), thimet oligopeptidase 1 (THOP1), glutaredoxin 3 (GLRX3), phosphoglycerate dehydrogenase (PHGDH), CDV3 homolog (mouse) (CDV3), structural maintenance of chromosomes 4 (SMC4), RNA binding motif (RNP1, RRM) protein 3 (RBM3), hepatoma-derived growth factor (HDGF), heterogeneous nuclear ribonucleoprotein U (scaffold attachment factor A) (HNRNPU), nuclear receptor binding protein 1 (NRBP1), polymerase (RNA) I polypeptide B, 128 kDa (POLR1B), protein phosphatase 5, catalytic subunit (PPP5C), glucose-6-phosphate dehydrogenase (G6PD), arginase, liver (ARG1), 3-hydroxy-3-methylglutaryl-CoA synthase 1 (soluble) (HMGCS1), ubiquitin-like modifier activating enzyme 2 (UBA2), KIAA1033 (KIAA1033), annexin A4 (ANXA4), DEAD (Asp-Glu-Ala-Asp) box polypeptide 17 (DDX17), acidic (leucine-rich) nuclear phosphoprotein 32 family, member E (ANP32E), glucosamine (UDP-N-acetyl)-2-epimerase/N-acetylmannosamine kinase (GNE), KIAA0368 (KIAA0368), vacuolar protein sorting 4 homolog B (S. cerevisiae) (VPS4B), replication protein A1, 70 kDa (RPA1), eukaryotic translation initiation factor 2, subunit 1 alpha, 35 kDa (EIF2S1), eukaryotic translation initiation factor 3, subunit J (EIF3J), suppressor of Ty 6 homolog (S. cerevisiae) (SUPT6H), heat shock 105 kDa/110 kDa protein 1 (HSPH1), exportin 5 (XPO5), transcription elongation factor A (SII), 1 (TCEA1), Sjogren syndrome antigen B (autoantigen La) (SSB), AE binding protein 1 (AEBP1), LIM and cysteine-rich domains 1 (LMCD1), interleukin enhancer binding factor 3, 90 kDa (ILF3), WD repeat domain 61 (WDR61), N(alpha)-acetyltransferase 15, NatA auxiliary subunit (NAA15), serine/arginine-rich splicing factor 4 (SRSF4), ring finger protein 20 (RNF20), lactamase, beta 2 (LACTB2), NHP2 ribonucleoprotein homolog (yeast) (NHP2), chromosome 17 open reading frame 28 (C17orf28), CTP synthase II (CTPS2), fascin homolog 1, actin-bundling protein (Strongylocentrotus purpuratus) (FSCN1), tRNA nucleotidyl transferase, CCA-adding, 1 (TRNT1), splicing regulatory glutamine/lysine-rich protein 1 (SREK1),stromal antigen 1 (STAG1), oxysterol binding protein (OSBP), deoxyuridine triphosphatase (DUT), coiled-coil domain containing 25 (CCDC25), DEK oncogene (DEK), coiled-coil domain containing 72 (CCDC72), polymerase (RNA) II (DNA directed) polypeptide E, 25 kDa (POLR2E), phosphoserine phosphatase (PSPH), structural maintenance of chromosomes 1A (SMC1A), DEAD (Asp-Glu-Ala-Asp) box polypeptide 23 (DDX23), tRNA methyltransferase 11-2 homolog (S. cerevisiae) (TRMT112), COP9 constitutive photomorphogenic homolog subunit 2 (Arabidopsis) (COPS2), programmed cell death 5 (PDCD5), cyclin-dependent kinase 2 (CDK2), proteasome (prosome, macropain) 26S subunit, non-ATPase, 3 (PSMD3), RAN binding protein 2 (RANBP2), SERPINE1 mRNA binding protein 1 (SERBP1), O-linked N-acetylglucosamine (GlcNAc) transferase (UDP-N-acetylglucosamine:polypeptide-N-acetylglucosaminyl transferase) (OGT), non-SMC condensin I complex, subunit D2 (NCAPD2), SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily c, member 2 (SMARCC2), NOP10 ribonucleoprotein homolog (yeast) (NOP10), LPS-responsive vesicle trafficking, beach and anchor containing (LRBA), apoptosis inhibitor 5 (API5), signal recognition particle receptor (docking protein) (SRPR), transferrin receptor (p90, CD71) (TFRC), basic leucine zipper and W2 domains 2 (BZW2), ribonuclease, RNase A family, 3 (RNASE3), diazepam binding inhibitor (GABA receptor modulator, acyl-CoA binding protein) (DBI), FK506 binding protein 4, 59 kDa (FKBP4), chromosome 6 open reading frame 130 (C6orf130), cofactor of BRCA1 (COBRA1), flap structure-specific endonuclease 1 (FEN1), glucan (1,4-alpha-), branching enzyme 1 (GBE1), small nuclear ribonucleoprotein polypeptide B (SNRPB2), NSFL1 (p97) cofactor (p47) (NSFL1C), acyl-CoA thioesterase 7 (ACOT7), NOP2/Sun domain family, member 2 (NSUN2), chaperonin containing TCP1, subunit 4 (delta) (CCT4), kallikrein-related peptidase 6 (KLK6), glutaminyl-peptide cyclotransferase (QPCT), BCL2-associated athanogene 6 (BAG6), eukaryotic translation initiation factor 3, subunit C (EIF3C/EIF3CL), ATPase, H+ transporting, lysosomal 56/58 kDa, V1 subunit B2 (ATP6V1B2), matrix metallopeptidase 8 (neutrophil collagenase) (MMP8), proteasome (prosome, macropain) 26S subunit, ATPase, 5 (PSMC5), GTP cyclohydrolase I feedback regulator (GCHFR), poly(A) polymerase alpha (PAPOLA), hippocalcin-like 1 (HPCAL1), GTPase activating protein (SH3 domain) binding protein 1 (G3BP1), polymerase (RNA) Ill (DNA directed) polypeptide A, 155 kDa (POLR3A), superkiller viralicidic activity 2-like 2 (S. cerevisiae) (SKIV2L2), polymerase (RNA) II (DNA directed) polypeptide A, 220 kDa (POLR2A), collagen, type I, alpha 2 (COL1A2), fibrillarin (FBL), glutamyl-prolyl-tRNA synthetase (EPRS), ELAV (embryonic lethal, abnormal vision, Drosophila)-like 1 (Hu antigen R) (ELAVL1), nuclear cap binding protein subunit 2, 20 kDa (NCBP2), GCN1 general control of amino-acid synthesis 1-like 1 (yeast) (GCN1L1), histone acetyltransferase 1 (HAT1), stromal antigen 2 (STAG2), sorbitol dehydrogenase (SORD), REX2, RNA exonuclease 2 homolog (S. cerevisiae) (REXO2), heterogeneous nuclear ribonucleoprotein F (HNRNPF), thymopoietin (TMPO), ubiquitin specific peptidase 24 (USP24), KIAA1967 (KIAA1967), complement component 1, r subcomponent (C1R), annexin A7 (ANXA7), RuvB-like 2 (E. coli) (RUVBL2), acireductone dioxygenase 1 (ADI1), eukaryotic translation initiation factor 4A3 (EIF4A3), heterogeneous nuclear ribonucleoprotein U-like 2 (HNRNPUL2), ubiquitin protein ligase E3 component n-recognin 4 (UBR4), SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily a, member 2 (SMARCA2), cytochrome b5 reductase 2 (CYB5R2), splicing factor 3b, subunit 3, 130 kDa (SF3B3), G protein pathway suppressor 1 (GPS1), MAD2 mitotic arrest deficient-like 1 (yeast) (MAD2L1), phospholipase C, gamma 1 (PLCG1), eukaryotic translation initiation factor 4H (EIF4H), U2 small nuclear RNA auxiliary factor 2 (U2AF2), tankyrase 1 binding protein 1, 182 kDa (TNKS1BP1), transglutaminase 2 (C polypeptide, protein-glutamine-gamma-glutamyltransferase) (TGM2), heterogeneous nuclear ribonucleoprotein L (HNRNPL), inositol polyphosphate-5-phosphatase, 145 kDa (INPP5D), annexin A10 (ANXA10), BUD31 homolog (S. cerevisiae) (BUD31), phosphatidylinositol transfer protein, alpha (PITPNA), leucyl-tRNA synthetase (LARS), nicotinamide N-methyltransferase (NNMT), proteasome (prosome, macropain) 26S subunit, non-ATPase, 12 (PSMD12), v-crk sarcoma virus CT10 oncogene homolog (avian) (CRK), proteoglycan 2, bone marrow (natural killer cell activator, eosinophil granule major basic protein) (PRG2), versican (VCAN), exportin, tRNA (nuclear export receptor for tRNAs) (XPOT), EMG1 nucleolar protein homolog (S. cerevisiae) (EMG1), chromosome 11 open reading frame 73 (C11orf73), transportin 1 (TNPO1), latent transforming growth factor beta binding protein 2 (LTBP2), cold shock domain containing E1, RNA-binding (CSDE1), sulfiredoxin 1 (SRXN1), paraspeckle component 1 (PSPC1), ribosomal protein 53A (RPS3A), ISG15 ubiquitin-like modifier (ISG15), polymerase (RNA) II (DNA directed) polypeptide B, 140 kDa (POLR2B), general transcription factor IIi (GTF2I), NHP2 non-histone chromosome protein 2-like 1 (S. cerevisiae) (NHP2L1), proteasome (prosome, macropain) 26S subunit, non-ATPase, 10 (PSMD10), signal transducer and activator of transcription 1, 91 kDa (STAT1), elongation factor Tu GTP binding domain containing 1 (EFTUD1), mediator complex subunit 23 (MED23), eukaryotic translation initiation factor 2C, 2 (EIF2C2), RNA binding motif protein 4B (RBM4B), KIAA0664 (KIAA0664), core-binding factor, beta subunit (CBFB),
poly(A) binding protein, cytoplasmic 1 (PABPC1), nicotinamide phosphoribosyltransferase (NAMPT), cellular retinoic acid binding protein 2 (CRABP2), thyroid hormone receptor interactor 12 (TRIP12), DnaJ (Hsp40) homolog, subfamily C, member 9 (DNAJC9), StAR-related lipid transfer (START) domain containing 10 (STARD10), ring finger protein 213 (RNF213), eukaryotic translation initiation factor 2B, subunit 5 epsilon, 82 kDa (EIF2B5), bolA homolog 2 (E. coli) (BOLA2/BOLA2B), meningioma expressed antigen 5 (hyaluronidase) (MGEA5), Rab geranylgeranyltransferase, alpha subunit (RABGGTA), pyridoxal-dependent decarboxylase domain containing 1 (PDXDC1), exosome component 8 (EXOSC8), phosphoserine aminotransferase 1 (PSAT1), eukaryotic translation initiation factor 6 (EIF6), chromosome 16 open reading frame 13 (C16orf13), signal transducer and activator of transcription 3 (acute-phase response factor) (STAT3), EGF containing fibulin-like extracellular matrix protein 1 (EFEMP1), defensin, alpha 6, Paneth cell-specific (DEFA6), pyrophosphatase (inorganic) 1 (PPA1), glutathione peroxidase 2 (gastrointestinal) (GPX2), unc-13 homolog D (C. elegans) (UNC13D), protein tyrosine phosphatase, non-receptor type 6 (PTPN6), myosin XVIIIA (MYO18A), fibulin 1 (FBLN1), ribosomal protein L19 (RPL19), diaphanous homolog 1 (Drosophila) (DIAPH1), MMS19 nucleotide excision repair homolog (S. cerevisiae) (MMS19), nudix (nucleoside diphosphate linked moiety X)-type motif 21 (NUDT21), splicing factor 3b, subunit 5, 10 kDa (SF3B5), SAP domain containing ribonucleoprotein (SARNP), ADP-ribosylation factor guanine nucleotide-exchange factor 2 (brefeldin A-inhibited) (ARFGEF2), guanylate binding protein 1, interferon-inducible (GBP1), HECT, UBA and WWE domain containing 1 (HUWE1), processing of precursor 7, ribonuclease P/MRP subunit (S. cerevisiae) (POP7), eukaryotic translation initiation factor 2B, subunit 3 gamma, 58 kDa (EIF2B3), N(alpha)-acetyltransferase 10, NatA catalytic subunit (NAA10), DnaJ (Hsp40) homolog, subfamily B, member 1 (DNAJB1), eukaryotic translation initiation factor 2-alpha kinase 2 (EIF2AK2), elongation factor Tu GTP binding domain containing 2 (EFTUD2), grancalcin, EF-hand calcium binding protein (GCA), neuronal cell adhesion molecule (NRCAM), DEAH (Asp-Glu-Ala-His) box polypeptide 16 (DHX16), small glutamine-rich tetratricopeptide repeat (TPR)-containing, alpha (SGTA), serine/threonine kinase 10 (STK10), smu-1 suppressor of mec-8 and unc-52 homolog (C. elegans) (SMU1), PRP4 pre-mRNA processing factor 4 homolog (yeast) (PRPF4), heterogeneous nuclear ribonucleoprotein D (AU-rich element RNA binding protein 1, 37 kDa) (HNRNPD), CTP synthase (CTPS), cleavage stimulation factor, 3′ pre-RNA, subunit 3, 77 kDa (CSTF3), cleavage stimulation factor, 3′ pre-RNA, subunit 1, 50 kDa (CSTF1), heterogeneous nuclear ribonucleoprotein U-like 1 (HNRNPUL1), ribosomal protein S5 (RPS5), protein tyrosine phosphatase, non-receptor type 11 (PTPN11), ladinin 1 (LAD1), component of oligomeric golgi complex 2 (COG2), cullin 2 (CUL2), ribosomal protein S17 (RPS17/RPS17L), proteasome (prosome, macropain) 26S subunit, non-ATPase, 4 (PSMD4), annexin A1 (ANXA1), annexin A2 (ANXA2), 5-aminoimidazole-4-carboxamide ribonucleotide formyltransferase/IMP cyclohydrolase (ATIC), azurocidin 1 (AZU1), baculoviral IAP repeat containing 6 (BIRC6), chitinase 3-like 1 (cartilage glycoprotein-39) (CHI3L1), carbamoyl-phosphate synthase 1, mitochondrial (CPS1), cathepsin G (CTSG), defensin, alpha 1 (DEFA1 (includes others)), deleted in malignant brain tumors 1 (DMBT1), elastase, neutrophil expressed (ELANE), integrin, alpha M (complement component 3 receptor 3 subunit) (ITGAM), lipocalin 2 (LCN2), lectin, galactoside-binding, soluble, 3 binding protein (LGALS3BP), minichromosome maintenance complex component 2 (MCM2), matrix metallopeptidase 9 (gelatinase B, 92 kDa gelatinase, 92 kDa type IV collagenase) (MMP9), myeloperoxidase (MPO), mucin 5AC, oligomeric mucus/gel-forming (MUC5AC/MUC5B), nuclear cap binding protein subunit 1, 80 kDa (NCBP1), neurofibromin 1 (NF1), olfactomedin 4 (OLFM4), PDS5, regulator of cohesion maintenance, homolog A (S. cerevisiae) (PDS5A), peptidoglycan recognition protein 1 (PGLYRP1), proteinase 3 (PRTN3), quiescin Q6 sulfhydryl oxidase 1 (QSOX1), regenerating islet-derived 1 alpha (REG1A), S100 calcium binding protein A9 (S100A9), serpin peptidase inhibitor, clade B (ovalbumin), member 10 (SERPINB10), serpin peptidase inhibitor, clade B (ovalbumin), member 5 (SERPINB5), unc-45 homolog A (C. elegans) (UNC45A).
In one embodiment, the methods of the invention screen for one or more biomarkers, the presence of which in a sample is indicative that the individual is at risk of suffering from or is suffering from colorectal cancer.
In an alternative embodiment, the methods of the invention screen for one or more biomarkers, the increased expression of which in a sample relative to a control sample is indicative that the individual is at risk of suffering from or is suffering from colorectal cancer.
In a preferred embodiment, the methods of the invention screen for one or more biomarkers from the group defined in Table 2. The biomarkers provided in Table 2 represent a preferred subset of those defined in Table 1, the differential expression for which relative to control samples is considered statistically significant. Therefore, this group of biomarkers represents a panel of biomarkers from which any number of biomarkers may be selected for screening.
Thus, in one embodiment, the methods of the invention may screen for more than one biomarker from the group defined in Table 2, for example two, three, four, five, six, seven, eight, nine, ten of the biomarkers of the group defined in Table 2. In an alternative embodiment, the methods of the invention screen for more than ten of the biomarkers of the group defined in Table 2, for example, eleven, twelve, thirteen, fourteen, fifteen, sixteen, seventeen, eighteen, nineteen, twenty, thirty, forty, fifty, sixty, seventy, eighty, ninety, one hundred of the biomarkers of the group defined in Table 2. Thus, the methods of the invention screen for the presence of or increased expression of one or more of: complement component C4B (Chido blood group) 2 (C4A/C4B); glutamic-oxaloacetic transaminase 2, mitochondrial (aspartate aminotransferase 2) (GOT2); glucose-6-phosphate isomerase (GPI); transketolase (TKT); N-acylaminoacyl-peptide hydrolase (APEH); histone cluster 1, H4c (HIST4H4 (includes others)); Fatty acid-binding protein 5 (psoriasis-associated) (FABP5); hexosaminidase B (beta polypeptide) (HEXB); epithelial cell adhesion molecule (EPCAM); Nucleoside diphosphate kinase (NME1-NME2); Superoxide dismutase 2, mitochondrial (SOD2); Tu translation elongation factor, mitochondrial (TUFM); Glutathione synthetase (GSS); annexin A2 (ANXA2); ATP synthase, H+ transporting, mitochondrial F1 complex, beta polypeptide (ATP5B); 10 kDa heat shock protein (chaperonin 10) (HSPE1); glyoxalase I (GLO1); histone cluster 2, H2be (HIST2H2BE (includes others)); S100 calcium binding protein A4 (S100A4); S100 calcium binding protein A11 (S100A11); latexin (LXN); dehydrogenase/reductase (SDR family) member 11 (DHRS11); N-acetylglucosaminidase, alpha (NAGLU); Translin (TSN); Proteasome (prosome, macropain) subunit alpha type-4 (PSMA4); Proteasome (prosome, macropain) subunit alpha type-6 (PSMA6); ras-related C3 botulinum toxin substrate 1 (rho family, small GTP binding protein Rac1) (RAC1); Adenosylhomocysteinase (AHOY); fucosidase, alpha-L-1, tissue (FUCA1); S100 calcium binding protein P (S100P); Proteasome (prosome, macropain) subunit beta type-2 (PSMB2); X-prolyl aminopeptidase (aminopeptidase P) 1 (XPNPEP1); Keratin 18 (KRT18); Nuclear cap-binding protein subunit 1 80 kDa (NCBP1); mannosidase, alpha, class 2B, member 1 (MAN2B1); S100 calcium binding protein A6 (S100A6); valosin containing protein (VCP); quinolinate phosphoribosyltransferase (QPRT); major histocompatibility complex, class I, B (HLA-B); phosphoglycerate mutase 1 (brain) (PGAM1); ectonucleotide pyrophosphatase/phosphodiesterase 3 (ENPP3); serpin peptidase inhibitor, clade B (ovalbumin), member 10 (SERPINB10); myeloperoxidase (MPO); creatine kinase, mitochondrial 1B (CKMT1A/CKMT1B); proteinase 3 (PRTN3); elastase, neutrophil expressed (ELANE); MORC family CW-type zinc finger 1 (MORC1); ubiquitin B (UBB); phospholipase A2, group IIA (platelets, synovial fluid) (PLA2G2A); carbonic anhydrase IV (CA4); G elongation factor, mitochondrial 2 (GFM2); S100 calcium binding protein A7 (S100A7); Bactericidal permeability-increasing protein (BPI); collagen, type VI, alpha 5 (COL6A5); LIM homeobox 8 (LHX8); cysteine-rich secretory protein 3 (CRISP3); Azurocidin (AZU1); hemicentin 1 (HMCN1); Transglutaminase 3 (E polypeptide, protein-glutamine gamma-glutamyltransferase) (TGM3); CDC42 binding protein kinase alpha (DMPK-like) (CDC42BPA); Cathepsin G (CTSG); Resistin (RETN); methylmalonyl CoA mutase (MUT); armadillo repeat containing, X-linked 4 (ARMCX4); Integrin alpha-M (ITGAM); Calcium channel, voltage dependent, R-type alpha-1E subunit (CACNA1E); T-cell lymphoma invasion and metastasis 2 (TIAM2); HIR histone cell cycle regulation defective homolog A (S. cerevisiae) (HIRA); dopey family member 2 (DOPEY2); integrin beta 1 binding protein 3 (ITGB1BP3); Sodium channel, voltage-gated, type VII, alpha (SCN7A); Rab3C, member RAS oncogene family (RAB3C); chromosome 9 open reading frame 79 (C9orf79); nuclear factor of activated T-cells, calcineurin-dependent 4 (NFATC4); UDP-glucose glycoprotein glucosyltransferase 2 (UGGT2); Cornulin (CRNN); kielin/chordin-like protein (KCP); CD1E molecule (CD1E); coiled-coil domain-containing 18 (CCDC18); leukotriene A-4 hydrolase (LTA4H); albumin (ALB); alpha-2-macroglobulin (A2M); complement component 3 (C3); hemoglobin, beta (HBB); transferrin (TF); hemoglobin, alpha 1 (HBA1/HBA2); lactotransferrin (LTF); ceruloplasmin (ferroxidase) (CP); catalase (CAT); group-specific component (vitamin D-binding protein) (GC); serpin peptidase inhibitor, clade C (antithrombin), member 1 (SERPINC1); fibrinogen gamma chain (FGG); S100 calcium binding protein A8 (S100A8); ferritin, light polypeptide (FTL); actin, beta (ACTB); fibronectin 1 (FN1); defensin, alpha 1 (DEFA1 (includes others)); serpin peptidase inhibitor, clade G (C1 inhibitor), member 1 (SERPING1); retinol binding protein 4, plasma (RBP4); peroxiredoxin 2 (PRDX2); fibrinogen alpha chain (FGA); serpin peptidase inhibitor, clade F (alpha-2 antiplasmin, pigment epithelium derived factor), member 2 (SERPINF2); carbonic anhydrase II (CA2); orosomucoid 1 (ORM1/ORM2); lactate dehydrogenase A (LDHA); vitronectin (VTN); kininogen-1 (KNG1); actin, alpha, cardiac muscle 1 (ACTC1); leucine-rich alpha-2-glycoprotein 1 (LRG1); gamma-glutamyl hydrolase (conjugase, folylpolygammaglutamyl hydrolase) (GGH); enolase 1, (alpha) (ENO1); profilin 1 (PFN1); serpin peptidase inhibitor, clade A (alpha-1 antiproteinase, antitrypsin), member 7 (SERPINA7); alpha-1-microglobulin/bikunin precursor (AMBP); lamin A/C (LMNA); apolipoprotein D (APOD); thyroid hormone receptor interactor 11 (TRIP11); complement component 4 binding protein, alpha (C4BPA); tropomyosin 4 (TPM4); filamin A, alpha (FLNA); haptoglobin (HP); hemopexin (HPX); hemoglobin, delta (HBD); fibrinogen beta chain (FGB); S100 calcium binding protein A9 (S100A9); complement component 5 (C5); solute carrier family 26, member 3 (SLC26A3); complement component 9 (C9); amyloid P component, serum (APCS); alpha-1-B glycoprotein (A1BG); complement C3-like (LOC100133511); inter-alpha (globulin) inhibitor H4 (plasma Kallikrein-sensitive glycoprotein) (ITIH4); complement component C8, alpha polypeptide (C8A); inter-alpha (globulin) inhibitor H1 (ITIH1).
In one embodiment, the method of the invention screens for one or more biomarkers capable of diagnosing or predicting CRC in an individual who tested negative in the fecal immunochemical test.
Thus, the methods of the invention may screen for one or more biomarkers selected from the group defined in Table 3. The biomarkers provided in Table 3 represent a preferred subset of those defined in Table 1, which have been found to be present in significantly higher levels in CRC samples which came back negative from the fecal immunochemical test. Thus, the group of Table 3 represents a panel of biomarkers from which any number of biomarkers may be selected for screening.
Thus, in one embodiment, the methods of the invention may screen for more than one biomarker from the group defined in Table 3, for example two, three, four, five, six, seven, eight, nine, ten of the biomarkers of the group defined in Table 3. In an alternative embodiment, the methods of the invention screen for more than ten of the biomarkers of the group defined in Table 3, for example, eleven, twelve, thirteen, fourteen, fifteen, sixteen, seventeen, eighteen, nineteen, twenty, thirty, forty, fifty, sixty, seventy, eighty, ninety, one hundred of the biomarkers of the group defined in Table 3. Thus, the methods of the invention may screen for the presence of or increased expression of one or more of: complement component C4B (Chido blood group) 2 (C4A/C4B); glutamic-oxaloacetic transaminase 2, mitochondrial (aspartate aminotransferase 2) (GOT2); glucose-6-phosphate isomerase (GPI); transketolase (TKT); N-acylaminoacyl-peptide hydrolase (APEH); hexosaminidase B (beta polypeptide) (HEXB); epithelial cell adhesion molecule (EPCAM); NME1-NME2 readthrough (NME1-NME2); Tu translation elongation factor, mitochondrial (TUFM); Glutathione synthetase (GSS); glyoxalase I (GLO1); latexin (LXN); Proteasome (prosome, macropain) subunit alpha type-4 (PSMA4); fucosidase, alpha-L-1, tissue (FUCA1); Keratin 18 (KRT18); Nuclear cap-binding protein subunit 1 80 kDa (NCBP1); mannosidase, alpha, class 2B, member 1 (MAN2B1); S100 calcium binding protein A6 (S100A6); major histocompatibility complex, class I, B (HLA-B); ectonucleotide pyrophosphatase/phosphodiesterase 3 (ENPP3); serpin peptidase inhibitor, clade B (ovalbumin), member 10 (SERPINB10); myeloperoxidase (MPO); proteinase 3 (PRTN3); phospholipase A2, group IIA (platelets, synovial fluid) (PLA2G2A); carbonic anhydrase IV (CA4); G elongation factor, mitochondrial 2 (GFM2); S100 calcium binding protein A7 (S100A7); Bactericidal permeability-increasing protein (BPI); collagen, type VI, alpha 5 (COL6A5); LIM homeobox 8 (LHX8); Azurocidin (AZU1); hemicentin 1 (HMCN1); methylmalonyl CoA mutase (MUT); armadillo repeat containing, X-linked 4 (ARMCX4); Integrin alpha-M (ITGAM); Calcium channel, voltage dependent, R-type alpha-1E subunit (CACNA1E); T-cell lymphoma invasion and metastasis 2 (TIAM2); HIR histone cell cycle regulation defective homolog A (S. cerevisiae) (HIRA); dopey family member 2 (DOPEY2); Sodium channel, voltage-gated, type VII, alpha (SCN7A); chromosome 9 open reading frame 79 (C9orf79); UDP-glucose glycoprotein glucosyltransferase 2 (UGGT2); Cornulin (CRNN); coiled-coil domain-containing 18 (CCDC18); alpha-2-macroglobulin (A2M); complement component 3 (C3); hemoglobin, beta (HBB); transferrin (TF); hemoglobin, alpha 1 (HBA1/HBA2); lactotransferrin (LTF); ceruloplasmin (ferroxidase) (CP); catalase (CAT); fibrinogen gamma chain (FGG); S100 calcium binding protein A8 (S100A8); ferritin, light polypeptide (FTL); fibronectin 1 (FN1); defensin, alpha 1 (DEFA1 (includes others)); serpin peptidase inhibitor, clade G (C1 inhibitor), member 1 (SERPING1); retinol binding protein 4, plasma (RBP4); peroxiredoxin 2 (PRDX2); serpin peptidase inhibitor, clade F (alpha-2 antiplasmin, pigment epithelium derived factor), member 2 (SERPINF2); lactate dehydrogenase (LDHA); vitronectin (VTN); kininogen-1 (KNG1); leucine-rich alpha-2-glycoprotein 1 (LRG1); gamma-glutamyl hydrolase (conjugase, folylpolygammaglutamyl hydrolase) (GGH); serpin peptidase inhibitor, clade A (alpha-1 antiproteinase, antitrypsin), member 7 (SERPINA7); alpha-1-microglobulin/bikunin precursor (AMBP); thyroid hormone receptor interactor 11 (TRIP11); complement component 4 binding protein, alpha (C4BPA); filamin A, alpha (FLNA); hemopexin (HPX); S100 calcium binding protein A9 (S100A9); complement component 5 (C5); solute carrier family 26, member 3 (SLC26A3); complement component 9 (C9); amyloid P component, serum (APCS); complement C3-like (LOC100133511); serpin peptidase inhibitor, clade A (alpha-1 antiproteinase, antitrypsin), member 3 (SERPINA3); Cathepsin S (CTSS); Cytidine deaminase (CDA); Shugoshin-like 2 (SGOL2); serpin peptidase inhibitor, clade B (ovalbumin), member 3 (SERPINB3); hect domain and RLD 2 (HERC2); triosephosphate isomerase 1 (TPI1); Isoform 1 of collagen, type IV, alpha 3 (Goodpasture antigen) binding protein (COL4A3BP); ribonuclease T2 (RNASET2); Glutathione reductase (GSR); spinster homolog 3 (Drosophila) (SPNS3); cDNA FLJ60317, highly similar to Aminoacylase-1 (ACY1); serpin peptidase inhibitor, clade B (ovalbumin), member 8 (SERPINB8); Branched-chain-amino-acid aminotransferase (BCAT2); sialic acid acetylesterase (SIAE); peptidase D (PEPD); major histocompatibility complex, class II, DR beta 5 (HLA-DRB5); SET domain containing 2 (SETD2); hect (homologous to the E6-AP (UBE3A) carboxyl terminus) domain and RCC1 (CHC1)-like domain (RLD) 1 (HERC1); isocitrate dehydrogenase 1 (NADP+), soluble (IDH1); cathepsin C (CTSC); serpin peptidase inhibitor, clade A (alpha-1 antiproteinase, antitrypsin), member 1 (SERPINA1); spectrin repeat containing, nuclear envelope 1 (SYNE1); Ankyrin repeat domain-containing protein 35 (ANKRD35); NIMA (never in mitosis gene a)-related kinase 10 (NEK10); inter-alpha (globulin) inhibitor H2 (ITIH2); acyl-CoA dehydrogenase, very long chain (ACADVL); Nebulin (NEB); Zymogen granule membrane protein 16 (ZG16); Vinculin (VCL); Isoform 2 of Dedicator of cytokinesis protein 4 (DOCK4); hypothetical protein LOC643677 (C13orf40); Uncharacterized protein KIAA1797 (KIAA1797); baculoviral IAP repeat-containing 6 (BIRC6); Transaldolase 1 (TALDO1); Taste receptor type 2 member 42 (TAS2R42); chitinase 1 (chitotriosidase) (CHIT1); quiescin Q6 sulfhydryl oxidase 1 (QSOX1); bone marrow stromal cell antigen 12 (BST1); Bleomycin hydrolase (BLMH); Lysozyme C (LYZ).
In one embodiment, the method of the invention screens for one or more biomarkers having a higher discriminatory power than haemoglobin. In other words, the one or more biomarkers have a higher sensitivity and higher specificity than haemoglobin in detecting the presence of advanced adenoma or adenocarcinoma in a sample.
Thus, in one embodiment, the method of the invention screens for one or more biomarker, for example two, three, four, five, six, seven, eight, nine, ten of the biomarkers selected from the group consisting of: S100 calcium binding protein A8 (S100A8), complement component C4B (Chido blood group) 2 (C4A/C4B), transferrin (TF), alpha-2-macroglobulin (A2M), S100 calcium binding protein A9 (S100A9), proteinase 3 (PRTN3), Azurocidin (AZU1), lactotransferrin (LTF), hemopexin (HPX) and defensin, alpha 1 (DEFA1).
Thus, in one embodiment, the method of the invention screens for one or more biomarker, for example two, three, four, five, six, seven, eight, nine, ten of the biomarkers selected from the group consisting of: annexin A2 (ANXA2), azurocidin 1 (AZU1), cathepsin G (CTSG), defensin, alpha 1 (DEFA1 includes others)), elastase, neutrophil expressed (ELANE), integrin, alpha M (complement component 3 receptor 3 subunit) (ITGAM), myeloperoxidase (MPO), nuclear cap binding protein subunit 1, 80 kDa (NCBP1), proteinase 3 (PRTN3), S100 calcium binding protein A9 (S100A9), serpin peptidase inhibitor, clade B (ovalbumin), member 10 (SERPINB10), annexin A1 (ANXA1), baculoviral IAP repeat containing 6 (BIRC6), minichromosome maintenance complex component 2 (MCM2), quiescin Q6 sulfhydryl oxidase 1 (QSOX1), serpin peptidase inhibitor, clade B (ovalbumin), member 5 (SERPINB5).
In one embodiment, the method of the invention screens for one or more biomarkers, the presence of which in a sample is indicative that the individual is at risk of suffering from or is suffering from colorectal cancer.
Thus, in one embodiment, the method of the invention screens for one or more biomarker, for example two, three, four, five, six, seven, eight, nine, ten of the biomarkers selected from the group defined in Table 4. The biomarkers provided in Table 4 represent a preferred subset of those defined in Table 1, which have been found to be expressed only in CRC samples, not in control samples. Thus, the group of Table 4 represents a panel of biomarkers from which any number of biomarkers may be selected for screening.
In one embodiment, the method of the invention screens for one or more biomarkers, the presence of which in a sample is indicative that the individual is at risk of suffering from or is suffering from colorectal cancer.
Thus, in one embodiment, the method of the invention screens for one or more biomarker, for example two, three, four, five, six, seven, eight, nine, ten of the biomarkers selected from the group defined in Table 5. The biomarkers provided in Table 5 represent a preferred subset of those defined in Table 1, which were identified in an independent verification set of stool samples.
Thus, in a further embodiment, the method of the invention screens for one or more biomarker, for example two, three, four, five, six, seven, eight, nine, ten of the biomarkers selected from the group consisting of: alpha-1-B glycoprotein (A1BG), alpha-2-macroglobulin (A2M), actin, beta (ACTB), actin, alpha, ardiac muscle 1 (ACTC1), albumin (ALB), amyloid P component, serum (APCS), N-cylaminoacyl-peptide hydrolase (APEH), apolipoprotein D (APOD), azurocidin 1 (AZU1), complement component 3 (C3), complement component 4B (Chido blood group) (C4A/C4B), complement component 5 (C5), carbonic anhydrase II (CA2), carbonic anhydrase IV (CA4), catalase (CAT), ceruloplasmin (ferroxidase) (CP), cathepsin G (CTSG), defensin, alpha 1 (DEFA1(includesothers)), elastase, neutrophil expressed (ELANE), enolase 1, (alpha) (ENO1), filamin A, alpha (FLNA), fibronectin 1 (FN1), ferritin, light polypeptide (FTL), fucosidase, alpha-L-1, tissue (FUCA1), gamma-glutamyl hydrolase (conjugase, folylpolygammaglutamyl hydrolase) (GGH), glutamic-oxaloacetic transaminase 2, mitochondrial (aspartate aminotransferase 2) (GOT2), glucose-6-phosphate isomerase (GPI), glutathione synthetase (GSS), hemoglobin, alpha 1 (HBA1/HBA2), hemoglobin, beta (HBB), hemoglobin, delta (HBD), hexosaminidase B (beta polypeptide) (HEXB), histone cluster 1, H4c (HIST4H4(includesothers)), major histocompatibility complex, class I, B (HLA-B), haptoglobin (HP), hemopexin (HPX), heat shock 10 kDa protein 1 (chaperonin 10) (HSPE1), lactate dehydrogenase A (LDHA), complement C3-like (LOC100133511), leucine-rich alpha-2-glycoprotein 1 (LRG1), lactotransferrin (LTF), latexin (LXN), mannosidase, alpha, class 2B, member 1 (MAN2B1), myeloperoxidase (MPO), N-acetylglucosaminidase, alpha (NAGLU), orosomucoid 1 (ORM1/ORM2), profilin 1 (PFN1), phospholipase A2, group IIA (platelets, synovial fluid) (PLA2G2A), proteinase 3 (PRTN3), proteasome (prosome, macropain) subunit, alpha type, 4 (PSMA4), proteasome (prosome, macropain) subunit, alpha type, 6 (PSMA6), proteasome (prosome, macropain) subunit, beta type, 2 (PSMB2), retinol binding protein 4, plasma (RBP4), resistin (RETN), S100 calcium binding protein A8 (S100A8), S100 calcium binding protein A9 (S100A9), serpin peptidase inhibitor, clade B (ovalbumin), member 10 (SERPINB10), serpin peptidase inhibitor, clade C (antithrombin), member 1 (SERPINC1), serpin peptidase inhibitor, clade F (alpha-2 antiplasmin, pigment epithelium derived factor), member 2 (SERPINF2), serpin peptidase inhibitor, clade G (C1 inhibitor), member 1 (SERPING1), superoxide dismutase 2, mitochondrial (SOD2), transferrin (TF), transketolase (TKT), ubiquitin B (UBB), Unmapped by Ingenuity (IP100884078), Unmapped by Ingenuity (26kDaprotein, IP100942608), alpha-2-macroglobulin-like 1 (A2ML1), acyl-CoA dehydrogenase, very long chain (ACADVL), Unmapped by Ingenuity (ACY1), alkaline phosphatase, liver/bone/kidney (ALPL), bleomycin hydrolase (BLMH), bone marrow stromal cell antigen 1 (BST1), calcium channel, voltage-dependent, L type, alpha 1D subunit (CACNA1D), creatine kinase, mitochondrial 1B (CKMT1A/CKMT1B), cathepsin S (CTSS), fumarylacetoacetate hydrolase (fumarylacetoacetase) (FAH), guanine deaminase (GDA), glutathione reductase (GSR), heterogeneous nuclear ribonucleoprotein A2/B1 (HNRNPA2B1), heat shock 70 kDa protein (HSPA8), keratin 6C (KRT6C), keratin 6C (KRT6C), lipocalin 1 (tear prealbumin) (LCN1), lysozyme (LYZ), minichromosome maintenance complex component 2 (MCM2), peptidase D (PEPD), phospholipase B domain containing 1 (PLBD1), proteasome (prosome, macropain) subunit, alpha type, 2 (PSMA2), proteasome (prosome, macropain) subunit, beta type, 1 (PSMB1), proteasome (prosome, macropain) subunit, beta type, 6 (PSMB6), quiescin Q6 sulfhydryl oxidase 1 (QSOX1), ribonuclease T2 (RNASET2), serpin peptidase inhibitor, clade A (alpha-1 antiproteinase, antitrypsin), member 1 (SERPINA1), serpin peptidase inhibitor, clade A (alpha-1 antiproteinase, antitrypsin), member 3 (SERPINA3), serpin peptidase inhibitor, clade B (ovalbumin), member 3 (SERPINB3), sialic acid acetylesterase (SIAE), triosephosphate isomerase 1 (TPI1), zymogen granule protein 16 homolog (rat) (ZG16).
In one embodiment, the method of the invention screens for one or more biomarkers, the presence of which in a sample is indicative that the individual is at risk of suffering from or is suffering from colorectal cancer, which biomarkers have been selected on the basis of correlation with secretomes of colon tumor tissue.
Thus, in one embodiment, the method of the invention screens for one or more biomarker, for example two, three, four, five, six, seven, eight, nine, ten of the biomarkers selected from the group defined in Table 7. The biomarkers provided in Table 7 represent a preferred subset of those defined in Table 6, which have been found to be expressed both in tumor issue and stool samples. Thus, the group of Table 7 represents a panel of biomarkers from which any number of biomarkers may be selected for screening. Thus, in a further embodiment, the method of the invention screens for one or more biomarker, for example two, three, four, five, six, seven, eight, nine, ten of the biomarkers selected from the group consisting of: annexin A2 (ANXA2), azurocidin 1 (AZU1), cathepsin G (CTSG), defensin, alpha 1 (DEFA1 (includes others)), elastase, neutrophil expressed (ELANE), integrin, alpha M (complement component 3 receptor 3 subunit) (ITGAM), myeloperoxidase (MPO), nuclear cap binding protein subunit 1, 80 kDa (NCBP1), proteinase 3 (PRTN3), S100 calcium binding protein A9 (S100A9), serpin peptidase inhibitor, clade B (ovalbumin), member 10 (SERPINB10), annexin A1 (ANXA1), baculoviral IAP repeat containing 6 (BIRC6), minichromosome maintenance complex component 2 (MCM2), quiescin Q6 sulfhydryl oxidase 1 (QSOX1), serpin peptidase inhibitor, clade B (ovalbumin), member 5 (SERPINB5).
Preferably, the method of the invention has an accuracy of at least 75%, for example 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% accuracy.
Preferably, the method of the invention has a sensitivity of at least 75%, for example 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sensitivity.
Preferably, the method of the invention has a specificity of at least 75%, for example 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% specificity.
By “accuracy” it is meant the proportion of correct outcomes of a diagnosis, by “sensitivity” it is meant the proportion of all positive diagnoses that are correctly classified as positives, and by “specificity” it is meant the proportion of all negative diagnoses that are correctly classified as negatives. The methods of the invention may be verified by subsequent colonoscopy.
Screening Methods
In one embodiment, the biological sample, for example a stool sample, may be screened for the one or more biomarkers using (targeted) mass spectrometry.
Protein marker discovery and screening by LC-MS/MS and hypothesis-based protein marker detection by SRM-MS has been applied before on murine and human stool samples, respectively (Ang C S, Nice E C. J Proteome Res 2010; 9: 4346-55; Ang C S, et al. Electrophoresis 2011; 32: 1926-38; Ang C S, et al. J Chromatogr A 2010; 1217: 3330-40).
Despite the complexity of stool material, the present invention has successfully detected several of these previously reported human stool proteins. In addition, the number of included samples and corresponding identified proteins in the current study exceeds that of previous studies. Therefore the data presented here is the largest overview of the human stool proteome to date.
While sample screening by the mass spectrometric techniques described herein has been proven to be effective, a preferred screening method comprises an antibody based screening assay. The fecal immunochemical test (FIT) comprises an antibody based screening assay and so an antibody based screening assay for one or more of the biomarkers identified in the present application provides a complementary screen which can be readily incorporated into the existing FIT assay.
Thus, in one embodiment of the methods of the invention, the biological sample is screened for the one or more biomarkers using a binding agent capable of binding to the one or more biomarkers.
Binding agents (also referred to as binding molecules) can be selected from a library, based on their ability to bind a given motif, as discussed below.
In one embodiment, the first binding agent is an antibody or a fragment thereof. Thus, a fragment may contain one or more of the variable heavy (VH) or variable light (VL) domains. For example, the term antibody fragment includes Fab-like molecules (Better et al Science 1988; 240, 1041); Fv molecules (Skerra et al Science 1988; 240, 1038); single-chain Fv (ScFv) molecules where the VH and VL partner domains are linked via a flexible oligopeptide (Bird et al Science 1988; 242,423; Huston et al Proc. Natl. Acad. Sci. USA 1988; 85, 5879) and single domain antibodies (dAbs) comprising isolated V domains (Ward et al Nature 1989; 341,544).
The term “antibody variant” includes any synthetic antibodies, recombinant antibodies or antibody hybrids, such as but not limited to, a single-chain antibody molecule produced by phage-display of immunoglobulin light and/or heavy chain variable and/or constant regions, or other immunointeractive molecule capable of binding to an antigen in an immunoassay format that is known to those skilled in the art.
A general review of the techniques involved in the synthesis of antibody fragments which retain their specific binding sites is to be found in Winter & Milstein Nature 1991; 349, 293-299.
In one embodiment, the antibody or fragment thereof is a recombinant antibody or fragment thereof (such as an scFv). By “ScFv molecules” it is meant molecules wherein the VH and VL partner domains are linked via a flexible oligopeptide.
The advantages of using antibody fragments, rather than whole antibodies, are several-fold. Effector functions of whole antibodies, such as complement binding, are removed. Fab, Fv, ScFv and dAb antibody fragments can all be expressed in and secreted from E. coli, thus allowing the facile production of large amounts of the said fragments.
Whole antibodies, and F(ab′)2 fragments are “bivalent”. By “bivalent” it is meant that the said antibodies and F(ab′)2 fragments have two antigen combining sites. In contrast, Fab, Fv, ScFv and dAb fragments are monovalent, having only one antigen combining site.
The antibodies may be monoclonal or polyclonal. Suitable antibodies may be prepared by known techniques and need no further discussion.
Additionally or alternatively the binding agent may be an aptamer. Suitable aptamers may be prepared by known techniques and need no further discussion.
In one embodiment, the one or more biomarker(s) in the test sample is labelled with a detectable moiety. In one embodiment, the one or more biomarker(s) in the control sample is labelled with a detectable moiety (which may be the same or different from the detectable moiety used to label the test sample).
A “detectable moiety” is one which may be detected and the relative amount and/or location of the moiety (for example, the location on an array) determined.
Detectable moieties are well known in the art. A detectable moiety may be a fluorescent and/or luminescent and/or chemiluminescent moiety which, when exposed to specific conditions, may be detected. For example, a fluorescent moiety may need to be exposed to radiation (i.e. light) at a specific wavelength and intensity to cause excitation of the fluorescent moiety, thereby enabling it to emit detectable fluorescence at a specific wavelength that may be detected.
Alternatively, the detectable moiety may be an enzyme which is capable of converting a (preferably undetectable) substrate into a detectable product that can be visualised and/or detected. Examples of suitable enzymes are discussed in more detail below in relation to, for example, ELISA assays.
Alternatively, the detectable moiety may be a radioactive label, which may be incorporated by methods well known in the art.
Arrays
In one embodiment, the methods of the present invention may be carried out on an array.
Arrays per se are well known in the art. Typically they are formed of a linear or two-dimensional structure having spaced apart (i.e. discrete) regions (“spots”), each having a finite area, formed on the surface of a solid support. An array can also be a bead structure where each bead can be identified by a molecular code or colour code or identified in a continuous flow. Analysis can also be performed sequentially where the sample is passed over a series of spots each adsorbing the class of molecules from the solution.
The solid support is typically glass or a polymer, the most commonly used polymers being cellulose, polyacrylamide, nylon, polystyrene, polyvinyl chloride or polypropylene. The solid supports may be in the form of tubes, beads, discs, silicon chips, microplates, polyvinylidene difluoride (PVDF) membrane, nitrocellulose membrane, nylon membrane, other porous membrane, non-porous membrane (e.g. plastic, polymer, perspex, silicon, amongst others), a plurality of polymeric pins, or a plurality of microtitre wells, or any other surface suitable for immobilising proteins, antibodies and other suitable molecules and/or conducting an immunoassay.
The binding processes are well known in the art and generally consist of cross-linking covalently binding or physically adsorbing a protein molecule, antibody or the like to the solid support. By using well-known techniques, such as contact or non-contact printing, masking or photolithography, the location of each spot can be defined.
Once suitable binding molecules (discussed above) have been identified and isolated, the skilled person can manufacture an array using methods well known in the art of molecular biology.
In one embodiment, the screening may comprise using an assay comprising a second binding agent capable of binding to the one or more biomarkers, the second binding agent having a detectable moiety.
In one embodiment, the second binding agent is an antibody or a fragment thereof (for example, as described above in relation to the first binding agent).
Typically, the assay is an ELISA (Enzyme Linked Immunosorbent Assay) which usually involves the use of enzymes which give a coloured reaction product, usually in solid phase assays. Enzymes such as horseradish peroxidase and phosphatase have been widely employed. A way of amplifying the phosphatase reaction is to use NADP as a substrate to generate NAD which now acts as a coenzyme for a second enzyme system. Pyrophosphatase from Escherichia coli provides a good conjugate because the enzyme is not present in tissues, is stable and gives a good reaction colour. Chemi-luminescent systems based on enzymes such as luciferase can also be used.
Conjugation with the vitamin biotin is also employed used since this can readily be detected by its reaction with enzyme-linked avidin or streptavidin to which it binds with great specificity and affinity.
It will be appreciated by persons skilled in the art that there is a degree of fluidity in the biomarker composition of the signatures of the invention. Thus, different combinations of the biomarkers may be equally useful in the diagnosis, prognosis and/or characterisation of colorectal cancer. In this way, each biomarker (either alone or in combination with one or more other biomarkers) makes a contribution to the signature.
Compounds and Methods for Treating CRC
The identification of the biomarkers as defined in Table 1 and/or Table 6 allows not only the detection of advanced colonic adenomas and colonic adenocarcinomas (colorectal cancer), but enables also methods of treating colorectal cancers as defined in the fourth aspect of the invention, and also provides for compounds for use in methods of treating colorectal cancers as defined in the fifth aspect of the invention.
While early diagnosis of colorectal cancer often allows for curative surgical removal of the tumour, later diagnosis may result in a (chemo)therapeutic treatment instead. Therapeutic agents used to treat colorectal cancer include monoclonal antibodies, small molecule inhibitors and chemotherapeutic agents.
Typical therapeutic monoclonal antibodies include but are not limited to bevacizumab, cetuximab or panitumumab. Typical small molecule inhibitors include but are not limited to erlotinib, sorafenib or alisertib. Typical chemotherapeutic agents include but are not limited to 5-FU, capecitabine, irinotecan oxaliplatin, or leucovorin or any combination thereof. Combination therapies of, for example, a therapeutic monoclonal antibody and a small molecule inhibitor may be used. Thus, any combination of two or more of a monoclonal antibody, a small molecule inhibitor and a chemotherapeutic agent is envisaged.
Kit for Performing the Method
The kit for performing the method according to the invention may be selected from any suitable assay and data processing apparatus and equipments.
The suitable selection will be well within the ability of those skilled in the art, and further description is not necessary here.
“Comprising”
The term “comprising” and related terms herein is to be interpreted as embracing “consisting essentially of” and “consisting of”, these two expressions being interchangeable with “comprising” in all definitions and discussion in this patent in order to specify alternative extents of exclusion of unspecified elements additional to the recited elements.
The term “comprising” and related expressions means “including” and therefore leaves open the option of including unspecified elements, whether essential or not. The term “consisting essentially of” and related expressions permits the presence of elements that do not materially affect the basic and novel or functional characteristic(s) of that embodiment of the invention. The term “consisting of” and related expressions means “consisting only of” and therefore excludes any element not recited in that description of the embodiment.
The invention shall now be further described by the following example with reference to the attached figures. The example is provided by way of example only, without any intended limitation of the scope of the invention. All cited references are incorporated herein by reference in their entireties.
The following tables list the biomarkers according to the invention, and may continue over several pages.
Drosophila)-like 1 (Hu antigen R)
The invention is further described by the following, illustrative examples. These are not intended to limit the scope of the invention.
Stool Samples and Protein Extraction
Written informed consent was obtained from all subjects who provided stool samples and this study was approved by the Medical Ethical Committee of the VU University Medical Center, The Netherlands.
Partial stool samples were collected from referral subjects who underwent colonoscopy between November 2003 and June 2006 at the VU University Medical Center in Amsterdam, The Netherlands. Partial stool samples were collected from before colonoscopy or following diagnosis at colonoscopy and prior to surgical resection of their tumors (see Table 8 for patient characteristics).
Independent (Verification) Set of Stool Samples
Homogenized whole stool samples were collected from subjects referred for and underwent colonoscopy between July 2009 and April 2011 at the VU University Medical Center in Amsterdam, The Netherlands. Whole stool samples were collected before colonoscopy (see table 1 for patient characteristics). The study participants immediately added stabilization buffer to the stool samples after defecation, the samples were processed in the lab with a final stool:buffer w/v ratio of 1:7 within 72 hours, and stored at −80° C. until use. Protein extracts were prepared as described above using 2 ml homogenized stool sample as starting material. Equal amounts of protein extracts were mixed and further treated as a single sample. Four pools from different categories were composed i.e. controls (n=5), and individuals with non-advanced adenomas (n=5), advanced adenomas (n=5) and CRC (n=5) (see Table 8 for patient characteristics).
aThis column represents an anonymization number code
bAs per the Modified Astler Coller classification (Compton and Greene, CA Cancer J Clin 2004)
ccut-off >75 ng/ul (van Veen, Ned tijdschriftGeneeskd, 2009)
At collection, stool samples were immediately stored at 4° C. and transferred to −20° C. within 36 hours. Stool samples were thawed and after performing FIT (fecal immunochemical test; FIT), ˜1 g stool was sampled from each stool sample for protein extraction. For this study stool samples from 10 subjects without colon neoplasia and 4 stool samples from CRC patients with a negative FIT score were selected next to 8 stool samples from CRC patients with a positive FIT score.
Stool proteins were extracted as described before (Ang C S, Nice E C. Targeted in-gel MRM: a hypothesis driven approach for colorectal cancer biomarker discovery in human feces. J Proteome Res 2010; 9: 4346-55) with few adaptations. In short, samples were homogenized in a two-fold excess volume of PBS by vortexing and centrifuged at 4° C. for 15 minutes at 16.000 G. The supernatants were centrifuged once more at 4° C. for 10 minutes at full speed. Following the last spin cycle the supernatants were cleaned from remaining particles by filtering through a 0.22 μM PVDF filter (Millipore, Billerica, Mass., USA). Finally, the samples were concentrated to approximately 200 μl using a 3 kDa cut-off filter (Amicon Ultra, Millipore Corporation, Billerica, Mass., USA).
1D-SDS Gel Electrophoresis and Sample Processing for Proteomics Analysis
Equal protein amounts (˜30 μg) were loaded and separated on precast 4-12% gradient gels (Invitrogen, Carlsbad, USA), fixed in 50% ethanol containing 3% phosphoric acid washed and stained overnight with Coomassie R-250. After staining the gels were washed in MilliQ water and stored at 4° C. until processing for in-gel digestion. Each lane was cut in 10 equal individual bands and each band was further processed into tryptic peptides as described before (Albrethsen J, et al. Mol Cell Proteomics 2010; 9: 988-1005; Piersma S R, et al. J Proteome Res 2010; 9: 1913-22).
For samples from an independent stool collection (verification experiment) equal amounts of the samples (20 μl) were loaded on a 12.5% SDS-PAGE gel and run into the gel until the proteins entered the running gel. Then the gel was fixed and stained as described above. The samples were cut out of the gel as a single band and further processed to tryptic peptides as described before (Albrethsen J, et al. Mol Cell Proteomics 2010; 9: 988-1005; Piersma S R, et al. J Proteome Res 2010; 9: 1913-22). The peptides were extracted and the volume of the desalted peptide fractions was reduced to 50 μl in a vacuum centrifuge.
nanoLC-MS/MS Proteomics Analysis
Peptides were separated by an Ultimate 3000 nanoLC system (Dionex LC-Packings, Amsterdam, The Netherlands) equipped with a 20 cm×75 μm ID fused silica column custom packed with 3 μm 120 Å ReproSil Pur C18 aqua (Dr Maisch GMBH, Ammerbuch-Entringen, Germany) as described previously (Piersma S R, et al. J Proteome Res 2010; 9: 1913-22). Peptides were trapped on a 5 mm×300 μm ID Pepmap C18 cartridge (Dionex LC-Packings, Amsterdam, The Netherlands) and separated at 300 nl/min in a 60 min gradient. Intact peptide mass spectra and fragmentation spectra (top 5) were acquired on a LTQ-FT hybrid mass spectrometer (Thermo Fisher, Bremen, Germany). Dynamic exclusion was applied with a repeat count of 1 and an exclusion time of 30 seconds. Peptides from the four pooled samples (verification experiment) were separated in triplicate on a 75 μm×50 cm custom packed Reprosil C18 aqua column (1.9 μm, 120 Å) in a 240 min. gradient (5-32% Acetonitrile+0.5% Acetic acid at 300 nl/min) using a U3000 RSLC high pressure nanoLC (Dionex). Eluting peptides were measured on-line by a Q Exactive mass spectrometer (ThermoFisher Scientific) operating in data-dependent acquisition mode. Peptides were ionized using a stainless-steel emitter at a potential of +2 kV (ThermoScientific). Intact peptide ions were detected at a resolution of 35000 and fragment ions at a resolution of 17500; the MS mass range was 350-1500 Da. AGC Target settings for MS were 3E6 charges and for MS/MS 2E5 charges. Peptides were selected for HCD fragmentation at an underfill ratio of 1% and a quadrupole isolation window of 1.5 Da, peptides were fragmented at normalized collision energy of 30 eV.
SRM Analysis
LC-SRM analysis was performed on the four pools from the verification set. Chromatographic separation of peptides was performed by an Ultimate 3000 RSCL Nano LC system (Dionex) equipped with custom packed nano-LC columns consisting of 20 cm×75 μm ID fused silica custom packed with 3 μm 100 Å ReproSil Pur C18 aqua (Dr Maisch GMBH, Ammerbuch-Entringen, Germany) as described before16. Samples were analyzed in triplicate on a QTRAP 5500 instrument (AB SCIEX, Foster City, Calif.) operated in positive SRM mode and equipped with a nano-electrospray source with applied voltage of 2.2 kV and a capillary heater temperature of 225° C. The scheduled SRM mode comprised the following parameters: SRM detection window of 900 sec, target scan time of 3.0 s, curtain gas of 15, ion source gas 1 of 25, declustering potential of 100, entrance potential of 10. Q1 resolution was set to unit and Q3 resolution to unit. Pause between mass ranges was set to 1 ms. Collision cell exit potentials was set to 36 for all transitions.
SRM Assay Development
An SRM assay for the target proteins was developed using the MRMPilot™ software version 2.1 (AB SCIEX, Concord, ON, Canada). For each included protein, 5 peptides were selected and purchased from JPT Peptide Technologies (Berlin, Germany) as non-purified ‘SpikeTides’. For each protein one μl of a mixture of the 5 selected peptides was injected at a concentration of approximately 50 fmol per peptide. For each of the peptides, the MRMPilot™ software was set to generate up to 20 theoretically possible SRM transitions, each consisting of the calculated m/z of the precursor ion (at any charge state predicted by the software) in combination with the predicted fragment ions for each predicted precursor. The highest responding peptides/transitions at a theoretically calculated optimum collision energy were determined, as well as the identity of the peptide via SRM triggered MS/MS. Each verification analysis was set up to detect 100 of all theoretically predicted transitions and their theoretically predicted optimum collision energy corresponding to the 5 peptides assessed for each candidate protein. The combined information from each SRM—Information Dependent Acquisition experiment was used to perform Mascot searches against the human Swiss-prot protein database and MultiQuant™ software version 2.1 (AB SCIEX). MultiQuant™ software was also used to generate method files for peptide verification and collision energy optimization and to integrate the results of all optimization cycles. MRMPilot was used to schedule three transitions at the experimentally found optimum and the experimentally found retention time for each peptide. The final assay contained 114 scheduled transitions, 3 for each peptide (1-5 peptides for each of the candidate proteins) and 6 for 2 external control peptides. SRM analyses on each pool were carried out in triplicate.
SRM Data Analysis
The area under the curve (AUC) and retention time of each transition was determined using Multiquant software (AB SCIEX). Obtained transitions were then verified by two characteristics; firstly a retention time check was carried out to reveal false positives or false negatives in the assignment of transitions. If the retention time of three transitions belonging to 1 peptide was >3 seconds, the three transitions were manually inspected to assess correctness of assignment of the three transitions by the Multiquant software. If this occurred in more than one of the technical replicates then the peptide for that sample was disregarded. Secondly, the consistency of the ratio between two areas under the curve of two transitions (Transition1/Transition2, Transition2/Transition3 and Transition1/Transition3) for each peptide in the CRC pool was verified. A relative standard deviation (RSD) percentage was calculated of these ratios in the triplicate analysis. If the one transition caused the RSD percentage for two ratio's to be >20%, the transitions were manually inspected and if only one transition appeared to be incorrect, the two transitions resulting in <20% RSD over all analyses were selected for further analysis. Again if all three transition ratios had a RSD percentage above 20 then the peptide for that sample was disregarded.
Database Searching and Statistical Analysis
MS/MS spectra were searched against the human IPI database 3.62 (83.947 entries) using Sequest (version 27, rev 12). Scaffold version 3.00 (Proteomesoftware, Portland, Oreg.) was used to organize the data and to validate peptide identifications using the PeptideProphet (probability >95%) and ProteinProphet (probability of >99% with 2 peptides or more in one sample) (Keller A, et al. Anal Chem 2002; 74: 5383-92; Nesvizhskii A I, et al. Anal Chem 2003; 75: 4646-58). The software programs SecretomeP and SignalP were used for prediction of non-classical secretion and presence of a signal peptide (Bendtsen J D, et al. Protein Eng Des Sel 2004; 17: 349-56; Bendtsen J D, et al. J Mol Biol 2004; 340: 783-95). Q Exactive MS/MS spectra were searched against the human IPI database 3.68 using MaxQuant 1.2.2.5. Maximum allowed mass deviation was set to 20 ppm for MS and MS/MS. Cysteine carboxamide methylation was set as fixed modification and methionine oxidation and protein N-terminal acetylation were set as variable modifications. Identifications were filtered to 1% FDR at both the peptide and protein level. For each protein the number of assigned MS/MS spectra was summed across the 10 fractions and exported to Excel. Normalized counts were calculated by dividing the counts per protein by the sum of all counts per sample and multiciplication by the average sum across all samples. Differential analysis of proteins present in stool samples from CRC patients versus stool samples from healthy controls was performed using the beta-binomial test. The beta-binomial test takes into account the within-sample variation and the between-sample variation in a single statistical model (Pham T V, et al. Bioinformatics 2010; 26: 363-9).
Fecal Immunochemical Test Analysis
FIT samples (OC-Sensor®, Eiken Chemical Co., Tokyo, Japan) were processed with the OC sensor MICRO desktop analyser (Eiken Chemical) and analyzed according to the manufacturer's instructions. A cut-off of 75 ng/ml was used to determine a positive test result (van Veen W, Mali W P. [Colorectal cancer screening: advice from the Health Council of the Netherlands]. Ned Tijdschr Geneeskd 2009; 153:A1441).
Results
Human Protein Identification in Stool Samples
In total 830 human proteins were identified in at least one of the 22 stool samples. Of these, 624 proteins were identified both in CRC and control stool samples, 164 proteins were detected only in CRC stool samples, and 42 proteins were detected only in control stool samples (see
The primary annotation of subcellular localization of the proteins was mainly the cytoplasm (35%) and the extracellular space (16%) and did not differ between CRC and control samples (see
Verification of LC-MS/MS Protein Quantification
The fecal immunochemical test (FIT) is used in many countries as a non-invasive test for the early detection of CRC, and is based on the detection of hemoglobin. To verify the results obtained by LC-MS/MS, hemoglobin spectral counts were compared with FIT values (ng/ml) determined in the same stool samples. The adult hemoglobin protein is a heterodimer consisting of two α and two β chains (Schechter A N. Blood 2008; 112: 3927-38). As can be seen in
Another protein which has been frequently detected in stool and associated to CRC is Calprotectin (Bosch L J, et al. Molecular tests for colorectal cancer screening. Clin Colorectal Cancer 2011; 10:8-23). Calprotectin is a heterodimer as well, consisting of S100A8 and S100A9 subunits (Yui S, et al. Biol Pharm Bull 2003; 26: 753-60). A strong correlation was observed between spectral counts of S100A8 and S100A9 (see
These results on hemoglobin and calprotectin show that LC-MS/MS on stool samples provides us with robust quantification of proteins, and is a valid approach for protein marker discovery.
Origin of Human Stool Proteins
The mechanisms of how tumor-derived biomarkers end up in stool can be broadly divided in leaked markers, secreted markers and exfoliated markers (Osborn N K, Ahlquist DA. Gastroenterology 2005; 128: 192-206). Hemoglobin is a typical example of a leaked marker from disturbed blood vessels in a neoplastic lesion. Secreted and exfoliated markers can be derived from the epithelial cells lining the colorectal lumen, but can also originate from cells in the surrounding stroma, such as immune cells. An overlap analysis with a previously obtained dataset of plasma proteins (unpublished data) together with the public available Human Proteome Organisation (HUPO), Human Plasma Proteome Project (HPPP) database (www.hupo.org/research/hppp; Omenn G S, et al. Proteomics 2005; 5: 3226-45; States D J, et al. Nat Biotechnol 2006; 24: 333-8) (high confidence list) revealed that 21.6% of the 830 identified proteins possibly originate from blood. To estimate how many of the 830 identified human proteins originate from epithelial cells, an overlap analysis was performed with proteins detected in CRC cell lines (unpublished data). Almost half of these proteins were also identified in the CRC cell lines suggesting an epithelial origin.
Stool Proteins that Discriminate CRC Patients from Control Subjects
Unsupervised hierarchical cluster analysis of human proteins identified in all CRC and control stool samples revealed two clusters. Cluster one grouped nine CRC stool samples together, and cluster two grouped all ten control stool samples and three CRC stool samples together (data not shown). Stool samples from CRC patients thus show a specific protein expression pattern as compared to stool from control subjects. This protein expression pattern can therefore be applied to discriminate most of the CRC stool samples from control stool samples without further selection of specific proteins. The complete stool protein dataset was analyzed with beta binomial statistics to identify potential cancer-associated proteins (Pham T V, et al. Bioinformatics 2010; 26: 363-9). From the 830 human stool proteins, 221 proteins were differentially detected, of which the levels of 134 proteins were significantly higher in CRC compared to control stool samples (p<0.05; Table 2), while 87 proteins were significantly lower in controls compared to CRC stool samples.
Cluster analysis based on the 221 differentially detected proteins in CRC and control stool samples revealed two clusters as well (data not shown). Again, the nine CRC stool samples clustered together in one cluster, and the same three CRC stool samples mentioned above clustered together with the control stool samples in the second cluster.
Within the second cluster, the three CRC stool samples grouped together with one control sample. The protein expression pattern of these three CRC stool samples contained aspects of both the control stool samples and the other nine CRC stool samples, indicating that a selection of these proteins would be able to discriminate these CRC samples from controls.
In addition, three out of four FIT negative CRC patients clustered together with FIT positive CRC patients. Therefore this group of proteins contains potential candidate biomarkers that complement FIT.
Verification by Semi-High Throughput nanoLC-MS/MS
In order to verify these initial results, four pools were created from an independent set of 20 stool samples. Next to CRC cases and negative controls, also samples from patients with adenomas were included. In fact, since most adenomas do not progress to cancer, for the purpose of CRC screening, only the detection of adenomas with high risk for progression besides CRC matters (Levin B et al., Gastroenterology 2008; 134:1570-1595). Therefore, pools of stool samples from patients with non-advanced adenoma as well as advanced adenomas (generally regarded as of higher risk to progression) were included. In these verification samples, 414 human proteins were identified. Out of the 830 human proteins identified in the discovery set, 331 (40%) were also detected in the verification set (see table 5). From the 134 human proteins significantly enriched in CRC stool samples, 63 (47%) were also detected in the verification set. The proteins detected in both the discovery and the verification set represented proteins with a significantly higher abundance (average spectral count of 15) compared to proteins that were not found back in the verification set (average spectral count of 4) (p=3.5*10−27).
Proof of Concept of Candidate Protein Biomarkers for Validation by Targeted MS
From the 134 proteins significantly enriched in CRC stool samples, a subset of proteins with a potential epithelial origin (i.e. detected in CRC cell lines), that were significantly enriched in FIT-negative CRC patients compared to controls, and were recovered in the verification set in the advanced adenoma and/or CRC pools (n=29) were selected for further validation. First results for 13 of these proteins confirmed the results obtained by LC-MS/MS in the verification set. These included 6 proteins detected in CRC and/or advanced adenoma pools while not detected in controls or adenomas (e.g. C4B and Glucose-6-phosphate isomerase (GPI) (
Proteins that Complement FIT for Detecting CRC
Novel biomarkers for early detection of CRC should perform significantly better than FIT, or should complement FIT, in order to increase diagnostic accuracy. As expected from their correlation with FIT, hemoglobin α and β were both significantly more abundantly present in CRC compared to control stool samples. The data revealed several proteins of which levels were significantly higher in CRC stool samples compared to control stool samples with a higher discriminative power than hemoglobin. It is of interest to see if these proteins detect the same CRC samples as hemoglobin does, or if they detect other CRC samples.
For this reason, proteins that showed significantly higher levels in the FIT negative CRC stool samples compared to all controls were investigated. Indeed, out of the 134 proteins with significantly higher levels in CRC versus control stool samples, around 90% also showed significantly higher levels in the FIT negative CRC stool samples. This indicates that these candidates have high potential to be of added value to the current detection of hemoglobin.
The present study has delivered in-depth proteomic analysis on human stool samples, providing a list of 134 human proteins that were significantly enriched in stool samples from CRC patients, of which several showed highly significant discriminative power. Thus, discriminative markers for improving current FIT tests have been identified.
Material and Methods
Patients
A total of four patients that underwent surgical resection at the VU University medical center (Amsterdam, the Netherlands) were included in this study. Collection, storage, and use of tissue and patient data were performed in accordance with the Code for Proper Secondary Use of Human Tissue in the Netherlands (Societies DFOBS. http://www.federa.org/). A pathologist inspected all samples to obtain information on tumor size, tumor and nodal stage, differentiation grade, mucinous differentiation. For an overview of the clinicopathological characteristics, see table 1.
Tissue Handling and Tissue Secretome Collection
The tissue secretome collection was performed as described before (Celis J E. et al. Mol. Cell Proteomics 2004; 3:327-4). In short, following surgical resection, the specimen was immediately transferred to the pathology department, were a pathologist excised a representative part of the tumor and adjacent normal colon mucosa (near an unaffected resection margin). These pieces of tissue were cut into cubes of approximately 1 mm3 and rinsed in PBS to remove blood and stool particles. Subsequently the tissue particles were incubated in 100 μl PBS for 1 hour at 37° C. Following this incubation the samples were briefly centrifuged (2000 rpm at 4° C. for 2 minutes) and the supernatant was transferred to a new eppendorf tube. The supernatants were centrifuged at maximum speed (13.200 rpm at 4° C. for 20 minutes) to remove all remaining cells and debris. The soluble fractions further, referred to as the ‘tissue secretomes’, were stored at −80° C. until further use. The tissues were processed by standard formalin fixation and paraffin embedded for histological evaluation (supplementary
Gel Electrophoresis and Sample Preparation for Proteomics Analysis
Protein concentrations were determined using the BCA protein assay (Pierce, Thermo Fisher Scientific, Rockford, USA). Twenty μg of proteins were separated by gel electrophoresis using a pre-cast 1D 4-12% gradient SDS-PAGE gel (Invitrogen, Carlsbad, USA). For gel images see supplementary
nanoLC-MS/MS Proteomics Analysis
Peptides were separated by an Ultimate 3000 nanoLC system (Dionex LC-Packings, Amsterdam, The Netherlands) equipped with a 20 cm×75 μm ID fused silica column custom packed with 3 μm 120 Å ReproSil Pur C18 aqua (Dr Maisch GMBH, Ammerbuch-Entringen, Germany). After injection, peptides were trapped at 6 μl/min on a 1 cm×100 μm ID trap column packed with 5 μm 120 Å ReproSil C18aqua at 2% buffer B (buffer A: 0.05% formic acid in MQ; buffer B: 80% acetonitrile+0.05% formic acid in MQ) and separated at 300 nl/min in a 10-40% buffer B gradient in 60 min. Eluting peptides were ionized at 1.7 kV in a Nanomate Triversa Chip-based nanospray source using a Triversa LC coupler (Advion, Ithaca, N.J.). Intact peptide mass spectra and fragmentation spectra were acquired on a LTQ-FT hybrid mass spectrometer (Thermo Fisher, Bremen, Germany). Intact masses were measured at resolution 50.000 in the ICR cell using a target value of 1×106 charges. In parallel, following an FT pre-scan, the top 5 peptide signals (charge-states 2+ and higher) were submitted to MS/MS in the linear ion trap (3 amu isolation width, 30 ms activation, 35% normalized activation energy, Q value of 0.25 and a threshold of 5000 counts). Dynamic exclusion was applied with a repeat count of 1 and an exclusion time of 30 seconds.
Database Searching
To identify proteins from the acquired data MS/MS spectra were searched against the human IPI database v3.59 (80128 entries) using Sequest (version 27, rev 12), which is part of the BioWorks 3.3 data analysis package (Thermo Fisher, San Jose, Calif.). Following database searching the DTA and OUT files were imported into Scaffold Scaffold—2—06—01 (Proteome software, Portland, Oreg.). Scaffold was used to organize the gel-band data and to validate peptide identifications using the Peptide Prophet algorithm (Nesvizhskii A I, et al., Anal. Chem. 2003; 75:4646-58). Only identifications with a probability >95% were retained. Subsequently, the Protein Prophet algorithm (Keller A, et al., Anal. Chem. 2002; 74:5383-92) was applied and protein identifications with a probability of >99% with 2 peptides or more in at least one of the samples were retained. Proteins that contained similar peptides and could not be differentiated based on MS/MS analysis alone were grouped.
Data Mining and Statistical Analysis
For each protein identified, the number of assigned spectra was exported to Excel. Differential analysis of samples was performed using a paired Beta-Binominal test as described previously (Pham T V et al., Expert Rev. Mol. Diagn. 2012; 12:343-59). (Szklarczyk D et al., Nucleic Acids Res. 2011; 39:D561-8). This modification resulted in a paired test thereby taking into account the origin of these samples e.g., comparing protein signatures between two tissues derived from the same patient. Additional general protein information was retrieved using Ingenuity Pathway Analysis (Ingenuity® Systems, www.ingenuity.com). Known and predicted protein-protein interactions were investigated using STRING version 9.0 (www.string-db.org) (Szklarczyk D, et al., Nucleic Acids Res. 2011; 39:D561-8). For cluster and gene ontology analyses we used the Cytoscape platform for network analysis (www.cytoscape.org) (Smoot M E et al., Bioinformatics 2011; 27:431-2), using the plug-ins ClusterONE version 0.93 for clustering and BINGO version 2.44 for the analysis of biological processes within our obtained networks based on GO annotations Maere S et al., Bioinformatics 2005; 21:3448-9 and Nepusz T et al., Nat Meth 2012; 9:471-2). Additionally, transmembrane domains and signal peptide sequences were investigated using secretomeP server 2.0 (Bendtsen J D et al., Protein Eng. Des. Sel. 2004; 17:349-56).
Results
Identification of Secreted Proteins in CRC and Normal Tissue Secretomes
Tissue secretomes were collected from four CRCs and matched adjacent normal colon tissue as described before by Celis at al. (Celis J E et al., Mol. Cell Proteomics 2004; 3:327-44) and processed for mass spectrometry. The tumor of patient 4 was found to be MSI and the tumors of patients 1-3 were microsatellite stable (patient details in table 8. A total of 2703 non-redundant proteins were identified in the tumor secretomes of the four CRCs and their matched normal tissue secretomes. On average 1986 proteins were identified per sample, ranging from 1264 to 2292 proteins.
Of the total of 2703 proteins, 2366 were identified in the tumor secretomes as well as in the normal tissue secretomes, 283 only in the tumor secretomes and 54 only in the normal tissue secretomes. The number of identified proteins was significantly higher in the CRC secretomes than in the normal tissue secretomes (2198 and 1775 on average, respectively, p=0.03), while the number of unique identifications in the samples did not differ significantly (supplementary
Proteins Enriched in the Cancer Secretomes
To obtain an overview of the dataset, an unsupervised hierarchical clustering was performed using the normalized spectral count data from all 2703 identified proteins. Two major clusters were identified; one containing the normal tissue sample of patient 3 and the other containing all other samples, this could be explained by the lower amount of proteins identified in this sample compared to the others (N=1264). In the second cluster again two sub clusters were formed; one containing the other three normal tissue samples and the other all cancer samples. Within the cancer cluster the samples of patients 1-3 formed a separate cluster from the sample of patient 4, which can be explained by the different molecular background of this sample i.e. this latter tumor was MSI whereas the tumors of patients 1-3 were microsatellite stable (table 8). Overall the unsupervised cluster analysis indicated that the protein expression pattern was more similar among the four tumor samples than between paired tumor and normal tissues from the same patient. Potential biomarkers should be enriched in the tumor secretomes compared to the normal tissue secretomes. To identify such proteins the complete protein dataset was further analyzed with a statistical test for paired samples, i.e. taking into account the relationship between normal and cancer tissue samples from individual patients, to identify CRC associated proteins (Pham T V et al., Expert Rev. Mol. Diagn. 2012; 12:343-59). From the 2703 proteins, 522 proteins were significantly differentially present (P<0.05), 409 of which were more abundant in tumor secretomes compared to normal tissue secretomes, thus representing the proteins of likely highest value for discrimination between CRC patients and normal controls (see table 6).
Overlap Analysis of Proteins Enriched in the Cancer Secretomes and Stool Proteins
Protein biomarkers that can be detected in blood or stool are of interest since they could be applied in a standard clinical setting alongside the routinely used markers such as CEA and CA19-9 or haemoglobin. Measuring molecules directly in the biological sample that will be used for screening, i.e. stool, can yield biomarkers that are stable and can reliably be detected in stool samples. An overlap analysis with a previously obtained dataset of stool proteins (unpublished data) revealed that 383 of the 2703 identified proteins are possibly detectable in stool. Out of the 409 significantly more abundant proteins in CRC secretomes compared to normal secretomes, 16 proteins of which were also detected in stool (see table 7).
The present invention thus provides a set of biomarkers for screening for colorectal cancer, or susceptibility to colorectal cancer. The biomarkers enable early diagnosis such that early stage surgical removal of a tumor before metastasis is possible.
The foregoing broadly describes the present invention without limitation to particular embodiments. Variations and modifications as will be readily apparent to those skilled in the art are intended to be within the scope of the invention as defined by the following claims.
Number | Date | Country | Kind |
---|---|---|---|
2008707 | Apr 2012 | NL | national |
2010276 | Feb 2013 | NL | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/NL13/50316 | 4/26/2013 | WO | 00 |