The invention relates to materials and methods for diagnosing tumor types, and assessing patient prognosis. In particular, the invention concerns the determination of marker proteins which enable primary liver tumors to be identified and classified.
The liver is a complex organ capable of regeneration after damage. It is highly structured with a number of specialised cells required to form amongst other features, the bile ducts and liver parenchyma. The most common cell type is the hepatocyte that forms the bulk of the liver parenchyma. Cholangiocytes are a much less common cell type forming the bile ducts of the intrahepatic biliary tree.
Primary liver tumors are classified into epithelial, mesenchymal, germ cell, lymphoid and of mixed or uncertain origin accordingly to the latest WHO classification [1]. Epithelial tumors are the commonest, and generally divided into hepatocellular and cholangiocellular due to their phenotypic similarity to hepatocytes and biliary epithelium, respectively and presumed derivation. Hepatocellular carcinoma (HCC) and cholangiocarcinoma (CC) are the most common malignant types. HCC is the fifth most common cancer worldwide, and usually develops in the context of chronic liver disease [2]. CC can arise from any portion of the intrahepatic biliary tree, and is classified into peripheral and hilar/perihilar based on the predominant location, probable different biological characteristics, and pathogenesis. This classification is supported by an association with risk factors such as viral hepatitis or alcoholic liver disease in peripheral CC. In contrast, multi-step carcinogenesis through intraepithelial neoplasia often in the context of a chronic cholangiopathy (e.g. primary sclerosing cholangitis (PSC)) appears to be behind the development of hilar/perihilar CC [3].
Some primary carcinomas show a mixed phenotype, with areas of hepatocellular differentiation alternating with areas of cholangiocellular differentiation.
An origin from hepatic progenitor cells has been proposed for these tumors, on the broader basis of the cancer stem cell theory that all primary liver tumors and in particular the epithelial ones may be part of a phenotypic spectrum with “pure” HCC and CC at either end, and mixed cancers somewhere in the middle [4, 5]. In this respect, we reported recently that local ablation therapy with transarterial chemoembolization (TACE) is associated with cholangiocellular differentiation in HCC [6]. A potential explanation for this observation is that TACE provides selection pressure in favor of a minor progenitor cell population that is resistant to TACE and capable of multipotent differentiation including biliary lineage. The hepatocellular and cholangiocellular/progenitor cell components were identified by single or double immunostainings or gene expression analysis (RT-PCR) from microdissected tissue, using a relatively limited number of known conventional markers [6]. More markers are required to help better define the details of the phenotype and pathogenesis of the different HCC/CC components of post-TACE tumors, their similarities to their normal and typical malignant counterparts, and aid in diagnosis, prognosis and potentially identify new selective therapeutic targets and predictive markers.
Liquid chromatography-mass spectrometry (LC-MS/MS) based proteomics has proven to be superior over conventional biochemical methods at identifying and quantifying thousands of marker proteins extracted from complex samples including cultured cells and clinical tissue [7-8]. Recently, the application of mass spectrometry based proteomic analysis on formalin fixed paraffin embedded (FFPE) tissue has gained particular focus because of the enormous collections of highly characterized FFPE tissue derived from both human and model organisms [9-10] and its compatibility with Laser Capture Microdissection to enrich tumor cell populations from heterogenous tissue sections.
Large scale global proteomic analysis of laser microdissected FFPE tissue has been successfully employed to discover differentially expressed marker proteins between histological tissue types that can serve as novel protein biomarkers of disease [10-13]. Many of these studies utilized label free quantitative proteomic strategies, such as spectral counting and signal intensities of peptide precursor ions. Both approaches benefit from reduced spectral complexity, enhanced analytical depth and good linear dynamic range (over two orders of magnitude for spectral counting) and consequently provide a high level of quantitative proteome coverage [11-15].
Standard liver histology and immunohistochemistry for tumor marker proteins have provided some means of differentiating between HCC and CC but are prone to inter-operator variability and lack of sensitivity. There remains therefore, a need for more informative markers for characterising liver tumors in terms of the predominant cellular type—hepatocytes or cholangiocytes—and potentially incorporating molecular markers of drug responsiveness. Such biomarkers of tumor cell lineage can provide an aid to earlier diagnosis, prognostic monitoring of disease, optimised treatment selection and may potentially identify new selective therapeutic targets for future drug development.
The present invention, therefore, provides for novel biomarkers for use in the classification of primary liver tumors and particularly distinction between hepatocellular carcinoma and cholangiocellular carcinoma. Such classification allows treatment regimens and prognosis to be specifically tailored to the patient.
In a first aspect, the present invention provides for a method of determining the cellular phenotype of a liver tissue sample said method comprising
In a second aspect, there is provided a method of identifying the cellular phenotype of a liver cell, said method comprising
In embodiments of these aspects, the cellular phenotype is selected from normal liver epithelium cells (hepatocytes), normal biliary epithelium cells (cholangiocytes), hepatocellular carcinoma cells, peripheral cholangiocellular carcinoma cells or hilar cholangiocellular carcinoma cells.
In some other embodiments the methods further comprise comparing said expression levels with a second reference set of expression levels representing a second cellular phenotype.
In some embodiments, the liver cell is a liver tumor cell.
In some embodiments, the biomarker panel is represented by Table 5 and/or Table 7 and the cellular phenotype is selected from hepatocellular carcinoma cells and cholangiocellular carcinoma cells, preferably the plurality of marker proteins is selected from part A of Table 5.
In some other embodiments, the liver tumor cell is obtained from a liver tumor biopsy sample, preferably obtained from a patient having previously been treated with transarterial chemoembolization.
In yet some other embodiments, the plurality of marker proteins are selected from Table 7, preferably the plurality of marker proteins are selected from Table 7 part A.
In yet some other embodiments of these aspect, the step of determining the expression levels of a plurality of marker proteins comprises
The specific binding member is an antibody or antibody fragment which selectively binds to one of said plurality of marker proteins or a nucleic acid sequence which selectively binds to nucleic acid encoding one of said plurality of marker proteins.
Optionally, the specific binding member is an aptamer or the binding member is immobilised on a solid support.
In some other embodiments of these aspects, the step of determining expression levels of a plurality of marker proteins is performed by mass spectrometry or by Selected Reaction Monitoring using one or more transitions for protein derived peptides; and comparing the peptide levels in the liver cell or the liver tissue sample under test with peptide levels previously determined to represent a cellular phenotype.
Preferably, comparing the peptide levels includes determining the amount of protein derived peptides from the liver cell or the liver tissue sample with known amounts of corresponding synthetic peptides, wherein the synthetic peptides are identical in sequence to the peptides obtained from the liver cell or the liver tissue sample except for a label. More preferably, the label is a tag of a different mass or a heavy isotope.
In a third aspect, the present invention provides for a method for the diagnosis or prognostic monitoring of a liver tumor in an individual, said method comprising
In a fourth aspect, the present invention provides for a method for determining a treatment regimen for an individual having a liver tumor, said method comprising
In some embodiments of these third and fourth aspects, the liver tumor cell is from a liver tumor biopsy.
In some other embodiments of these aspects, the biomarker panel is represented by Table 5, preferably by Part A of Table 5.
In some further embodiments of these aspects the individual had previously been treated with transarterial chemoembolization. Preferably, the biomarker panel is represented by Table 7, more preferably by Part A of Table 7.
In a fifth aspect, the present invention provides for a method of diagnosing liver cancer in an individual comprising detecting one or more marker proteins or fragments thereof selected from Table 1A, Tables 2 to 11 in a blood, tissue, saliva or urine sample obtained from said individual. Preferably, said one or more protein markers or fragments thereof are detecting using a specific binding member, more preferably said binding member is an antibody specific for said one or marker protein.
In some embodiments of all these aspects, the plurality of marker proteins are selected from any one of Collagen alpha 1 (XVIII) chain, Plastin-3, AKR1B10, Fibronectin, Beta 3 tubulin, Asporin, 14-3-3 protein eta or Dihydropyrimidinase-related protein 3 or combinations thereof, preferably the plurality of marker proteins comprises AKR1B10 and/or Beta 3 tubulin.
In another aspect, the present invention provides for the use of one or more marker proteins selected from Table 1A, Tables 2 to 11 as a diagnostic marker for liver cancer.
In yet another aspect, the present invention provides for a method for diagnosing recurrent or primary liver tumor in a subject, the method comprising determining the presence or absence of one or more marker proteins selected from the group consisting of Collagen alpha 1 (XVIII) chain, Plastin-3, AKR1B10, Fibronectin, Beta 3 tubulin, Asporin, 14-3-3 protein eta, and Dihydropyrimidinase-related protein 3 in a sample. Preferably, the liver tumor is selected from the group consisting of hepatocellular carcinoma, peripheral cholangiocellular carcinoma or hilar cholangiocellular carcinoma cells.
In one embodiment of this aspect the marker protein is Beta 3 tubulin and/or AKR1B10, preferably Beta 3 tubulin.
In another embodiment the sample is selected from any one of blood, plasma, serum, liver tissue, liver cells or combinations thereof, preferably the sample is liver tissue, optionally formalin-fixed paraffin-embedded liver tissue section.
In another embodiment, the determining the presence or absence of one or more marker proteins in the sample is performed by either Immunohistochemistry (IHC) or mass spectrometry.
In another aspect the invention provides for a kit for diagnosing recurrent or primary liver tumor in a subject, the kit comprising reagents for determining the presence or absence of one or more marker proteins selected from the group consisting of Collagen alpha 1 (XVIII) chain, Plastin-3, AKR1B10, Fibronectin, Beta 3 tubulin, Asporin, 14-3-3 protein eta, and Dihydropyrimidinase-related protein 3 in a sample. Preferably, the liver tumor is selected from the group consisting of hepatocellular carcinoma, peripheral cholangiocellular carcinoma or hilar cholangiocellular carcinoma cells.
In one embodiment, the marker protein is Beta 3 tubulin and/or AKR1B10, preferably Beta 3 tubulin.
In another embodiment, the kit comprises reagents suitable for preparing the sample, wherein the sample is selected from any one of blood, plasma, serum, liver tissue, liver cells or combinations thereof.
In yet another embodiment, the sample is liver tissue and the kit comprises reagents suitable for preparing liver tissue, optionally for preparing formalin-fixed paraffin-embedded liver tissue sections.
In another embodiment, the determining the presence or absence of one or more marker proteins in the sample is performed by either Immunohistochemistry.
In yet another aspect, the present invention provides for a kit for use in determining the cellular phenotype of a liver cell, said kit allowing the user to determine the presence or level of expression of a plurality of analytes selected from proteins or fragments thereof provided in biomarker panels as represented by any one of Table 1A, Tables 2 to 11, a plurality of antibodies against said marker proteins and a plurality of nucleic acid molecules encoding said marker proteins or fragments thereof, in a cell under test; the kit comprising
The present invention also provides for a kit for use in determining the cellular phenotype of a liver cell in vitro, said kit allowing the user to determine the presence or level of expression of a plurality of proteins or fragments thereof provided in biomarker panels represented by Table 1A, Tables 2 to 11, in a cell under test; the kit comprising
In yet a further aspect, the present invention provides for a kit for the diagnosis, prognostic monitoring of a liver tumor in an individual or for determining a treatment regimen for an individual having a liver tumor, the kit comprising
Preferably, the biomarker panel is represented by Table 5 or by Part A of Table 4 or by Table 7 or Part A of Table 7.
In yet a further aspect, the present invention provides for a plurality of synthetic peptides each having a sequence identical to a fragment of one of a plurality of proteins selected from a biomarker panel selected from any one of Table 1A, Tables 2 to 11, said fragment resulting from digestion of the protein by trypsin, ArgC, AspN or Lys-C digestion, wherein one or more of the plurality of synthetic peptides comprises a label, optionally for the use in Selective Reaction Monitoring. Preferably, the label is a heavy isotope.
The present invention also provides for a liver cellular classification system comprising a liver cellular classification apparatus and an information communication terminal apparatus, said liver cellular classification apparatus including a control component and a memory component, said apparatuses being communicatively connected to each other via a network;
Preferably, the memory unit contains data of a plurality of proteins selected from Table 5 or Table 11 and wherein the classification is between Hepatocellular carcinoma and peripheral cholangiocarcinoma; alternatively the memory unit contains data of a plurality of proteins selected from Table 7 or Table 11 and wherein the classification is between Hepatocellular carcinoma and cholangiocarcinoma in post-TACE liver tumors.
In some embodiments, the liver cellular classification system according to the invention is connected to an apparatus for determining the protein expression levels in a liver tissue sample, preferably the apparatus can process multiple samples using liquid chromatography-mass spectrometry (LC-MS/MS).
In yet a further aspect of the present invention, there is provided a liver tissue cellular classification program that makes an information processing apparatus including a control component and a memory component execute a method of determining and/or classifying the liver tissue of a subject, the method comprising:
The term “plurality of marker proteins” means at least two marker proteins as disclosed herein.
The term “marker protein” includes all biologically relevant forms of the protein identified, including post-translational modifications. For example, the marker protein can be present in a glycosylated, phosphorylated, multimeric, fragmented or precursor form. A marker protein fragment may be naturally occurring or, for example, enzymatically generated and the biologically active function of the full marker protein. Fragments will typically be at least about 10 amino acids, usually at least about 50 amino acids in length, and can be as long as 300 amino acids in length or longer.
The term “cellular phenotype” refers to the characteristics or traits of a cell or group of cells. Cellular phenotype refers to the cells anatomical location, morphology, development, biochemical or physiological properties, behaviour, and products of biochemistry/behaviour. Cellular phenotype results from the expression of cell genes as well as the influence of environmental factors and the interactions between the two.
The term “liver tissue sample” include, but is not limited to, a specimen of liver tissue removed by resection or core needle biopsy.
The term “expression level” refers to the relative amount of protein in a liver tissue sample, for example as determined by LC-MS/MS label free quantification approaches such as area under the curve and spectral counting.
The term “comparing” means measuring the relative amount of a protein or proteins in a sample relative to other samples (for example protein amounts stored in our database).
The term “reference set” refers to the samples (for example in our database) used as classifiers (e.g. classic examples or HCC, or CC). These classifiers can be used to help diagnosis of non-classic specimens from new cases.
The term “reference level”, “reference set of expression level”; “reference expression level” and “reference amount” are used herein as synonyms and refers to a pre-determined level, which may, for example be provided in the form of an accessible data record from a public database.
The term “antibody” includes polyclonal antiserum, monoclonal antibodies, fragments of antibodies such as single chain and Fab fragments, and genetically engineered antibodies. The antibodies may be chimeric or of a single species.
The terms “marker protein” and “biomarker”, which are used interchangeably herein, include all biologically relevant forms of the protein identified, including post-translational modifications. For example, the marker protein can be present in a glycosylated, phosphorylated, multimeric or precursor form.
The term “control” refers to a cultured cell line, primary culture of cells taken from a human or animal subject, or biopsy material taken from a human or animal subject that is free of HCC or CC.
The term “antibody array” or “antibody microarray” means an array of unique addressable elements on a continuous solid surface whereby at each unique addressable element an antibody with defined specificity for an antigen is immobilised in a manner allowing its subsequent capture of the target antigen and subsequent detection of the extent of such binding. Each unique addressable element is spaced from all other unique addressable elements on the solid surface so that the binding and detection of specific antigens does not interfere with any adjacent such unique addressable element.
The term “bead suspension array” means an aqueous suspension of one or more identifiably distinct particles whereby each particle contains coding features relating to its size and colour or fluorescent signature and to which all of the beads of a particular combination of such coding features is coated with an antibody with a defined specificity for an antigen in a manner allowing its subsequent capture of the target antigen and subsequent detection of the extent of such binding. Examples of such arrays can be found at www.uminexcorp.com where application of the xMAP® bead suspension array on the Luminex® 100™ System is described.
The terms “selected reaction monitoring”, “SRM” and “MRM” means a mass spectrometry assay whereby precursor ions of known mass-to-charge ratio representing known biomarkers are preferentially targeted for analysis by tandem mass spectrometry in an ion trap or triple quadrupole mass spectrometer.
During the analysis the parent ion is fragmented and the number of daughter ions of a second predefined mass-to-charge ratio is counted. Typically, an equivalent precursor ion bearing a predefined number of stable isotope substitutions but otherwise chemically identical to the target ion is included in the method to act as a quantitative internal standard.
“Differential expression”, as used herein, refers to at least one recognisable difference in protein expression. It may be a quantitatively measurable, semi-quantitatively estimable or qualitatively detectable difference in tissue protein expression. Thus, a differentially expressed protein may be strongly expressed in tissue in one cellular phenotype (e.g. HCC) and less strongly expressed or not expressed at all in another cellular phenotype (e.g. CC). Further, expression may be regarded as differential if the protein undergoes any recognisable change such as cleavage or post-translational modification between two cellular phenotypes under comparison.
The term “isolated” means throughout this specification, that the marker protein, antibody or polynucleotide, as the case may be, exists in a physical milieu distinct from that in which it may occur in nature.
As used herein, the term “subject” includes any human or non-human animal. The term “non-human animal” includes all vertebrates, e.g., mammals and non-mammals, such as non-human primates, sheep, dogs, cats, horses, cows, chickens, amphibians, reptiles, etc.
The term “treat”, “treating”, “treatment”, “prevent”, “preventing” or “prevention” includes therapeutic treatments, prophylactic treatments and applications in which one reduces the risk that a subject will develop a disorder or other risk factor. Treatment does not require the complete curing of a disorder and encompasses the reduction of the symptoms or underlying risk factors.
Abbreviations
LMD: laser microdissection; TACE: trans-arterial chemo-embolization; HCC: hepatocellular carcinoma; CC: cholangiocellular carcinoma
PSC: primary sclerosing cholangitis; FFPE: Formalin Fixed Paraffin embedded;
AUC: area under the curve; PSM: peptide spectrum match.
The inventors have identified marker proteins that demonstrate statistically significant differences in protein expression levels between different cellular phenotypes of liver cells, including liver tumor cells. In particular, the inventors have determined marker proteins having different expression levels between components (HCC and CC) of post-TACE HCC, Often cases diagnosed with HCC are then treated with transarterial chemoembolization (TACE), however tumors generally come back, but no longer show the classic HCC phenotype, having some regions that look classic HCC, some that look classic CC, and some which are undefinable. The present invention allows for the identification of marker proteins more specific for HCC than CC, or vice versa, in a patient that has already undergone TACE. The inventors have further explored their similarities or dissimilarities compared to their normal and typical malignant counterparts. The inventors also found significant differences in other tissue type comparisons. These differentially expressed marker proteins provide useful biomarkers to help diagnosing tumor types, assessing patient prognosis and determining appropriate treatment regimens.
The identification of marker protein sets (or biomarker panels) specific to the hepatocellular and cholangiocellular phenotype of post-TACE mixed tumors, and their similarity to their normal and typical neoplastic counterparts confirms that the differentiation process is truly divergent, despite a probable origin from a common progenitor. Of equal importance is the identification by the inventors of marker proteins differentially expressed between normal and neoplastic hepatocytes and biliary epithelial cells, as they provide new markers of malignant transformation or tumor differentiation; and between HCC and peripheral CC, which often overlap in both clinical presentation, and appearance on imaging and histology (22, 23).
The present invention provides herein marker proteins which are differentially expressed between two cell types tested and allow a particular cellular phenotype to be determined.
Table 1A shows the preferred marker proteins (including their synonyms) according to the invention, namely Beta 3 tubulin, AKR1B10, Collagen alpha 1 (XVIII) chain, Plastin-3, Fibronectin, Asporin, 14-3-3 protein eta and Dihydropyrimidinase-related protein 3.
Table 1B indicates the numbers of proteins that showed statistically significant differential expression levels between two types of liver tissues (p-value <0.05 and Log 2 [fold change]≧2 or ≦−2) using shared and unique peptides. These numbers illustrate the number of differentially modulated proteins that were common to both area under the curve and spectral counting datasets per tissue type comparison (467 proteins common to both).
The marker proteins indicated in the Tables 2 to 10 allow the following cell types to be distinguished:—
Table 2: Normal hepatocytes from HCC.
Table 3: Peripheral cholangiocarcinoma from normal bile duct.
Table 4: Hilar cholangiocarcinoma from normal bile duct.
Table 5: Hepatocellular carcinoma from peripheral cholangiocarcinoma.
Table 6: Hepatocytes from cholangiocytes.
Table 7: Hepatocellular carcinoma and cholangiocarcinoma in post-TACE liver tumors.
Table 8: Peripheral cholangiocarcinoma from metastatic colorectal cancer.
Table 9: Hilar cholangiocarcinoma from hilar cholangiocarcinoma with primary sclerosing cholangitis.
Table 10: Hilar cholangiocarcinoma from metastatic colorectal cancer.
This determination provides clinicians for the first time with knowledge of the cellular phenotype of the liver tumor and as a result, accurate decisions regarding type and assessment of treatment, and prognosis can be provided. For each Table, all negative values for effect size (g), relate to marker proteins that were present at a lower concentration in the first tissue type versus the second tissue type. All positive values for effect size (g), relate to marker proteins that were present in higher concentration in the first tissue type versus the second tissue type.
The statistical significance for each protein regulation is shown as a p-value calculated after performing an unrelated t-test comparing the number of spectral counts for each protein between the two named tissue types. Hedges' g unbiased standardized effect size estimates were calculated, along with 95% confidence intervals for these estimates. Values of g<0.2 are regarded as very small differences, g=0.5 average differences, g>0.8 regarded as large differences. Unstandardized effect size estimates (i.e., differences in the mean spectral counts in two compared tissue types) were also calculated, along with 95% confidence intervals for these estimates. The tables displayed these unbiased standardized effect size estimates and the unstandardized effect size estimates. Q-values (adjusted p values) provide a more stringent measure of statistical significance than p-values and were computed using a direct False Discovery Rate approach. Individual Q-values are not shown here but all marker proteins with q-values 0.05 are listed in section A of each tables 2 to 10, while all marker proteins with p-values 0.05 are displayed in section B of each table 2 to 10.
Protein expression levels for marker proteins shown in Tables 2 to 10 were determined using label free LC-MS/MS quantification based on spectral counting (shared and unique peptides) which is well known in the art. All marker proteins showing statistically significant differences in mean spectral counts between two tissue types are display in Tables 2-10. We have also used an alternate method of data analysis based on the area under the curve (AUC) of the MS1 peak of the three most intense peptides for each protein. All marker proteins in Table 11 (
In Tables 2 to 10 the significantly modulated marker proteins were filtered by the stringent q-values (section A), then the less stringent p-values (section B). In Table 11 (
Table 2 provides protein markers for use in distinguishing normal hepatocytes from hepatocellular carcinoma cells (HCC).
P39060*
Collagen alpha-1(XVIII)
2.28E−07
−5.26
5.26
−6.71
6.71
chain/Name = COL18A1
P07355*
Annexin A2/Name = ANXA2;
3.36E−05
−3.7
3.7
−10.57
10.57
Synonyms = ANX2,
ANX2L4, CAL1H, LPC2D
P02751*
Fibronectin/Name = FN1;
1.58E−04
−3.61
3.61
−19.43
19.43
Synonyms = FN
P01876*
Ig alpha-1 chain C region/
1.03E−03
−2.16
2.16
−6.57
6.57
Name = IGHA1
Table 2 provides information as to whether the marker proteins are relatively over-expressed (identified in bold) or under-expressed in HCC versus normal hepatocytes. Accordingly, by determining the presence, absence or change in expression levels of a plurality of these marker proteins and comparing these changes with a reference of known expression levels, one is able to determine whether the cells under test are HCC or normal hepatocytes. The plurality of marker proteins may be selected from Table 2 as a whole, or preferably from Part A which lists those marker proteins showing a higher statistically significant difference between the two cell types.
Table 3 provides information as to whether the marker proteins are relatively over-expressed (identified in bold) or under-expressed in peripheral cholangiocarcinoma versus normal cholangiocytes. Accordingly, by determining the presence, absence or change in expression levels of a plurality of these marker proteins and comparing these changes with a reference of known expression levels, one is able to determine whether the cells under test are peripheral cholangiocarcinoma or normal cholangiocytes.
P21333*
Filamin-A/Name = FLNA;
9.79E−04
2.3
2.3
37.69
37.69
Synonyms = FLN, FLN1
Table 4 provides information as to whether the marker proteins are relatively over-expressed (identified in bold) or under-expressed in hilar cholangiocarcinoma versus normal cholangiocytes. Accordingly, by determining the presence, absence or change in expression levels of a plurality of these marker proteins and comparing these changes with a reference of known expression levels, one is able to determine whether the cells under test are hilar cholangiocarcinoma or normal cholangiocytes.
P13797*
Plastin-3/Name = PLS3
5.11E−04
2.68
2.68
8.52
8.52
Table 5 provides information as to whether the marker proteins are relatively over-expressed (identified in bold) or under-expressed in peripheral carcinoma versus hepatocellular carcinoma. Accordingly, by determining the presence, absence or change in expression levels of a plurality of these marker proteins and comparing these changes with a reference of known expression levels, one is able to determine whether the cells under test are hepatocellular carcinoma or peripheral carcinoma.
P18206*
Vinculin/Name = VCL
1.65E−08
−7.07
7.07
−13.14
13.14
P15311*
Ezrin/Name = EZR;
8.38E−07
−5.26
5.26
−11.14
11.14
Synonyms = VIL2
P63104*
14-3-3 protein zeta/delta/
3.33E−05
−3.22
3.22
−13.29
13.29
Name = YWHAZ
P14618*
Pyruvate kinase isozymes
3.76E−05
−4.42
4.42
−39.86
39.86
M1/M2/Name = PKM2;
Synonyms = OIP3, PK2, PK3,
PKM
P09525*
Annexin A4/Name = ANXA4;
4.58E−05
−3.8
3.8
−33.14
33.14
Synonyms = ANX4
P31949*
Protein S100-A11/
4.81E−05
−5.17
5.17
−3.71
3.71
Name = S100A11;
Synonyms = MLN70, S100C
P04075*
Fructose-bisphosphate
6.82E−05
−3.58
3.58
−25
25
aldolase A/Name = ALDOA;
Synonyms = ALDA
P00558*
Phosphoglycerate kinase 1/
7.65E−05
−3.1
3.1
−19.43
19.43
Name = PGK1;
Synonyms = PGKA;
ORFNames = MIG10,
OK/SW-cl.110
P46940*
Ras GTPase-activating-like
7.86E−05
−4.74
4.74
−14.29
14.29
protein IQGAP1/
Name = IQGAP1;
Synonyms = KIAA0051
Q15019*
Septin-2/Name = SEPT2;
8.46E−05
−4.68
4.68
−5
5
Synonyms = DIFF6,
KIAA0158, NEDD5
P08238*
Heat shock protein HSP
1.09E−04
−2.97
2.97
−12.71
12.71
90-beta/Name = HSP90AB1;
Synonyms = HSP90B,
HSPC2, HSPCB
P50454*
Serpin H1/
1.10E−04
−3.45
3.45
−8.29
Name = SERPINH1;
Synonyms = CBP1, CBP2,
HSP47, SERPINH2;
ORFNames = PIG14
P13010*
X-ray repair
1.20E−04
−2.91
2.91
−8.29
8.29
cross-complementing
protein 5/Name = XRCC5;
Synonyms = G22P2
P27348*
14-3-3 protein theta/
1.30E−04
−2.99
2.99
−11.71
11.71
Name = YWHAQ
P21333*
Filamin-A/Name = FLNA;
1.57E−04
−3.96
3.96
−55.43
55.43
Synonyms = FLN, FLN1
P61158*
Actin-related protein 3/
1.66E−04
−2.74
2.74
−6.29
6.29
Name = ACTR3;
Synonyms = ARP3
P12429*
Annexin A3/Name = ANXA3;
2.25E−04
−3.14
3.14
−4.57
4.57
Synonyms = ANX3
P68032*
Actin, alpha cardiac muscle
2.30E−04
−2.62
2.62
−55.86
55.86
1/Name = ACTC1;
Synonyms = ACTC
Q12905*
Interleukin
2.58E−04
−2.58
2.58
−3.86
3.86
enhancer-binding factor 2/
Name = ILF2;
Synonyms = NF45;
ORFNames = PRO3063
P21810*
Biglycan/Name = BGN;
2.80E−04
−3.62
3.62
−21.57
21.57
Synonyms = SLRR1A
P12956*
X-ray repair
3.17E−04
−2.71
2.71
−10.57
10.57
cross-complementing
protein 6/Name = XRCC6;
Synonyms = G22P1
P67936*
Tropomyosin alpha-4 chain/
3.48E−04
−2.88
2.88
−9.43
9.43
Name = TPM4
O60664*
Perilipin-3/Name = PLIN3;
5.02E−04
−3.39
3.39
−6.43
6.43
Synonyms = M6PRBP1, TIP47
P06396*
Gelsolin/Name = GSN
6.42E−04
−2.49
2.49
−8.43
8.43
P50995*
Annexin A11/
8.64E−04
−2.47
2.47
−6.57
6.57
Name = ANXA11;
Synonyms = ANX11
Table 6 provides information as to whether the marker proteins are relatively over-expressed (identified in bold) or under-expressed in normal cholangiocytes versus normal hepatocytes. Accordingly, by determining the presence, absence or change in expression levels of a plurality of these marker proteins and comparing these changes with a reference of known expression levels, one is able to determine whether the cells under test are normal cholangiocytes or normal hepatocytes.
P30740*
Leukocyte elastase
3.97E−07
−6.1
6.1
−9.9
9.9
inhibitor/
Name = SERPINB1;
Synonyms = ELANH2,
MNEI, PI2
Q96QK1*
Vacuolar protein
2.15E−06
−5.61
5.61
−5.38
5.38
sorting-associated
protein 35/
Name = VPS35;
Synonyms = MEM3;
ORFNames = TCCCTA00141
Q9UHD8*
Septin-9/
3.32E−06
−5.38
5.38
−5.21
5.21
Name = SEPT9;
Synonyms = KIAA0991,
MSF
Q01995*
Transgelin/
6.29E−06
−8.08
8.08
−13.71
13.71
Name = TAGLN;
Synonyms = SM22,
WS3-10
P50995*
Annexin A11/
1.11E−05
−9.87
9.87
−8.67
8.67
Name = ANXA11;
Synonyms = ANX11
Q07960*
Rho
1.15E−05
−9.79
9.79
−3.67
3.67
GTPase-activating
protein 1/
Name = ARHGAP1;
Synonyms = CDC42GAP,
RHOGAP1
P08670*
Vimentin/Name = VIM
1.42E−05
−3.89
3.89
−33.31
33.31
P08758*
Annexin A5/
1.70E−05
−5.87
5.87
−18.48
18.48
Name = ANXA5;
Synonyms = ANX5,
ENX2, PP4
Q13838*
Spliceosome RNA
2.36E−05
−3.75
3.75
−7.12
7.12
helicase DDX39B/
Name = DDX39B;
Synonyms = BAT1,
UAP56
P09211*
Glutathione
2.59E−05
−7.63
7.63
−39.45
39.45
S-transferase P/
Name = GSTP1;
Synonyms = FAEES3,
GST3
O14979*
Heterogeneous
2.78E−05
−8.18
8.18
−5.83
5.83
nuclear
ribonucleoprotein
D-like/
Name = HNRPDL;
Synonyms = JKTBP
P37802*
Transgelin-2/
2.84E−05
−4.52
4.52
−7.26
7.26
Name = TAGLN2;
Synonyms = KIAA0120;
ORFNames = CDABP0035
P60660*
Myosin light
3.13E−05
−3.8
3.8
−6.83
6.83
polypeptide 6/
Name = MYL6
P06396*
Gelsolin/Name = GSN
3.33E−05
−5.01
5.01
−10.38
10.38
P08729*
Keratin, type II
3.41E−05
−7.85
7.85
−12.17
12.17
cytoskeletal 7/
Name = KRT7;
Synonyms = SCL
P08107*
Heat shock 70 kDa
3.76E−05
−4.78
4.78
−22.98
22.98
protein 1A/1B/
Name = HSPA1A;
Synonyms = HSPA1
O00299*
Chloride intracellular
4.23E−05
−4.64
4.64
−11.76
11.76
channel protein 1/
Name = CLIC1;
Synonyms = G6,
NCC27
P31943*
Heterogeneous
4.37E−05
−3.33
3.33
−7.24
7.24
nuclear
ribonucleoprotein H/
Name = HNRNPH1;
Synonyms = HNRPH,
HNRPH1
Q05707*
Collagen alpha-1(XIV)
4.98E−05
−7.27
7.27
−19.33
19.33
chain/
Name = COL14A1;
Synonyms = UND
P05455*
Lupus La protein/
5.30E−05
−3.69
3.69
−4.43
4.43
Name = SSB
P68366*
Tubulin alpha-4A
6.27E−05
−3.82
3.82
−40
40
chain/
Name = TUBA4A;
Synonyms = TUBA1
Q08211*
ATP-dependent RNA
6.61E−05
−5.12
5.12
−10.1
10.1
helicase A/
Name = DHX9;
Synonyms = DDX9,
LKP, NDH2
O43390*
Heterogeneous
6.75E−05
−3.34
3.34
−6.38
6.38
nuclear
ribonucleoprotein R/
Name = HNRNPR;
Synonyms = HNRPR
P19105*
Myosin regulatory
7.33E−05
−3.86
3.86
−5.64
5.64
light chain 12A/
Name = MYL12A;
Synonyms = MLCB,
MRLC3, RLC
O14950*
Myosin regulatory
7.33E−05
−3.86
3.86
−5.64
5.64
light chain 12B/
Name = MYL12B;
Synonyms = MRLC2,
MYLC2B
P61978*
Heterogeneous
8.02E−05
−4.28
4.28
−12.88
12.88
nuclear
ribonucleoprotein K/
Name = HNRNPK;
Synonyms = HNRPK
P67936*
Tropomyosin alpha-4
8.10E−05
−4.86
4.86
−7.21
7.21
chain/Name = TPM4
P21810*
Biglycan/
8.14E−05
−6.13
6.13
−39.6
39.6
Name = BGN;
Synonyms = SLRR1A
O00571*
ATP-dependent RNA
1.11E−04
−3.66
3.66
−4.93
4.93
helicase DDX3X/
Name = DDX3X;
Synonyms = DBX,
DDX3
P13010*
X-ray repair
1.25E−04
−4.51
4.51
−8.6
8.6
cross-complementing
protein 5/
Name = XRCC5;
Synonyms = G22P2
P09525*
Annexin A4/
1.30E−04
−5.9
5.9
−68.21
68.21
Name = ANXA4;
Synonyms = ANX4
P08727*
Keratin, type I
1.31E−04
−5.96
5.96
−33
33
cytoskeletal 19/
Name = KRT19
Q01105*
Protein SET/
1.33E−04
−2.98
2.98
−3.95
3.95
Name = SET
Q9NR45*
Sialic acid synthase/
1.38E−04
−3.09
3.09
−3.43
3.43
Name = NANS;
Synonyms = SAS
Q00839*
Heterogeneous
1.43E−04
−4.08
4.08
−11.9
11.9
nuclear
ribonucleoprotein U/
Name = HNRNPU;
Synonyms = HNRPU,
SAFA, U21.1
P12111*
Collagen alpha-3(VI)
1.55E−04
−3.89
3.89
−57.57
57.57
chain/
Name = COL6A3
Q13263*
Transcription
1.58E−04
−5.72
5.72
−5.67
5.67
intermediary factor
1-beta/
Name = TRIM28;
Synonyms = KAP1,
RNF96, TIF1B
O60506*
Heterogeneous
1.70E−04
−2.89
2.89
−5.83
5.83
nuclear
ribonucleoprotein Q/
Name = SYNCRIP;
Synonyms = HNRPQ,
NSAP1
P43243*
Matrin-3/
1.80E−04
−2.85
2.85
−4.69
4.69
Name = MATR3;
Synonyms = KIAA0723
O95994*
Anterior gradient
1.83E−04
−5.55
5.55
−12.83
12.83
protein 2 homolog/
Name = AGR2;
Synonyms = AG2;
ORFNames = UNQ515/
PRO1030
P40121*
Macrophage-capping
1.91E−04
−5.5
5.5
−5.5
5.5
protein/
Name = CAPG;
Synonyms = AFCP,
MCP
P15311*
Ezrin/Name = EZR;
1.93E−04
−5.49
5.49
−13.5
13.5
Synonyms = VIL2
P62258*
14-3-3 protein epsilon/
2.02E−04
−2.96
2.96
−9.52
9.52
Name = YWHAE
Q04917*
14-3-3 protein eta/
2.20E−04
−3.17
3.17
−10.83
10.83
Name = YWHAH;
Synonyms = YWHA1
Q16555*
Dihydropyrimidinase-
2.26E−04
−4.73
4.73
−10.88
10.88
related protein 2/
Name = DPYSL2;
Synonyms = CRMP2,
ULIP2
O75367*
Core histone
3.15E−04
−4.24
4.24
−11.52
11.52
macro-H2A.1/
Name = H2AFY;
Synonyms = MACROH2A1
P26599*
Polypyrimidine
3.31E−04
−3.17
3.17
−10.24
10.24
tract-binding protein
1/Name = PTBP1;
Synonyms = PTB
P56470*
Galectin-4/
3.40E−04
−2.63
2.63
−18.6
18.6
Name = LGALS4
P68371*
Tubulin beta-2C
3.46E−04
−2.61
2.61
−42.24
42.24
chain/
Name = TUBB2C
Q9Y3I0*
tRNA-splicing ligase
4.35E−04
−4.63
4.63
−3
3
RtcB homolog/
Name = C22orf28;
ORFNames = HSPC117
P04083*
Annexin A1/
4.62E−04
−3.05
3.05
−9.62
9.62
Name = ANXA1;
Synonyms = ANX1,
LPC1
P22314*
Ubiquitin-like
4.70E−04
−2.74
2.74
−10.45
10.45
modifier-activating
enzyme 1/
Name = UBA1;
Synonyms = A1S9T,
UBE1
P07195*
L-lactate
4.77E−04
−2.58
2.58
−7.24
7.24
dehydrogenase B
chain/Name = LDHB
P12956*
X-ray repair
4.80E−04
−3.94
3.94
−10.98
10.98
cross-complementing
protein 6/
Name = XRCC6;
Synonyms = G22P1
Q96PK6*
RNA-binding protein
4.93E−04
−4.5
4.5
−2.67
2.67
14/Name = RBM14;
Synonyms = SIP
P55786*
Puromycin-sensitive
5.70E−04
−3.16
3.16
−4.93
4.93
aminopeptidase/
Name = NPEPPS;
Synonyms = PSA
P07384*
Calpain-1 catalytic
5.70E−04
−4.07
4.07
−14.43
14.43
subunit/
Name = CAPN1;
Synonyms = CANPL1;
ORFNames = PIG30
P60842*
Eukaryotic initiation
6.15E−04
−2.67
2.67
−7.1
7.1
factor 4A-I/
Name = EIF4A1;
Synonyms = DDX2A,
EIF4A
P23528*
Cofilin-1/
6.28E−04
−2.4
2.4
−4.81
4.81
Name = CFL1;
Synonyms = CFL
P00558*
Phosphoglycerate
6.29E−04
−2.61
2.61
−13.74
13.74
kinase 1/
Name = PGK1;
Synonyms = PGKA;
ORFNames = MIG10,
OK/SW-cl.110
P06748*
Nucleophosmin/
6.64E−04
−2.89
2.89
−5.83
5.83
Name = NPM1;
Synonyms = NPM
Q13151*
Heterogeneous
7.14E−04
−4.16
4.16
−4.83
4.83
nuclear
ribonucleoprotein A0/
Name = HNRNPA0;
Synonyms = HNRPA0
P50991*
T-complex protein 1
7.55E−04
−3.16
3.16
−4.21
4.21
subunit delta/
Name = CCT4;
Synonyms = CCTD,
SRB
Q15019*
Septin-2/
7.71E−04
−4.09
4.09
−4.5
4.5
Name = SEPT2;
Synonyms = DIFF6,
KIAA0158, NEDD5
P20774*
Mimecan/
8.69E−04
−3.99
3.99
−20.33
20.33
Name = OGN;
Synonyms = OIF,
SLRR3A
P54578*
Ubiquitin
8.82E−04
−3.97
3.97
−2.83
2.83
carboxyl-terminal
hydrolase 14/
Name = USP14;
Synonyms = TGT
P51888*
Prolargin/
9.02E−04
−3.64
3.64
−30.1
30.1
Name = PRELP;
Synonyms = SLRR2A
P09960*
Leukotriene A-4
9.14E−04
−2.52
2.52
−4
4
hydrolase/
Name = LTA4H;
Synonyms = LTA4
P27348*
14-3-3 protein theta/
9.53E−04
−2.97
2.97
−7.31
7.31
Name = YWHAQ
Table 7 provides information as to whether the marker proteins are relatively over-expressed (identified in bold) or under-expressed in hepatocellular carcinoma versus cholangiocarcinoma in post-TACE liver tumours. Accordingly, by determining the presence, absence or change in expression levels of a plurality of these marker proteins and comparing these changes with a reference of known expression levels, one is able to determine whether the cells under test are hepatocellular carcinoma versus cholangiocarcinoma in post-TACE liver tumours.
All the marker proteins in section A of Table 7 are proteins with q-values less than or equal to 0.05. Those marker proteins in bold text (and with −ve effect size (g)) were less abundant in the HCC regions of the post-TACE, relative to the CC regions of the post-TACE. All the marker proteins in Table 7B are marker proteins with p-values less than or equal to 0.05.
P68032*
Actin,
alpha
cardiac
5.30E−06
−5
5
−36.86
36.86
muscle
1/Name
=
ACTC1;
Synonyms
=
ACTC
P00491*
Purine
nucleoside
1.19E−05
−3.7
3.7
−6.71
6.71
phosphorylase/
Name
=
PNP;
Synonyms
=
NP
Q9UHD8*
Septin-9/Name
=
SEPT9;
4.03E−05
−3.18
3.18
−5.43
5.43
Synonyms
=
KIAA0991,
MSF
Q04917*
14-3-3
protein
eta/
4.45E−05
−3.11
3.11
−10.71
10.71
Name
=
YWHAH;
Synonyms
=
YWHA1
Q14974*
Importin
subunit
beta-1/
1.01E−04
−3.14
3.14
−4.43
4.43
Name
=
KPNB1;
Synonyms
=
NTF97
Q00839*
Heterogeneous
nuclear
5.13E−04
−2.36
2.36
−9.14
9.14
ribonucleoprotein
U/
Name
=
HNRNPU;
Synonyms
=
HNRNPU,
SAFA,
U21.1
P13010*
X-ray
repair
5.80E−04
−2.36
2.36
−6
6
cross-complementing
protein
5/Name
=
XRCC5;
Synonyms
=
G22P2
P07737*
Profilin-1/Name
=
PFN1
6.04E−04
−2.52
2.52
−6.14
6.14
P04632*
Calpain
small
subunit
1/
6.49E−04
−2.39
2.39
−3.71
3.71
Name
=
CAPNS1;
Synonyms
=
CAPN4,
CAPNS
P52907*
F-actin-capping
protein
6.74E−04
−2.41
2.41
−4.29
4.29
subunit
alpha-1/
Name
=
CAPZA1
P52565*
Rho
GDP-dissociation
7.34E−04
−2.28
2.28
−3.57
3.57
inhibitor
1/
Name
=
ARHGDIA;
Synonyms
=
GDIA1
O15144*
Actin-related
protein
2/3
8.69E−04
−2.59
2.59
−5.57
5.57
complex
subunit
2/
Name
=
ARPC2;
Synonyms
=
ARC34;
ORFNames
=
PRO2446
P63261*
Actin,
cytoplasmic
2/
8.89E−04
−2.74
2.74
−93.86
93.86
Name
=
ACTG1;
Synonyms
=
ACTB,
ACTG
Table 8 provides information as to whether the marker proteins are relatively over-expressed (identified in bold) or under-expressed in peripheral cholangiocarcinoma versus metastatic colorectal cancer in post-TACE liver tumours. Accordingly, by determining the presence, absence or change in expression levels of a plurality of these marker proteins and comparing these changes with a reference of known expression levels, one is able to determine whether the cells under test are peripheral cholangiocarcinoma or metastatic colorectal cancer.
P17987*
T-complex
protein
1
9.41E−05
−2.98
2.98
−5.29
5.29
subunit
alpha/
Name
=
TCP1;
Synonyms
=
CCT1,
CCTA
Q8NFW8*
N-acylneuraminate
1.05E−04
−2.91
2.91
−2.71
2.71
cytidylyltransferase/
Name
=
CMAS
P15374*
Ubiquitin
2.53E−04
−2.61
2.61
−3.57
3.57
carboxyl-terminal
hydrolase
isozyme
L3/
Name
=
UCHL3
Q99829*
Copine-1/Name
=
CPNE1;
6.96E−04
−2.32
2.32
−3.29
3.29
Synonyms
=
CPN1
P48643*
T-complex
protein
1
9.45E−04
−2.27
2.27
−3.86
3.86
subunit
epsilon/
Name
=
CCT5;
Synonyms
=
CCTE,
KIAA0098
Table 9 provides information as to whether the marker proteins are relatively over-expressed or under-expressed in hilar cholangiocarcinoma versus HCC plus primary sclerosis cholangitis in post-TACE liver tumours. Accordingly, by determining the presence, absence or change in expression levels of a plurality of these marker proteins and comparing these changes with a reference of known expression levels, one is able to determine whether the cells under test are peripheral hilar cholangiocarcinoma or primary sclerosis cholangitis.
Table 10 provides information as marker proteins that showed significant difference between hilar cholangiocarcinoma and colorectal metastasis.
Table 11 (
Accordingly, the invention provides for the first time marker proteins which, by determining their relative expression by any appropriate means, can distinguish between different liver cell phenotypes. The disclosure herein provides details of how such expression levels may be determined and/or quantified, but may be employed and the method of quantification itself is not a limiting component of the invention.
The invention provides for a method of determining the cellular phenotype of a liver tissue sample said method comprising
The invention also provides for a method of identifying the cellular phenotype of a liver cell, said method comprising
Preferably the known cellular phenotypes comprise normal liver epithelium cells (hepatocytes), normal biliary epithelium cells (cholangiocytes), hepatocellular carcinoma cells, peripheral cholangiocellular carcinoma cells and hilar cholangiocellular carcinoma cells.
In all cases, the plurality of marker proteins may be selected from a biomarker panel as represented by the relevant Table as a whole or from Part A of the Table which contains those marker proteins showing the highest statistically significant difference. Alternatively the plurality of marker proteins may be selected from those shown to be over-expressed (identified in bold) or under-expressed as compared to the two cell types from either the whole Table or part A.
Alternatively, the plurality of marker proteins may be selected from Table 11 for the relevant tissue comparison.
In all cases, the plurality of protein may comprises 2, 3, 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 120 or more protein markers provided in the respective Tables. With respect to Table 5, the plurality of marker proteins may comprise 2, 3, 5, 10, 20, 30, 40, 50, 60, 70, or more protein markers.
Alternatively, the method may comprises determining the presence or change in level of expression of at least 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% of the protein markers provided in any one of Tables 2 to 10.
In one embodiment, the method may determine the presence or change in level of expression of 100% of the protein markers provided in any one of Tables 2 to 10 or Table 11 or the relevant section of Table 11.
Preferably, the plurality of marker proteins are selected from any one of Collagen alpha 1 (XVIII) chain, Plastin-3, AKR1B10, Fibronectin, Beta 3 tubulin, Asporin, 14-3-3 protein eta or Dihydropyrimidinase-related protein 3 or combinations thereof, preferably the plurality of marker proteins comprises AKR1B10 and/or Beta 3 tubulin.
The liver tissue sample may be a biopsy sample taken from an individual suspect of having a liver tumor. Alternatively, the biopsy may be taken from an individual having previously received treatment for a liver tumor such as surgery, transplantation with or without transarterial chemoembolization.
The step of determining expression levels of said plurality of marker proteins includes determining the presence or absence of the marker proteins in said sample as well as the degree of change in expression levels. When compared to a reference or standard of known expression levels for the cellular phenotype, the presence, absence or change in degree of expression will be indicative of the cellular phenotype.
For this and all other aspects of the invention, the reference or standard protein expression levels may be determined from non-tumor liver tissue from the same subject. In this way, the difference in protein expression levels may be used to determine the cellular phenotype of the liver tumor. Alternatively, the reference levels may be a database comprising data representing expression levels for the marker proteins of interest as selected from any one or more of Tables 2 to 10 or the relevant section of Table 11. Ideally, the reference levels are provided by a liver tumor classification system, such as according to the present invention. The data representing expression levels may be a collection of data obtained from multiple liver samples and presented as an average or range. The data may relate to the levels of specific peptides each being unique to a protein of interest.
The biomarkers provided in Table 5 allow for the first time accurate and reliable distinction to be made between HCC and CC cells.
In particular the marker proteins selected from Table 5 or fragments thereof, or antibodies against said proteins or nucleic acids encoding said proteins or fragments thereof, can be used as a marker for the determination of cellular phenotype of a liver cell wherein said cellular phenotype is selected from HCC or CC.
Preferably, there is provided a method of determining the cellular phenotype of a liver tumor cell, said method comprising
Where the liver tumor cell is from a biopsy taken from an individual having previously received treatment for a liver tumor such as surgery, transplantation with transarterial chemoembolization, it may be preferably to determine the protein expression levels of a plurality of marker proteins selected from Table 7, in addition to, or instead of, those selected from Table 5.
Hence, in one embodiment, the method of determining the cellular phenotype of a liver tumor cell, comprises:
It will also be appreciated that the marker proteins determined herein may also be used as tumor antigens for the purpose of diagnostic and/or prognostic methods and/or for selecting or determining a treatment regimen for an individual based on determination of a cellular phenotype of the liver tumour cell. For example, the marker proteins or fragments thereof may be secreted or lost into the bloodstream as a result of cell death and may therefore be detected from blood, urine or saliva samples using standard techniques, e.g. antibodies. The detection of such protein markers (e.g. tumor antigens) will enable the clinician to determine whether the individual has liver cancer and the cellular classification of the tumor. Thus, it is envisaged that the same methods may be used to diagnose liver tumor at an early stage using samples, including but not exclusively, from blood, saliva or urine.
Hence, there is provided a method for the diagnosis or prognostic monitoring of a liver tumor in an individual, said method comprising
Furthermore, there is provided a method for determining a treatment regimen for an individual having a liver tumor, said method comprising
In particular, the methods according to the invention may be based on the determination of cellular phenotypes of the liver tumor cells based on the protein expression levels identified in respect of a plurality of protein markers provided in Table 5 and/or Table 7.
The method may comprise comparing the determined expression levels with a previously determined reference level for said plurality of marker proteins.
The reference level is preferably a pre-determined level, which may, for example be provided in the form of an accessible data record. The reference level is preferable representative of the expression levels of a number of the biomarkers identified in Table 5 and/or Table 7, each one being the derived mean and range of values obtained from known cellular phenotypes. It will be appreciated that in other embodiments the reference level may be representative of the expression levels of marker proteins selected from other biomarker panels represented by the Tables provided herein depending on the cellular phenotype under investigation.
The liver tumor cell is preferably from a liver tumor biopsy from the individual and more preferably the biomarker panel is represented by Table 5, more preferably Table 5, Part A.
Still further, it is preferred that the plurality of marker proteins selected from the biomarker panel of Table 11 section 2_5.
Where the biopsy is from a patient having previously been treated with transarterial chemoembolization (TACE), it is preferred that the plurality of marker proteins are selected from the biomarker panel of Table 7 or Table 7, Part A. More preferably, the plurality of marker proteins selected from Table 11 section 3_4.
For all methods provided herein, it is envisaged that a further step of determining expressions levels for a second set of marker proteins may be performed. The second set of marker proteins may be selected from the same biomarker panel as the first set, or may be selected from a different biomarker panel as represented by the Tables herein. For example, the method may include firstly determining the expression levels for a plurality of marker proteins selected from Table 5 or relevant section of Table 11, and then determining the expression levels of a plurality of marker proteins selected from Table 7 (or relevant section of Table 11). The expression levels for the first and second set of marker proteins may be measured sequentially or at the same time.
Determining the presence or change in expression level of the plurality of marker proteins may be achieved in many ways all of which are well within the capabilities of the skilled person.
The determination may involve direct quantification of marker protein levels, of nucleic acid encoding those marker proteins or it may involve indirect quantification, e.g. using an assay that provides a measure that is correlated with the amount of marker protein present.
Accordingly, determining the presence or level of expression of the plurality of marker proteins may comprise
The binding member may be an antibody specific for a marker protein or a part thereof, or it may be a nucleic acid molecule which binds to a nucleic acid molecule representing the presence, increase or decrease of expression of a marker protein, e.g. an mRNA sequence.
The antibodies raised against specific marker proteins may be anti- to any biologically relevant state of the marker protein. Thus, for example, they can be raised against the unglycosylated form of a protein which exists in the body in a glycosylated form, against a precursor form of the protein, or a more mature form of the precursor protein, e.g. minus its signal sequence, or against a peptide carrying a relevant epitope of the marker protein. The detection and/or quantification may include preparing a standard curve using standards of known expression levels of the one or more marker proteins and comparing to the level of complex obtained in step (b) above.
A variety of methods may be suitable for determining the presence or changes in level of the plurality of marker proteins: by way of a non-limiting example, these include Western blot, ELISA (Enzyme-Linked Immunosorbent Assay), RIA (Radioimmunoassay), Competitive EIA (Competitive Enzyme Immunoassay), DAS-ELISA (Double Antibody Sandwich-ELISA), Liquid Immunoarray technology), immunocytochemical or immunohistochemical techniques, techniques based on the use of protein microarrays that include specific antibodies, “dipstick” assays, affinity chromatography techniques and liquid binding assays.
Antibodies may be obtained using techniques which are standard in the art. Methods of producing antibodies include immunising a mammal (e.g. mouse, rat, rabbit, horse, goat, sheep or monkey) with the protein or a fragment thereof. Antibodies may be obtained from immunised animals using any of a variety of techniques known in the art, and screened, preferably using binding of antibody to antigen of interest. For instance, Western blotting techniques or immunoprecipitation may be used (Armitage et al, Nature, 357:80-82, 1992). Isolation of antibodies and/or antibody-producing cells from an animal may be accompanied by a step of sacrificing the animal. As an alternative or supplement to immunising a mammal with a peptide, an antibody specific for a protein may be obtained from a recombinantly produced library of expressed immunoglobulin variable domains, e.g. using lambda bacteriophage or filamentous bacteriophage which display functional immunoglobulin binding domains on their surfaces; for instance see WO92/01047. The library may be naive, that is constructed from sequences obtained from an organism which has not been immunised with any of the proteins (or fragments), or may be one constructed using sequences obtained from an organism which has been exposed to the antigen of interest.
Antibodies according to the present invention may be modified in a number of ways that are well known in the art. Indeed the term “antibody” should be construed as covering any binding substance having a binding domain with the required specificity. Thus the invention covers antibody fragments, derivatives, functional equivalents and homologues of antibodies, including synthetic molecules and molecules whose shape mimics that of an antibody enabling it to bind an antigen or epitope. Humanised antibodies in which CDRs from a non-human source are grafted onto human framework regions, typically with the alteration of some of the framework amino acid residues, to provide antibodies which are less immunogenic than the parent non-human antibodies, are also included within the present invention.
A hybridoma producing a monoclonal antibody according to the present invention may be subject to genetic mutation or other changes. It will further be understood by those skilled in the art that a monoclonal antibody can be subjected to the techniques of recombinant DNA technology to produce other antibodies or chimeric molecules which retain the specificity of the original antibody. Such techniques may involve introducing DNA encoding the immunoglobulin variable region, or the complementarity determining regions (CDRs), of an antibody to the constant regions, or constant regions plus framework regions, of a different immunoglobulin. See, for instance, EP 0 184 187 A, GB 2 188 638 A or EP 0 239 400 A. Cloning and expression of chimeric antibodies are described in EP 0 120 694 A and EP 0 125 023 A.
Preferred antibodies for use in accordance with the methods disclosed herein are isolated, in the sense of being free from contaminants such as antibodies able to bind other polypeptides and/or free of serum components. Monoclonal antibodies are preferred for some purposes, though polyclonal antibodies are within the scope of the present invention. For example, the primary monoclonal antibodies used herein were anti-AKR1B10 (clone 1A6; 1:500; Abcam, Cambridge, UK) and anti-tubulin beta 3 (clone TU20; 1:500; Abcam).
The binding of antibodies on a sample may be determined by any appropriate means. Tagging with individual reporter molecules is one possibility. The reporter molecules may directly or indirectly generate detectable, and preferably measurable, signals. The linkage of reporter molecules may be directly or indirectly, covalently, e.g. via a peptide bond or non-covalently. Linkage via a peptide bond may be as a result of recombinant expression of a gene fusion encoding antibody and reporter molecule. One favoured mode is by covalent linkage of each antibody with an individual fluorochrome, phosphor or laser exciting dye with spectrally isolated absorption or emission characteristics. Suitable fluorochromes include fluorescein, rhodamine, phycoerythrin and Texas Red. Suitable chromogenic dyes include diaminobenzidine.
Other reporters include macromolecular colloidal particles or particulate material such as latex beads that are coloured, magnetic or paramagnetic, and biologically or chemically active agents that can directly or indirectly cause detectable signals to be visually observed, electronically detected or otherwise recorded. These molecules may be enzymes which catalyse reactions that develop or change colours or cause changes in electrical properties, for example. They may be molecularly excitable, such that electronic transitions between energy states result in characteristic spectral absorptions or emissions. They may include chemical entities used in conjunction with biosensors. Biotin/avidin or biotin/streptavidin and alkaline phosphatase detection systems may be employed.
The determination of over expression of the one or more (or plurality) of marker proteins according to the present invention may be carried out in many different ways well known to those skilled in the art that include, by way of example, determining the presence or amount of expression of said marker protein, or a fragment thereof, in the sample (tissue or blood) obtained from the individual, or determining the expression of the marker protein gene, for example by examining the marker protein mRNA levels expressed from the marker protein gene.
Preferably, the methods comprise detecting the expression levels of the marker proteins. Such detection may involve the step of contacting an antibody or antibody fragment capable of recognising said polypeptide, or fragment thereof, with said sample (tissue or blood).
The analysis may comprise a qualitative analysis, e.g. by monitoring the presence of the one or more marker proteins by microscopy, e.g. using immunohistochemical staining. Immunohistochemical analysis can be performed on either formalin-fixed, paraffin fixed samples or on frozen tissue samples. Examples of possible IHC methods which could be used to detect and quantify the one or more marker proteins are as described in the present invention.
In one aspect, the present invention provides for a method for diagnosing recurrent or primary liver tumor in a subject, the method comprising determining the presence or absence of one or more marker proteins selected from the group consisting of Collagen alpha 1 (XVIII) chain, Plastin-3, AKR1B10, Fibronectin, Beta 3 tubulin, Asporin, 14-3-3 protein eta, and Dihydropyrimidinase-related protein 3 in a sample. Preferably, the liver tumor is selected from the group consisting of hepatocellular carcinoma, peripheral cholangiocellular carcinoma or hilar cholangiocellular carcinoma cells.
In one embodiment of this aspect the marker protein is Beta 3 tubulin and/or AKR1B10, preferably Beta 3 tubulin.
In another embodiment, the sample is selected from any one of blood, plasma, serum, liver tissue, liver cells or combinations thereof, preferably the sample is liver tissue, optionally formalin-fixed paraffin-embedded liver tissue section.
In another embodiment, the determining the presence or absence of one or more marker proteins in the sample is performed by either Immunohistochemistry (IHC) or mass-spectrometry.
In one preferred embodiment, the method for diagnosing recurrent or primary liver tumor in a subject comprises determining the presence or absence of Beta 3 tubulin, and optionally, AKR1B10, in a sample, wherein the liver tumor is selected from the group consisting of hepatocellular carcinoma, peripheral cholangiocellular carcinoma or hilar cholangiocellular carcinoma cells and wherein the sample is liver tissue, optionally formalin-fixed paraffin-embedded liver tissue section and wherein determining the presence or absence of one or more marker proteins in the sample is performed by Immunohistochemistry (IHC).
More preferably, the method comprises determining Beta 3 tubulin with a primary antibody.
By way of further example a primary antibody that is capable of specifically binding to a marker protein, e.g. Beta 3 tubulin and/or AKR1B10 in a binding assay may be labelled with a detectable molecule such as, but not limited to, radioactive or fluorescent labels or to enzymes which utilise a chromogenic substrate. Examples of radiolabels of use in this technique are 32P, 3H or 14C. Examples of fluorescent molecules of use in this technique are green fluorescent protein, Fluorescein IsoThioCyanate (FITC), Rhodamine IsoThioCyanate (TRICT) Cy3 and Cy5 Dyes. Examples of enzymes with chromagenic substrates of possible use in this technique are peroxidase, alkaline phosphatase or glucose oxidase.
Instead of detecting the signal from the primary antibody itself (as described above), a secondary antibody which binds to the primary antibody can be utilised.
The secondary antibody may be labelled with a suitable molecule for detection purposes examples of which are described above.
In an alternative method of detection the primary or secondary antibody may be labelled with a biotin molecule which can then be bound by a streptavidin or avidin linked enzyme with a suitable chromogenic substrate for detection.
Additional variations of the above techniques exist that will be apparent to someone skilled in the art.
In the context of this invention antibodies which could be used in such a technique could be generated by standard techniques involving immunisation of animals or could be generated in vitro by recombinant techniques. Antibodies could in this context be whole immunoglobulins or fragments of antibodies (Fab fragments) that correspond to the anti-idiotype. Such antibodies can be readily produced by the skilled person as discussed above.
The invention demonstrates the use of histological analysis to detect marker proteins and from this the cellular phenotype may be determined and the appropriate diagnosis and prognosis for the individual
In one embodiment, the method comprises the measurement of a plurality of marker proteins, preferably including tubulin beta 3 and/or AKR1B10 proteins in a liver tumour tissue section. The section may be a fresh-frozen section or formalin-fixed, paraffin embedded section such as is routine in the art of histology. Staining of sections may require a step of antigen retrieval prior to detection with a primary antibody specific for the target protein. Accordingly, the invention provides a method of determining the expression level of one or more marker proteins (preferably a plurality) using a binding member such as an antibody. Materials and methods relating to such assays are described in more detail below.
Alternatively, antibodies to the plurality of marker proteins may be detected in the blood or saliva of patients suspected of having liver cancer, using the marker proteins or fragments thereof as a detection agent.
In a further embodiment, the determination of the one or more (or plurality) of marker proteins in a sample from the individual may comprise the detection and quantification of autoantibodies. The marker protein or fragment thereof must be capable of specifically binding to such an autoantibody. Techniques such as ELISA may be used. An altered concentration of the plurality of marker proteins maybe identified by detecting the presence or altered levels of autoantibody thereto, compared to the level in a reference or control sample.
The level of autoantibody may be detected by Western blot (from 1D or 2D electrophoresis) against liver cell or liver tumor cells obtained from a biopsy or cell lines grown in vitro; or by ELISA, protein microarray or bead suspension array using purified marker proteins.
By way of example, detection of autoantibodies to marker proteins in different liver cell phenotypes can be carried out as follows. Recombinant marker proteins are expressed in baculovirus infected insect cells and used to coat the surface of microtitre plates. A blood or saliva sample, preferably a blood plasma sample and more preferably a blood serum sample is added to duplicate wells of each microtitre plate and incubated at 37° C. for 1 hour. Plates are aspirated and washed prior to the addition of a horse-radish peroxidase (HRP) labelled anti-human IgG antiserum and incubated for 1 hour at 37° C. Finally, binding of the antihuman antiserum is revealed by aspirating the plates, washing, and then adding tetra-methylbenzidine (TMB) which in the presence of HRP produces a coloured product the intensity of which is measured by reading the plates at 450 nm. An identical set of plates is tested with the exception that the second antibody is a HRP labelled anti-human IgM antiserum. The levels of IgG, IgE, IgA, IgD and/or IgM autoantibodies to each of the liver cell or liver tumor cell marker marker proteins is altered when compared to the levels found in reference standards or control samples.
In other embodiments, autoantibodies to the plurality of protein markers may be detected using the Western blotting approach using cells from the liver tumor sample, and then detecting the presence of antibodies specific for the protein markers that are present in the tumor.
It is contemplated within the invention to use (i) an antibody chip or array of chips, or a bead suspension array capable of detecting the plurality of marker proteins that interact with that antibody; or (ii) a protein chip or array of chips, or bead suspension array capable of detecting one or more autoantibodies that interact with the marker proteins; or (iii) a combination of both antibody arrays and protein arrays.
A further class of specific binding members contemplated herein in accordance with any aspect of the invention comprise aptamers (including nucleic acid aptamers and peptide aptamers). Advantageously, an aptamer directed to a protein marker may be provided by a technique known as SELEX (Systematic Evolution of Ligands by Exponential Enrichment), described in U.S. Pat. Nos. 5,475,096 and 5,270,163.
Alternatively, differential expression of nucleic acids encoding marker proteins may be used as a detection method. Expression of nucleic acids may be detected by methods known in the art, such as RT-PCR, Northern blotting or in situ hybridisation such as FISH.
Gene expression technologies such as reverse transcriptase—polymerase chain reaction (RT-PCR) can give accurate measurement of mRNA expression levels and the presence of the one or more marker proteins mRNA in a sample as opposed to its absence could also be used to provide the cellular phenotype classification. RT-PCR can be performed in a range of formats including quantitative versions and with sensitivities that enable the determination of mRNA levels in a single cell.
In one embodiment, the expression of the marker protein gene can be assessed by determining the presence or amount of marker protein mRNA in the sample and methods for doing this are well known to the skilled person. By way of example, they include determining the presence of marker protein mRNA in the sample (i) using a labelled probe capable of hybridising to the marker protein nucleic acid; and/or (ii) using PCR involving one or more primers based on a marker protein nucleic acid sequence to determine whether the marker protein transcript is present in a sample. The probe may also be immobilised as a sequence included in a microarray.
In accordance with these and other aspects of the invention, the plurality of marker proteins are selected from any one of Collagen alpha 1 (XVIII) chain, Plastin-3, AKR1B10, Fibronectin, Beta 3 tubulin, Asporin, 14-3-3 protein eta or Dihydropyrimidinase-related protein 3 or combinations thereof, preferably the plurality of marker proteins comprises AKR1B10 and/or Beta 3 tubulin.
In some embodiments, the determination of the presence or amount of the plurality of protein markers comprises measuring the presence or amount of mRNA derived from the cell under test. The presence or level of mRNA encoding the protein marker in the liver cells under examination will allow the cell to be classified according to its phenotype.
Techniques suitable for measuring the level of protein marker encoding mRNA are readily available to the skilled person and include “real-time” reverse transcriptase PCR or Northern blots. The method of measuring the level of a protein marker encoding mRNA may comprise using a plurality of primers or probes that are each independently directed to the sequence of one of the plurality of protein marker encoding genes or complement thereof. Each of the primers or probes may comprise a nucleotide sequence of at least 10, 15, 20, 25, 30 or 50 contiguous nucleotides that has at least 70%, 80%, 90%, 95%, 98%, 99% or 100% identity to a nucleotide sequence encoding the protein marker provided Table 5 (or any of Table 2 to 10 or relevant section of combined Table 11).
Preferably, the probes or primers according to the invention hybridise under stringent conditions to their specific protein marker encoding nucleic acid sequence.
The methods of the invention may comprise contacting the liver cell with a binding member as described above, but also includes contacting the binding member with cell lysate to increase contact directly or indirectly with the one or more of the marker proteins.
The binding members may be immobilised on a solid support. This may be in the form of an antibody array or a nucleic acid microarray. Arrays such as these are well known in the art. The solid support may be contacted with the cell lysate, thereby allowing the binding members to bind to the cell products representing the presence or amount of the one or more marker proteins.
In some embodiments, the binding member is an antibody or fragment thereof which is capable of binding to a marker protein or part thereof. In other embodiments, the binding member may be a nucleic acid molecule capable of binding (i.e. complementary to) the sequence of the nucleic acid to be detected.
The methods may further comprise contacting the solid support with a developing agent that is capable of binding to the one or more marker proteins, antibody or nucleic acid.
The developing agent may comprise a label and the method may comprise detecting the label to obtain a value representative of the presence or amount of the one or more marker proteins, antibody or nucleic acid in the cell, cell culture medium or cell lysate.
The label may be, for example, a radioactive label, a fluorophor, a phosphor, a laser dye, a chromogenic dye, a macromolecular colloidal particle, a latex bead which is coloured, magnetic or paramagnetic, an enzyme which catalyses a reaction producing a detectable result or the label is a tag.
The methods preferably comprise determining the presence or level of expression of a plurality of marker proteins or nucleic acids encoding said marker proteins in a single sample. For example, a plurality of binding members, each specific for one of a plurality of protein markers selected from Table 5 (or any one of Tables 2 to 10 or relevant section of combined Table 11), may be immobilised at predefined locations on the solid support. The number of binding members on the solid support may make up 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or 100% of the total number of binding members on the support.
Additional methodologies to detect the one or more marker protein gene expression will be apparent to those skilled in the art.
In some embodiments, the determination of the presence or the level of expression of one or more of the marker proteins may be performed by mass spectrometry. Techniques suitable for measuring the level of a protein marker selected from Table 5 (or any other of Table 2 to 10 or relevant section of combined Table 11) include, but are not limited to techniques related to Selected Reaction Monitoring (SRM) and Multiple Reaction Monitoring (MRM) isotope dilution mass spectrometry including SILAC, AQUA (as disclosed in WO 03/016861, the entire content of which is specifically incorporated herein by reference) and TMTcalibrator (as disclosed in WO 2008/110581; the entire content of which is specifically incorporated herein by reference).
WO 2008/110581 discloses a method using isobaric mass tags to label separate aliquots of all marker proteins in a reference sample which can, after labelling, be mixed in quantitative ratios to deliver a standard calibration curve. A test sample is then labelled with a further independent member of the same set of isobaric mass tags and mixed with the calibration curve. This mixture is the subjected to tandem mass spectrometry and peptides derived from specific marker proteins can be identified and quantified based on the appearance of unique mass reported ions released from the isobaric mass tags in the MS/MS spectrum.
By way of a reference level, a known or predicted protein marker derived peptide may be created by trypsin, ArgC, AspN or Lys-C digestion of said protein marker. In some cases, when employing mass spectrometry based determination of protein markers, the methods of the invention comprises providing a calibration sample comprising at least two different aliquots comprising the protein marker and/or at least one protein marker derived peptide, each aliquot being of known quantity and wherein said biological sample and each of said aliquots are differentially labelled with one or more isobaric mass labels. Preferably, the isobaric mass labels each comprise a different mass spectrometrically distinct mass marker group.
Accordingly, the method of determining the cellular phenotype of a liver cell, wherein the method comprises determining the presence or expression level of one or more of the marker proteins selected from Table 5 (or from any one of Tables 2 to 10 or relevant section of combined Table 11 in a liver cell by Selected Reaction Monitoring using one or more determined transitions for known protein marker derived peptides; comparing the determined expression levels with reference set of expression levels previously determined to represent a particular cellular phenotype, e.g. HCC or CC; and determining or identifying the cellular phenotype based on changes in expression of said one or more, preferably plurality of marker proteins. The comparison step may include determining the amount of marker protein derived peptides from the liver cell with known amounts of corresponding synthetic peptides. The synthetic peptides are identical in sequence to the peptides obtained from the cell, but may be distinguished by a label such as a tag of a different mass or a heavy isotope.
More preferably, the determination and/or quantification is made by mass spectrometry.
One or more of these synthetic protein marker derived peptides with or without label form a further aspect of the present invention. These synthetic peptides may be provided in the form of a kit for the purpose of determining the cellular phenotype of a liver cell, in particular HCC or CC phenotype.
Other suitable methods for determining levels of protein expression include surface-enhanced laser desorption ionization-time of flight (SELDI-TOF) mass spectrometry; matrix assisted laser desorption ionization-time of flight (MALDI-TOF) mass spectrometry, including LS/MS/MS; electrospray ionization (ESI) mass spectrometry; as well as the preferred SRM and TMT-SRM. Each of these methods may be preceded by a step of marker protein enrichment by immunoprecipitation or affinity chromatography performed in column or batch mode. Any binding agent with the required specificity for the marker proteins may be employed in such enrichment including but not limited to polyclonal antibodies, monoclonal antibodies and aptamers.
Liquid chromatography-mass spectrometry (LC-MS/MS) based proteomics has proven to be superior over conventional biochemical methods at identifying and precisely quantifying thousands of marker proteins from complex samples including cultured cells (prokaryotes/eukaryotes), and tissue (Fresh Frozen/formalin fixed paraffin embedded), leading to the identification of novel biomarkers in an unbiased manner [7, 8, 9]. The present inventors have used laser microdissection (LMD) of specific formalin fixed tissue types thereby allowing regions of archival tumor material enriched for normal hepatocytes, normal cholangiocytes, and their respective transformed equivalents to be independently analysed by LC-MS proteomics. Spectral counting was used for relative quantification due to its good linear dynamic range (two to three orders of magnitude) and high quantitative proteome coverage [10, 11].
Thus, as detailed above, a differentially expressed protein which is a member of the plurality of protein markers described herein and illustrated in Tables 1A and Tables 2 to 11 may qualitatively have its expression activated or completely inactivated in first cellular phenotype versus a second cellular phenotype. Such a qualitatively regulated protein will exhibit an expression pattern within a given cell type which is detectable in one phenotype, e.g. HCC or CC, but not detectable in both. ‘Detectable’, as used herein, refers to a protein expression pattern, which is detectable using techniques described herein.
Alternatively, a differentially expressed protein which is a member of the plurality of marker proteins described herein may have its expression modulated, i.e. quantitatively increased or decreased, in a first cellular phenotype versus a second cellular phenotype. The degree to which expression differs between cellular phenotypes under comparison, e.g. HCC and CC, need only be large enough to be visualised via standard characterisation techniques, such as silver staining of 2D-electrophoretic gels. Other such standard characterisation techniques by which expression differences may be visualised are well known to those skilled in the art. These include successive chromatographic separations of fractions and comparisons of the peaks, capillary electrophoresis, separations using micro-channel networks, including on a micro-chip, SELDI analysis and qPST analysis.
Chromatographic separations can be carried out by high performance liquid chromatography as described in Pharmacia literature, the chromatogram being obtained in the form of a plot of absorbance of light at 280 nm against time of separation. The material giving incompletely resolved peaks is then re-chromatographed and so on.
Capillary electrophoresis is a technique described in many publications, for example in the literature “Total CE Solutions” supplied by Beckman with their P/ACE 5000 system. The technique depends on applying an electric potential across the sample contained in a small capillary tube. The tube has a charged surface, such as negatively charged silicate glass. Oppositely charged ions (in this instance, positive ions) are attracted to the surface and then migrate to the appropriate electrode of the same polarity as the surface (in this instance, the cathode). In this electroosmotic flow (EOF) of the sample, the positive ions move fastest, followed by uncharged material and negatively charged ions. Thus, marker proteins are separated essentially according to charge on them.
Micro-channel networks function somewhat like capillaries and can be formed by photoablation of a polymeric material. In this technique, a UV laser is used to generate high energy light pulses that are fired in bursts onto polymers having suitable UV absorption characteristics, for example polyethylene terephthalate or polycarbonate. The incident photons break chemical bonds with a confined space, leading to a rise in internal pressure, mini-explosions and ejection of the ablated material, leaving behind voids which form micro-channels. The micro-channel material achieves a separation based on EOF, as for capillary electrophoresis. It is adaptable to micro-chip form, each chip having its own sample injector, separation column and electrochemical detector: see J. S. Rossier et al., 1999, Electrophoresis 20: pages 727-731.
Surface enhanced laser desorption ionisation time of flight mass spectrometry (SELDI-TOF-MS) combined with ProteinChip technology can also provide a rapid and sensitive means of profiling marker proteins and is used as an alternative to 2D gel electrophoresis in a complementary fashion. The ProteinChip system consists of aluminium chips to which protein samples can be selectively bound on the surface chemistry of the chip (eg. anionic, cationic, hydrophobic, hydrophilic etc). Bound marker proteins are then co-crystallised with a molar excess of small energy-absorbing molecules. The chip is then analysed by short intense pulses of N2 320 nm UV laser with protein separation and detection being by time of flight mass spectrometry. Spectral profiles of each group within an experiment are compared and any peaks of interest can be further analysed using techniques as described below to establish the identity of the protein.
Isotopic or isobaric Tandem Mass Tags® (TMT®) (Thermo Scientific, Rockford, USA) technology may also be used to detect differentially expressed marker proteins which are members of a biomarker panel described herein. Briefly, the marker proteins in the samples for comparison are optionally digested, labelled with a stable isotope tag and quantified by mass spectrometry. In this way, expression of equivalent marker proteins in the different samples can be compared directly by comparing the intensities of their respective isotopic peaks or of reporter ions released from the TMT reagents during fragmentation in a tandem mass spectrometry experiment.
Differentially expressed marker proteins which are members of the plurality of protein markers described herein may be further described as target marker proteins and/or fingerprint marker proteins. ‘Fingerprint marker proteins’, as used herein, refer to a differentially expressed protein whose expression pattern may be utilised as part of a prognostic or diagnostic cellular phenotype evaluation. A fingerprint protein may also have characteristics of a target protein or a pathway protein. For example, the one or more marker proteins described herein may be used as liver tumor markers as well as determining the cellular phenotype of the liver cell. For example, it is contemplated that any of the markers provided in Tables 2 to 11, but at least tubulin beta 3 and/or AKR1B10 proteins may be used as markers for liver tumor. The detection of these proteins in blood may well provide a diagnostic tool for liver cancer. The marker proteins may be secreted or lost into the blood stream following cell death and may serve as circulating tumor antigens.
As described above, the invention provides a number of methods by which the one or marker proteins may be determined in a liver tissue sample, blood or saliva sample from an individual. The method comprises detecting the expression levels of the one or more (preferably plurality) of marker proteins selected from any one of Tables 2 to 10 or the relevant section of Table 11.
Unless context dictates otherwise, the descriptions and definitions of the features set out above are not limited to any particular aspect or embodiment of the invention and apply equally to all aspects and embodiments which are described. Thus, the features set out above are disclosed in all combinations and permutations.
One or more of the marker proteins selected From table 2 to 11 may be used as diagnostic marker for liver cancer in the methods described above and kits for use in carrying out these methods, in particular determining the cellular phenotype of a liver cell, preferably a liver tumor cell, in vitro, are encompassed herein.
Preferably, the kit allows the determination/identification of a cellular phenotype selected from normal liver epithelium cells (hepatocytes), normal biliary epithelium cells (cholangiocytes), hepatocellular carcinoma cells, peripheral cholangiocellular carcinoma cells and hilar cholangiocellular carcinoma cells.
More preferably, the kit allows the liver tumor cell to be identified as an HCC cell or a CC cell.
The kit allows the user to determine the presence or level of expression of a plurality of analytes selected from
Suitable binding members have been described herein. In particular, for detection of a marker protein or fragment thereof, the binding member may be an antibody which is capable of binding to one or more of the marker proteins selected from Table 5 (or any one of Tables 2 to 10 or relevant section of combined Table 11), or a combination thereof.
Kits according for the invention may be used for diagnosing recurrent or primary liver tumor in a subject by comprising reagents for determining the presence or absence of one or more marker proteins selected from the group consisting of Collagen alpha 1 (XVIII) chain, Plastin-3, AKR1B10, Fibronectin, Beta 3 tubulin, Asporin, 14-3-3 protein eta, and Dihydropyrimidinase-related protein 3 in a sample. Preferably, the liver tumor is selected from the group consisting of hepatocellular carcinoma, peripheral cholangiocellular carcinoma or hilar cholangiocellular carcinoma cells.
In one embodiment, the marker protein is Beta 3 tubulin and/or AKR1B10, preferably Beta 3 tubulin.
In another embodiment, the kit comprises reagents suitable for preparing the sample, wherein the sample is selected from any one of blood, plasma, serum, liver tissue, liver cells or combinations thereof.
In yet another embodiment, the sample is liver tissue and the kit comprises reagents suitable for preparing liver tissue, optionally for preparing formalin-fixed paraffin-embedded liver tissue sections.
In another embodiment, the determining the presence or absence of one or more marker proteins in the sample is performed by either Immuno-hystochemistry.
In one preferred embodiment, the kit for diagnosing recurrent or primary liver tumor in a subject comprises reagents for determining the presence or absence of Beta 3 tubulin, and optionally, AKR1B10, in a sample, wherein the liver tumor is selected from the group consisting of hepatocellular carcinoma, peripheral cholangiocellular carcinoma or hilar cholangiocellular carcinoma cells and wherein the kit comprises reagents suitable for preparing liver tissue, optionally for preparing formalin-fixed paraffin-embedded liver tissue sections and wherein the kit is suitable for determining the presence or absence of one or more marker proteins in the sample by Immuno-hystochemistry (IHC).
More preferably, the kit comprises a primary antibody for Beta 3 tubulin.
As mentioned above, various methodologies are known in the art for determining the presence or amount of a marker protein, antibody or nucleic acid molecule in a sample. Various suitable assays are described below in more detail and each form embodiments of the invention.
The kit may additionally provide a standard or reference which provides a quantitative measure by which determination of an expression level of one or more marker proteins can be compared. The standard may indicate the levels of marker protein expression which indicate the cellular phenotype of the liver cell, e.g. HCC or CC
The kit may also comprise printed instructions for performing the method.
In one embodiment, the kit may be for performance of a mass spectrometry assay and may comprise a set of reference peptides (e.g. SRM peptides) in an assay compatible format wherein each peptide in the set is uniquely representative of each of the plurality of marker proteins provided in Table 5, (or any one of Tables 2 to 10 or relevant section of combined Table 11). Preferably two and more preferably three such unique peptides are used for each protein for which the kit is designed, and wherein each set of unique peptides are provided in known amounts which reflect the levels of such proteins in a standard preparation of said cell of known phenotype, e.g. HCC or CC cells. Optionally, the kit may also provide protocols and reagents for the isolation and extraction of proteins from a sample, a purified preparation of a proteolytic enzyme such as trypsin and a detailed protocol of the method including details of the precursor mass and specific transitions to be monitored. The peptides may be synthetic peptides and may comprise one or more heavy isotopes of carbon, nitrogen, oxygen and/or hydrogen.
Optionally, the kits of the present invention may also comprise appropriate cells, vessels, growth media and buffers.
The invention also includes the use of a plurality of binding members each capable of independently binding to one or more of a plurality of marker proteins or fragments thereof provided in Table 5, one or more antibodies against said marker proteins and one or more nucleic acid molecules encoding said marker proteins or fragments thereof, for the in vitro diagnosis or prognostic monitoring of an individual having or suspecting a liver tumor, or following treatment for a liver tumor.
The kit may comprise reagents for the detection of the plurality of protein markers in a liver tumor sample, wherein said plurality of protein markers are selected from Table 5 or part A of Table 5, or section 2_5 of Table 11.
A kit may comprise a plurality of primary antibodies, each antibody binding specifically to a different individual protein marker of the plurality of protein markers selected from Table 5 or section 2_5 of Table 11.
The antibodies may be immobilised on an assay plate, beads, microspheres or particles. Optionally, beads, microspheres or particles may be dyed, tagged or labelled.
A kit may further comprise one or more secondary antibodies which bind specifically bind to the primary antibodies. The secondary antibodies may be labelled, for example fluorescent labelled or tagged.
A kit may further comprise one or more detection reagents for detecting the presence of the tagged secondary antibodies.
Furthermore, the invention provides for a kit for classifying the cellular phenotype of a liver tumor cell or for determining a liver tumor in an individual in line with the methods described herein. Preferably, the kit comprises the reagents necessary for carrying out the determination of the presence or level of expression of one or more (preferably a plurality) of the marker proteins selected from one or more of Tables 2 to 11 on a sample (tissue or blood) and instructions for carrying out the test and interpreting the results. Preferred types of kit may comprise one or more of the following reagents:
As for antibody reagents, the probes may conveniently be directly or indirectly labelled to enable them to be detected.
The invention also provide for a liver cellular classification system comprising a liver cellular classification apparatus and an information communication terminal apparatus, said liver cellular classification apparatus including a control component and a memory component, said apparatuses being communicatively connected to each other via a network;
The data derived from the liver tissue sample of the subject is preferably expression level data such as that obtained from methods described herein e.g. LC-MS/MS and other proteomic approaches. The data may be derived just from the tissue being either normal tissue of tumor (or suspected tumor) tissue sample.
The protein data received by the data-receiving unit may be the actual protein levels, or it may be peptide levels from which the protein levels can be calculated. The peptide is unique to the at least one (preferably plurality) protein. In some embodiments it is preferable to use multiple, i.e. 2, 3, 4, or 5 peptides which are all unique to said protein. Where multiple peptides are used, data may be collated and optionally a median value used in the data comparison step.
The memory unit preferably includes data sets relating to protein expression levels representative of liver tissue or tumor sample. In a preferred embodiment, the protein expression levels are derived from actual peptide levels in the sample. This is particularly so if the data has been obtained using proteomic methods such as the LC-MS/MS method described herein. The data sets may provide a representative (e.g. average) level of protein expression levels found in liver tissue (normal or tumor) from a collection of data sets, e.g. as provided herein by Table 11. Alternatively, it may be preferable for the data sets to include a value representing a ratio of the protein expression level as compared to the protein expression level of a different cellular phenotype (e.g. HCC v peripheral CC) tissue obtained from the same source.
In this way, the system can compare the protein expression levels obtained from liver tissue samples (non-tumor or tumor) with protein expression levels representative of a particular liver cellular phenotype for the same protein and thereby classify the tissue by its cell type.
The system may further comprise the means to add the inputted data via the data sending unit to the stored data already held in the memory unit so that this new data can be included in the analysis performed by the determining unit. In this way the data representative of liver cellular phenotype (tumor or non-tumor) is constantly updated.
The liver tissue classification system may be connected to an apparatus for determining protein expression levels in a liver tissue (tumor or non-tumor) sample and feeding this data to the protein data sending unit.
Ideally the apparatus can process multiple samples using LC-MS/MS as described herein.
In accordance with this aspect of the invention, there is also provided a liver tissue (tumor or non-tumor) cellular classification program that makes an information processing apparatus including a control component and a memory component execute a method of determining and/or classifying the liver tissue of a subject, the method comprising:
In accordance with this aspect of the invention, there is also provided a computer-readable recording medium, comprising the liver tissue cellular classification program described above recorded thereon.
The data representing protein expression levels may be derived from peptide levels in the sample where said peptides are each unique to a particular protein selected from any one of Tables 2 to 11. It will be appreciated that peptides may be designed which will be unique for the protein from which they are derived, e.g. by proteolytic enzyme digestion such as trypsin, aspN, gluC and other such enzymes well known in the art.
In accordance with all aspects and embodiments of the invention the plurality of marker proteins are selected from any one of Collagen alpha 1 (XVIII) chain, Plastin-3, AKR1B10, Fibronectin, Beta 3 tubulin, Asporin, 14-3-3 protein eta or Dihydropyrimidinase-related protein 3 or combinations thereof, preferably the plurality of marker protein comprises AKR1B10 and/or Beta 3 tubulin.
Certain aspects and embodiments of the invention will now be illustrated by way of example and with reference to the figures and tables described above.
All documents mentioned in this specification are incorporated herein by reference in their entirety for all purposes.
4.1 Material and Methods
Liver Tissue
This study consists of 9 types of liver tissue taken from a total of 55 archived specimens: mixed HCC/CC after TACE (areas of HCC and areas of CC separately examined), non-treated HCC, normal liver parenchyma, normal bile duct, non-treated peripheral CC, non-treated hilar/perihilar CC, PSC-associated hilar CC, and metastatic colorectal cancer. All specimens were surgically resected or explanted livers from adult patients ranging from 27 to 80 years in age. Details of tissues in each group are as follows:
Tissue Sampling
Fresh liver specimens, which were surgically resected, were immediately received at our pathology laboratory. After macroscopic examination, samples were extensively taken, and were fixed in 10% formalin for at least 4 hours before being embedded in paraffin.
Microdissection of FFPE Tissue
10 μm thick sections were prepared from FFPE tissue blocks. After deparaffinization with xylene and alcohol, a target area of 1.5×107 μm2 (0.15 mm3) was selectively cut using the Laser Capture Microdissection System (LMD6500, Leica Microsystems, Wetzlar, Germany). Dissected tissues were directly immersed in 50 μL of Qproteome® FFPE Tissue Extraction Buffer (QIAGEN, Valencia, Calif.) and stored in −80° C. until protein extraction. Samples were prepared in batches e.g. week 1 (batch 1) the 1st biological replicate of each tissue/tumor type were prepared and analysed. At week 2 (batch 2), the 2nd biological replicates of each tissue/tumor type were prepared and analysed. This process was continued up until week 7 (batch 7) where the 7th biological replicate of each tissue/tumor type were prepared and analysed. Samples were prepared and analysed in this way to ensure that differences in protein expression levels between different tissue types were due to biology/pathology of the sample, rather than any sample preparation variability.
Protein Extraction from FFPE Liver Tissue
Following storage at −80° C. samples were thawed on ice then homogenised, vortexed and centrifuged. Samples were transferred to 1.5 ml collection tubes and sealed with collection tube sealing clip, as provided in the Qproteome® kit. Samples were incubated on a heating block at 100° C. for 20 min, then for a further 2 hours at 80° C. with agitation at 750 rpm. After heating, the sample tubes were placed on ice for 1 min and the collection tube sealing clip removed. Each tube was centrifuged for 15 min at 14,000 g at 4° C. The supernatant were then transferred to a new siliconised collection tubes. The protein concentration of each sample were then determined using the Bradford protein assay and microplate luminometer.
1D Electrophoresis Gels
Stacking gels were constructed to comprise a 1 cm height 4% w/v polyacrylamide matrix on top of a 20% w/v polyacrylamide matrix. Protein samples and pre-stained molecular weight markers were each prepared in Sigma 2× Laemmli sample buffer (1:1) and run into the gels in Tris-glycine running buffer (Invitrogen, Loughborough, UK) for 20 min at 150 V, or until the protein sample and molecular weight markers were observed to concentrate at the 4-20% w/v gel interface. Each sample was loaded onto the gel at 100 μg/well. Following electrophoresis the gels were briefly stained with Imperial protein stain (Pierce, Ill., USA) then de-stained in water to visualize the proteins and to confirm their migration as a homogeneous population. The protein band visible at the 4-20% w/v gel interface was excised from each lane.
For separation gels we used the 1 mm thick 10 well Nu-PAGE 4 to 12% Bis-Tris gels from Invitrogen, Carlsbad, Calif., USA.
Reduction/Alkylation/Trypsin Digestion
Gel bands were chopped into small 1 mm3 pieces then destained and dehydrated with ACN. Proteins were subsequently reduced with 10 mM dithiothreitol in 25 mM ammonium bicarbonate at 56° C. for 1 h and alkylated with 55 mM iodoacetamide in 25 mM ammonium bicarbonate at room temperature for 45 min. Gel pieces were then washed, dried, rehydrated on ice for 10 min in 2 μg of sequence grade trypsin, reconstituted in 100 μL of 25 mM ammonium bicarbonate, then covered with an additional 20 μL ammonium bicarbonate solution, and incubated overnight at 37° C. The resulting proteolytic peptides were subjected to aqueous (30 μL ammonium bicarbonate, 20 min vortex) and two hydrophobic extractions (100 μL of 50% ACN, 5% formic acid, 20 min vortex, 10 min sonication). Samples were quickly vortexed and centrifuged then frozen to −80° C. The frozen sample were then concentrated under vacuum to ˜20 μL, then topped up with 0.1% Formic acid to 70 μL, gel particulates filtered out using 30000 MW filters (Millipore), and finally stored at −80° C. until used for LC-MS/MS.
Liquid Chromatography Mass Spectrometry (LC-MS/MS).
After freeze/thaw, 10 μL of each sample were injected onto a Thermo pre-column (EASY-Column, 2 cm, ID 100 μm, 5 μm C18-A1), using the Proxeon EASY-nLC II system (Thermo Fisher Scientific). Peptides were then resolved using an increasing gradient of 0.1% formic acid in acetonitirile (5 to 50% over 80 minute) through a Thermo analytical column (EASY-Column, 10 cm, ID 75 μm, 3 μm C18-A2) at a flow rate of 300 nL/min. Mass spectra were acquired on an LTQ-Orbitrap Velos (Thermo Fisher Scientific) throughout the chromatographic run (115 minutes), using 20×CID scans following each FTMS scan (2× pScans at 30000 resolving power @400 m/z). CID was carried out on 20 of the most intense ions from each FTMS scan then put on a dynamic exclusion list for 30 secs (20 ppm m/z window). AGC ion injection target for each FTMS scan were 1000000 (500 ms max injection time). AGC ion injection target for each MSA CID scan were 10000 (50 ms max ion injection time).
Data Pre-Processing
Peptide identification. Peak lists were extracted from Xcalibur Raw data file format using Proteome Discoverer 1.4 and searched using Mascot 2.2 and Sequest HT search engines.
Protein quantification. For each of the 62 tissue specimen data files (n=7 for 8 of the tissue types, n=6 for normal bile duct), Proteome Discover 1.4 was used to export the list of identified proteins to excel. For quantification purposes we utilized the node ‘The Precursor Ions Area Detector’ (
Normalization: Both spectral counts and AUC for each protein from each sample were normalized to compensate for any artifact differences between samples such as unequal loading of protein onto the gel, variable in-gel digestion and peptide extraction and variable injection volume into the LC-MS/MS system.
The protein area estimates were log10 transformed prior to normalization. Normalization was done using the following equation;
Where
Statistical Analysis
Principal component analysis (PCA) was performed to investigate the multivariate datasets and identify outliers and groups/clusters nested within the datasets. Normalized protein values were used for PCA, which was performed using Simca v. 11, MKS Umetrics AB, Sweden [17].
Hierarchical clustering to build a class hierarchy for tissue types in relation to normalized protein values, alongside statistical analyses to observe differential regulation of proteins between tissue-types were both carried out in MATLAB: The MathWorks Inc., (R2012a) [18]. Two types of hierarchical clustering were performed to group the normalized protein abundances using agglomerative based clustering. In the first approach Pearson's correlation coefficients were obtained by comparing all normalized protein levels in all the samples (62) across all other samples (62), which resulted in a square data matrix consisting of 62×62 r2 Pearson's correlation coefficients. The second clustering was performed using ‘city-block’ distance metric (also known as the Manhattan distance) with un-weighted average distance (UPGMA) linkage to generate a hierarchical tree. The process clustered all data points first along all the columns (producing row-clustered data), and then along all the rows in the data matrix where rows corresponded to marker proteins and columns corresponded to the samples.
For Tables 2 to 10; the statistical analyses were run using R and the following R packages: q value and MBESS. For each group comparison and each protein an unrelated t-tests was computed to obtain the p value. Then q values (adjusted p values) were computed using a direct False Discovery Rate approach proposed by Storey (A direct approach to false discovery rates. Journal of the Royal Statistical Society, 2002, Series B, 64: 479-498). Hedges' g unbiased standardized effect size estimates were calculated, along with 95% confidence intervals for these estimates (Hedges, L. V. & Olkin, I. (1985). Statistical methods for meta-analysis. New York: Academic press). g<0.2 are regarded as very small differences, g=0.5 average differences, g>0.8 regarded as large differences. Unstandardized effect size estimates (i.e., mean difference) were calculated, along with 95% confidence intervals for these estimates.
For Table 11 (
A protein was considered to be differentially modulated between the two tissue types when it had a p-value<0.05, log2 fold ratio>2 or log2 fold ratio<−2 (representing fourfold up- and down-regulation respectively). Volcano plots were created using these p-values and log2 fold changes as described by Cui, X et al and Best, C. J. M et al [19-20].
Immunohistochemistry (IHC)
Four tissue types (normal liver (n=7), normal bile duct (n=6), HCC (n=7), and peripheral CC (n=7)) were used for validation IHC, as these are clinically most important to differentiate. One representative section selected from each case was used for immunostaining. Sections for IHC were taken from the same cases that were analysed by LC-MS. Immunostaining on FFPE specimens was performed using an autostainer Bond Max (Leica Microsystems, Wetzlar, Germany) The deparaffinized sections were heat-treated in a pH6.0 buffer for 10 mins. The primary monoclonal antibodies used were anti-AKR1B10 (clone 1A6; 1:500; Abcam, Cambridge, UK) and anti-tubulin beta 3 (clone TU20; 1:500; Abcam). Tissue type number key; (as used in Table 11 first column)
1)=Normal liver epithelium (Hepatocytes).
2)=Hepatocellular carcinoma. Combined hepato-cholangiocellular carcinoma after TACE therapy i.e.
3)=areas of hepatocellular differentiation, and
4)=areas of cholangiocellular differentiation.
5)=Peripheral (intrahepatic) cholangiocarcinoma.
6)=Hilar cholangiocarcinoma originated in patients without primary sclerosing cholangitis.
7)=Hilar cholangiocarcinoma originated in patients with primary sclerosing cholangitis.
8)=Metastatic colo-rectal carcinoma.
9)=Normal biliary epithelium (Cholangiocytes).
4.2 Results
Protein Markers Identification
In total 2864 proteins were identified using rank 1 peptides at 1% FDR at peptide level (2 rank 1 peptides per protein ID). Of the 2864 proteins 2628 (92%) had at least 1 unique peptide sequence and 2009 (70%) proteins had only unique peptide sequences. It was further observed that 236 (8%) proteins out of 2864 proteins had only shared peptide sequences. Of the 619 proteins with unique and shared peptides the inventors performed quantification using only unique peptides and compared this quantification to using the unique & shared peptides and found a correlation of 0.99 when comparing the fold change values from the two datasets (0.992 for spectral counting and 0.999 for area under the curve). Thus, as there appears to be no detrimental effect on accuracy using the shared and unique peptides, compared to only unique peptides, and due to the additional coverage gained by using shared and unique peptides, the inventors present results here obtained using shared and unique peptide sequences from both spectral counting and AUC forms of quantitation in the main text (Table 1, Table 11)
Protein Quantification and Hierarchical Clustering
The inventors found 1072 proteins significantly regulated in at least one of the tissue type comparisons when using the area under the curve dataset, while 611 proteins were significantly regulated using the spectral counting dataset (in at least one of the tissue type comparisons). A total of 467 marker proteins were found to be significantly regulated in at least one of the tissue type comparisons, as observed in both quantification methods (e.g. common to both spectral counting (right) and AUC (left) in the Venn-Diagram in
Hierarchal clustering of the same 467 common marker proteins also supported the results obtained using PCA which clusters hepatocellular tissue types from glandular epithelium. Clustering of these 467 marker proteins based on Pearson's correlation coefficients and on protein data matrix using normalized protein area values clustered samples that originated from tissue types 1, 8 and 9 within single nested sub-groups (data not shown). Although using spectral counting as a data matrix produced similar results, it was found that area under curve data matrix produced better separation between groups when hierarchical clustering was performed. Table 1 illustrates the number of differentially modulated marker proteins that were common to both area under the curve and spectral counting datasets per tissue type comparison.
Difference in Protein Expression Profiling Among 9 Tissue Types
Post-TACE mixed cancer: Although HCC and CC components of post-TACE cancer are theoretically same in origin, these two areas showed significantly different protein markers' profiles as clearly demonstrated by PCA (
Normal liver parenchyma vs. normal bile duct: (See Table 6) Over 200 marker proteins were expressed at significantly different levels between normal liver and bile duct (Table 1 and Table 6). About a half of those marker proteins were liver enzymes, which were more abundantly present in normal liver parenchyma. In contrast, marker proteins that were more strongly expressed in normal bile ducts were diverse, including keratins 7 and 19, annexins, and galectins (Table 11).
Normal liver parenchyma vs. HCC: (See Table 2) Among 11 marker proteins that showed statistically significant difference between normal liver parenchyma and HCC, 5 marker proteins (14-3-3 protein eta, Aldo-keto reductase family 1 member B10 [AKR1B10], Heterogeneous nuclear ribonucleoprotein R, Histone H1.5, Keratin type II cytoskeletal 6B) appeared overexpressed in the cancer tissue. The remaining six, which were less abundant in HCC, were mostly liver enzymes supposed to represent mature hepatocyte functions. In accordance with the invention the one or more, or plurality of marker marker proteins may be selected from the group consisting of 14-3-3 protein eta, Aldo-keto reductase family 1 member B10 [AKR1B10], Heterogeneous nuclear ribonucleoprotein R, Histone H1.5, Keratin type II cytoskeletal 6B.
Normal bile duct vs. peripheral or hilar CC: (See Tables 3 and 4) Numbers of marker proteins that showed statistically significant difference between normal and neoplastic bile ducts are 37 for peripheral CC and 32 for hilar CC (Table 1, Table 11). Six and eight marker proteins were significantly overexpressed in peripheral and hilar CC, respectively. Among them, 3 marker proteins (Tubulin-beta 3 chain, Periostin, Collagen alpha-1(XII) chain) were up-regulated in both types of CC. Fourteen marker proteins were significantly less abundant in both types of CCs. In accordance with the invention the one or more, or plurality of marker proteins may be selected from the group consisting of Tubulin-beta 3 chain, Periostin, and Collagen alpha-1(XII) chain.
The one or more, or plurality of marker proteins may also be selected from argininosuccinate lyase, N9G), N9G)-dimethylarginine dimethylaminohydrolase 1A and 1B, Filamin-A and plastin-3.
HCCvs. peripheral CC: (See Table 5) One hundred and sixty-five marker proteins showed statistically significant differences between these two types of cancers, which develop in the liver parenchyma (Table 1, Table 4 and Table 11). Most marker proteins that were overexpressed in HCC were liver enzymes or mitochondrial marker proteins, whereas marker proteins that were up-regulated in peripheral CC were diverse in function including cell-cell adhesion, cell migration, and signal transduction. Multi-functional marker proteins such as annexins and S100-A11 were also more abundantly present in peripheral CC. Ninety-six marker proteins (58%) were overlapped with marker proteins that were identified in the comparison between normal liver parenchyma and normal bile duct.
Peripheral CC vs. hilar CC: These two types of CC are both adenocarcinoma of the biliary epithelium in origin. Interestingly, 14 showed significant differences between peripheral and hilar CC. For example, MUC5AC, a gastric type mucin, was significantly more abundant in hilar CC, while Tenascin was upregulated in peripheral CC. In accordance with the present invention, the protein markers may include MUC5AC and Tenascin.
PSC-associated CC vs. hilar CC: (Table 9) These two types of CC are histologically indistinguishable. But 5 marker proteins (Alpha-1B-glycoprotein, Asporin, Decorin, Methyl-CpG-binding protein 2, and Mimecan) were significantly different in abundance between PSC-associated and conventional hilar CC. All of these were more abundant in hilar CC unrelated to PSC. In accordance with the present invention, the one or more, or plurality of marker proteins may be selected from the group consisiting of Alpha-1B-glycoprotein, Asporin, Decorin, Methyl-CpG-binding protein 2, and Mimecan.
Peripheral or hilar CC vs. colorectal metastasis: (See Table 10) There were only 29 marker proteins expressed at significantly different levels in peripheral CC vs. colorectal metastasis and 63 marker proteins expressed at significantly different levels in hilar CC vs. colorectal metastasis (Table 1, Table 10, Table 11). Keratin 20, which is the most commonly used intestinal marker in routine pathological examination, did not reach statistical significance in this tissue comparison. Marker proteins that were significantly more abundant in CCs included annexins A4 and A5, protein-glutamine gamma-glutamyltransferase 2, and plasma protease C1 inhibitor.
AKR1B10 and Tubulin-beta 3 were investigated further by Volcano plots and Immuno-hystochemistry as a validation study. AKR1B10 was chosen as it is significantly upregulated in HCC than in normal liver or peripheral CC, suggesting that this may become a diagnostic marker specific to HCC. Tubulin-beta 3 was significantly up-regulated in peripheral CC than in either tissue type of normal liver, HCC, or normal bile duct, suggesting Tubulin-beta 3 to have a diagnostic value specific to peripheral CC. The inventors focused on four tissue types: normal liver parenchyma, HCC, normal bile duct, and peripheral CC, as they are clinically most important to differentiate.
AKR1B10 (060218) is up-regulated inHCC (tissue type 2) (
Tubulin-beta 3 chain (Q13509) was found to be up-regulated in peripheral CC (tissue type 5) when compared to normal liver parenchyma or normal bile duct. Tubulin-beta 3 chain was surprisingly completely negative in normal liver, HCC, and normal bile duct, while it was diffusely expressed in 5 of 7 cases of peripheral CC (
Finally,
4.3 Discussion
The inventors have shown that the combination of laser microdissection and LC-MS/MS proteomics is a powerful approach which allows extensive profiling of protein expression in selected tumor sub-populations. This technique can be applied to FFPE histological archival material, a major advantage in the design of both prospective and retrospective tissue based studies. The identification of marker proteins already known to be specific to certain lineages (e.g. keratins 7 and 19 in biliary epithelium [21]), supports the robustness of the technique. The inventors have identified sets of marker proteins specific to well characterised hepato-biliary lineages and their neoplastic counterparts, and which could be used as biomarkers with diagnostic and prognostic potential, therapeutic targets or to understand the underlying carcinogenetic processes.
The identification of protein sets specific to the hepatocellular and cholangiocellular phenotype of post-TACE mixed tumors, and their similarity to their normal and typical neoplastic counterparts confirms that the differentiation process is truly divergent, despite a probable origin from a common progenitor. Of equal importance is the identification of marker proteins differentially expressed between normal and neoplastic hepatocytes and biliary epithelial cells, as these provide markers of malignant transformation or tumor differentiation; and between HCC and peripheral CC, which often overlap in both clinical presentation, and appearance on imaging and histology [22-23]. Of note alpha-fetoprotein (AFP), a marker commonly increased in the serum of patients with HCC [2], was not identified in any tissue type in this study. This is probably due to expression levels in tissue samples being below the LC-MS/MS detection threshold. Serum AFP levels are known to be elevated in about 75% of patients with HCC, but its expression in tissue is detectable in than 40% of patients even by IHC [24-25].
Interestingly one (14-3-3 protein eta) of the five marker proteins (14-3-3 protein eta; AK1BA; H15 and K2C6B) shown to be significantly over-expressed in HCC compared to normal liver parenchyma is known to play a role in mechanisms known to contribute to the cancer phenotype, as the abnormal expression of 14-3-3 protein eta has been reported in some human neoplasms [26-27]. Another two (Heterogeneous nuclear ribonucleoprotein R and Histone H1.5) are involved in gene transcription through chromatin remodeling, DNA methylation, and processing of precursor mRNA in the nucleus. The inventors also identified AKR1B10 as a significantly upregulated protein in HCC, which was validated by additional IHC. This finding is in keeping with a previous study, where a random-based gene fishing approach identified AKR1B10 as a significantly up-regulated gene in HCC compared to non-neoplastic liver tissue [28].
The inventors were also interested in molecules that were specifically up-regulated in CCs. Three marker proteins (Tubulin-beta 3 chain, Periostin, Collagen alpha-1(XII) chain) were up-regulated in CC compared to normal bile duct. Tubulin-beta 3 is the major constituent of microtubules and plays a critical role in proper axon guidance and maintenance. Periostin induces cell attachment and spreading and plays a role in cell adhesion. Collagen alpha-1(XII) interacts with type I collagen-containing fibrils, which are known to be overexpressed in invasive breast carcinoma [11]. Increased deposition and aberrant cross-linking of collagen is associated with the development of invasive breast cancer, the result of which contributes to stiffening of the extracellular matrix and is a factor that has been shown to drive progression of in situ disease [11]. The overexpression of these three marker proteins in CCs, and their known functional roles in biology and pathology means they will be useful markers of CC.
The three types of CCs are histologically very similar. Only a small number of markers that show significant difference in abundance between peripheral and hilar CCs have been identified [29]. The invention provides at least 5 such marker proteins, which may represent different underlying carcinogenetic processes. PSC-associated CC is supposedly different from conventional CC in underlying molecular events. However, these two types of CCs are histologically almost identical with no reliable molecular discriminators. No oncogenes or tumor suppressor genes specifically involved in PSC-associated carcinogenesis have been identified to the best of the inventors' knowledge. The inventors have identified 5 significantly modulated marker proteins between these two tissue types, all less abundant in PSC-associated CC.
In conclusion, the inventors have shown that the combination of laser microdissection and LC-MS/MS allows comprehensive proteomic profiling of tumor cell subpopulations and is applicable to FFPE archival tissue. The inventors have identified biomarkers, in particular Collagen alpha 1 (XVIII) chain, Plastin-3, AKR1B10, Fibronectin, Beta 3 tubulin, Asporin, 14-3-3 protein eta and Dihydropyrimidinase-related protein 3, to be used in the distinction between non-neoplastic and neoplastic hepatocytes and biliary epithelial cells, to refine grading of tumor differentiation, in the differential diagnosis of primary liver tumors, and to investigate the pathogenesis of sub-types of cholangiocarcinoma.
Number | Date | Country | Kind |
---|---|---|---|
1320061.3 | Nov 2013 | GB | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/GB2014/053368 | 11/13/2014 | WO | 00 |