The present invention relates to methods for selecting a treatment for a subject with cancer, predicting responsiveness of a subject with cancer to therapeutic agents including MAPK, EMT and SRC pathway inhibitors and determining clinical prognosis based on assessing the expression level of biomarkers.
Mitogen-activated protein kinases (MAPKs) are components of a vital intracellular pathway which controls a vast array of physiologic processes. These enzymes are regulated by a characteristic phosphor-relay system, in which a series of three protein kinases phosphorylate and activate one another. The extracellular signal-regulated kinases (ERKs) function in the control of cell division. The c-Jun amino-terminal kinases (JNKs) are critical regulators of transcription. The p38 MAPKs are activated by inflammatory cytokines and environmental stresses. Because of its role in cell proliferation and carcinogenesis, the most characterised MAPK pathway is the RAS/RAF/MEK/ERK pathway. Mutations of each of these genes are mutually exclusive, but all lead to constitutive activation of the MAPK signal transduction pathway. However this is one of the most frequently dysregulated signal transduction pathways in human cancers, often through gain-of-function mutations of RAS and RAF family members. Mutations in KRAS have been found in 90% of pancreatic cancers, 20% of non-small cell lung cancers (NSCLC), and up to 50% of colorectal and thyroid cancers (Jose et al., 1984) whereas mutations of BRAF have been identified in more than 60% of melanoma and 40% to 60% of papillary thyroid cancers (Cohen et al., 2003; Davies et al., 2002, Xu et al., 2003). Although MEK1/2 is rarely mutated, constitutively active MEK has been found in more than 30% of primary tumour cell lines tested (Hoshino et al., 1999). With regards to ovarian cancer, low-grade serous ovarian carcinoma which accounts for a small proportion of all ovarian serous carcinomas (<20%) are characterized by mutations of the KRAS, BRAF, ERBB2 genes (Lopez et al., 2013). In addition, it is hypothesised that alterations, other than mutation, exist in High grade serous ovarian cancer (HGSOC). It has been found that 11% of HGSOC have KRAS amplification and 12% have BRAF amplification, suggesting that the RAS/RAF/MEK/ERK pathway may have utility outside low grade serous ovarian cancer (The Cancer Genome Atlas (TCGA, Nature 2011). In addition 12% of these tumours have also alteration in the NF1 gene, a RAS GTPase and negative regulator of RAS (TOGA, Nature 2011).
The RAS/RAF/MEK/ERK pathway is activated by a wide array of growth factors and cytokines acting through receptor tyrosine kinases such as EGF, IGF, and TGF. The activated receptors recruit nucleotide exchange proteins which activate RAS through a conversion from the inactive GDP-bound form to the active GTP-bound form. Activated RAS recruits RAF kinase to the membrane, where it is activated by multiple phosphorylation events and where it activates MEK1/2 kinase. MEK1/2 are dual-specificity kinases, catalysing the phosphorylation of both tyrosine and threonine on ERK1 and ERK2. Phosphorylated ERK can translocate to the nucleus where it phosphorylates and activates various transcription factors (Marshall, 1996). Activated ERK1/2 catalyse the phosphorylation of numerous cytoplasmic and nuclear substrates, regulating diverse cellular responses such as mitosis, embryogenesis, cell differentiation, motility, metabolism, and programmed death, as well as angiogenesis (Shaul et al., 2007; Lewis et al., 1998; Johnson et al., 1994; D'Angelo et al., 1995; Na et al., 2010).
Factors associated with resistance to platinum include those that limit the formation of cytotoxic platinum-DNA adducts and those that prevent cell death occurring after platinum-adduct formation (Davis et al., 2014). The former may result from reduced uptake of cisplatin into cells, increased efflux via alterations to transport proteins or by inactivation of intracellular cisplatin by conversion into cisplatin-thiol conjugates. The latter form of resistance may occur by increased DNA repair after adduct formation. Alterations in various proteins associated with these repair mechanisms have been associated with platinum resistance, for example high levels of excision repair cross-complementation 1 (ERCC1) protein, mutations or down-regulation of MLH1, MSH2 and MSH1 and secondary mutations of BRCA1 or 2, which can cause reversion to the BRCA genotype and reestablishment of BRCA function, hence increasing HR (Lord and Ashworth, 2013). These various factors may either be present at diagnosis or acquired over time.
A number of studies have tried to characterise the mechanisms of acquired resistance in ovarian cancer. Analysis of 135 spatially and temporally separated samples from 14 patients with HGSOC who received platinum-based chemotherapy found that NF1 deletion showed a progressive increase in tumour allele fraction during chemotherapy (Schwarz et al., 2015). This suggested that subclonal tumour populations are present in pre-treatment biopsies in HGSOC and can undergo expansion during chemotherapy, causing clinical relapse (Schwarz et al., 2015). Additionally alteration of the NF1 gene has been associated with innate cisplatin resistance in HGSOC, whereby 20% of primary tumours showed inactivation of the NF1 gene by mutation or gene breakage (Patch et al., 2015). Furthermore mutation of the RAS-MAPK has been associated with chemotherapy resistance in relapsed neuroblastomas (Eleveld et al., 2015). Additionally, in cell line models, the MAPK pathway has been implicated in cisplatin resistance in ovarian cancer (Benedetti et al., 2008) and in squamous cell carcinoma (Kong et al., 2015).
A cancer with a given histopathological diagnosis may represent multiple diseases at a molecular level. The present inventors have defined a molecular subgroup of cancer characterised by misregulation of the MAPK signalling pathway and the epithelial-mesenchymal transition (EMT) pathway. Biomarker signatures devised by the present inventors can be used to identify cancers within the molecular subgroup. The signatures are also useful for identifying the treatment that is best suited for a given patient.
Thus, in a first aspect the invention provides a method for selecting a treatment for a subject having a cancer, comprising:
According to a related aspect of the invention there is provided a method for selecting a treatment for a subject having a cancer, comprising:
According to all aspects of the invention, in specific embodiments the methods of the invention comprise measuring the expression level of THBS1. In further specific embodiments the methods of the invention comprise measuring the expression levels of COL5A1 and THBS1. In yet further specific embodiments the methods of the invention comprise measuring the expression level of COL5A1. COL5A1 and THBS1 are found in Table B herein. Thus, in addition to measuring the expression levels of at least COL5A1 and/or THBS1 the methods of the invention may include measuring one or more additional, up to all, of the biomarkers listed in Table B (optionally together with one or more biomarkers from Table A and/or one or more additional biomarkers).
The present inventors have identified that treatment of tumour cells resistant to a platinum-based chemotherapeutic agent (and positive for the biomarker signature) with a MAPK pathway inhibitor can re-sensitise the tumour cells to the platinum-based chemotherapeutic agent. Thus, in the methods described herein the MAPK pathway inhibitor may be combined with a platinum based chemotherapeutic agent. The platinum based chemotherapeutic agent may be administered before, together with, or after the MAPK pathway inhibitor. Preferably, the platinum based chemotherapeutic agent is administered together with, or after, the MAPK pathway inhibitor.
Furthermore, the present inventors have identified that the MAPK and SRC pathways act in parallel such that the inhibition of one signaling cascade leads to the activation of the other. Thus, in the methods described herein the MAPK pathway inhibitor may be combined with a SRC pathway inhibitor. The SRC pathway inhibitor may be administered before, together with, or after the MAPK pathway inhibitor. Preferably, the SRC pathway inhibitor is administered together with, or after, the MAPK pathway inhibitor.
By “indicated” is meant “indicated for treatment”, i.e. that the therapeutic agent is predicted to positively treat the cancer. A therapeutic agent is thus “indicated” if the cancer's rate of growth is expected to, or will, decelerate as a result of contact with the therapeutic agent, compared to its growth in the absence of contact with the therapeutic agent. A therapeutic agent can also be considered “indicated” if the subject's overall prognosis (progression free survival and/or overall survival) is expected to, or will, improve by administration of the therapeutic agent.
By “not indicated” is meant “not indicated for treatment”, i.e. that the therapeutic agent is predicted not to positively treat the cancer. A therapeutic agent is thus “not indicated” if the cancer's rate of growth is expected to, or will, not decelerate as a result of contact with the therapeutic agent, compared to its growth in the absence of contact with the therapeutic agent. A therapeutic agent can also be considered “not indicated” if the subject's overall prognosis (progression free survival and/or overall survival) is expected to, or will, not improve by administration of the therapeutic agent. A therapeutic agent can also be considered “not indicated” if it is “contraindicated”. By “contraindicated” is meant that a worse outcome is expected for the subject than if the subject was treated with the therapeutic agent.
According to a related aspect of the invention there is provided a method for predicting the responsiveness of a subject with cancer to a therapeutic agent comprising:
There is also provided a method for predicting the responsiveness of a subject with cancer to a therapeutic agent comprising:
A cancer is “responsive” to a therapeutic agent if its rate of growth is inhibited as a result of contact with the therapeutic agent, compared to its growth in the absence of contact with the therapeutic agent. Growth of a cancer can be measured in a variety of ways. For instance, the size of a tumor or measuring the expression of tumour markers appropriate for that tumour type. A cancer can also be considered responsive to a therapeutic agent if the subject's overall prognosis (progression free survival and/or overall survival) is improved by the administration of the therapeutic agent.
A cancer is “non-responsive” to a therapeutic agent if its rate of growth is not inhibited, or inhibited to a very low degree or to a non-statistically significant degree, as a result of contact with the therapeutic agent when compared to its growth in the absence of contact with the therapeutic agent. As stated above, growth of a cancer can be measured in a variety of ways, for instance, the size of a tumour or measuring the expression of tumour markers appropriate for that tumour type. A cancer can also be considered non-responsive to a therapeutic agent if the subject's overall prognosis (progression free survival and/or overall survival) is not improved by the administration of the therapeutic agent. Still further, measures of non-responsiveness can be assessed using additional criteria beyond growth size of a tumor such as, but not limited to, patient quality of life, and degree of metastases.
In yet a further aspect, the present invention relates to a method of determining clinical prognosis of a subject with cancer comprising:
In some embodiments, the at least 1 biomarker comprises or is COL5A1 and/or THBS1.
“Poor prognosis” may indicate decreased progression free survival and/or overall survival rates compared to samples that are negative for the biomarker signature and/or good prognosis may indicate increased progression free survival or overall survival rates compared to samples that are positive for the biomarker signature. Poor prognosis may indicate increased likelihood of recurrence or metastasis compared to samples that are negative for the biomarker signature and/or good prognosis may indicate decreased likelihood of recurrence or metastasis compared to samples that are positive for the biomarker signature. Metastasis, or metastatic disease, is the spread of a cancer from one organ or part to another non-adjacent organ or part. The new occurrences of disease thus generated are referred to as metastases.
In certain embodiments the cancer of the subject whose prognosis is determined is not a glioblastoma. In specific embodiments the cancer of the subject whose prognosis is determined is colon, lung (optionally lung adenocarcinoma) or prostate cancer (and the subject is not receiving a taxane), optionally wherein the subject with prostate cancer has been treated with radical prostatectomy and/or radical radiotherapy. The cancer may also be bladder cancer, cervical cancer, colorectal cancer, glioblastoma, head and neck cancer, renal cancer (optionally renal clear cell or renal papillary cancer), glioma (optionally lower grade glioma), pancreatic cancer, melanoma, ovarian cancer and/or stomach cancer. In certain embodiments the subject is receiving, has received and/or will receive treatment with the standard of care treatment. A skilled practitioner is aware of the standard of care treatment for the particular cancer. In some embodiments, the standard of care treatment incorporates a platinum-based chemotherapeutic agent (as defined herein).
The signatures disclosed herein provide a prognostic indication. This may apply to untreated patients. It may also apply to patients treated with standard of care treatment.
However, the signatures disclosed herein also predict responsiveness to particular targeted, or indicated, therapeutic agents. Accordingly, in specific embodiments a subject with cancer whose sample is positive for the biomarker signature may have a better prognosis when treated with an indicated therapeutic agent to which they are predicted to be responsive than a subject with a cancer whose sample is negative for the biomarker signature and who is treated with the same therapeutic agent. For example, it is shown herein that subjects with (de novo) metastatic prostate cancer who are signature positive have a good prognosis (increased overall survival) relative to signature negative subjects when treated with a taxane. Thus, according to all aspects of the invention, a subject that is signature positive may be selected for therapy and this improves their prognosis. Alternatively, a subject that is signature negative has an improved prognosis and thus is not selected for therapy.
According to a further aspect the invention provides a method for selecting a treatment for a subject having a cancer, comprising:
In some embodiments, the at least 1 biomarker comprises or is COL5A1 and/or THBS1. By “treatment” is meant any therapy or surgery that may be provided to a subject in order to improve, stabilise, or minimise deterioration of a medical condition. The condition relevant to the present invention is cancer (in particular a cancer of the type indicated herein). Means of administration of therapeutic agents include oral, rectal, sublingual, sublabial, intravenous, intraatricular, intracardiac, intracavernous, intramuscular, epidural, intracerebral, intracerebroventricular, epicutaneous, nasal, intrathecal, or via a gastral or duodenal feeding tube and are known in the art. Similarly dosage forms and dosage regimes are known for therapeutic agents and can be determined by a practising physician. Therapeutic agents are approved and marketed for administration in a given dosage form, together with detailed prescribing instructions. Thus, the invention is not limited in relation to how, or in what form, the therapeutic agent is administered since the skilled person would be in a position to determine this based on the therapeutic agent of interest.
According to all aspects of the invention an increased expression level and/or a decreased expression level of the biomarker(s) may contribute to the determination that the sample is positive or negative for the biomarker signature. As shown in Tables C and D, a threshold level of gene expression can be set and a value above or below that threshold may then indicate increased or decreased expression levels. Of course, the invention is not limited to the specific values; the skilled person would appreciate that the suitable values may be determined depending upon the data set in question.
The biomarker signature may be defined by the probesets listed in Tables E and F and by the expression levels of the corresponding genes, which may be measured using the probesets. The biomarker signature may include the expression levels of one or more additional biomarkers, which may be measured in any suitable way, for example using one or more additional probesets.
By “biomarker signature” is meant an identifier comprised of one or more biomarkers (such as a DNA or RNA sequence, a protein or other biological molecule, a cell etc.). The expression level of the one or more biomarkers is measured and the measured expression levels allow the sample to be defined as signature positive or signature negative. Thus, at its simplest, an increased level of expression of one or more biomarkers defines a sample as positive for the biomarker signature. For certain biomarkers, a decreased level of expression of one or more biomarkers defines a sample as positive for the biomarker signature. However, where the expression level of a plurality of biomarkers is measured, the combination of expression levels is typically aggregated in order to determine whether the sample is positive for the biomarker signature. Thus, some biomarkers may display increased expression and some biomarkers may display decreased expression. This can be achieved in various ways, as discussed in detail herein.
In a general sense, in some embodiments, the biomarker signature may be considered as indicative of a particular biological state (such as the presence of a disease condition or developmental state or belonging to a particular biological subgroup). “Positive” for a biomarker signature thus may be interpreted to mean that the sample reflects the relevant biological state that the biomarker signature identifies. Similarly, “negative” for a biomarker signature means that the sample is not in (or reflective of) the relevant biological state. In the present invention, the biological state indicated by the biomarker signature is a molecular subgroup of cancer characterised by misregulation of the MAPK signalling pathway and the epithelial-mesenchymal transition (EMT) pathway. Thus, the cancer identified by the signature may have increased MAPK signalling. The cancer identified by the signature may have increased expression of both immune response and angiogenesis/vascular development genes. The cancer identified by the signature may display higher expression of EMT associated genes. This may include increased expression of VIMENTIN, AXL, TWIST1, SNAIL and/or SLUG. The increased signalling or expression is as compared to other cancers of the same type. So, for example, the cancer may be an ovarian cancer and the subgroup displays increased signalling or expression as compared to other ovarian cancers. Genes defining the EMT/Angio-Immune/MAPK pathway molecular subgroup of cancer are listed in Tables 9 and 10 below. The expression level of the genes may be measured using the probesets in Table 11. In Table 9 up-regulation and down-regulation are presented relative to gene expression levels in the overall sample set.
The biomarker signature is also correlated with particular end points as discussed in detail herein. The biomarker signature may permit selection of appropriate therapeutic interventions for example.
According to all aspects of the invention assessing whether the sample is positive or negative for the biomarker signature may comprise use of classification trees.
According to all aspects of the invention assessing whether the sample is positive or negative for the biomarker signature may comprise:
The skilled person will be aware that threshold expression scores may be set in a number of ways, as discussed in greater detail herein below, for example in order to maximise sensitivity and/or specificity. Thus, the sample expression score and threshold score may also be determined such that if the sample expression score is below or equal to the threshold expression score the sample is positive for the biomarker signature and/or if the sample expression score is above the threshold score the sample is negative for the biomarker signature.
“Expression levels” of biomarkers may be numerical values or directions of expression. By “directions” is meant increased or decreased expression, which may be determined as against a control or threshold expression level as explained further herein.
In the methods the sample expression score (or “signature score”) may be derived according to the formula:
The sample expression score may be derived using the expression level(s) of any of the genes or groups of genes described herein. The sample expression score may be derived using the expression level of one or more additional genes.
According to all aspects of the invention the expression score may be calculated using a weight value and a bias value for each biomarker. For example, the weight value and the bias value may be as defined for each biomarker in Table A and/or Table B. The expression score may be calculated using a weight value for each biomarker.
As used herein, the term “weight” refers to the absolute magnitude of an item in a mathematical calculation. The weight of each biomarker in a gene expression classifier or signature may be determined on a data set of patient samples using learning methods known in the art. As used herein the term “bias” or “offset” refers to a constant term derived using the mean expression of the signatures genes in a training set and is used to mean-center each gene analyzed in the test dataset.
By “expression score” is meant a compound decision score that summarizes the expression levels of the biomarkers. This may be compared to a threshold score that is mathematically derived from a training set of patient data. The threshold score is established with the purpose of maximizing the ability to separate cancers into those that are positive for the biomarker signature and those that are negative. The patient training set data is preferably derived from cancer tissue samples having been characterized by sub-type, prognosis, likelihood of recurrence, long term survival, clinical outcome, treatment response, diagnosis, cancer classification, or personalized genomics profile. Expression profiles, and corresponding decision scores from patient samples may be correlated with the characteristics of patient samples in the training set that are on the same side of the mathematically derived score decision threshold. In certain example embodiments, the threshold of the (linear) classifier scalar output is optimized to maximize the sum of sensitivity and specificity under cross-validation as observed within the training dataset.
The overall expression data for a given sample may be normalized using methods known to those skilled in the art in order to correct for differing amounts of starting material, varying efficiencies of the extraction and amplification reactions, etc.
In one embodiment, the biomarker expression levels in a sample are evaluated by a (linear) classifier. As used herein, a (linear) classifier refers to a weighted sum of the individual biomarker intensities into a compound decision score (“decision function”). The decision score is then compared to a pre-defined cut-off score threshold, corresponding to a certain set-point in terms of sensitivity and specificity which indicates if a sample is equal to or above the score threshold (decision function positive) or below (decision function negative).
Using a (linear) classifier on the normalized data to make a call (e.g. positive or negative for a biomarker signature) effectively means to split the data space, i.e. all possible combinations of expression values for all genes in the classifier, into two disjoint segments by means of a separating hyperplane. This split is empirically derived on a (large) set of training examples. Without loss of generality, one can assume a certain fixed set of values for all but one biomarker, which would automatically define a threshold value for this remaining biomarker where the decision would change from, for example, positive or negative for the biomarker signature. The precise value of this threshold depends on the actual measured expression profile of all other biomarkers within the classifier, but the general indication of certain biomarkers remains fixed. Therefore, in the context of the overall gene expression classifier, relative expression can indicate if either up- or down-regulation of a certain biomarker is indicative of being positive for the signature or not. In certain example embodiments, a sample expression score above the threshold expression score indicates the sample is positive for the biomarker signature. In certain other example embodiments, a sample expression score above a threshold score indicates the subject has a poor clinical prognosis compared to a subject with a sample expression score below the threshold score.
In certain other example embodiments, the expression signature is derived using a decision tree (Hastie et al. The Elements of Statistical Learning, Springer, New York 2001), a random forest (Breiman, 2001 Random Forests, Machine Learning 45:5), a neural network (Bishop, Neural Networks for Pattern Recognition, Clarendon Press, Oxford 1995), discriminant analysis (Duda et al. Pattern Classification, 2nd ed., John Wiley, New York 2001), including, but not limited to linear, diagonal linear, quadratic and logistic discriminant analysis, a Prediction Analysis for Microarrays (PAM, (Tibshirani et al., 2002, Proc. Natl. Acad. Sci. USA 99:6567-6572)) or a Soft Independent Modeling of Class Analogy analysis. (SIMCA, (Wold, 1976, Pattern Recogn. 8:127-139)). Classification trees (Breiman, Leo; Friedman, J. H.; Olshen, R. A.; Stone, C. J. (1984). Classification and regression trees. Monterey, Calif.: Wadsworth & Brooks/Cole Advanced Books & Software. ISBN 978-0-412-04841-8) provide a means of predicting outcomes based on logic and rules. A classification tree is built through a process called binary recursive partitioning, which is an iterative procedure of splitting the data into partitions/branches. The goal is to build a tree that distinguishes among pre-defined classes. Each node in the tree corresponds to a variable. To choose the best split at a node, each variable is considered in turn, where every possible split is tried and considered, and the best split is the one which produces the largest decrease in diversity of the classification label within each partition. This is repeated for all variables, and the winner is chosen as the best splitter for that node. The process is continued at the next node and in this manner, a full tree is generated. One of the advantages of classification trees over other supervised learning approaches such as discriminant analysis, is that the variables that are used to build the tree can be either categorical, or numeric, or a mix of both. In this way it is possible to generate a classification tree for predicting outcomes based on say the directionality of gene expression. Random forest algorithms (Breiman, Leo (2001). “Random Forests”. Machine Learning 45 (1): 5-32. doi:10.1023/A:1010933404324) provide a further extension to classification trees, whereby a collection of classification trees are randomly generated to form a “forest” and an average of the predicted outcomes from each tree is used to make inference with respect to the outcome.
Biomarker expression values may be defined in combination with corresponding scalar weights on the real scale with varying magnitude, which are further combined through linear or non-linear, algebraic, trigonometric or correlative means into a single scalar value via an algebraic, statistical learning, Bayesian, regression, or similar algorithms which together with a mathematically derived decision function on the scalar value provide a predictive model by which expression profiles from samples may be resolved into discrete classes of responder or non-responder, resistant or non-resistant, to a specified drug, drug class, molecular subtype, or treatment regimen. Such predictive models, including biomarker membership, are developed by learning weights and the decision threshold, optimized for sensitivity, specificity, negative and positive predictive values, hazard ratio or any combination thereof, under cross-validation, bootstrapping or similar sampling techniques, from a set of representative expression profiles from historical patient samples with known drug response and/or resistance.
In one embodiment, the biomarkers are used to form a weighted sum of their signals, where individual weights can be positive or negative. The resulting sum (“expression score”) is compared with a pre-determined reference point or value. The comparison with the reference point or value may be used to diagnose, or predict a clinical condition or outcome.
As described above, one of ordinary skill in the art will appreciate that the biomarkers included in the classifier provided in Table A and/or Table B will carry unequal weights in a classifier. Therefore, while as few as one biomarker may be used to diagnose or predict a clinical prognosis or response to a therapeutic agent, the specificity and sensitivity or diagnosis or prediction accuracy may increase using more biomarkers.
In certain example embodiments, the expression signature is defined by a decision function. A decision function is a set of weighted expression values derived using a (linear) classifier. All linear classifiers define the decision function using the following equation:
f(x)=w′·x+b=Σwi·xi+b (1)
All measurement values, such as the microarray gene expression intensities xi, for a certain sample are collected in a vector x. Each intensity is then multiplied with a corresponding weight wi to obtain the value of the decision function f(x) after adding an offset term b. In deriving the decision function, the linear classifier will further define a threshold value that splits the gene expression data space into two disjoint sections. Example (linear) classifiers include but are not limited to partial least squares (PLS), (Nguyen et al., Bioinformatics 18 (2002) 39-50), support vector machines (SVM) (Schölkopf et al., Learning with Kernels, MIT Press, Cambridge 2002), and shrinkage discriminant analysis (SDA) (Ahdesmäki et al., Annals of applied statistics 4, 503-519 (2010)). In one example embodiment, the (linear) classifier is a PLS linear classifier.
The decision function is empirically derived on a large set of training samples, for example from patients showing a good or poor clinical prognosis. The threshold separates a patient group based on different characteristics such as, but not limited to, clinical prognosis before or after a given therapeutic treatment. The interpretation of this quantity, i.e. the cut-off threshold, is derived in the development phase (“training”) from a set of patients with known outcome. The corresponding weights and the responsiveness/resistance cut-off threshold for the decision score are fixed a priori from training data by methods known to those skilled in the art. In one example embodiment, Partial Least Squares Discriminant Analysis (PLS-DA) is used for determining the weights. (L. Ståhle, S. Wold, J. Chemom. 1 (1987) 185-196; D. V. Nguyen, D. M. Rocke, Bioinformatics 18 (2002) 39-50).
Effectively, this means that the data space, i.e. the set of all possible combinations of biomarker expression values, is split into two mutually exclusive groups corresponding to different clinical classifications or predictions, for example, one corresponding to good clinical prognosis and poor clinical prognosis. In the context of the overall classifier, relative over-expression of a certain biomarker can either increase the decision score (positive weight) or reduce it (negative weight) and thus contribute to an overall decision of, for example, a good clinical prognosis.
In certain example embodiments of the invention, the data is transformed non-linearly before applying a weighted sum as described above. This non-linear transformation might include increasing the dimensionality of the data. The non-linear transformation and weighted summation might also be performed implicitly, for example, through the use of a kernel function. (Schölkopf et al. Learning with Kernels, MIT Press, Cambridge 2002).
In certain example embodiments, the patient training set data is derived by isolated RNA from a corresponding cancer tissue sample set and determining expression values by hybridizing the (cDNA amplified from) isolated RNA to a microarray. In certain example embodiments, the microarray used in deriving the expression signature is a transcriptome array. As used herein a “transcriptome array” refers to a microarray containing probe sets that are designed to hybridize to sequences that have been verified as expressed in the diseased tissue of interest. Given alternative splicing and variable poly-A tail processing between tissues and biological contexts, it is possible that probes designed against the same gene sequence derived from another tissue source or biological context will not effectively bind to transcripts expressed in the diseased tissue of interest, leading to a loss of potentially relevant biological information. Accordingly, it is beneficial to verify what sequences are expressed in the disease tissue of interest before deriving a microarray probe set. Verification of expressed sequences in a particular disease context may be done, for example, by isolating and sequencing total RNA from a diseased tissue sample set and cross-referencing the isolated sequences with known nucleic acid sequence databases to verify that the probe set on the transcriptome array is designed against the sequences actually expressed in the diseased tissue of interest. Methods for making transcriptome arrays are described in United States Patent Application Publication No. 2006/0134663, which is incorporated herein by reference. In certain example embodiments, the probe set of the transcriptome array is designed to bind within 300 nucleotides of the 3′ end of a transcript. Methods for designing transcriptome arrays with probe sets that bind within 300 nucleotides of the 3′ end of target transcripts are disclosed in United States Patent Application Publication No. 2009/0082218, which is incorporated by reference herein. In certain example embodiments, the microarray used in deriving the gene expression profiles of the present invention is the Almac Ovarian Cancer DSA™ microarray (Almac Group, Craigavon, United Kingdom).
An optimal (linear) classifier can be selected by evaluating a (linear) classifier's performance using such diagnostics as “area under the curve” (AUC). AUC refers to the area under the curve of a receiver operating characteristic (ROC) curve, both of which are well known in the art. AUC measures are useful for comparing the accuracy of a classifier across the complete data range. (Linear) classifiers with a higher AUC have a greater capacity to classify unknowns correctly between two groups of interest (e.g., ovarian cancer samples and normal or control samples). ROC curves are useful for plotting the performance of a particular feature (e.g., any of the biomarkers described herein and/or any item of additional biomedical information) in distinguishing between two populations (e.g., individuals responding and not responding to a therapeutic agent). Typically, the feature data across the entire population (e.g., the cases and controls) are sorted in ascending order based on the value of a single feature. Then, for each value for that feature, the true positive and false positive rates for the data are calculated. The true positive rate is determined by counting the number of cases above the value for that feature and then dividing by the total number of positive cases. The false positive rate is determined by counting the number of controls above the value for that feature and then dividing by the total number of controls. Although this definition refers to scenarios in which a feature is elevated in cases compared to controls, this definition also applies to scenarios in which a feature is lower in cases compared to the controls (in such a scenario, samples below the value for that feature would be counted). ROC curves can be generated for a single feature as well as for other single outputs, for example, a combination of two or more features can be mathematically combined (e.g., added, subtracted, multiplied, etc.) to provide a single sum value, and this single sum value can be plotted in a ROC curve. Additionally, any combination of multiple features, in which the combination derives a single output value, can be plotted in a ROC curve. These combinations of features may comprise a test. The ROC curve is the plot of the true positive rate (sensitivity) of a test against the false positive rate (1-specificity) of the test.
Alternatively, an optimal classifier can be selected by evaluating performance against time-to-event endpoints using methods such as Cox proportional hazards (PH) and measures of performance across all possible thresholds assessed via the concordance-index (C-index) (Harrell, Jr. 2010). The C-Index is analagous to the “area under the curve” (AUC) metric (used for dichotomised endpoints), and it is used to measure performance with respect to association with survival data. Note that the extension of AUC to time-to-event endpoints is the C-index, with threshold selection optimised to maximise the hazard ratio (HR) under cross-validation. In this instance, the partial Cox regression algorithm (Li and Gui, 2004) was chosen for the biomarker discovery analyses. It is analogous to principal components analysis in that the first few latent components explain most of the information in the data. Implementation is as described in Ahdesmaki et al 2013.
C-index values can be generated for a single feature as well as for other single outputs, for example, a combination of two or more features can be mathematically combined (e.g., added, subtracted, multiplied, etc.) to provide a single sum value, and this single sum value can be evaluated for statistical significance. Additionally, any combination of multiple features, in which the combination derives a single output value, can be evaluated as a C-index for assessing utility for time-to-event class separation. These combinations of features may comprise a test. The C-index (Harrell, Jr. 2010, see Equation 4) of the continuous cross-validation test set risk score predictions was evaluated as the main performance measure.
In one example embodiment an expression signature is directed to the biomarkers detailed in Table A and/or Table B with corresponding ranks, and weights and associated bias detailed in the tables or alternative rankings, and weightings and bias, depending, for example, on the disease setting. The methods of the invention may rely upon measuring one or more, up to all, of the biomarkers listed in Table A and/or Table B (optionally together with one or more additional biomarkers).
The invention provides for patient selection for therapy and thus may contribute to improved outcomes in response to particular classes of therapy. Accordingly, the invention also relates to a method of treating cancer comprising administering a MAPK pathway inhibitor, an EMT pathway inhibitor, an SRC pathway inhibitor, an anti-angiogenic therapeutic agent, a taxane and/or a platinum-based chemotherapeutic agent to a subject wherein the subject is selected for treatment on the basis of a method as described herein.
In a related aspect, the present invention provides a method of treating cancer comprising administering a therapeutic agent to a subject wherein the subject is selected for treatment by
In a further aspect the invention provides a method of treating cancer comprising administering a therapeutic agent to a subject wherein the subject is selected for treatment by
The invention also relates to a MAPK pathway inhibitor, an EMT pathway inhibitor, an SRC pathway inhibitor, an anti-angiogenic therapeutic agent, a taxane and/or a platinum-based chemotherapeutic agent for use in treating cancer in a subject, wherein the subject is selected for treatment on the basis of a method as described herein.
In yet a further related aspect, the present invention provides a therapeutic agent for use in treating cancer in a subject wherein the subject is selected for treatment by:
According to a further aspect of the invention there is provided a therapeutic agent for use in treating cancer in a subject wherein the subject is selected for treatment by:
According to a further aspect of the invention there is provided a method of treating cancer comprising administering a therapeutic agent to a subject wherein
Also provided is a method of treating cancer comprising administering a therapeutic agent to a subject wherein if the subject is positive for a biomarker signature comprising the expression level(s) of at least COL5A1 and/or THBS1 the therapeutic agent is an anti-angiogenic therapeutic agent.
The invention also relates to a therapeutic agent for use in treating cancer in a subject, wherein
Also provided is a therapeutic agent for use in treating cancer in a subject, wherein if the sample is positive for a biomarker signature comprising the expression level(s) of at least COL5A1 and/or THBS1 the therapeutic agent is an anti-angiogenic therapeutic agent.
In yet a further aspect, the present invention relates to a method of treating cancer comprising administering a therapeutic agent to a subject, wherein:
According to a further aspect of the invention there is provided a therapeutic agent for use in treating cancer in a subject, wherein
By “EMT cancer” is meant a cancer falling within the molecular subgroup identified by the present inventors, which is detectable using the biomarker signatures of the invention and described herein, for example based on the expression levels of one or more biomarkers from Tables A and B. The cancer may thus display epithelial-mesenchymal transition (EMT), which may contribute to angiogenic processes and disease progression. An EMT cancer can also be termed an Angio-Immune cancer or a MAPK pathway (MEK) cancer in view of the contributing pathways to the subgroup. Genes defining the EMT/Angio-Immune/MAPK pathway molecular subgroup of cancer are listed in Tables 9 and 10 below. In Table 9 up-regulation and down-regulation are presented relative to gene expression levels in the overall sample set.
According to all aspects of the invention the therapeutic agent may be a MAPK pathway inhibitor combined with a platinum-based chemotherapeutic agent and/or an SRC pathway inhibitor.
The invention also relates to a method of treating cancer comprising administering a combination of a platinum-based chemotherapeutic agent and a MAPK pathway inhibitor, wherein:
In a further aspect, the present invention relates to a combination of a platinum-based chemotherapeutic agent and a MAPK pathway inhibitor for use in a method of treating cancer, wherein:
According to all relevant aspects of the invention, the platinum-based chemotherapeutic agent and the MAPK pathway inhibitor may be administered together and/or sequentially in time in either order.
According to all aspects of the invention, a therapeutic agent may be a chemically synthesized pharmaceutical, a biologic, vaccine or small molecule. Biologics include antibodies and derivatives thereof as discussed further herein, recombinant therapeutic proteins, sugars and nucleic acids.
By “MAPK pathway inhibitor” is meant a therapeutic agent, such as a pharmaceutical drug, that inhibits signalling via the MAPK pathway. The inhibitor may be specific for the MAPK pathway. Thus, in certain embodiments the MAPK pathway inhibitor is not a multi-pathway inhibitor. In further embodiments the MAPK pathway inhibitor is a RAS/RAF/MEK/ERK pathway inhibitor. In specific embodiments the MAPK pathway inhibitor is a (specific) RAS, RAF, MEK and/or MAPK inhibitor. By MEK inhibitor is meant a therapeutic agent, such as a pharmaceutical drug, that (specifically) inhibits the mitogen-activated protein kinase kinase enzymes MEK1 and/or MEK2.
In certain embodiments the MAPK pathway inhibitor is selected from Table G and/or H. In certain embodiments the MAPK pathway inhibitor (specifically) inhibits one or more of the targets listed in Table H. In specific embodiments the MAPK pathway inhibitor is trametinib. In further specific embodiments the MAPK pathway inhibitor is selumetinib (synonyms: AZD6244 and ARRY-142886).
By “EMT pathway inhibitor” is meant a therapeutic agent, such as a pharmaceutical drug, that acts to inhibit the epithelial-mesenchymal transition (EMT). The inhibitor may be specific for the EMT pathway. Thus, in certain embodiments the EMT pathway inhibitor is not a multi-pathway inhibitor. In certain embodiments, the EMT pathway inhibitor is selected from Table I.
In further embodiments, the EMT pathway inhibitor is an FKBP-L polypeptide or a biologically active peptide fragment thereof. In preferred embodiments, the biologically active peptide fragment of FKBP-L comprises the amino acid sequence IRQQPRDPPTETLELEVSPDPAS (SEQ ID NO:791; referred to herein also as ALM201), or a sequence at least 90% identical thereto. In further embodiments, the FKBP-L polypeptide comprises the amino acid sequence shown as SEQ ID NO:789 or SEQ ID NO:790, or a sequence at least 90% identical thereto. In further embodiments, the biologically active peptide fragment of FKBP-L comprises the amino acid sequence shown as any one of SEQ ID Nos 792 to 811, or a sequence at least 90% identical thereto.
As used herein, the term “biologically active FKBP-L peptide” (e.g., fragment and/or modified polypeptides) is used to refer to a peptide or polypeptide that displays the same or similar amount and type of activity as the full-length FKBP-L polypeptide. In this context “biological activity” of an FKBP-L polypeptide, fragment or derivative refers to the ability to inhibit and/or reverse the EMT pathway (and/or the ability to down-regulate the MAPK pathway). MAPK is known to induce EMT via phosphorylation of the SNAIL/SLUG transcription factors, (Virtakoivu et al., 2015).
Biological activity of FKBP-L fragments or derivatives may be tested in comparison to full length FKBP-L using any of the in vitro or in vivo assays described in the accompanying examples, including cell-based assays of the mesenchymal phenotype, such as for example the colony formation assay, migration assay or invasion assay. In other embodiments, “biological activity” of an FKBP-L polypeptide, fragment or derivative may be demonstrated by assaying expression of one or more biomarkers of the EMT pathway (e.g. mesenchymal markers), or one or more biomarkers of the MAPK pathway.
The term “FKBP-L” refers to the protein FK506 binding protein-like, (McKeen et al. Endocrinology, 2008, Vol 149(11), 5724-34; Gene ID: 63943). FKBP-L and peptide fragments thereof have previously been demonstrated to possess potent anti-angiogenic activity (WO 2007/141533). The anti-angiogenic activity of FKBP-L peptide fragments appears to be dependent on an amino acid sequence located between amino acids 34-57, in the N-terminal region of the full-length protein. This anti-angiogenic activity suggested a clinical utility of the peptide in the treatment of cancers, particularly solid tumours.
The expression “FKBP-L polypeptide” is used in the specification according to its broadest meaning. It designates the naturally occurring full-length protein as shown in SEQ ID NO:789, together with homologues due to polymorphisms, other variants, mutants and portions of said polypeptide which retain their biological activities. For example, in certain embodiments, the FKBP-L polypeptide comprises SEQ ID NO:789 (GENBank Accession No. NP_071393; NM_022110; [gi:34304364]), or SEQ ID NO:790 with a Threonine at position 181 and a Glycine at position 186 of the wild-type sequence. Example constructs of other FKBP-L polypeptides (e.g., fragments and other modifications) and polynucleotide constructs encoding for FKBP-L polypeptides are described in WO 2007/141533, the contents of which are incorporated herein in their entirely by reference, expressly for this purpose.
In SEQ ID NO: 790, the FKBP-L insert (originally cloned into PUC18 by Cambridge Bioscience and now cloned into pcDNA3.1); had two inserted point mutations compared to the sequence that is deposited on the PUBMED database (SEQ ID NO: 789). There is a point mutation at 540 bp (from start codon): TCT to ACT which therefore converts a serine (S) to a Threonine (T) (amino acid: 181). There is also a point mutation at 555 bp (from start codon): AGG to GGG which therefore converts an Arginine (R) to a Glycine (G) (amino acid: 186). Both FKBP-L polypeptides (SEQ ID NO: 789 and SEQ ID NO: 790) display biological activity.
An FKBP-L polypeptide or peptide may include natural and/or chemically synthesized or artificial FKBP-L peptides, peptide mimetics, modified peptides (e.g., phosphopeptides, cyclic peptides, peptides containing D- and unnatural amino-acids, stapled peptides, peptides containing radiolabels), or peptides linked to antibodies, carbohydrates, monosaccharides, oligosaccharides, polysaccharides, glycolipids, heterocyclic compounds, nucleosides or nucleotides or parts thereof, and/or small organic or inorganic molecules (e.g., peptides modified with PEG or other stabilizing groups). Thus, the FKBP-L (poly)peptides of the invention also include chemically modified peptides or isomers and racemic forms.
As described herein, the methods and therapeutic agents for use according to the present invention may utilize a full-length FKBP-L polypeptide, or biologically active fragments of the polypeptide. Thus, certain embodiments of the present invention comprise a FKBP-L derivative which comprises or consists of a biologically active portion of the N-terminal amino acid sequence of naturally occurring FKBP-L. This sequence may comprise, consist essentially of, or consist of an active N-terminal portion of the FKBP-L polypeptide. In alternate embodiments, the polypeptide may comprise, consist essentially of, or consist of amino acids 1 to 57 of SEQ ID NO: 790 (i.e., SEQ ID NO: 796), or amino acids 34-57 of SEQ ID NO:790 (i.e., SEQ ID NO: 792), or amino acids 35-57 of SEQ ID NO:790 (i.e. SEQ ID NO:791). Or, the peptide may comprise, consist essentially of, or consist of a sequence that comprises at least 18 contiguous amino acids of SEQ ID NO: 792 (e.g., SEQ ID NOs: 798, 800, or 807). In alternate embodiments, the polypeptide used in the methods and compositions of the present invention may comprise, consist essential of, or consist of one of the amino acid sequences shown in any one of SEQ ID NOs: 789-811. In certain embodiments, the present invention comprises a biologically active fragment of FKBP-L, wherein said polypeptide includes no more than 200 consecutive amino acids of the amino acid sequence shown in SEQ ID NO:789, or SEQ ID NO:790, with the proviso that said polypeptide includes the amino acid sequence shown as SEQ ID NO:791.
As described herein, the peptides may be modified (e.g., to contain PEG and/or His tags, albumin conjugates or other modifications). Or, the present invention may comprise isolated polypeptides having a sequence at least 70%, or 75%, or 80%, or 85%, or 90%, or 95%, or 96%, or 97%, or 98%, or 99% identical to the amino acid sequences as set forth in any one of SEQ ID NOS: 789-811, including in particular sequences at least 70%, or 75%, or 80%, or 85%, or 90%, or 95%, or 96%, or 97%, or 98%, or 99% identical to the amino acid sequence shown as SEQ ID NO:791. In this regard, deliberate amino acid substitutions may be made in the peptide on the basis of similarity in polarity, charge, solubility, hydrophobicity, or hydrophilicity of the residues, as long as the specific biological activity (i.e. function) of the peptide is retained. The FKBP-L peptide may be of variable length as long as it retains its biological activity and can be used according to the various aspects of the invention described above.
Certain regions of the N-terminus of the FKBP-L protein may display biological activity, therefore the invention encompasses biologically active fragments of FKBP-L, in particular any fragment which exhibits biological activity substantially equivalent to that of the 23-mer peptide (SEQ ID NO:791). In certain embodiments, the biological activity of the FKBP-L 23mer peptide (SEQ ID NO:791; referred to herein also as ALM201) is exhibited as a reduction in expression of mesenchymal markers in Kuramochi cells or OVCAR3 cisplatin resistant cells. In further embodiments, the biological activity of the FKBP-L 23mer peptide (SEQ ID NO:791; referred to herein also as ALM201) is exhibited as a reversal of the mesenchymal phenotype in OVCAR3 or OVCAR4 cisplatin resistant cells.
A “fragment” of a FKBP-L polypeptide means an isolated peptide comprising a contiguous sequence of at least 6 amino acids, preferably at least 10 amino acids, or at least 15 amino acids, or at least 20 amino acids, or at least 23 amino acids of FKBP-L. The “fragment” preferably contains no more than 50, or no more than 45, or no more than 40, or no more than 35, or no more than 30, or no more than 25, or no more than 23 contiguous amino acids of FKBP-L. Preferred fragments for use according to the invention are those having the amino acid sequences shown in any one of SEQ ID Nos: 792-811, or minor sequence variants thereof (e.g. variants containing one or more conservative amino acid substitutions).
In certain embodiments the EMT pathway inhibitor (specifically) inhibits Vimentin, AXL, TWIST1, SNAIL and/or SLUG.
The EMT pathway inhibitor may be used together with a platinum-based chemotherapeutic agent as a first line treatment. Alternatively, the EMT pathway inhibitor may be used as a second line treatment after treatment with a platinum-based chemotherapeutic agent.
By “SRC pathway inhibitor” is meant a therapeutic agent, such as a pharmaceutical drug, that acts to inhibit signalling by the SRC pathway. The inhibitor may be specific for the SRC pathway. Thus, in certain embodiments the SRC pathway inhibitor is not a multi-pathway inhibitor. In further embodiments the SRC pathway inhibitor is a (specific) inhibitor of an SRC family kinase.
In certain embodiments the SRC pathway inhibitor is selected from Table J. In further embodiments the SRC pathway inhibitor is not Dasatinib and/or pazopanib (hydrochloride), which are multi-targeted pathway inhibitors.
According to all aspects of the invention the platinum-based chemotherapeutic agent may comprise one or more of, or be selected from carboplatin, cisplatin, oxaliplatin, satraplatin, picoplatin, Nedaplatin, Triplatin and/or Lipoplatin.
According to all aspects of the invention the taxane may comprise Paclitaxel and/or Docetaxel. In specific embodiments the therapeutic agent is a taxane and the cancer is prostate cancer. The prostate cancer may be metastatic prostate cancer, in particular de novo metastatic prostate cancer.
The inhibitors described herein may act by inhibiting the expression (reducing the levels) and/or the function of one (or more) targets. Inhibition of function can include inhibiting interactions with one (or more) binding partners.
By “anti-angiogenic therapeutic agent” is meant a therapeutic agent, such as a pharmaceutical drug, that acts to inhibit angiogenesis. Examples of anti-angiogenic therapeutic agents include VEGF pathway-targeted therapeutic agents, including multi-targeted pathway inhibitors (VEGF/PDGF/FGF/EGFT/FLT-3/c-KIT), Angiopoietin-TIE2 pathway inhibitors, endogenous angiogenic inhibitors, and immunomodulatory Agents. VEGF specific inhibitors include, but are not limited to, Bevacizumab (Avastin), Afibercept (VEGF Trap), IMC-1121B (Ramucirumab). Multi-targeted pathway inhibitors include, but are not limited to, Imatinib (Gleevec), Sorafenib (Nexavar), Gefitinib (Iressa), Sunitinib (Sutent), Erlotinib, Tivozinib, Cediranib (Recentin), Pazopanib (Votrient), BIBF 1120 (Vargatef), Dovitinib, Semaxanib (Sugen), Axitinib (AG013736), Vandetanib (Zactima), Nilotinib (Tasigna), Dasatinib (Sprycel), Vatalanib, Motesanib, ABT-869, TKI-258. Angiopoietin-TIE2 pathway inhibitors include, but are not limited to, AMG-386 (Trebananib), PF-4856884 CVX-060, CEP-11981, CE-245677, MEDI-3617, CVX-241, Trastuzumab (Herceptin). Endogenous angiogenic inhibitors include, but are not limited to, Thombospondin, Endostatin, Tumstatin, Canstatin, Arrestin, Angiostatin, Vasostatin, Interferon alpha. Immunomodulatory Agents include, but are not limited to, Thalidomide and Lenalidomide. The inhibitor may be specific for angiogenesis processes or pathways. In certain embodiments the anti-angiogenic therapeutic agent is not a multi-pathway inhibitor.
In certain embodiments the anti-angiogenic therapeutic agent is selected from Table K.
According to all aspects of the invention the method may further comprise obtaining a test sample from the subject. In certain embodiments the methods involving determining gene expression are in vitro methods performed on an isolated sample.
According to all aspects of the invention samples may be of any suitable form including any material, biological fluid, tissue, or cell obtained or otherwise derived from an individual. Typically, the sample includes cancer cells or genetic material (DNA or RNA) derived from the cancer cells, to include cell-free genetic material (e.g. found in the peripheral blood). In specific embodiments the sample comprises, consists essentially of or consists of a formalin-fixed paraffin-embedded biopsy sample. In further embodiments the sample comprises, consists essentially of or consists of a fresh/frozen (FF) sample. The sample may comprise, consist essentially of or consist of tumour (cancer) tissue, optionally ovarian tumour (cancer) tissue. The sample may comprise, consist essentially of or consist of tumour (cancer) cells, optionally ovarian tumour (cancer) cells. The tissue sample may be obtained by any suitable technique.
Examples include a biopsy procedure, optionally a fine needle aspirate biopsy procedure. Body fluid samples may also be utilised. Suitable sample types include blood (including whole blood, leukocytes, peripheral blood mononuclear cells, buffy coat, plasma, and serum), sputum, tears, mucus, nasal washes, nasal aspirate, breath, urine, semen, saliva, meningeal fluid, amniotic fluid, glandular fluid, lymph fluid, nipple aspirate, bronchial aspirate, synovial fluid, joint aspirate, ascites, cells, a cellular extract, and cerebrospinal fluid. This also includes experimentally separated fractions of all of the preceding. For example, a blood sample can be fractionated into serum or into fractions containing particular types of blood cells, such as red blood cells or white blood cells (leukocytes). If desired, a sample can be a combination of samples from an individual, such as a combination of a tissue and fluid sample. The term “sample” also includes materials containing homogenized solid material, such as from a stool sample, a tissue sample, or a tissue biopsy, for example. The term “sample” also includes materials derived from a tissue culture or a cell culture, including tissue resection and biopsy samples. Example methods for obtaining a sample include, e.g., phlebotomy, swab (e.g., buccal swab). Samples can also be collected, e.g., by micro dissection (e.g., laser capture micro dissection (LCM) or laser micro dissection (LMD)), bladder wash, smear (e.g., a PAP smear), or ductal lavage. A “sample” obtained or derived from an individual includes any such sample that has been processed in any suitable manner after being obtained from the individual. The methods of the invention as defined herein may begin with an obtained sample and thus do not necessarily incorporate the step of obtaining the sample from the patient. As used herein, the term “patient” includes human and non-human animals. The preferred patient for treatment is a human. “Patient,” “individual” and “subject” are used interchangeably herein.
According to all aspects of the invention the cancer may be ovarian, prostate, colon or lung cancer or melanoma. In certain embodiments the ovarian cancer is serous ovarian cancer. In specific embodiments the ovarian cancer is high grade serous ovarian cancer. In certain embodiments the lung cancer is non-small cell lung cancer and/or lung adenocarcinoma. The cancer may also be leukaemia, brain cancer, glioblastoma, head and neck cancer, liver cancer, stomach cancer, colorectal cancer, thyroid cancer, neuroendocrine cancer, gastrointestinal stromal tumors (GIST), gastric cancer, lymphoma, throat cancer, breast cancer, skin cancer, melanoma, multiple myeloma, sarcoma, cervical cancer, testicular cancer, bladder cancer, endocrine cancer, endometrial cancer, esophageal cancer, glioma (optionally lower grade glioma), lymphoma, neuroblastoma, osteosarcoma, pancreatic cancer, pituitary cancer, renal cancer (optionally renal clear cell cancer and/or renal papillary cancer), and the like. As used herein, colorectal cancer encompasses cancers that may involve cancer in tissues of both the rectum and other portions of the colon as well as cancers that may be individually classified as either colon cancer or rectal cancer. In certain embodiments the cancer is not prostate cancer and/or glioblastoma.
Optionally, according to all aspects, the method may comprise measuring the expression levels of at least around 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46 or each of the biomarkers listed in Table A and/or Table B. Combinations from Tables A and B are also envisaged. “Around” may mean plus or minus five. By “corresponding” may mean that the probe hybridizes to the gene/biomarker or can be used to detect expression of the gene/biomarker. Smaller gene signatures may be based around those markers having greater weight values in Tables A and B and thus, in some embodiments, sub-signatures are generated by taking a selection of the larger signatures in numerical order. Thus, for example, a 5 gene signature may be composed of the first 5 genes listed in Table A and/or Table B. It could also be composed of the 5 genes with the highest weight values from Tables A and B combined. In other embodiments, the gene signatures may comprise one of the markers with the highest weight values (e.g. selected from the top 2, 3, 4, 5, 6, 7, 8, 9, or 10 markers), either alone or combined with other markers. In other embodiments the methods may comprise measuring the expression levels of one or more up to all of the following biomarkers: GJB2, CDH11, GFPT2, COL10A1, ANGPTL2, THBS1, RAB31, THBS2, INHBA, MMP14, VCAN, PLAU, FAP, FN1.
Optionally, according to all aspects, the methods may comprise measuring the expression levels of one or more up to all of the following biomarkers: TMEM200A, GJB2, MMP13, GFPT2, POSTN, BICC1, MRVI1, COL11A1, IGFL2, NTM, BGN, COL10A1, RAB31, ANGPTL2, PLAU, COL8A1, MIR1245, POLD2, NKD2, FZD1, COPZ2, ITGA5, VGLL3, INHBA, MMP14, THBS2, RUNX2, TIMP3, SFRP2, COL1A2, COL5A2, SERPINF1, KIF26B, ALPK2, CTSK, LOXL1 and FAP (optionally together with one or more up to all of the following biomarkers: CDH11, PMP22, LUM, COL3A1, VCAN, TNFAIP6, MMP2 and FN1); and/or one or more up to all of the following biomarkers: GJB2, GFPT2, COL10A1, ANGPTL2, THBS1, RAB31, THBS2, INHBA, MMP14, PLAU and FAP (optionally together with one or more up to all of the following biomarkers: CDH11, VCAN, COL5A1 and FN1).
Optionally, according to all aspects, the methods may comprise measuring the expression levels of one or more up to all of the following biomarkers: TMEM200A, GJB2, MMP13, GFPT2, POSTN, BICC1, CDH11, MRVI1, PMP22, COL11A1, IGFL2, LUM, NTM, BGN, COL3A1, COL10A1, RAB31, ANGPTL2, PLAU, COL8A1, MIR1245, POLD2, NKD2, FZD1, COPZ2, ITGA5, VGLL3, INHBA, MMP14, VCAN, THBS2, RUNX2, TIMP3, SFRP2, COL1A2, COL5A2, SERPINF1, KIF26B, TNFAIP6, ALPK2, CTSK, LOXL1 and FAP (optionally together with one or more up to all of the following biomarkers: MMP2 and FN1); and/or one or more up to all of the following biomarkers: GJB2, CDH11, GFPT2, COL10A1, ANGPTL2, THBS1, RAB31, THBS2, INHBA, MMP14, VCAN, PLAU, COL5A1 and FAP (optionally together with FN1).
In further embodiments the methods may comprise measuring the expression levels of one or more, up to all of the biomarkers in Table 13 with an LCI C-index of more than 0.5. In yet further embodiments the methods may comprise measuring the expression levels of one or more, up to all of the top 10 ranked biomarkers in Table 14 and/or Table 15. In specific embodiments the methods may comprise measuring the expression levels of the sets of 22, 19, 17, 13, 11, 9, 8, 7, 6 and 5 biomarkers listed below. Each of these signatures has been shown to give high levels of performance in identifying the relevant molecular subgroup of cancer:
Combinations of these signatures are also envisaged, for example to generate suitable 2, 3, 4, 10, 12, 14, 16, 18, 20 and 21 gene signatures. Thus, for example, a 10 gene signature may be formed by adding a single gene to the 9 gene signature. This gene could be selected from the additional genes included in another signature, for example in the 11 gene signature. Alternatively it could be derived from elsewhere and tested according to the methods known in the art and described herein.
The expression levels of the biomarkers in these sets may be measured using the probesets listed in Tables E, F and L as appropriate for each biomarker.
In particular embodiments the at least 1 biomarker selected from Table B is not COL5A1. In certain embodiments the at least 1 biomarker selected from Table A or Table B is not one or more up to all of ANGPTL2, CDH11, COL1A2, COL8A1, LOXL1, MMP14, POLD2 and/or TIMP3. Additionally or alternatively, in certain embodiments the at least 1 biomarker selected from Table A or Table B is not one or more up to all of CDH11, PMP22, LUM, COL3A1, VCAN, TNFAIP6, MMP2, FN1 and/or COL5A1. In further embodiments the at least 1 biomarker selected from Table A or Table B is not MMP2 and/or FN1. In specific embodiments the at least 1 biomarker selected from Table A or Table B does not consist of from 1 to 63 of the biomarkers shown in Table M. In further specific embodiments the EMT pathway inhibitor is ALM201 and the at least 1 biomarker selected from Table A or Table B does not consist of from 1 to 63 of the biomarkers shown in Table M.
On the basis of the information provided herein other biomarker signatures may be derived by the skilled person for use according to the invention. By using one or more of the biomarker signatures described herein (such as the 15 or 45 gene signature) the skilled person could classify a sample set into those positive and negative for the biomarker signature. The skilled person could then derive further signatures using methods described herein or known in the art (such as partial least squares paired with forward feature selection) that reproduce the classification ability of the biomarker signatures described herein. Alternatively, the skilled person could carry out the gene expression profiling and hierarchical clustering described herein and in WO2012/167278 to identify samples that fall within the EMT/Angio-Immune/MAPK pathway molecular subgroup of cancer identified by the present inventors. The skilled person could then use methods such as partial least squares paired with forward feature selection to derive further signatures that are able to detect the EMT/Angio-Immune/MAPK pathway molecular subgroup of cancer. The further signatures could be generated on an initial training dataset and then tested in a subsequent dataset for their ability to identify the EMT/Angio-Immune/MAPK pathway molecular subgroup of cancer or their classification ability.
Methods for determining the expression levels of the biomarkers are described in greater detail herein. Typically, the methods may involve contacting a sample obtained from a subject with a detection agent, such as primers/probes/antibodies (as discussed in detail herein) specific for the biomarker and detecting biomarker expression products.
According to all aspects of the invention the expression level of the gene or genes may be measured by any suitable method. Genes may also be referred to, interchangeably, as biomarkers. In certain embodiments the expression level is determined at the level of protein, RNA or epigenetic modification. The epigenetic modification may be DNA methylation.
The expression level may be determined by immunohistochemistry. By “Immunohistochemistry” is meant the detection of proteins in cells of a tissue sample by using a binding reagent such as an antibody or aptamer that binds specifically to the proteins. Accordingly, in a further aspect, the present invention relates to an antibody or aptamer that binds specifically to a protein product of at least one of the biomarkers described herein.
Antibodies useful for therapeutic and detection purposes as required herein may be of monoclonal or polyclonal origin. Fragments and derivative antibodies may also be utilised, to include without limitation Fab fragments, ScFv, single domain antibodies, nanoantibodies, heavy chain antibodies, aptamers, highly constrained bicyclic peptides (“bicycles”) etc. which retain specific binding function and these are included in the definition of “antibody”. Such antibodies are useful in the methods of the invention. Therapeutic antibodies may be conjugated to a drug to form an antibody drug conjugate. Many such ADC systems are known in the art. They may be used to measure the level of a particular protein, or in some instances one or more specific isoforms of a protein. The skilled person is well able to identify epitopes that permit specific isoforms to be discriminated from one another.
Methods for generating specific antibodies are known to those skilled in the art. Antibodies may be of human or non-human origin (e.g. rodent, such as rat or mouse) and be humanized etc. according to known techniques (Jones et al., Nature (1986) May 29-June 4; 321(6069):522-5; Roguska et al., Protein Engineering, 1996, 9(10):895-904; and Studnicka et al., Humanizing Mouse Antibody Frameworks While Preserving 3-D Structure. Protein Engineering, 1994, Vol. 7, pg 805).
In certain embodiments the expression level is determined using an antibody or aptamer conjugated to a label. By label is meant a component that permits detection, directly or indirectly. For example, the label may be an enzyme, optionally a peroxidase, or a fluorophore.
Where the antibody is conjugated to an enzyme a chemical composition may be used such that the enzyme catalyses a chemical reaction to produce a detectable product. The products of reactions catalyzed by appropriate enzymes can be, without limitation, fluorescent, luminescent, or radioactive or they may absorb visible or ultraviolet light. Examples of detectors suitable for detecting such detectable labels include, without limitation, x-ray film, radioactivity counters, scintillation counters, spectrophotometers, colorimeters, fluorometers, luminometers, and densitometers. In certain embodiments a secondary antibody is used and the expression level is then determined using an unlabeled primary antibody that binds to the target protein and a secondary antibody conjugated to a label, wherein the secondary antibody binds to the primary antibody.
Additional techniques for determining expression level at the level of protein include, for example, Western blot, immunoprecipitation, immunocytochemistry, mass spectrometry, ELISA and others (see ImmunoAssay: A Practical Guide, edited by Brian Law, published by Taylor & Francis, Ltd., 2005 edition). To improve specificity and sensitivity of an assay method based on immunoreactivity, monoclonal antibodies are often used because of their specific epitope recognition. Polyclonal antibodies have also been successfully used in various immunoassays because of their increased affinity for the target as compared to monoclonal antibodies.
Measuring mRNA in a biological sample may be used as a surrogate for detection of the level of the corresponding protein in the biological sample. Thus, the expression level of any of the genes described herein can also be detected by detecting the appropriate RNA. RNA from the sample may be converted into cDNA and the amount of the appropriate cDNA measured using any suitable method, for example via hybridization of (fluorescently labelled) probes. The amount of cDNA from the sample may then be compared with a reference amount of the relevant cDNA. cDNA based measurements may employ second generation sequencing technologies such as Illumina and Ion Torrent sequencing. Direct RNA measurements are also possible, for example using third generation sequencing technologies such as SMRT sequencing (Pacific Biosciences), nanopore sequencing and SeqLL (Helicos) sequencing.
Accordingly, in specific embodiments the expression level is determined by microarray, northern blotting, RNA-seq (RNA sequencing), in situ RNA detection or nucleic acid amplification. Nucleic acid amplification includes PCR and all variants thereof such as real-time and end point methods and qPCR. Other nucleic acid amplification techniques are well known in the art, and include methods such as NASBA, 3SR and Transcription Mediated Amplification (TMA). Other suitable amplification methods include the ligase chain reaction (LCR), selective amplification of target polynucleotide sequences (U.S. Pat. No. 6,410,276), consensus sequence primed polymerase chain reaction (U.S. Pat. No. 4,437,975), arbitrarily primed polymerase chain reaction (WO 90/06995), invader technology, strand displacement technology, and nick displacement amplification (WO 2004/067726). This list is not intended to be exhaustive; any nucleic acid amplification technique may be used provided the appropriate nucleic acid product is specifically amplified. Design of suitable primers and/or probes is within the capability of one skilled in the art. Various primer design tools are freely available to assist in this process such as the NCBI Primer-BLAST tool. Primers and/or probes may be at least 15, 16, 17, 18, 19, 20, 21, 22, 23, 24 or 25 (or more) nucleotides in length. mRNA expression levels may be measured by reverse transcription quantitative polymerase chain reaction (RT-PCR followed with qPCR). RT-PCR is used to create a cDNA from the mRNA. The cDNA may be used in a qPCR assay to produce fluorescence as the DNA amplification process progresses. By comparison to a standard curve, qPCR can produce an absolute measurement such as number of copies of mRNA per cell. Northern blots, microarrays, Invader assays, and RT-PCR combined with capillary electrophoresis have all been used to measure expression levels of mRNA in a sample. See Gene Expression Profiling: Methods and Protocols, Richard A. Shimkets, editor, Humana Press, 2004.
RNA-seq uses next-generation sequencing to measure changes in gene expression. RNA may be converted into cDNA or directly sequenced. Next generation sequencing techniques include pyrosequencing, SOLiD sequencing, Ion Torrent semiconductor sequencing, Illumina dye sequencing, single-molecule real-time sequencing or DNA nanoball sequencing.
In situ RNA detection involves detecting RNA without extraction from tissues and cells. In situ RNA detection includes In situ hybridization (ISH) which uses a labeled (e.g. radio labelled, antigen labelled or fluorescence labelled) probe (complementary DNA or RNA strand) to localize a specific RNA sequence in a portion or section of tissue, or in the entire tissue (whole mount ISH), or in cells. A branched DNA assay can also be used for RNA in situ hybridization assays with single molecule sensitivity. This approach includes ViewRNA assays.
Thus, in a further aspect the present invention relates to a kit comprising one or more oligonucleotide probes specific for an RNA product of at least 1 biomarker from Table A or Table B.
RNA expression may be determined by hybridization of RNA to a set of probes. The probes may be arranged in an array. Microarray platforms include those manufactured by companies such as Affymetrix, Illumina and Agilent. Examples of microarray platforms manufactured by Affymetrix include the U133 Plus2 array, the Almac proprietary Xcel™ array and the Almac proprietary Cancer DSAs®. In specific embodiments a sample of target nucleic acids is first prepared from the initial nucleic acid sample being assayed, where preparation may include labeling of the target nucleic acids with a label, e.g., a member of a signal producing system. Following target nucleic acid sample preparation, the sample is contacted with the array under hybridization conditions, whereby complexes are formed between target nucleic acids that are complementary to probe sequences attached to the array surface. The presence of hybridized complexes is then detected, either qualitatively or quantitatively. Specific hybridization technology which may be practiced to generate the expression profiles employed in the subject methods includes the technology described in U.S. Pat. Nos. 5,143,854; 5,288,644; 5,324,633; 5,432,049; 5,470,710; 5,492,806; 5,503,980; 5,510,270; 5,525,464; 5,547,839; 5,580,732; 5,661,028; 5,800,992; the disclosures of which are herein incorporated by reference; as well as WO 95/21265; WO 96/31622; WO 97/10365; WO 97/27317; EP 373 203; and EP 785 280. In these methods, an array of “probe” nucleic acids that includes a probe for each of the biomarkers whose expression is being assayed is contacted with target nucleic acids as described above. Contact is carried out under hybridization conditions, e.g., stringent hybridization conditions as described above, and unbound nucleic acid is then removed. The resultant pattern of hybridized nucleic acids provides information regarding expression for each of the biomarkers that have been probed, where the expression information is in terms of whether or not the gene is expressed and, typically, at what level, where the expression data, i.e., expression profile, may be both qualitative and quantitative.
In certain embodiments, measuring the expression levels of the at least 1 biomarker selected from Table A or Table B comprises contacting the sample with a set of nucleic acid probes or primers that bind to the at least 1 biomarker and detecting binding of the set of nucleic acid probes or primers to the at least 1 biomarker(s) by microarray, northern blotting, or nucleic acid amplification.
The methods described herein may further comprise extracting total nucleic acid or RNA from the sample. Suitable methods are known in the art and include use of commercially available kits such as RNeasy and GeneJET RNA purification kit.
In specific embodiments, expression of the at least one gene may be determined using one or more probes described herein.
These probes may also be incorporated into the kits of the invention. The probe sequences may also be used in order to design primers for detection of expression, for example by RT-PCR. Such primers may also be included in the kits of the invention.
The invention also relates to a system or device for performing a method as described herein.
Thus, the present invention relates to a system or test kit for selecting a treatment for a subject having a cancer, comprising:
In certain embodiments:
In yet a further aspect, the present invention relates to system or test kit for predicting the responsiveness of a subject with cancer to a therapeutic agent comprising:
In certain embodiments the subject is classified as
The invention also relates to a system or test kit for determining the clinical prognosis of a subject with cancer comprising:
In certain embodiments the subject is classified as having a poor prognosis if the sample is positive for the biomarker signature and/or having a good prognosis if the sample is negative for the biomarker signature.
The system or test kit may further comprise a display for the output from the processor.
By testing device is meant a combination of components that allows the expression level of a gene to be determined. The components may include any of those described above with respect to the methods for determining expression level at the level of protein, RNA or epigenetic modification. For example the components may be antibodies, primers, detection agents and so on. Components may also include one or more of the following: microscopes, microscope slides, x-ray film, radioactivity counters, scintillation counters, spectrophotometers, colorimeters, fluorometers, luminometers, and densitometers.
The invention also relates to a computer application or storage medium comprising a computer application as defined above.
In certain example embodiments, provided is a computer-implemented method, system, and a computer program product for selecting a treatment for a subject having a cancer and/or prediction of the responsiveness of a subject with cancer to a therapeutic agent and/or determining the clinical prognosis of a subject with cancer, in accordance with the methods described herein. For example, the computer program product may comprise a non-transitory computer-readable storage device having computer-readable program instructions embodied thereon that cause the computer to:
In certain example embodiments, the computer-implemented method, system, and computer program product may be embodied in a computer application, for example, that operates and executes on a computing machine and a module. When executed, the application may select whether to administer a treatment to a subject having a cancer and/or predict the responsiveness of a subject with cancer to a therapeutic agent and/or determine the clinical prognosis of a subject with cancer, in accordance with the example embodiments described herein.
As used herein, the computing machine may correspond to any computers, servers, embedded systems, or computing systems. The module may comprise one or more hardware or software elements configured to facilitate the computing machine in performing the various methods and processing functions presented herein. The computing machine may include various internal or attached components such as a processor, system bus, system memory, storage media, input/output interface, and a network interface for communicating with a network, for example.
The computing machine may be implemented as a conventional computer system, an embedded controller, a laptop, a server, a customized machine, any other hardware platform, such as a laboratory computer or device, for example, or any combination thereof. The computing machine may be a distributed system configured to function using multiple computing machines interconnected via a data network or bus system, for example.
The processor may be configured to execute code or instructions to perform the operations and functionality described herein, manage request flow and address mappings, and to perform calculations and generate commands. The processor may be configured to monitor and control the operation of the components in the computing machine. The processor may be a general purpose processor, a processor core, a multiprocessor, a reconfigurable processor, a microcontroller, a digital signal processor (“DSP”), an application specific integrated circuit (“ASIC”), a graphics processing unit (“GPU”), a field programmable gate array (“FPGA”), a programmable logic device (“PLD”), a controller, a state machine, gated logic, discrete hardware components, any other processing unit, or any combination or multiplicity thereof. The processor may be a single processing unit, multiple processing units, a single processing core, multiple processing cores, special purpose processing cores, co-processors, or any combination thereof. According to certain example embodiments, the processor, along with other components of the computing machine, may be a virtualized computing machine executing within one or more other computing machines.
The system memory may include non-volatile memories such as read-only memory (“ROM”), programmable read-only memory (“PROM”), erasable programmable read-only memory (“EPROM”), flash memory, or any other device capable of storing program instructions or data with or without applied power. The system memory may also include volatile memories such as random access memory (“RAM”), static random access memory (“SRAM”), dynamic random access memory (“DRAM”), and synchronous dynamic random access memory (“SDRAM”). Other types of RAM also may be used to implement the system memory. The system memory may be implemented using a single memory module or multiple memory modules. While the system memory may be part of the computing machine, one skilled in the art will recognize that the system memory may be separate from the computing machine without departing from the scope of the subject technology. It should also be appreciated that the system memory may include, or operate in conjunction with, a non-volatile storage device such as the storage media.
The storage media may include a hard disk, a floppy disk, a compact disc read only memory (“CD-ROM”), a digital versatile disc (“DVD”), a Blu-ray disc, a magnetic tape, a flash memory, other non-volatile memory device, a solid state drive (“SSD”), any magnetic storage device, any optical storage device, any electrical storage device, any semiconductor storage device, any physical-based storage device, any other data storage device, or any combination or multiplicity thereof. The storage media may store one or more operating systems, application programs and program modules such as module, data, or any other information. The storage media may be part of, or connected to, the computing machine. The storage media may also be part of one or more other computing machines that are in communication with the computing machine, such as servers, database servers, cloud storage, network attached storage, and so forth.
The module may comprise one or more hardware or software elements configured to facilitate the computing machine with performing the various methods and processing functions presented herein. The module may include one or more sequences of instructions stored as software or firmware in association with the system memory, the storage media, or both. The storage media may therefore represent examples of machine or computer readable media on which instructions or code may be stored for execution by the processor. Machine or computer readable media may generally refer to any medium or media used to provide instructions to the processor. Such machine or computer readable media associated with the module may comprise a computer software product. It should be appreciated that a computer software product comprising the module may also be associated with one or more processes or methods for delivering the module to the computing machine via a network, any signal-bearing medium, or any other communication or delivery technology. The module may also comprise hardware circuits or information for configuring hardware circuits such as microcode or configuration information for an FPGA or other PLD.
The input/output (“I/O”) interface may be configured to couple to one or more external devices, to receive data from the one or more external devices, and to send data to the one or more external devices. Such external devices along with the various internal devices may also be known as peripheral devices. The I/O interface may include both electrical and physical connections for operably coupling the various peripheral devices to the computing machine or the processor. The I/O interface may be configured to communicate data, addresses, and control signals between the peripheral devices, the computing machine, or the processor. The I/O interface may be configured to implement any standard interface, such as small computer system interface (“SCSI”), serial-attached SCSI (“SAS”), fiber channel, peripheral component interconnect (“PCI”), PCI express (PCIe), serial bus, parallel bus, advanced technology attached (“ATA”), serial ATA (“SATA”), universal serial bus (“USB”), Thunderbolt, FireWire, various video buses, and the like. The I/O interface may be configured to implement only one interface or bus technology.
Alternatively, the I/O interface may be configured to implement multiple interfaces or bus technologies. The I/O interface may be configured as part of, all of, or to operate in conjunction with, the system bus. The I/O interface may include one or more buffers for buffering transmissions between one or more external devices, internal devices, the computing machine, or the processor.
The I/O interface may couple the computing machine to various input devices including mice, touch-screens, scanners, electronic digitizers, sensors, receivers, touchpads, trackballs, cameras, microphones, keyboards, any other pointing devices, or any combinations thereof. The I/O interface may couple the computing machine to various output devices including video displays, speakers, printers, projectors, tactile feedback devices, automation control, robotic components, actuators, motors, fans, solenoids, valves, pumps, transmitters, signal emitters, lights, and so forth.
The computing machine may operate in a networked environment using logical connections through the network interface to one or more other systems or computing machines across the network. The network may include wide area networks (WAN), local area networks (LAN), intranets, the Internet, wireless access networks, wired networks, mobile networks, telephone networks, optical networks, or combinations thereof. The network may be packet switched, circuit switched, of any topology, and may use any communication protocol. Communication links within the network may involve various digital or an analog communication media such as fiber optic cables, free-space optics, waveguides, electrical conductors, wireless links, antennas, radio-frequency communications, and so forth.
The processor may be connected to the other elements of the computing machine or the various peripherals discussed herein through the system bus. It should be appreciated that the system bus may be within the processor, outside the processor, or both. According to some embodiments, any of the processor, the other elements of the computing machine, or the various peripherals discussed herein may be integrated into a single device such as a system on chip (“SOC”), system on package (“SOP”), or ASIC device.
Embodiments may comprise a computer program that embodies the functions described and illustrated herein, wherein the computer program is implemented in a computer system that comprises instructions stored in a machine-readable medium and a processor that executes the instructions. However, it should be apparent that there could be many different ways of implementing embodiments in computer programming, and the embodiments should not be construed as limited to any one set of computer program instructions. Further, a skilled programmer would be able to write such a computer program to implement one or more of the disclosed embodiments described herein. Therefore, disclosure of a particular set of program code instructions is not considered necessary for an adequate understanding of how to make and use embodiments. Further, those skilled in the art will appreciate that one or more aspects of embodiments described herein may be performed by hardware, software, or a combination thereof, as may be embodied in one or more computing systems. Moreover, any reference to an act being performed by a computer should not be construed as being performed by a single computer as more than one computer may perform the act.
The example embodiments described herein can be used with computer hardware and software that perform the methods and processing functions described previously. The systems, methods, and procedures described herein can be embodied in a programmable computer, computer-executable software, or digital circuitry. The software can be stored on computer-readable media. For example, computer-readable media can include a floppy disk, RAM, ROM, hard disk, removable media, flash memory, memory stick, optical media, magneto-optical media, CD-ROM, etc. Digital circuitry can include integrated circuits, gate arrays, building block logic, field programmable gate arrays (FPGA), etc.
Reagents, tools, and/or instructions for performing the methods described herein can be provided in a kit. In certain embodiments, there is provided a kit for use in a method for selecting a treatment for a subject having a cancer as described herein and/or for use in a method for predicting the responsiveness of a subject with cancer to a therapeutic agent as described herein and/or for use in a method of determining a clinical prognosis for a subject with cancer as described herein.
The kit may include reagents for collecting a tissue sample from a patient, such as by biopsy, and reagents for processing the tissue. Thus, the kit may include suitable fixatives, such as formalin and embedding reagents, such as paraffin. The kit can also include one or more reagents for performing an expression level analysis, such as reagents for performing nucleic acid amplification, including RT-PCR and qPCR, NGS (RNA-seq), northern blot, proteomic analysis, or immunohistochemistry to determine expression levels of biomarkers in a sample of a patient. For example, primers for performing RT-PCR, probes for performing northern blot analyses or bDNA assays, and/or antibodies or aptamers, as discussed herein, for performing proteomic analysis such as Western blot, immunohistochemistry and ELISA analyses can be included in such kits. Appropriate buffers for the assays can also be included. Detection reagents required for any of these assays can also be included. The kits may be array or PCR based kits for example and may include additional reagents, such as a polymerase and/or dNTPs for example. The kits featured herein can also include an instruction sheet describing how to perform the assays for measuring expression levels.
The kit may include one or more primer pairs and/or probes complementary to at least one gene selected from Table A or Table B. In certain embodiments, according to all aspects of the invention, the kits may include one or more probes or primers (primer pairs) designed to hybridize with the target sequences or full sequences listed in Table A or Table B and thus permit expression levels to be determined. The probes and probesets identified in Table A and Table B may be employed according to all aspects of the invention.
The kits may include primers/primer pairs/probes/probesets to form any of the gene signatures specified herein.
The kits may also include one or more primer pairs complementary to a reference gene.
Such a kit can also include primer pairs complementary to at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, or 45 of the genes listed in Table A and/or primer pairs complementary to at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 of the genes listed in Table B.
There is provided a kit for use in a method for selecting a treatment for a subject having a cancer as described herein and/or for use in a method for predicting the responsiveness of a subject with cancer to a therapeutic agent as described herein and/or for use in a method of determining a clinical prognosis for a subject with cancer as described herein comprising one or more primers and/or primer pairs for amplifying and/or which specifically hybridize with at least one gene, full sequence or target sequence selected from Table A or Table B. There is also provided a kit for use in a method for selecting a treatment for a subject having a cancer as described herein and/or for use in a method for predicting the responsiveness of a subject with cancer to a therapeutic agent as described herein and/or for use in a method of determining a clinical prognosis for a subject with cancer as described herein comprising one or more probes that specifically hybridize with at least one gene, full sequence or target sequence selected from Table A or Table B.
The probes and probesets also constitute separate aspects of the invention. By “probeset” is meant the collection of probes designed to target (by hybridization) a single gene.
The invention also relates to a kit for use in the methods described herein comprising one or more antibodies or aptamers as described above and which are useful in the methods of the invention.
Informational material included in the kits can be descriptive, instructional, marketing or other material that relates to the methods described herein and/or the use of the reagents for the methods described herein. For example, the informational material of the kit can contain contact information, e.g., a physical address, email address, website, or telephone number, where a user of the kit can obtain substantive information about performing a gene expression analysis and interpreting the results.
The kit may further comprise a computer application or storage medium as described above.
The example systems, methods, and acts described in the embodiments presented previously are illustrative, and, in alternative embodiments, certain acts can be performed in a different order, in parallel with one another, omitted entirely, and/or combined between different example embodiments, and/or certain additional acts can be performed, without departing from the scope and spirit of various embodiments. Accordingly, such alternative embodiments are included in the scope of the invention as described herein.
Although specific embodiments have been described above in detail, the description is merely for purposes of illustration. It should be appreciated, therefore, that many aspects described above are not intended as required or essential elements unless explicitly stated otherwise.
Modifications of, and equivalent components or acts corresponding to, the disclosed aspects of the example embodiments, in addition to those described above, can be made by a person of ordinary skill in the art, having the benefit of the present disclosure, without departing from the spirit and scope of embodiments defined in the following claims, the scope of which is to be accorded the broadest interpretation so as to encompass such modifications and equivalent structures.
A. Heat map showing unsupervised hierarchical clustering of gene expression data using the 1040 most variable genes in the Edinburgh 265 high grade serous ovarian carcinomas. Gene expression across all samples is represented horizontally. Functional processes corresponding to each gene cluster are labelled along the right of the figure. Angio (blue), Immune (green), and Angio_Immune (red) subgroups are labelled for each of the sample clusters, and colour coded along the top as described in the legend box. A gene expression signature to detect each of the subgroups was generated. B. Kaplan-Meier Progression-Free Survival analysis of subgroups as defined by unsupervised clustering analysis of Edinburgh 265 HGSOC Samples. Additionally Kaplan-Meier overall survival analysis of subgroups as defined by unsupervised clustering analysis of Edinburgh 265 HGSOC Samples. C. Kaplan-Meier to show the prognostic utility of the Angio_Immune subgroup in HGSOC (PFS HR 1.4 (1.092 to 1.880) p=0.0256 and OS HR 1.4 (1.05-1.87) p=0.0224). D. Molecular subgroups are dynamic in the context of chemotherapy. The effect of chemotherapy treatment on 48 matched pre-chemotherapy and post-chemotherapy samples and analysis of subgroup switching based on assessment of the 3 gene signature scores (22 Angio signature; 63 Immune signature and 45 Angio_immune signature) generated from the treatment naive Discover dataset.
A. Generation of Cisplatin resistant OVCAR3 cell lines. 10-day colony formation assay assessing sensitivity of OVCAR3-WT and OVCAR3-CP cells to increasing concentrations of cisplatin. B. Cisplatin sensitive and resistant A2780 cell line models were scored with each of the 3 gene signatures and scores plotted in a bar graph, Angio_Immune (p=0.0057), Angio (p=0.3959) and Immune (p=0.0124). C. Cisplatin sensitive and resistant OVCAR3 cell line models were scored with each of the 3 gene signatures and scores plotted, Angio_Immune (p=0.0244), Angio (p=0.2478) and Immune (p=0.028). D. Western blot analysis showing increased MAPK signalling in the A2780 and OVCAR3 cisplatin resistant cells compared to cisplatin sensitive counterparts. E. Colony formation assay with cisplatin in 15 ovarian cell lines, plotting 45-gene signature scores based on median centred IC50 doses (AUC 0.7917 (0.6350-0.9483), p=0.0008) and plotting IC50 doses based on median centred signature scores (AUC 0.6838 (0.5184-0.8491), p=0.0377).
A. Semi-supervised clustering analysis was performed on the Discovery dataset using the 3 public gene lists. Genes separating the ovarian samples were selected for further analysis. These were combined and a compilation gene list compiled and semi-supervised analysis of the Discovery dataset performed again. B. Venn diagrams illustrating the overlap of the ‘MEK ON’ population with the 3 gene signatures. This demonstrated 77% overlap with the Angio_immune subgroup. C. TOGA ovarian samples were scored with the 3 ovarian gene signatures. Correlation with the gene signatures and the pMAPK RRPA data was investigated using ROC analysis. Each of the 3 gene signature scores of TOGA samples were median centered and defined as being High and Low scores. A ROC curve was generated using the binary signature scores and the continuous pMAPK expression (TOGA) in 237 samples. A statistically significant result was found with the 45-gene signature (p=0.04786) but not with the 63 or 22 gene signatures (p=0.4337 and p=0.4109 respectively). D. (i) Colony formation assay with Trametinib in 16 ovarian cell lines, plotting 45-gene signature scores based on median centred IC50 doses (AUC 0.7234 (0.5778-0.8690), p=0.0090) and plotting IC50 doses based on median centred signature scores (AUC 0.7147 (0.5674-0.8620), p=0.0117). ii. 881 cell lines from the Sanger center were scored with the 45 gene AngioImmune signature and correlated to IC50 response to Trametinib. AngioImmune gene signature scores were plotted based on median centred IC50 doses and IC50 doses were plotted based on median centred signature scores. iii. 760 cell lines from the Sanger center were scored with the 45 gene AngioImmune signature and correlated to IC50 response to Selumetinib. AngioImmune gene signature scores were plotted based on median centred IC50 doses and IC50 doses were plotted based on median centred signature scores.
A. The 45-gene, 22-gene and 63 gene signature scores from the E-GEOD-55624 data whereby SW480 cells (KRAS G12D) were treated with a MEK inhibitor for 4 and 16 hours. The Angio_Immune signature scores was significantly reduced post MEK inhibitor treatment at both 4 and 16 hours (p=0.0055 and p=0.0143 respectively). B. Differences in the 3 gene signatures between HCT116 (KRAS MT) and HKH2 cells (KRAS WT) using the E-MEXP-3557 dataset. The 45-gene signature scores were elevated in KRAS mutant cells. C. E-GEOD 12764: MCF10 breast cells transfected with empty vector (EV) or HRAS or MEK1 confirmed elevated 45-gene signature scores in the HRAS and MEK1 mutants (p=0.0004 and p<0.0001 respectively). D. Inhibition of MEK with Trametinib decreases the 45-gene Angio_Immune signature score in OVCAR3 cells (p=0.0011).
A. 10 day colony formation assay assessing sensitivity of OVCAR3-WT and OVCAR3-CP cells to increasing concentrations of Cisplatin and MEK inhibitor as single agents. B. 10 day colony formation assay assessing sensitivity of OVCAR3-WT and OVCAR3-CP cells to increasing concentrations of Trametinib (GSK1120212). Table shows IC50 values for OVCAR3-WT and OVCAR3-CP cells for Cisplatin and Cisplatin in combination with Trametinib.
A. Box and whisker plots depicting the expression of EMT related genes across the 3 HGSOC molecular subgroups. Expression of VIM, AXL, TWIST1, SNAIL and SLUG is enhanced in the Angio_Immune subgroup (p<0.0001). B. Box and whisker plot of 45-gene signature scores in MCF7 control and SNAIL overexpressing cells (E-GEOD-58252). The 45-gene signature is enhanced by SNAIL overexpression (p=0.0004).
A. 10-day colony formation assay assessing sensitivity of OVCAR3-WT and OVCAR3-CP cells to increasing concentrations of cisplatin (left panel). Western blot analysis showing increased MAPK signalling in the OVCAR4 cisplatin resistant cells compared to cisplatin sensitive counterparts (right panel). B. Western blot analysis showing activation of EMT signalling in OVCAR3 CP and OVCAR4 CP (cisplatin resistant) with increased protein expression of Vimentin, N-cadherin and SLUG whilst decreasing protein expression of E-cadherin. B-actin was used as a loading control. C. Quantitative real-time PCR (qRT-PCR) expression of EMT markers (N-cadherin, SLUG, SNAIL, Vimentin, TWIST and TGF-β3) in cisplatin resistant OVCAR3 cells. Fold change plotted relative to wildtype counterparts. D. Quantitative real-time PCR (qRT-PCR) expression of EMT markers (N-cadherin, SLUG, SNAIL, Vimentin, TWIST and TGF-β3) in cisplatin resistant OVCAR4 cells. Fold change plotted relative to wildtype counterparts. E. Bar charts to show the fold change increase in migration of OVCAR3 and OVCAR4 cisplatin resistant cells compared to the wildtype ovarian cell lines.
A. Representative western blot showing levels of phosphorylated ERK, and SRC following treatment of TOV112D cells with 1 μM SRC inhibitor, Saracatinib for 3, 6, 12 and 24 hours. Total ERK and total SRC expression are also shown. Beta actin was used as a loading control. Representative western blot showing levels of phosphorylated SRC and ERK following treatment of TOV112D cells with 1 μM MEK inhibitor, Trametinib for 3, 6, 12 and 24 hours. Total SRC and total ERK expression are also shown. Beta-actin was used as a loading control. B. Box and whisker plots showing differences in the 45-gene signature scores between SRC inhibitor resistant and sensitive cells.
A. Heatmap representation of semi-supervised analysis of the MARISA dataset (GSE40967) using the Angio_Immune genes. Five individual clusters were identified, with Sample Cluster 3 (highlighted by the red box) defining the MEK driven subgroup. B. Kaplan-Meier to show the relapse-free survival of the five sample cluster groups. The MEK driven group represents poor prognosis in comparison to the other subgroups (p=0.037). C. Kaplan-Meier to show the relapse-free survival using the 45-gene signature scores from Marisa. The MEK ON group represents poor prognosis in comparison to the MEK OFF group (AUC 1.5949 (1.0951-2.3228), p=0.0063). D. Kaplan-Meier to show the disease-free survival using the 45-gene signature scores in the Jorissen dataset (GSE14333). The MEK ON group represents poor prognosis in comparison to the MEK OFF group (AUC 2.4543 (1.2049-4.999), p=0.0014).
A. Heatmap representation of semi-supervised analysis of the Okayama dataset (GSE31210) using the Angio_Immune genes. Five individual clusters were identified, with Sample Cluster 4 (highlighted by the red box) defining the MEK driven subgroup. B. Kaplan-Meier to show the relapse-free survival of the five sample cluster groups. The MEK driven group represents poor prognosis in comparison to the other subgroups (p=0.0004). C. Kaplan-Meier to show the progression-free survival using the 45-gene signature scores from Okayama. The SIG POS group represents poor prognosis in comparison to the SIG NEG group (AUC 3.045 (1.631-5.686), p=0.0005). D. Kaplan-Meier to show the overall survival using the 45-gene signature scores in the Okayama dataset. The SIG POS group represents poor prognosis in comparison to the SIG NEG group (AUC 2.872 (1.271-6.489), p=0.0312).
A. Kaplan-Meier to show the prognostic utility of the 15-gene Angio_Immune subgroup in HGSOC (PFS HR=1.3564 [1.0156-1.8117]; p=0.0279 and OS HR=1.3464 [0.9901-1.8308]; p=0.0441). B. Colony formation assay with cisplatin in 15 ovarian cell lines, plotting 15-gene signature scores based on median centred IC50 doses (AUC 0.6905 (0.5254-8556), p=0.0290) and plotting IC50 doses based on median centred signature scores (AUC 0.6897 (0.5326-0.8468), p=0.02932). C. Cisplatin sensitive and resistant OVCAR3 cell line models were scored with the 15-gene signature and scores plotted in a box and whisker plot, (p=0.046).
A. Differences in the 15-gene signature between HCT116 (KRAS MT) and HKH2 cells (KRAS WT) using the E-MEXP-3557 dataset. The 15-gene signature scores were elevated in KRAS mutant cells (p=0.0443). B. E-GEOD 12764: MCF10 breast cells transfected with empty vector (EV) or HRAS or MEK1 confirmed elevated 15-gene signature scores in the HRAS and MEK1 mutants (p<0.0001). C. Inhibition of MEK with Trametinib decreases the 15-gene Angio_Immune signature score in OVCAR3 cells (p=0.0023). D. Colony formation assay with Trametinib in 15 ovarian cell lines, plotting 15-gene signature scores based on median centred IC50 doses (AUC 0.850 (0.7366-0.9636), p<0.0001) and plotting IC50 doses based on median centred signature scores (AUC 0.737 (0.5820-0.8974, p=0.006495).
Box and whisker plot of 15-gene signature scores in MCF7 control and SNAIL overexpressing cells (E-GEOD-58252). The 15-gene signature is enhanced by SNAIL overexpression (p=0.0015).
A and B. Box and whisker plots showing differences in the 15-gene signature scores between SRC inhibitor resistant and sensitive cells following treatment with Saracatinib. Median centered on signature score (AUC 0.7289 (0.5544-0.9035), p=0.01454) or median centered on IC50 of Saracatinib (AUC 0.7698 (0.6054-0.9343), p=0.004076).
A. Kaplan-Meier to show the disease-free survival using the 15-gene signature scores in the Jorissen dataset (GSE14333). The MEK ON group represents poor prognosis in comparison to the MEK OFF group (p=0.0328). B. Kaplan-Meier to show the relapse-free survival using the 15-gene signature scores from Marisa. The MEK ON group represents poor prognosis in comparison to the MEK OFF group (p=0.0161).
A. Kaplan-Meier to show the progression-free survival using the 15-gene signature scores from Okayama. The SIG POS group represents poor prognosis in comparison to the SIG NEG group (p=0.0024). B. Kaplan-Meier to show the overall survival using the 15-gene signature scores in the Okayama dataset. The SIG POS group represents poor prognosis in comparison to the SIG NEG group (p=0.0396).
A. 739 cell lines from ‘The Genomics of Drug Sensitivity in Cancer Project’ (http://www.cancerrxgene.org/) were scored with the 45-gene AngioImmune signature and correlated to IC50 response to Trametinib. AngioImmune gene signature scores were plotted based on median centred IC50 doses and IC50 doses were plotted based on median centred signature scores. B. 759 cell lines from the ‘The Genomics of Drug Sensitivity in Cancer Project’ were scored with the 45-gene AngioImmune signature and correlated to IC50 response to Selumetinib. AngioImmune gene signature scores were plotted based on median centred IC50 doses and IC50 doses were plotted based on median centred signature scores. C. 739 cell lines from ‘The Genomics of Drug Sensitivity in Cancer Project’ (http://www.cancerrxgene.org/) were scored with the 15-gene AngioImmune signature and correlated to IC50 response to Trametinib. AngioImmune gene signature scores were plotted based on median centred IC50 doses and IC50 doses were plotted based on median centred signature scores. D. 760 cell lines from the ‘The Genomics of Drug Sensitivity in Cancer Project’ were scored with the 15-gene AngioImmune signature and correlated to IC50 response to Selumetinib. AngioImmune gene signature scores were plotted based on median centred IC50 doses and IC50 doses were plotted based on median centred signature scores.
A. Scatter plot of 45-gene signature scores across two clinical groups, PSA responders and PSA non-responders. B. Kaplan-Meier to show patient survival using the 45-gene signature scores in response to taxane based chemotherapy in prostate cancer. The EMT positive group (blue) represents the good prognosis group who respond well to taxane in comparison to the EMT negative group (red) C. Table representing the breakdown of PSA responders and non-responders who are EMT positive and negative within the pilot cohort.
A. Scatter plot of 15-gene signature scores across two clinical groups, PSA responders and PSA non-responders. B. Kaplan-Meier to show patient survival using the 15-gene signature scores in response to taxane based chemotherapy in prostate cancer. The EMT positive group (blue) represents the good prognosis group who respond well to taxane in comparison to the EMT negative group (red) C. Table representing the breakdown of PSA responders and non-responders who are EMT positive and negative within the pilot cohort.
Kaplan-Meier to show prognostic relevance of the 15-gene signature scores in predicting biochemical recurrence in prostate cancer. The EMT positive group (15-gene signature high) (blue) represents the poor prognosis group who have poorer survival and greater chance of biochemical recurrence in comparison to the EMT negative group (15-gene signature low) (green).
A. Kaplan-Meier to show prognostic relevance of the 15-gene signature scores in predicting biochemical recurrence in prostate cancer. The EMT positive group (15-gene signature high) (green) represents the poor prognosis group who have poorer survival and greater chance of biochemical recurrence in comparison to the EMT negative group (15-gene signature low) (blue). B. Kaplan-Meier to show prognostic relevance of the 15-gene signature scores in predicting biochemical recurrence in prostate cancer. The EMT positive group (15-gene signature high) (green) represents the poor prognosis group who have poorer survival and greater chance of metastatic progression in comparison to the EMT negative group (15-gene signature low) (blue).
A. Kaplan-Meier to show the disease-free survival using the 15-gene signature scores in the TOGA RNA-seq dataset across multiple diseases (shown in B) The MEK/EMT ON group represents poor prognosis in comparison to the MEK/EMT OFF group). B. Table showing hazard ratios and statistical significance of EMT biomarker across individual diseases.
OVCAR3 and OVCAR4 HGSOC cell lines were continuously exposed to increasing concentrations cisplatin over 6 months to generate cisplatin resistant OVCAR3CP and OVCAR4CP cells respectively. In-vivo matrigel plug assay to demonstrate the MVD in the OVCAR3 isogenic cell lines. H&E quantification of MVD of the OVCAR3 isogenic cell lines shows that co-culturing the OVCAR3 CP cell lines with ECFCs have a higher MVD (p-value: 0.0041) compared to the parental cell lines (p-value: 0.8712). The OVCAR3 CP cell lines has a higher 15-gene signature score relative to the OVCAR3 WT cell line (p-value: 0.046)
Cytokine array shows that the platinum resistant OVCAR3 (A) and OVCAR4 (B) have higher expression of cytokines that a key regulators of angiogenesis
(C) Western blot showing that VEGFa protein expression levels are higher in OVCAR3 and OVCAR4 cisplatin-resistant cells in comparison to the OVCAR3 and OVCAR4 cisplatin-naïve cells.
A. AngioImmune subgroup is characterised by expression of RTKs that are key regulators of the mesenchymal and proliferative phenotype in ovarian cancer compared to the other 2 subgroups.
B. Pre chemotherapy samples verses post-chemotherapy samples. RTKs shown represent those which were statistically associated either by ROC analysis (AUC) or student t-test where indicated.
A. pRTK array shows that the OVCAR3 cisplatin-resistant cell line has higher basal expression of pRTK relative to the platinum-naïve OVCAR3 pair.
B. Further validation of the pRTK array by western blot shows basal upregulation of phospho-VEGFR2, VEGFR3, PDGFRα and phospho-AXL in the OVCAR3 cisplatin-resistant relative to the OVCAR3 cisplatin-naïve pair
A. 10-day colony formation assay of Cediranib in the OVCAR3 isogenic cell lines demonstrates sensitivity for the OVCAR3 cisplatin-resistant (IC50 1.194) relative to the OVCAR3 cisplatin-naïve cell line (IC50 4.994).
B. 10-day colony formation assay of Nintedanib in the OVCAR3 isogenic cell lines demonstrates sensitivity for the OVCAR3 cisplatin-resistant (IC50 3.777) relative to the OVCAR3 cisplatin-naïve cell line (IC50>10).
A CellTiter Glo assay was carried out to determine the IC50 for Cediranib (IC50 5.569 μM at 48 hour time point) and Nintedanib (IC50 9.097 μM at 48 hour time point).
B. Western blot showing that VEGFa protein expression levels are down-regulated in OVCAR3 and OVCAR4 cisplatin-resistant cells treated with an IC50 concentration of Cediranib and Nintedanib.
In-vivo matrigel plug assay to demonstrate the effect of bevacizumab on MVD in the OVCAR3 isogenic cell lines. IF quantification of MVD of the OVCAR3 isogenic cell lines shows that co-culturing the OVCAR3 CP cell lines with ECFCs have a higher MVD (p-value: 0.0024) compared to the parental cell lines (p-value: 0.84525).
The present invention will be further understood by reference to the following experimental examples.
MEK Activation is Associated with a Molecular Subgroup in High Grade Serous Ovarian Cancer
Epithelial ovarian cancer (EOC) ranks among the top ten diagnosed and top five deadliest cancers in most countries (Ferlay et al., 2010). Continental rates are highest in Europe (10.1 per 100,000) with 41,448 deaths from ovarian cancer in 2008, representing 5.5% of all female cancer deaths in Europe. The high death rate is because most patients (>60%) are diagnosed at an advanced stage of disease (Stage III and IV) (Vaugh et al., 2012). The most common type of EOC is high-grade serous ovarian cancer (HGSOC) which accounts for at least 70% of cases, the majority of which are stage III and IV disease (Bowtell, 2010). Currently, the standard treatment used in initial management is cytoreductive surgery and adjuvant chemotherapy with a platinum-based regimen. However, despite an initial complete clinical-response rate of 65%-80%, most stage III and IV ovarian carcinomas relapse with an overall 5-year survival rate of only 10%-30% and a median survival of 2 to 3 years (www.cancerresearchuk.org). Classic clinicopathological factors, such as age, stage, residual tumour after surgery, differentiation grade and histopathological features, are currently the most important prognostic markers, but it is not possible to select optimal chemotherapy on an individual patient basis using these factors. Over the past 20 years there has been very little progress in the treatment of HGSOC, with five-year survival figures remaining unchanged for stage III and IV disease (www.cancerresearchuk.org).
A number of studies have tried to characterise the mechanisms of acquired resistance in ovarian cancer. Analysis of 135 spatially and temporally separated samples from 14 patients with HGSOC who received platinum-based chemotherapy found that NF1 deletion showed a progressive increase in tumour allele fraction during chemotherapy (Schwarz et al., 2015). This suggested that subclonal tumour populations are present in pre-treatment biopsies in HGSOC and can undergo expansion during chemotherapy, causing clinical relapse (Schwarz et al., 2015). Additionally alteration of the NF1 gene has been associated with innate cisplatin resistance in HGSOC, whereby 20% of primary tumours showed inactivation of the NF1 gene by mutation or gene breakage (Patch et al., 2015). Furthermore mutation of the RAS-MAPK has been associated with chemotherapy resistance in relapsed neuroblastomas (Eleveld et al., 2015). Additionally, in cell line models, the MAPK pathway has been implicated in cisplatin resistance in ovarian cancer (Benedetti et al., 2008) and in squamous cell carcinoma (Kong et al., 2015).
This study performed gene expression analysis of a cohort of 265 macrodissected ovarian cancer FFPE tissue samples sourced from the Edinburgh Ovarian Cancer Database. Ethical approval for Edinburgh dataset analysis was obtained from Lothian Local Research Ethics Committee 2 (Ref: 07/S1102/33).
This cohort of samples can be further described using the following inclusion criteria:
Three separate prostate cancer cohorts were sourced and used to assess the association of EMT with Prostate Cancer prognosis.
Total RNA was extracted from the macrodissected FFPE tumour samples using the Roche High Pure RNA Paraffin Kit (Roche Diagnostics GmbH, Mannheim, Germany) as described previously (Kennedy et al, 2011). Total RNA was amplified using the NuGEN WT-Ovation™ FFPE System (NuGEN Technologies Inc., San Carlos, Calif., USA). It was then hybridised to the Almac Ovarian Cancer DSA™ as described previously (Kennedy et al, 2011) or Prostate DSA™ for prostate samples (Tanney et al, 2008). Arrays were scanned using the Affymetrix Genechip® Scanner 7G (Affymetrix Inc., Santa Clara, Calif.).
Quality Control (QC) of profiled samples was carried out using MAS5 pre-processing algorithm to assess technical aspects of the samples i.e. average noise and background homogeneity, percentage of present call (array quality), signal quality, RNA quality and hybridization quality. Distributions and Median Absolute Deviation of corresponding parameters were analyzed and used to identify possible outliers. Sample pre-processing was carried out using RMA (Irizarry et al, 2003). The pre-processed data matrix was sorted by decreasing variance, decreasing intensity and increasing correlation to cDNA yield. Following filtering of probe sets (PS) correlated with cDNA yield (to remove any technical bias in the expression data), hierarchical clustering analysis was performed (Pearson correlation distance and Ward's linkage methods (Ward et al, 1963). Subsets of the data matrix were tested for cluster stability using the GAP statistic (Tibshirani et al, 2001), which gives an indication of the within-cluster tightness and between-cluster separateness. The GAP statistic was applied to calculate the optimal number of sample clusters in each sub-matrix, while the stability of cluster composition was assessed using a partition comparison tool (Carriço et al, 2006; Pinto et al, 2008). The smallest number of PS generating the optimal sample cluster number was selected as the list of most variable PS to take forward for hierarchical cluster analysis.
To establish the functional significance of the gene clusters an enrichment analysis, based on the hypergeometric function (False Discovery Rate applied (Benjamini and Hochberg, 1995, J. R. Stat. Soc. 57:289:300)), was performed. Over-representation of biological processes and pathways were analysed for each gene group generated by the hierarchical clustering using Gene Ontology biological processes. Hypergeometric p-values were assessed for each enriched functional entity class. Functional entity classes with the highest p-values were selected as representative of the group and a general functional category representing these functional entities was assigned to the gene clusters based on significance of representation (i.e. p-value).
Genes that are variable and highly expressed across multiple disease indications were determined prior to model development. The disease indications that were included in this evaluation were: ovarian cancer; colon cancer; lung cancer and melanoma. Two data sets per disease indication were assessed with the exception of prostate cancer where only one dataset was evaluated. Data sets were pre-processed using RMA and summarised to Entrez Gene ID level using the median of probe sets for each Entrez Gene ID on the Ovarian Cancer DSA™ Within each data set, Entrez Gene IDs were ranked based on the average rank by variance and mean intensity across samples (high rank=high variance, high mean intensity). A single combined rank value per gene was calculated based on the average variance-intensity rank within each disease indication. Genes with no expression level were removed from further analysis. Scatterplots were generated to show the combined variance-intensity rank of the 19920 Entrez gene IDs in the disease indications evaluated with two datasets (
The genes that had common high expression and variance in ovarian, colon, lung, melanoma and prostate were used as a starting set for model development. The Edinburgh ovarian cancer sample cohort was used to train the signature under 5 fold cross validation (CV) with 10 repeats. Partial least squares (PLS) (de Jong, 1993) was paired with Forward Feature Selection (FFS) to generate signatures for the top 75% ranked list. Table 4 indicates the weightings and bias for each probeset incorporated within the 45-gene signature (A) and the 15-gene signature (B)
The C-index performance was calculated using the progression free survival (PFS) time endpoint and signature scores generated within cross validation for each evaluated signature length. This data was then used to determine the signature length at which optimal performance is reached with respect to association between signature scores and PFS. The highest C-index values were compared for signatures of length less than 100 and greater than 10 features. The signature with the shortest length and highest C-index within this subset was selected as the final model for identifying the subgroup.
A threshold was generated for classification of signature scores by using the value where the sum of sensitivity and specificity with respect to predicting the subtype in the training data is highest. This threshold was set at 0.5899 using the curve of sensitivity and specificity (
Functional enrichment analysis of the selected model was performed using the Gene Ontology biological processes classification to gain an understanding of the underlying biology behind the selected signature. Table 3 presents the top 20 GO biological processes and GO terms from functional enrichment analysis of the signature, where the top 20 biological processes include:
Human epithelial ovarian cancer cell lines OVCAR3 and OVCAR4 were obtained from the American Type Culture Collection. Tumour cells were cultured in RPMI (Gibco™ Life technologies) supplemented with 20% foetal calf serum (FCS) nd maintained in 5% CO2 at 37 C. Pharmaceutical grade cisplatin and bevacizumab were kindly provided by the Belfast City Hospital pharmacy department. Cediranib and Nintedanib were purchased through Selleckchem and re-suspended in DMSO to a stock concentration of 10 mM.
Cells were seeded at predetermined densities, 24 hours later treated with drug, which was replenished every 3-4 days. After 10 days, cells were washed with PBS, fixed in methanol, stained with crystal violet and colonies counted. The surviving fraction (SF) for a given dose was calculated and dose-response curves plotted and IC50 generated using Graph Pad Prism™ 5. Receiver operator curves (ROC) were plotted by dicotomising the IC50 values based on the median of the IC50 and defining the higher IC50 values as resistant and the lower IC50 values as sensitive. The gene signatures associated with the cell lines were plotted based on sensitive and resistant cells. Additionally ROCs were plotted by dicotomising the signature scores based on the median of the scores and defining the higher signature score as signature positive and the lower signature scores as signature negative. The IC50s associated with the cell lines were plotted based on signature positive and signature negative cells.
The migration assay was performed using the xCELLigence RTCA DP system and carried out with CIM-plate 16 (ACEA Bioscience). Endothelial progenitor cell conditioned media, fresh endothelial media with growth factors (VEGF, IGF-1, bFGF, EGF) with 10% foetal bovine serum (FBS) and endothelial media with 10% foetal bovine serum (FBS) only, were the three chemo-attractant conditions used in the bottom chamber. 160 μl of the chemo-attractant was added to each bottom chamber of a CIM-plate 16. The CIM-Plate 16 is assembled by placing the top chamber onto the bottom chamber and snapping the two together. 30 μl pf serum-free medium is placed in the top chamber to hydrate and pre-incubate the membrane for 2 hours in the CO2 incubator at 37° C. before obtaining a background measurement. The protocol is optimised for the two paired cancer cell lines: OVCAR3, OVCAR4 parental and OVCAR3, OVCAR4 platinum resistant cell lines. Platinum resistant cell lines were washed ×3 with PBS, to remove cisplatin, and fresh platinum free media was added to the cells for 24 hours prior to carrying out the experiment. Cells were then grown in serum free medium for 2 hours prior to seeding. Cells are lightly trypsinised, pelleted and re-suspended at 100 μl, containing 50,000 cells, in serum-free medium. Once the CIM-Plate 16 has been equilibrated, it is placed in the RTCA DP station and the background cell-index values are measured. The CIM-Plate 16 is then removed from the RTCA DP station and the cells are added to the top chamber. The CIM-Plate 16 is placed in the RTCA DP station and migration is monitored every 5 minutes for several hours. Each experimental condition was performed in triplicate. For quantification of the migration rate, the slope of the curve was used to determine the rate if change in cell index. The average and standard deviation slope values were then quantified relative to the controls.
The invasion assay was performed using the xCELLigence RTCA DP system and carried out with CIM-plate 16 (ACEA bioscience).
Normal cell media growth conditions (RPMI 1640, 1% L-Glut and 20% FCS) was the chemoattractant condition used in the bottom chamber. 160 μl of the chemoattractant was added to each bottom chamber of a CIM-plate 16. The CIM-Plate 16 is assembled by placing the top chamber onto the bottom chamber and snapping the two together. 20 μl Matrigel growth factor reduced (GFR) (phenol-red free) basement membrane matrix (Cornig, ref: 356231) was diluted in 400 μl optimem (serum free) giving a final working concentration of GFR Matrigel of 5%. 20 μl of the Matrigel-optimem master mix is placed in the top chamber to hydrate and pre-incubate the membrane for 2 hours in the CO2 incubator at 37° C. before obtaining a background measurement.
The protocol is optimized for the two paired cancer cell lines: OVCAR3, OVCAR4 parental and OVCAR3, OVCAR4 platinum resistant cell lines. Platinum resistant cell lines were grown for 24 hours in media containing 0.1 nM, 1 nM and 10 nM ALM201. On the experimental day, cells are washed ×1 with PBS. Cells are lightly trypsinized, pelleted and re-suspended at 100 μl, containing 50,000 cells, in optimem (serum-free) medium in the presence of 0.1 nM, 1 nM and 10 nM ALM201. Once the CIM-Plate 16 has been equilibrated, it is placed in the RTCA DP station and the background cell-index values are measured. The CIM-Plate 16 is then removed from the RTCA DP station and the cells are added to the top chamber. The CIM-Plate 16 is placed in the RTCA DP station and migration is monitored every 5 minutes for several hours.
Each experimental condition was performed in triplicate. For quantification of the migration rate, the slope of the curve was used to determine the rate if change in cell index. The average and standard deviation slope values were then quantified relative to that at the control condition.
The proliferation assay was performed using 6-well plates. The experiment was set-up with two controls mechanisms to ensure accuracy of results. Each cell line was seeded in duplicates and the experiment was performed in triplicate. For quantification of proliferation, cell numbers were counted manually using a coulter counter on day 1, day 2 and day 3 in three different concentrations of ALM201 (0.1 nM, 1 nM and 10 nM). Media is changed on day 2 and day 3 with fresh media containing the 3 concentrations of ALM201.
On day 0, each cell line was lightly trypsinized, counted and seeded at a concentration of 5×104 per well in the presence of ALM201 (0.1 nM, 1 nM and 10 nM concentration). 2 mls of cells was added to each well in three 6 well plates (representing day 1, 2 and 3) and left to incubate for 24 hours in the CO2 incubator at 37° C. prior to counting cells for day 1, 48 hour incubation prior to count day 2 and 72 hour incubation prior to counting day 3.
At each time point, media was aspirated from the wells and wells were washed with PBS×1. 500 μl 5% trypsin was added to each well and incubated 3-5 mins. 1.5 mls media was added to each well to neutralise the trypsin. Cells were counted using the coulter counter. To estimate significance, the unpaired, two-tailed student T-test was calculated using the T-test calculator available on GraphPad Prism 5.0 software.
Dasatinib (BMS354825), Saracatinib (AZD0530) and Trametinib (GSK1120212) were purchased from Selleck Chemicals, dissolved in DMSO to constitute a 10 mM stock solution, and stored at −20° C. Cisplatin was acquired from Belfast City Hospital Pharmacy department and diluted in PBS to produce a 10 μM stock solution. Cisplatin was stored at room temperature and protected from light.
OVCAR3 and OVCAR4 cells were trypsinised and relevant cell numbers were seeded into P90 plates. Cells were allowed to adhere overnight. The following day media was removed and replaced with media containing 25 nM cisplatin. The concentration of cisplatin was increased every 2 weeks, doubling the concentration at every increment. Batches of cells were frozen every two weeks upon increasing the concentration of cisplatin. Once cells were stably growing at 200 nM cisplatin, sensitivity to cisplatin was tested by clonogenic assay. Cells were continuously grown in 200 nM cisplatin.
Cells were trysanised and counted using the Countess™ Automated Cell Counter. 5,000 cells were seeded into each well of a 96 well plate. Cells were allowed to adhere overnight and were then treated with titrated concentrations of cisplatin, cedirianib and nintedanib (10 μM to 0.005 μM concentration). Under sterile conditions in tissue culture, at the 24 hour time point, the drug-conditioned media was removed and replaced with 100 μl of fresh media. The 96 well plate was allowed to stand at room temperature for 20-30 mins. In the meantime the CellTiter-Glo Luminescent was allowed to thaw from the −20 freezer. Following the 20-30 minute incubation, 75 μl of the CellTiter-Glo Luminescent was added per well, this was then shaken for 2 minutes and then allowed to stand for 10 minutes. The analysis was performed using the Bioscience BioTek plate reader.
All animal experiments were performed in conformity to UK Home Office Regulations (PPL2729) and with authorization from Queen's University Belfast Animal Welfare and Ethical Review Body (AWERB). Eight week-old male Athymic nude mice (Harlan Laboratories) were used. ECFC were inoculated at a high density of 2.45×106 and co-cultured with GFP-labelled OVCAR3 WT or GFP-labelled OVCAR3-CP at a density of 1.4×106. The GFP-labelled OVCAR3 WT and GFP-labelled OVCAR3-CP were seeded alone at a density of 1.4×106. Each condition was diluted in 10 μL of phenol red-free DMEM and re-suspended in 90 μL of growth factor-reduced Matrigel (Corning) and injected subcutaneously. After 8 days, mice were sacrificed under isoflurane anaesthesia using 31 gauge needless intraperitoneal (IP) administration of sodium pentobarbital at 200 mg/kg, and implants were removed and fixed in 4% formaldehyde overnight. Fixed Matrigel implants were then embedded in paraffin and 10 μm sections were prepared for staining. For 5 mg/Kg bevacizumab was administered IP once weekly for 14 days. Treatment was commenced on day 3. Mice were sacrificed after a 14-day treatment period as previously described.
Protein lysate was obtained from OVCAR3 isogenic cell lines and analysed using the proteome profiler human angiogenesis array (R&D Systems, Abingdon, UK) in accordance with the manufacturer guidelines. Briefly, samples were adjusted to a final volume of 1.5 ml with array buffer and mixed with a detection antibody cocktail for 1 hour. After a membrane blocking step, samples containing antibody cocktail were added to membranes and left to incubate on a rocking platform overnight at 4° C. After several washes, membranes were incubated with strepavidin-horseradish peroxidase secondary antibody and spots were detected using a UVP bioimaging system (Millipore). Densitometry was quantified using Image J software.
Conditioned media was collected from OVCAR3 and OVCAR4 isogenic cell lines and added to the cytokine array (Abcam) membranes. The experiment was carried out in accordance with the manufacturer guidelines. Comparison between samples was performed using Image J densitometry software for a semi-quantitative comparison.
IHC and immunofluorescence studies were conducted as previously describe [22]. Conventional H&E staining was done and examined by light microscopy. Immunofluorescence was done using anti-mouse CD31 (1:00; Baca), anti-rabbit α-smooth muscle actin (α-SMA; 1:100; Baca). For micro vessel counts, paraffin embedded tissues were sectioned and stained with anti-CD31 and anti-αSMA antibody. CD31 and αSMA stained vessels were then counted per high power field (200×) in three separate fields of three independent tumors from each group. Blood vessels associated with α-SMA-positive cells were considered mature. Sections were stained with α-SMA and with anti-CD31, which stains both mature and immature vessels.
Quantitative Real-Time Polymerase Chain Reaction (qRT-PCR)
Reverse transcription was performed using the First Strand cDNA synthesis kit (Roche). 500 ng of RNA was reverse transcribed according to manufacturer's instructions. Exon-spanning qPCR primers were designed using Roche Universal Probe Library Assay Design Centre and were used at a concentration of 0.5 μM. The following primer sequences were used:
To preform absolute quantification from qPCR, we used a standard curve method. The efficiency of each primer set was derived from the standard curve using the following equation:
E=10∧(−1/slope)
The product of Reverse Transcription was diluted 1:10 in Nuclease Free Water (NFW). Each 10 μl PCR reaction, consisted of 0.5 μl of 10 μM Forward primer, 0.5 μl of 10 μM Reverse primer, 5 μl of 2× LightCycler® 480 SYBR Green I Master mix (Roche), 1.5 μl NFW and 2.5 μl diluted Reverse Transcription product. These 10 μl reactions were pipetted into wells of a LightCycler® 480 multiwell 96 plate (Roche), the plate was then sealed using clear adhesive film (Roche). The plate was placed into the LightCycler® 480 (Roche) and run with the following protocol. (95° C. for 10 mins, 45 cycles of; 95° C. for 15 secs, 55° C. for 30 secs and 72° C. for 30 secs, finishing with a melt curve for confirmation of primer specificity. All qPCR data was analysed using the LightCycler® 480 software provided by Roche. For analysis, the Cp value from a technical duplicate was calculated and the relative amount of a gene was calculated Cp value to an in-run standard curve. Each mean value was then normalised to the mean concentration of the housekeeping gene PUM1 within the corresponding sample, by dividing the concentration of the target gene by the concentration of the house keeping gene. Relative expression refers to the gene expression levels which have been normalised to the housekeeping gene and made relative to the associated control samples. From these normalized values, the fold changes for each gene were calculated and the average of three individual fold changes were derived from three independent experimental triplicates. The unpaired, two-tailed students T-test available on GraphPad Prism 5.0 software was used to detect statistical significance.
30 μg of protein lysates were mixed with LDS loading dye (Invitrogen) and Reducing Agent (Invitrogen) and denatured for 10 minutes at 70° C. Samples were briefly centrifuged, and loaded onto a Bolt 4-12% Bis-Tris gel and electrophoresed at 165 V for 1 hour 30 minutes using MOPS running buffer. SeeBlue Pre Stained protein ladder (Invitrogen) was used as a reference for protein size. After electrophoresis proteins were transferred onto immobilon-P PVDF membrane (Millipore) at 30 V for 2 hours using the XCell surelock mini-cell transfer system (Invitrogen). To ensure proper transferring of proteins onto membrane, the membrane was stained with Ponceau S solution (Sigma). Membranes were incubated in blocking solution (5% bovine serum albumin) for 1 hour at room temperature on a rocking platform in order to prevent non-specific binding of antibody to membrane. Membranes were then incubated in primary antibody overnight (see appendix 3 for dilutions) at 4° C. The following day, the membranes were washed 3 times in TBS-T for 10 minutes and incubated in secondary antibody at a 1:5000 dilution 1 hour 30 minutes at room temperature. Membranes were then washed 6 times for 5 minutes in TBS-T, and incubated for 5 minutes in Luminata Cresendo or Forte (Millipore) detection reagent. Analysis was performed using Alpha Innotech Imager FlourChem Software.
ERK (Cell Signalling 2496)—monoclonal mouse antibody for total p44/42 MAP Kinase (ERK1/2), used at a 1:1000 dilution in 5% milk.
pERK (Cell Signalling 4370)—polyclonal rabbit antibody for p44 and p42 MAP Kinase (ERK1/2), used at a 1:1000 dilution in 5% BSA.
MEK (Cell Signalling 4694)—mouse monoclonal antibody for total MEK1/2, used at a dilution of 1:1000 in 5% milk.
pMEK (Cell Signalling 9121)—rabbit polyclonal antibody for phosph0-MEK1/2 at Ser217/221, used at a dilution of 1:1000 in 5% BSA.
SRC (Cell Signalling 2123)—rabbit polyclonal antibody for SRC, used at a dilution of 1:000 in 5% BSA.
pSRC (Cell Signalling 2101)—polyclonal rabbit antibody for phosphor-SRC at tyrosine 416, used at a dilution of 1:1000 in 5% BSA.
N-cadherin (Cell Signalling)—rabbit monoclonal, used at a dilution of 1:1000 in 5% milk.
E-cadherin (Cell Signalling 24E10)—rabbit monoclonal antibody used at a dilution of 1:1000 in 5% milk.
Vimentin (Cell Signalling R28)—rabbit monoclonal antibody to detect Vimentin, used at a dilution of 1:1000 in 5% milk.
SLUG (Cell Signalling C10G7)—rabbit monoclonal antibody to detect SLUG EMT marker, used at a dilution of 1:1000 in 5% milk.
VEGFa (Abcam)—rabbit polyclonal antibody to detect VEGFa, used at a dilution of 1:1000 in 5% BSA.
VEGFR1 (Abcam)—rabbit polyclonal antibody, a dilution of 1:500 in 5% BSA.
Phospho-VEGFR2 (Cell Signalling)—rabbit polyclonal antibody, a dilution of 1:500 in 5% BSA.
VEGFR2 (Cell Signalling)—rabbit polyclonal antibody, a dilution of 1:500 in 5% BSA.
PDGFRα (Cell Signalling)—rabbit polyclonal antibody, a dilution of 1:500 in 5% BSA.
Phospho-PDGFRβ (Cell Signalling)—rabbit polyclonal antibody, a dilution of 1:500 in 5% BSA.
Phospho-AXL (Cell Signalling)—rabbit polyclonal antibody, a dilution of 1:500 in 5% BSA.
AXL (Cell Signalling)—rabbit polyclonal antibody, a dilution of 1:500 in 5% BSA.
B-actin (Sigma A2228)—mouse monoclonal antibody detecting the N-terminus of β-actin, used at a dilution of 1:5000 in 5% milk.
Secondary antibodies—anti-rabbit and anti-mouse (Cell Signalling) were used at a dilution of 1:5000 in 5% milk.
We have defined 3 molecular subgroups of High grade serous ovarian cancer (HGSOC), an Angiogenesis subgroup (HGS1), an Immune subgroup (HGS2) and an Angio_Immune subgroup (HGS3) (
The Angio_Immune subgroup is defined by the 45-gene signature. We hypothesised that the Angio_Immune group would be prognostic in the context of SoC treatment in ovarian cancer. We therefore investigated this in the treatment naive Discovery dataset. The 45-gene signature was associated with worse prognosis (PFS HR 1.6403 (1.2252-2.1960) p=0.0002 and OS HR 1.6562 (1.2169-2.2540) p=0.0004) and therefore predicted response to cisplatin based therapy (
We wanted to investigate the effect of platinum treatment on the previously identified molecular subgroups. To do this, we analysed 48 matched (from the same patient) pre-chemotherapy and post-chemotherapy samples by gene expression analysis on the ovarian DSA™. Each of the samples were then scored with each of the 3 gene signatures, the 22 Angio signature, the 63 gene Immune signature and the 45 Angio_Immune signature. This analysis allowed us to define which of the 3 molecular subgroups, the samples fell into. This analysis demonstrated that treatment with cisplatin based chemotherapy caused samples to move between subgroups, specifically more of the post-chemotherapy samples were aligned with the Angio_Immune subgroup, rather than the immune and Angio subgroups. 40% of the pre-treatment patient samples were aligned with the Angio_Immune subgroup which increased to 54% post-chemotherapy (
Cisplatin Resistant Cell Line Models have Elevated 45 and 15-Gene Signature Scores
We used a number of cisplatin-sensitive and -resistant cell lines to model the Angio_immune group in vitro including the A2780 and A2780CP and a further cisplatin-sensitive and -resistant cell line generated in-house using the HGSOC OVCAR3 cell line. As high-grade serous ovarian cancer accounts for 70% of ovarian cancers (Seidman et al. 2004) we used the OVCAR3 cell line as they have been confirmed as high-grade serous (Domcke et al. 2013). Although this cell line was generated from a patient following treatment with platinum based chemotherapy, our research has demonstrated that this cell line remains sensitive to cisplatin treatment. The cisplatin resistant OVCAR3 cells had an IC50 of 0.29 μM compared to the cisplatin sensitive cells which had an IC50 of 0.066 μM representing a 4.4-fold difference in IC50 doses (
Furthermore, both cisplatin resistant cells (A2780CP and OVCAR3PT) had significantly increased 45-gene signature scores (Angio_Immune) compared to their sensitive counterparts (p=0057 and p=0.0244 respectively) (
Since analysis of the clinical samples of the Discovery dataset demonstrated that the 45-gene signature could predict response to cisplatin based SoC treatment, we used a panel of 15 ovarian cell lines to investigate this further. These cells lines were analysed by DNA microarray analysis using the Ovarian Cancer DSA™ and signature scores were generated as previously described. The cell lines were also used to perform colony formation assays with Pharmaceutical grade cisplatin. A ROC curve was generated. Cisplatin resistance (res) or sensitivity (sens) was defined based on the median of the IC50 values and correlation with signature scores and AUC scores determined. This demonstrated that in cell line models the 45-gene signature could predict resistance to cisplatin upfront as shown by the increased 45-gene signature scores in resistant ovarian cell line panels (res) (AUC 0.7917 (0.6350-0.9483), p=0.0008) (
To further investigate whether the Angio_Immune group was driven by this MAPK signalling pathway, we performed in silico analysis of the gene expression data from the Discovery dataset. To do this we firstly identified 3 different gene lists representing MAPK activation from the literature (Dry et al., 2010, Loboda et al., 2010, Creighton et al., 2006). We used these gene lists separately to perform semi-supervised hierarchical clustering of the Discovery dataset which were then compiled to generate a refined gene list representing a MAPK driven patient population (‘MEK ON’, represented by the red box) (
Reverse Phase Proteomic Array (RPPA) data was utilised from The Cancer Genome Atlas (TCGA) dataset. The continuous Phospho-MAPK (pMAPK) scores (serine 217/221) and total MAPK scores were down loaded from TCGA (http://bioinformatics.mdanderson.org/main/TCPA). Phospho-MAPK scores were calculated as a ratio of total MAPK. Gene signature scores were then correlated with the RPPA data and the Angio_Immune gene signature was specifically found to correlate with pMAPK serine 217/221 expression using ROC analysis (
Furthermore the 45-gene signature could predict sensitivity to the MEK inhibitor Trametinib (Mekinist, GSK) as demonstrated by the increased 45-gene signature score in sensitive ovarian cell line panels (sens) (AUC 0.72 (0.5778-0.8690) p=0.009066) (
Assessment of 45 Gene Expression Signature with Drug Response in Sanger Cell Lines
Gene expression data (Affymetrix U219 chip) downloaded from https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-3610/ and cell lines sensitivity data with regards to the MEK inhibitors Trametinib and Selumetinib downloaded from http://www.cancerrxgene.org/. The cancer cell lines were scored with the 45 gene AngioImmune signature and correlations of signature score and IC50 response were determined as before: based on median centred IC50 doses against signature scores and vice versa plotting median centered signature scores against IC50 doses. ROC analysis demonstrated the 45 gene signature could predict response to both MEK inhibitors in these cell lines. For Trametinib there was 881 solid tumour cell lines which had both gene expression data and drug IC50 data and for Selumetinib there was 760 solid tumour cell lines which had both gene expression data and drug IC50 data.
As mentioned previously, it has been found that 11% of HGSOC have KRAS amplification and 12% have BRAF amplification, highlighting the potential that drugs targeting the RAS/RAF/MEK/ERK pathway may have utility in HGSOC. The link between KRAS status and MEK was assessed using three publically available datasets; E-GEOD-55624 which profiles KRAS mutant cancer cells treated with a MEK inhibitor, E-MEXP-3557 which transcriptionally profiles human KRAS mutant and wildtype cells and E-GEOD-12764 which transcriptionally profiled MCF10 cells with overexpressed HRAS or MEK1. Using the E-GEOD-55624 dataset, following treatment of SW480 cells which harbour a KRAS mutation (G12D), with a MEK inhibitor exhibited a reduction in the 45-gene Angio_Immune signature scores at both 4 and 16 hours (p=0.0055 and p=0.0143), in comparison to DMSO control (
a) MEK inhibition in cisplatin resistant OVCAR3 cells, re-sensitises cells to cisplatin
b) To further investigate the role of MEK signalling in driving resistance to cisplatin, OVCAR3-WT and OVCAR3-CP cells were treated with either cisplatin alone or a combination of cisplatin and the MEK inhibitor Trametinib. Following treatment of OVCAR3-WT cells with increasing concentrations of cisplatin, OVCAR3-CP cells formed more colonies than OVCAR3-WT cells. However, the addition of 0.5 uM Trametinib to increasing concentrations of cisplatin resulted in a decrease in colony formation of both OVCAR3-WT and OVCAR3-CP cells (
MAPK is known to phosphorylate SLUG and other key players of the SNAIL/SLUG transcription factors, to induce epithelial-mesenchymal transition (EMT) which is known to be a contributing mechanism to angiogenesis and of progressive disease (Virtakoivu et al., 2015). Gene expression profiling showed that a range of EMT associated genes had higher expression levels in the Angio-Immune subgroup as opposed to the Angio and Immune subgroups. This includes significantly increased expression of VIMENTIN, AXL, TWIST1, SNAIL and SLUG in the Angio_Immune subgroup (p=<0.001) compared to the Angio and Immune subgroups (
Above we showed that MAPK signalling may be driving the Angio_Immune subgroup. Interestingly, cisplatin resistant OVCAR3 and OVCAR 4 cells had increased MAPK signalling compared to the cisplatin sensitive counterparts, as measured by phospho-MEK and phospho-ERK protein expression (
We used the EMT on cell line Kuramochi and the OVCAR3 and 4 cisplatin resistant cells to examine the effects of ALM201 on EMT markers and associated phenotypes. ALM201 treatment caused reduced MAPK signalling and EMT signalling in the Kuramochi cell line (
Further investigation suggests that both the MAPK and SRC signalling pathways signal as parallel pathways and may contribute to drug resistance. Treatment with 1 μM Saracatinib (SRCi) over 24 hours, reduces the protein expression of pSRC (left, top panel) whilst increasing the protein levels of pERK (left, third panel). In contrast to this, treatment with 1 μM Trametinib (MEKi) over 24 hours exhibits the opposite effect, enhancing pSRC protein levels (right, third panel) and decreasing pERK protein expression (right, top panel) (
As the Angio_Immune group is driven by MAPK signalling, we hypothesised that the EMT signature may also have prognostic utility in alternative disease indications, namely colon cancer (CRC) and non-small cell lung cancer (NSCLC) which have high incidence of alterations in the MAPK pathway. We therefore investigated this in two publically available colon datasets both in the context of treatment (Marisa GSE40967 and Jorissen GSE14333) and one NSCLC dataset which incorporates an untreated population (Okayama GSE31210). The Marisa dataset consisting of 566 Stage I-IV colon cancers, had the MEK defined subgroup present in sample cluster 3 (C3) following hierarchical clustering (
In relation to NSCLC, the Okayama dataset consisting of Stage I and II untreated NSCLC samples also had the MEK defined subgroup present in sample cluster 4 (C4) following hierarchical clustering (
The Angio_Immune subgroup may now be defined forthwith as the 15-gene signature. AS with the 45-gene signature, we hypothesised that the Angio_Immune group would be prognostic in the context of SoC treatment in ovarian cancer. We therefore investigated this in the treatment naive Discovery dataset. The 15-gene signature was also associated with worse prognosis (PFS, HR=1.3564 [1.0156-1.8117]; p=0.0279 and OS, HR=1.3464 [0.9901-1.8308]; p=0.0441) and could predict response to cisplatin based therapy (
Since analysis of the clinical samples of the Discovery dataset demonstrated that the 15-gene signature could similarly predict response to cisplatin based standard of care treatment, we used a panel of 15 ovarian cell lines to investigate this further. These cells lines were analysed by DNA microarray analysis using the Ovarian Cancer DSA™ and signature scores were generated as previously described. The cell lines were also used to perform colony formation assays with Pharmaceutical grade cisplatin. A ROC curve was generated. Cisplatin resistance (res) or sensitivity (sens) was defined based on the median of the IC50 values and correlation with signature scores and AUC scores determined. This demonstrated that in cell line models the 15-gene signature could predict resistance to cisplatin upfront as shown by the increased 15-gene signature scores in resistant ovarian cell line panels (res) (AUC 0.6905 (0.5254-0.8556), p=0.2900) (
The 15-Gene Signature is Also Associated with the MAPK Pathway
The potential link between the MAPK pathway and the 15-gene signature was assessed using two previously mentioned publically available datasets; E-MEXP-3557 and E-GEOD-12764. Data from the E-MEXP-3557 dataset showed that KRAS mutant (KRAS MT) HCT116 cells had an association with the Angio_Immune subgroup with higher 15-gene signature scores compared to the wildtype HKH2 cells (KRAS WT) (p=0.0443) (
Utilising the Reverse Phase Proteomic Array (RPPA) data from The Cancer Genome Atlas (TCGA) dataset, phospho-MAPK scores were calculated as a ratio of total MAPK. Gene signature scores were then correlated with the RPPA data and AS previously mentioned Angio_Immune subgroup was specifically found to correlate with pMAPK serine 217/221 expression using ROC analysis (
Furthermore the 15-gene signature could predict sensitivity to the MEK inhibitor Trametinib (Mekinist, GSK) as demonstrated by the increased 15-gene signature scores in ovarian cell line panels (sens) (AUC 0.7234 (0.5778-0.8690) p=0.0090) (
As mentioned previously, MAPK is known to phosphorylate SLUG and other key players of the SNAIL/SLUG transcription factors, to induce epithelial-mesenchymal transition (EMT) which is known to be a contributing mechanism of progressive disease. Using the E-GEOD-58582 dataset, MCF7 breast cells overexpressing SNAIL show a positive association with the 15-gene signature (p=0.0015) (
Further investigation suggests that both the MAPK and SRC signalling pathways signal as parallel pathways and may contribute to drug resistance. Resistance (res) or sensitivity (sens) to SRC inhibitor, Saracatinib was defined based on the median of the IC50 values and correlation with 15-gene signature scores (Sign Pos vs Sign Neg) and AUC scores determined (AUC 0.7289, (0.5544-0.9035), p=0.01454) (
As the Angio_Immune group is driven by MAPK signalling, we hypothesised that the 15-gene
EMT signature may also exhibit prognostic utility in alternative disease indications; colon cancer and non-small cell lung cancer (NSCLC). We therefore investigated this in the two publically available colon datasets in relation to treatment (Marisa GSE40967 and Jorissen GSE14333) and one untreated NSCLC dataset (Okayama GSE31210). The Jorissen dataset consisting of 260 Stage I-IV colon cancers showed a poor prognostic subgroup detected by the 15-gene EMT signature, (DFS, p=0.0328) (
With regards to NSCLC, the Okayama dataset consisting of Stage I and II untreated NSCLC samples also associated with 15-gene signature with worse prognosis. The 15-gene EMT signature described as ‘SIG POS’ was associated with poor prognosis, (PFS, p=0.0024) (
Furthermore the both the 45 and 15-gene signature could predict sensitivity to the MEK inhibitors, Trametinib (Mekinist, GSK) and Selumetinib from ‘The Genomics of Drug Sensitivity in Cancer Project’ cell line data. This is demonstrated by the increased 45-gene signature scores in a panel of cell lines treated with either Trametinib or Selumetinib (sens) (AUC 0.7277 (0.6945-0.7609) p<0.0001 and AUC 0.6598 [0.6213-0.6983]; p<0.0001 respectively) (
Additionally, this is also demonstrated by the increased 15-gene signature scores in a panel of cell lines treated with either Trametinib or Selumetinib (sens) (AUC 0.629 (0.593-0.664) p<0.0001 and AUC 0.619 [0.584-0.654]; p<0.0001 respectively) (
Microvessel density (MVD) is a measure of the angiogenic capacity of a tumor and as the angiogenic potential increases so too does its MVD. An in-vivo matrigel-plug assay was used to determine the micro-vessel density (MVD) in the OVCAR3 isogenic cell lines in co-culture with Human endothelial colony forming cells (EFCF). The OVCAR3 platinum-resistant cell lines have a higher MVD than the OVCAR3 platinum-naïve pair (p-value: 0.0041) (
To investigate if there was any specific chemokines driving the angiogenesis like phenotype in the OVCAR3-PT and OVCAR4-PT cells, we performed a cytokine array. This demonstrated increased expression of a number of cytokines that are key regulators of tumour angiogenesis (HGF, VEGF, TIMP1&2, PIGF and Angiogenin) in the PT resistant cells relative to the treatment-naïve control (
As many of the RTK were associated with the AngioImmune group and post-chemotherapy samples, we wanted to examine whether the OVCAR3-PT resistant cells would be sensitive to inhibitors of RTKs. We used 2 RTK inhibitors, Cediranib which targets VEGFR1-3 and PDGFRα/β and Nintedanib which targets VEGFR1-3, PDGFRα/β FGFR1-3 and performed 10-day colony formation assays. This demonstrated that Cediranib (VEGFR1-3 and PDGFRα/β inhibitor) and Nintedanib (VEGFR1-3, PDGFRα/β FGFR1-3 inhibitor) have specificity for the cisplatin-resistant OVCAR3 cells relative to the OVCAR3 cisplatin-naïve cell line (fold change 0.2390869 and 0.377, respectively) (
Within a pilot of 56 biopsy samples with de novo metastatic disease, 50 samples passed all QC metrics and were utilised for data analysis, 24 of which were PSA-responders and 26 non PSA-responders. It was observed that within the proportion of PSA responders, there was a significant increase in the 45-gene signature scores (p=0.0083) (
It was also observed that within the proportion of PSA responders, there was a significant increase in the 15-gene signature scores (p=0.04) (
In a retrospective cohort of 322 radical prostatectomy specimens, the EMT signature exhibits a prognostic relevance in Prostate Cancer. The EMT assay predicts biochemical disease recurrence whereby high EMT patients demonstrate a significantly poor prognostic group in comparison to low EMT patients (HR=1.8095 [1.1499-2.8474]; p=0.0658) (
We have demonstrated that the MAPK pathway is a mechanism of innate and acquired resistance to SoC cisplatin-based therapy in HGSOC. The Angio_Immune group represents 44% of treatment naive HGSOCs and also represents a subgroup which samples migrate to after platinum-based chemotherapy treatment. At present all HGSOCs are treated the same however even though there are differences in prognosis on SoC platinum based therapy. The Angio_Immune group is a poor prognosis subgroup on SoC therapy and therefore represents an opportunity for treatment with more targeted and effective therapeutic agents. Importantly, the 45 and 15 EMT signatures which detect the Angio_Immune group can act as predictive assays for these targeted therapies and hence should predict response to therapeutics in any cancer disease setting. This is significant as these patients cannot be identified by a pathologist. The Angio_Immune group and the 45 and 15 gene signatures which detect the subgroup have a number of potential utilities.
The Angio_Immune group represents a poor prognosis subgroup (
We have demonstrated that the Angio_Immune group is driven by MAPK signalling using publically available and internal datasets and methods (
Since the Angio_Immune group is associated with increased expression of EMT pathway related genes (
We have demonstrated that the MAPK and SRC pathways act in parallel and that activation of the MAPK pathway may predict resistance to inhibitors of the SRC pathway.
Our data suggest that after chemotherapy treatment, patient tumours move into the Angio_Immune group and therefore become more angiogenic like. The Angio_Immune subgroup is largely driven by angiogenic-like processed (
Following a critical review of the various phase III anti-angiogenic clinical trials, there is a clear change in the biology of ovarian cancer following relapse and particularly on platinum resistance. There is a trend to a higher number of positive anti-angiogenic trials in platinum resistant ovarian cancer (see below). This confirms our hypothesis and suggests that the use of anti-angiogenic targeted agents may work better in second line trials after primary cisplatin treatment. The 45 and 15 EMT signatures should therefore predict response to these agents.
We have demonstrated that the EMT subgroup can predict response to taxane based chemotherapy.
There has been a number of phase II/III clinical trials focusing on target treatment with an aim to improve the overall survival of patients with high risk stage Ic and II-IV ovarian cancer. GOG 218 and ICON7 explored the role of bevacizumab (Avastin) in combination with upfront chemotherapy. Bevacizumab is a recombinant humanised monoclonal antibody that binds to vascular endothelial growth factor A (VEGF-A). ICON7 showed superior progression free survival (PFS) in the group of patients who received bevacizumab in combination with standard chemotherapy (20.3 months and 21.8 months, standard therapy versus standard therapy plus bevacizumab respectively (hazard ratio (HR) 0.81; 95% confidence interval (CI), 0.70 to 0.94; P=0.004)). Benefit was observed in patients with high risk disease (defined as FIGO III, ≥1.0 cm disease following debulking or FIGO stage IV).
GOG-218 similarly compared three arms; standard chemotherapy, bevacizumab with standard chemotherapy from cycle 2 to cycle 6 and bevacizumab plus standard chemotherapy cycle 2 through to cycle 22. There was again superior PFS when comparing standard chemotherapy and standard treatment with the addition of bevacizumab (10.3, 11.2, and 14.1 months, control group, bevacizumab (cycle 2-6), and bevacizumab (cylce2-22), respectively). The HR for PFS was 0.908; 95% CI, 0.795 to 1.040; P=0.16 for bevacizumab cycle 2-6 and 0.717; 95% CI, 0.625 to 0.824 (P<0.001) for bevacizumab plus chemotherapy from cycle 2-22.
As previously discussed the prognosis of ovarian cancer is directly correlated with platinum sensitivity and timing or recurrence or relapse following completion of platinum-based chemotherapy. Therefore, anti-angiogenics were explored in recurrent/relapsed ovarian cancer. Two phase III trials OCEANS and AURELIA explored the impact of bevacizumab in relation to the timing of disease relapse following platinum-based chemotherapy.
OCEANS is a phase III trial exploring the efficacy of bevacizumab in combination with gemcitabine and carboplatin (GC) in platinum sensitive recurrent ovarian cancer. Patients were assigned to bevacizumab+GC or placebo+GC, total number of six to ten cycles. Median progression free survival was 12.4 months and 8.4 months respectively (HR 0.484; 95% CI: 0.388-0.605; p-value<0.0001). The role of bevacizumab in platinum resistant epithelial ovarian cancer was explored in the AURELIA trial. In this phase III clinical trials patients were randomly assigned to single agent chemotherapy (investigators choice: peglated doxorubicin, paclitaxel or topotecan) with or without bevacizumab. Progression free survival as a primary endpoint was reached, 6.7 months in the chemotherapy and bevacizumab treated patients and 3.4 months in the chemotherapy alone arm (HA 0.42; 95% CI: 0.32-0.53; p-value<0.001). The objective response rate (ORR) in patients treated with bevacizumab and chemotherapy was 30.9%. Most importantly the benefit in PFS was reflected in the overall survival; 16.6 months in the chemotherapy and bevacizumab arm compared to 13.3 months in the chemotherapy alone arm (HR 0.89; 95% CI: 0.69-1.14; p-value<0.174). However this was not statistically significant. This trial led to the Food Drugs Advisory (FDA) approval of bevacizumab in platinum resistant epithelial ovarian cancer in November 2014, with an added benefit in relation to progression free survival in patients treated with bevacizumab and paclitaxel (PFS 9.6 months).
ICON6 was an international phase III clinical trial, testing the efficacy of cediranib, an oral potent inhibitor of VEGFR 1, 2 and 3 that as a direct effect in stopping the VEGF signal, in relapsed platinum sensitive epithelial ovarian cancer. Patients were randomised into three arms; chemotherapy plus placebo maintenance (reference arm), concurrent platinum chemotherapy and cediranib followed by maintenance cediranib or cediranib plus maintenance cediranib. There was statistically significant progression free and overall survival benefit (PFS 9.4 months and 11.4 months respectively; HR 0.68; p-value=0.0022, Overall survival 17.6 months and 20.3 months respectively; HR 0.70; p-value=0.0419).
Many cancers are driven by alterations in the MAPK pathway. For example mutations in the RAS genes (KRAS, HRAS, and NRAS), are present in approximately 50% of all patients with colorectal cancer. This results in hyper-activation of RAS proteins and their corresponding downstream pathways such as the MAPK pathway, thereby stimulating the development and progression of malignancy. We have demonstrated that both the 45 and 15 gene signatures predict poor prognosis subgroups in colon, lung and prostate cancer (
This analysis evaluates if the gene expression data for each of the signature genes has the ability to significantly detect the MEK subtype independent of the other genes. For each gene an area under the receiver operating curve (AUC) and ANOVA p-values were calculated in the internal training samples. Signature genes with an ANOVA p-value less than 0.05 are statistically significant at detecting the subtype, independent of the other signature genes. C-index values were also generated to determine if the gene expression for each of the signature genes has the ability to significantly detect PFS (Progression Free Survival) independent of the other signature genes. The upper and lower confidence intervals (CI) were derived using bootstrapping with 1000 samplings. If the C-Index lower CI is greater than 0.5 or the upper CI is less than 0.5 then the C-index indicates that the gene expression is significantly associated to the observed survival.
The purpose of evaluating the core gene set of the signature is to determine a ranking for the genes based upon their impact on performance when removed from the signature.
This analysis involved 10,000 random samplings of 10 signature genes from the original 15 signature gene set. At each iteration, 10 randomly selected signature genes were removed and the performance of the remaining 5 genes was evaluated using the endpoint to determine the impact on HR (Hazard Ratio) performance when these 10 genes were removed in the following 2 datasets:
ICON7 was evaluated using the PFS endpoint and Tothill was evaluated using the OS (Overall Survival) endpoint. Within each of the 2 datasets, the signature genes were weighted based upon the change in HR performance (Delta HR) based upon their inclusion or exclusion. Genes ranked ‘1’ have the most negative impact on performance when removed and those ranked ‘15’ have the least impact on performance when removed.
The purpose of evaluating the minimum number of genes is to determine if significant performance can be achieved within smaller subsets of the original signature.
This analysis involved 10,000 random samplings of the 15 signature genes starting at 1 gene/feature, up to a maximum of 10 genes/features. For each randomly selected feature length, the signature was redeveloped using the PLS machine learning method under CV and model parameters derived. At each feature length, all randomly selected signatures were applied to calculate signature scores for the following 2 datasets:
Continuous signature scores were evaluated with outcome to determine the HR effect; ICON7 was evaluated with PFS and Tothill was evaluated with OS. The HR for all random signatures at each feature length was summarized and figures generated to visualize the performance over CV.
The results for the subset analysis of the 15 gene signature in the internal training dataset is provided in this section.
The results for the core gene analysis of the 15 gene signature in the 2 datasets is provided in this section.
The results for the minimum gene analysis of the 15 gene signature in 2 datasets is provided in this section.
This analysis evaluates if the gene expression data for each of the signature genes has the ability to significantly detect the MEK subtype independent of the other genes. For each gene an area under the receiver operating curve (AUC) and ANOVA p-values were calculated in the internal training samples. Signature genes with an ANOVA p-value less than 0.05 are statistically significant at detecting the subtype, independent of the other signature genes. C-index values were also generated to determine if the gene expression for each of the signature genes has the ability to significantly detect PFS (Progression Free Survival) independent of the other signature genes. The upper and lower confidence intervals (CI) were derived using bootstrapping with 1000 samplings. If the C-Index lower CI is greater than 0.5 or the upper CI is less than 0.5 then the C-index indicates that the gene expression is significantly associated to the observed survival.
The purpose of evaluating the core gene set of the signature is to determine a ranking for the genes based upon their impact on performance when removed from the signature.
This analysis involved 10,000 random samplings of 10 signature genes from the original 45 signature gene set. At each iteration, 10 randomly selected signature genes were removed and the performance of the remaining 35 genes was evaluated using the endpoint to determine the impact on HR (Hazard Ratio) performance when these 10 genes were removed in the following 2 datasets:
ICON7 was evaluated using the PFS endpoint and Tothill was evaluated using the OS (Overall Survival) endpoint. Within each of the 2 datasets, the signature genes were weighted based upon the change in HR performance (Delta HR) based upon their inclusion or exclusion. Genes ranked ‘1’ have the most negative impact on performance when removed and those ranked ‘45’ have the least impact on performance when removed.
The purpose of evaluating the minimum number of genes is to determine if significant performance can be achieved within smaller subsets of the original signature.
This analysis involved 10,000 random samplings of the 45 signature genes starting at 1 gene/feature, up to a maximum of 20 genes/features. For each randomly selected feature length, the signature was redeveloped using the PLS machine learning method under CV and model parameters derived. At each feature length, all randomly selected signatures were applied to calculate signature scores for the following 2 datasets:
Continuous signature scores were evaluated with outcome to determine the HR effect; ICON7 was evaluated with PFS and Tothill was evaluated with OS. The HR for all random signatures at each feature length was summarized and figures generated to visualize the performance over CV.
The results for the subset analysis of the 45 gene signature in the internal training dataset is provided in this section.
The results for the core gene analysis of the 45 gene signature in the 2 datasets is provided in this section.
The results for the minimum gene analysis of the 45 gene signature in 2 datasets is provided in this section.
Number | Date | Country | Kind |
---|---|---|---|
1604398.6 | Mar 2016 | GB | national |
1609944.2 | Jun 2016 | GB | national |
1621384.5 | Dec 2016 | GB | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/GB2017/050712 | 3/13/2017 | WO | 00 |