The present invention relates to materials and methods for predicting response to radiotherapy among cancer patients, particularly patients having muscle invasive bladder cancer.
Muscle-invasive bladder cancer (MIBC) is a heterogeneous disease associated with marked variation in its behaviour and clinical outcomes. Despite surgical and oncological advances, 5-year overall survival has not significantly changed and remains at approximately 50% (Stein et al., 2001). With over 10 000 new cases in 2015, bladder cancer is the 7th most common cancer in the UK (CRUK, Bladder Cancer Statistics, 2016). Incidence increases with age and the majority of patients are aged 75 or over at diagnosis. As such, this is a patient population where predictive and prognostic biomarkers would be particularly advantageous to ensure prompt delivery of treatment with likely benefit, and minimise unnecessary toxicity from a treatment likely to fail.
At a clinical level, bladder cancer is divided by histological assessment into non-muscle invasive disease and muscle-invasive disease. Non-muscle invasive bladder cancer (NMIBC) is usually treated with local resection and intravesical agents to reduce the risk of recurrence. While risk of recurrence is high, the majority do not progress further and generally carry a good prognosis. Muscle-invasive bladder cancer (MIBC) however has an aggressive phenotype with a poor prognosis.
Definitive radical management of MIBC currently includes cystectomy with pelvic lymph dissection, or bladder preservation with chemoradiation. Indeed, bladder preservation with combined modality treatment (CMT) is now increasingly recognised as an alternative to radical surgery. 5-year overall survival rates of 50-57% have been reported with CMT (Mak et al., 2014; Ploussard et al., 2014), which included a maximal transurethral resection (TURBT) and chemoradiation. Locoregional relapse rates at 2 years of 67% have been reported, with over half due to NMIBC disease only (13). In those with locoregional invasive relapse, salvage cystectomy may be performed and 5-year survival rates of 10-30% (Chang et al., 2017; Lee et al., 2006) are documented. However, this cohort of patients has been subjected to the toxicity of both radiation and surgery, with the delay to effective treatment potentially compromising overall outcome. The decision between surgery and radiotherapy is currently based upon patient factors and disease parameters. In current clinical practice, there are no validated biomarkers to guide this decision between the two modalities.
There is a real clinical need for further translational research in MIBC to identify predictive and prognostic biomarkers to guide treatment strategy for individual patients.
Molecular subtyping at a transcriptomic level refers to the classification of a disease based on gene expression profiles, where samples with similar gene expression features are clustered together into a subgroup. Several groups have explored molecular subtypes in MIBC and the number of subtypes reported ranges from 2 to 7 (Robertson et al., 2017; Cancer Genome Atlas Research, N, 2014; Dyrskjot et al., 2003; Blaveri et al., 2005; Sanchez-Carbayo et al., 2006; Lindgren, et al., 2010; Choi et al., 2014; Damrauer et al., 2014; Seiler et al., 2017). In particular, Robertson et al. (2017) reported a comprehensive analysis of 412 MIBC from the Cancer Genome Atlas (TCGA) characterised by mutation data, mRNA, long non-coding RNA and miRNA expression data. mRNA expression clustering was performed in this study (by non-negative matrix factorisation). This identified five expression subtypes (luminal papillary, luminal infiltrated, basal-squamous and neuronal) and a set of 46 genes whose expression characterised the different subtypes. The subtypes were significantly associated with overall survival following surgery, with the neuronal subtype having poor survival. Based on the genetic and expression markers identified as associated with the different subtypes, Robertson et al. speculated on potential chemo-/immune-therapies that may be particularly suitable for each subtype. However, these speculations were not validated and sensitivity to radiotherapy was not discussed.
Several groups have also explored gene signatures associated with radiosensitivity, albeit not in the context of bladder cancer. For example, the radiosensitivity index (RSI) was derived from work investigating the SF2 in 48 cell lines from the National Cancer Institute panel of 60 using microarray gene expression data (Eschrich et al., 2009). This work identified 10 genes whose expression can be combined in a linear equation to obtain a “radiosensitivity index”. The genes are AR, cJun, STAT1, PKC, RelA, cABL, SUMO1, CDK1, HDAC1 and IRF1. The approach was validated using 2 prospective pilot cohorts of patients with rectal and oesophageal cancer undergoing pre-operative chemoradiation (n=14 and 12 respectively). The authors reported that the model distinguished responders from non-responders with mean RSI values of 0.34 vs 0.48 (p=0.002) respectively. The approach was subsequently applied to other cancers such as breast cancer (Eschrich et al., 2012), glioblastoma multiforme (Ahmed et al., 2015) and pancreatic cancer (Strom et al., 2015). However, no successful application to bladder cancer has been reported.
Designing a panel of genes whose expression profiles could provide predictive or prognostic information for MIBC is particularly challenging because MIBC is a heterogeneous disease with no one single specific mutation identified in the majority. This is in contrast to e.g. melanoma where on reviewing the TCGA dataset, the BRAF V600E mutation was detected in 206/240 (85.8%) patients (Cancer Genome Atlas, N, 2015). There is no such single or even group of documented specific mutations that can achieve the same in MIBC.
While previously described predictive models of bladder cancer show promise, there remains an unmet need for further models able to predict treatment response and/or survival of bladder cancer patients following radiotherapy+/−chemotherapy. The present invention seeks to fulfil these needs and provides further related advantages.
The present inventors initially sought to validate the prognostic and predictive effects of previously disclosed cancer subtype classifiers (respectively developed for colorectal cancer and MIBC) in a cohort of patients having undergone radiotherapy+/−chemotherapy in the context of a bladder preservation strategy. However, no statistically significant differences in survival (overall or locoregional) were seen between the subtypes identified using these approaches. The inventors therefore carried out an analysis to a) investigate whether intrinsic subtypes could be identified in the radiotherapy treated cohort that differ in their survival post-radiotherapy and b) identify genes the expression of which, alone or as part of a gene expression signature, can be used to identify patients that differ in their survival post-radiotherapy. A signature comprising 71 genes was found to stratify patients into subtypes that are associated with different clinical outcomes, including at least locoregional relapse free survival and pathological complete response rates post-radiotherapy. When applied to an independent data set of bladder cancer patients, the signature was found to stratify patients in subtypes that are associated with different overall survival. Further reduced signatures were identified that are associated with clinical outcomes post radiotherapy+/−chemotherapy by investigating genes that drive the separation between groups of patients with good and poor prognosis following radiotherapy.
Accordingly, in a first aspect the present invention provides a method for predicting the treatment response of a human bladder cancer patient, the method comprising:
In embodiments, the at least 9, at least 10, at least 15, at least or at least 30 of the genes from Group 1 in Table 10 are selected from: KRT20, SFRP4, SNAI2, TWIST1, ZEB1, ZEB2, APLP1, C7, CD44, CDH2, CLDN3, CLDN4, CLDN7, COL17A1, COMP, CXCL11, DES, DSC3, FGFR3, FOXA1, GATA3, GNG4, GSDMC, KRT14, KRT5, KRT6A, L1CAM, MSI1, PDCD1LG2, PEG10, PGM5, PI3, PLEKHG4B, PPARG, RND2, SAA1, SGCD, SNAI1, SNX31, SOX2, TGM1, TP63, TUBB2B, UPK1A, UPK2, and CD274. In embodiments, the at least, at least 2, at least 3 or at least genes from Groups 2-4 in Table 10 are selected from: SUMO1, RelA, PKC, CDK1, HDAC1, AR, IRF1, cJun, cABL, STAT1, Trex1, STING, HIF1alpha, cGAS, AIMP3, KTM2D/MLL2, TXNIP, SLX4, BCLAF1, RAD50, RAD54L, RB1, NBN, NFEL2L2, PALB2, MRE11, PARP1, KAT5, E2F3, ERCC1, ERCC2, ERCC4, ERCC5, ERCC6, FANCB, FANCD2, FANCF, FANCG, KDM6A/UTX, ARID1A, ATM, ATR, BRCA1, BRCA2, BRIP1, and AREG.
In embodiments, the method comprises measuring the gene expression of:
In embodiments, the method comprises measuring the gene expression of:
In embodiments, the method comprises measuring the gene expression of: at least 10 genes, preferably at least 15 genes from Groups 2-4 in Table 10. In embodiments, the method comprises measuring the gene expression of: at least 35 genes, preferably at least 39 genes from Group 1 in Table 10.
In embodiments, the genes from Groups 2-4 include at least 1, at least 2, at least 3, at least 4, at least 5, at least 10 or all of the following genes: RelA, CDK1, HDAC1, Trex1, STING, RAD54L, RB1, MRE11, ERCC4, ERCC6, FANCD2, FANCF, FANCG, ATM, and ATR.
In embodiments, the genes from Group 1 include at least 9, at least 10, at least 15, at least 20 or at least 30, at least 35 or all 39 of the following genes: KRT20, SFRP4, TWIST1, ZEB1, ZEB2, APLP1, C7, CD44, CDH2, CLDN3, CLDN4, CLDN7, COL17A1, COMP, DES, DSC3, FGFR3, FOXA1, GATA3, GNG4, GSDMC, KRT14, KRT5, KRT6A, L1CAM, MSI1, PGM5, PI3, PPARG, RND2, SAA1, SGCD, SNX31, TGM1, TP63, TUBB2B, UPK1A, UPK2, and CD274.
In embodiments, the genes from Group 1 include at least 9, at least 10, at least 15, at least 20 or all 29 of the following genes: TUBB2B, KRT14, KRT5, KRT20, UPK2, DES, SFRP4, SNX31, PI3, FOXA1, CLDN3, UPK1A, CLDN4, TWIST1, MSI1, CLDN7, ZEB2, KRT6A, FGFR3, COMP, PPARG, L1CAM, DSC3, SAA1, TP63, GNG4, TGM1, SGCD, and GATA3. In embodiments, the genes from Group 1 include at least 9, at least 10, at least 15, or all 20 of the following genes: TUBB2B, KRT14, KRT5, KRT20, UPK2, DES, SNX31, SFRP4, PI3, CLDN3, FOXA1, UPK1A, CLDN4, TWIST1, CLDN7, MSI1, FGFR3, KRT6A, ZEB2, and PPARG.
In embodiments, the genes from Group 1 include at least TUBB2B, KRT14, KRT5, KRT20, UPK2, DES, SNX31, SFRP4, and PI3. In embodiments, the genes from Group 1 further include one or more of: CLDN3, FOXA1, UPK1A, CLDN4, TWIST1, CLDN7, MSI1, FGFR3, KRT6A, ZEB2, and PPARG. In embodiments, the genes from Group 1 further include one or more of: TWIST1, ZEB1, ZEB2, APLP1, C7, CD44, CDH2, CLDN3, CLDN4, CLDN7, COL17A1, COMP, DSC3, FGFR3, FOXA1, GATA3, GNG4, GSDMC, KRT14, KRT5, KRT6A, L1CAM, MSI1, PGM5, PPARG, RND2, SAA1, SGCD, TGM1, TP63, UPK1A and CD274. In embodiments, the genes from Group 1 further include one or more of: TWIST1, ZEB1, ZEB2, APLP1, C7, CD44, CDH2, CLDN3, CLDN4, CLDN7, COL17A1, COMP, DSC3, FGFR3, FOXA1, GATA3, GNG4, GSDMC, KRT6A, L1CAM, MSI1, PGM5, PI3, PPARG, RND2, SAA1, SGCD, TGM1, TP63, UPK1A, UPK2, PDCD1LG2 and CD274.
The present inventors have demonstrated that a classifier with clinically useful predictive power could be built based on the gene expression profiles of 54 genes, 15 of which were selected from Groups 2-4 and 39 of which were selected from Group 1. The present inventors have further demonstrated that a classifier with clinically useful predictive power could be built based on the gene expression profiles of 32 genes, 3 of which were selected from Groups 2-4 and 29 of which were selected from Group 1.
In embodiments, the method comprises measuring RelA, CDK1, HDAC1, Trex1, STING, RAD54L, RB1, MRE11, ERCC4, ERCC6, FANCD2, FANCF, FANCG, ATM, and ATR (Groups 2-4) and KRT20, SFRP4, TWIST1, ZEB1, ZEB2, APLP1, C7, CD44, CDH2, CLDN3, CLDN4, CLDN7, COL17A1, COMP, DES, DSC3, FGFR3, FOXA1, GATA3, GNG4, GSDMC, KRT14, KRT5, KRT6A, L1CAM, MSI1, PGM5, PI3, PPARG, RND2, SAA1, SGCD, SNX31, TGM1, TP63, TUBB2B, UPK1A, UPK2, and CD274 (Group 1).
In embodiments, the measured genes from Groups 2-4 comprise RAD54L, ATR, cGAS, ERCC1, ERCC6, PI3, RelA, MRE11, SUMO1, Trex1, and/or ATM.
The present inventors have found that the expression levels of each of these genes strongly differentiated patients classified in subtypes which have a good prognosis following (chemo)radiation (such as e.g. subtypes 4, 5) from patients classified in subtypes which have a poor prognosis following (chemo)radiation (such as e.g. subtypes 1, 3).
Throughout this disclosure, reference to a method for predicting the treatment response of a human bladder cancer patient also encompasses a method for predicting whether a human bladder cancer patient is likely to be sensitive to therapy (such as radiotherapy or chemoradiotherapy), or resistant to therapy (such as radiotherapy or chemoradiotherapy).
In embodiments, the measured genes from Groups 2-4 comprise RAD54L and/or ATM.
The present inventors have found that the expression levels of RAD54L and ATM both strongly differentiated patients classified in subtype 4 and/or patients in subtype 5 (which have a good prognosis following (chemo)radiation) from patients classified in subtype 1 (which have a poor prognosis following (chemo)radiation).
In embodiments, the measured genes from Groups 2-4 comprise Trex1, MRE11 and RAD54L.
In embodiments, the measured genes from Groups 2-4 further comprise one or more of RelA, CDK1, HDAC1, STING, RB1, ERCC4, ERCC6, FANCD2, FANCF, FANCG, ATM, and ATR.
In embodiments, the measured genes from Groups 2-4 further comprise one or more of RelA, CDK1, HDAC1, cGAS, AIMP3, STING, RB1, MRE11, ERCC4, ERCC6, FANCD2, FANCF, FANCG, ATM, ATR, TXNIP, SLX4, BCLAF1, RAD50, NBN, E2F3, ERCC1, ERCC5, FANCB, BRCA2 and BRIP1.
In embodiments, the measured genes from Group 1 comprise one or more of the following genes: KRT5, SFRP4, DES, PI3, CLDN3, CLDN7, KRT14, ZEB2, COMP, C7, CLDN4, SGCD, ZEB1, ZEB2, COL17A1, TGM1, DSC3, KRT6A, and TWIST1.
The present inventors have found that the expression levels of each of these Group 1 genes strongly differentiated patients classified in subtype 4 and/or patients classified in subtype 5 (which have a good prognosis following (chemo)radiation) from patients classified in subtype 1 (which have a poor prognosis following (chemo)radiation).
In embodiments, the measured genes from Group 1 comprise one or more of the following genes: C7, CD247, CD44, CLDN3, CLDN7, CLDN4, KRT6A, SAA1, SFRP4, TGM1, and TWIST1.
The present inventors have found that the expression levels of each of these genes strongly differentiated patients classified in subtypes 4-5 (which have a good prognosis following (chemo)radiation) from patients classified in subtypes 1-3 (which have a poor prognosis following (chemo)radiation). Without wishing to be bound by theory, the subset of genes (in particular Group 1 genes) that best separate subtypes 1-3 and 4-5 may not be identical to the set of genes that best separate subtypes 4-5 from subtype 1, for example because subtypes 1-3 may each contain samples that are biologically distinct for each subtype, which distinction may or may not associate with treatment response.
In embodiments, the method comprises measuring the gene expression of at least 20 genes, preferably at least 25 genes or at least 28 genes from Groups 2-4 in Table 10. In embodiments, the method comprises measuring the gene expression of at least 31 genes from Groups 2-4 in Table 10. In embodiments, the method comprises measuring the gene expression of at least 40 genes from Group 1 in Table 10.
In embodiments, the genes from Groups 2-4 include at least 1, at least 2, at least 3, at least 4, at least 5, at least 10, at least 15, at least 20, at least 25 or all of the following genes: RelA, CDK1, HDAC1, Trex1, cGAS, AIMP3, STING, RAD54L, RB1, MRE11, ERCC4, ERCC6, FANCD2, FANCF, FANCG, ATM, ATR, TXNIP, SLX4, BCLAF1, RAD50, NBN, E2F3, ERCC1, ERCC5, FANCB, BRCA2 and BRIP1.
In embodiments, the at least 31 genes from Groups 2-4 include all of the following genes: RelA, CDK1, SUMO1 and HDAC1. The Group 3 genes are Trex1, cGAS, AIMP3 and STING. The Group 4 genes are RAD54L, RB1, MRE11, ERCC4, ERCC6, FANCD2, FANCF, FANCG, ATM, ATR, TXNIP, SLX4, BCLAF1, RAD50, NBN, E2F3, ERCC1, ERCC5, FANCB, BRCA2, BRCA1, KTM2D/MLL2 and BRIP1.
In embodiments, the at least 40 genes from Group 1 include the following genes: KRT20, SFRP4, TWIST1, ZEB1, ZEB2, APLP1, C7, CD44, CDH2, CLDN3, CLDN4, CLDN7, COL17A1, COMP, DES, DSC3, FGFR3, FOXA1, GATA3, GNG4, GSDMC, KRT14, KRT5, KRT6A, L1CAM, MSI1, PGM5, PI3, PPARG, RND2, SAA1, SGCD, SNX31, TGM1, TP63, TUBB2B, UPK1A, UPK2, PDCD1LG2 and CD274.
In embodiments, the method comprises measuring the gene expression of KRT20, SFRP4, TWIST1, ZEB1, ZEB2, APLP1, C7, CD44, CDH2, CLDN3, CLDN4, CLDN7, COL17A1, COMP, DES, DSC3, FGFR3, FOXA1, GATA3, GNG4, GSDMC, KRT14, KRT5, KRT6A, L1CAM, MSI1, PGM5, PI3, PPARG, RND2, SAA1, SGCD, SNX31, TGM1, TP63, TUBB2B, UPK1A, UPK2, PDCD1LG2 and CD274. The Group 2 genes are RelA, CDK1 and HDAC1 (Group 1) and RelA, CDK1, HDAC1, Trex1, cGAS, AIMP3, STING, RAD54L, RB1, MRE11, ERCC4, ERCC6, FANCD2, FANCF, FANCG, ATM, ATR, TXNIP, SLX4, BCLAF1, RAD50, NBN, E2F3, ERCC1, ERCC5, FANCB, BRCA2 and BRIP1 (Groups 2-4).
In embodiments, the method comprises measuring the gene expression of KRT20, SFRP4, TWIST1, ZEB1, ZEB2, APLP1, C7, CD44, CDH2, CLDN3, CLDN4, CLDN7, COL17A1, COMP, DES, DSC3, FGFR3, FOXA1, GATA3, GNG4, GSDMC, KRT14, KRT5, KRT6A, L1CAM, MSI1, PGM5, PI3, PPARG, RND2, SAA1, SGCD, SNX31, TGM1, TP63, TUBB2B, UPK1A, UPK2, PDCD1LG2 and CD274. The Group 2 genes are RelA, CDK1, SUMO1 and HDAC1 (Group 1) and RelA, CDK1, SUMO1 and HDAC1. The Group 3 genes are Trex1, cGAS, AIMP3 and STING. The Group 4 genes are RAD54L, RB1, MRE11, ERCC4, ERCC6, FANCD2, FANCF, FANCG, ATM, ATR, TXNIP, SLX4, BCLAF1, RAD50, NBN, E2F3, ERCC1, ERCC5, FANCB, BRCA2, BRCA1, KTM2D/MLL2 and BRIP1 (Groups 2-4).
The present inventors have demonstrated that a classifier with clinically useful predictive power could be built based on the gene expression profiles of 68 genes, 28 of which were selected from Groups 2-4 and 40 of which were selected from Group 1.
The present inventors have further demonstrated that an optimal classification performance could be achieved using the gene expression profiles of 71 genes, 31 of which were selected from Groups 2-4 and 40 of which were selected from Group 1.
In embodiments, the genes measured from Groups 2-4 include one or more of the following genes: HDAC1, ERCC5, PKC (PRRT2), MRE11, and BRCA2, SLX4, ERCC2, and ATM. Each of these genes were found to be likely differentially expressed between patients with or without locoregional recurrence, and with or without invasive locoregional recurrence.
In certain embodiments, the total number of genes the expression of which is measured is not more than 100.
In embodiments, measuring the genes comprises using a targeted assay that specifically measures the gene expression of each of the genes.
In embodiments, the patient is a patient who has not undergone any therapy for bladder cancer, optionally wherein the patient has not undergone radiotherapy and/or chemotherapy.
In embodiments, the patient is a patient who has had surgical resection of the bladder tumour, optionally combined with perioperative therapy. In embodiments, the patient has had a maximal transurethral resection of the bladder tumour (TURBT).
In embodiments, the perioperative therapy is neoadjuvant therapy.
In embodiments, making a prediction of the treatment response and/or prognosis of the patient comprises predicting the response of the patient to at least one course of radiotherapy treatment, preferably radical radiotherapy. In embodiment, the course of radiotherapy treatment comprised 32 doses (such as e.g. daily doses) of at least 64Gy.
In embodiments, the sample is a sample taken from the tumour after all or part of the tumour has been removed, i.e. a resected tumour sample.
In embodiments, the sample is a fixed tumour tissue sample (such as e.g. a formalin-fixed paraffin-embedded (FFPE) tissue sample), or a frozen tumour tissue sample (such as e.g. a fresh frozen (FF) tissue sample).
In embodiment, the sample is a sample taken from the tumour at diagnosis (i.e. a diagnosis biopsy).
In accordance with any aspect, measuring the gene expression of a gene in Table 10 may comprise measuring the expression of the corresponding transcript with the RefSeq identifier provided in Table 2.
In accordance with any aspect, measuring the gene expression of a gene in Table 10 may comprise measuring using a nucleic acid microarray, a nucleic acid synthesis-based method (such as quantitative PCR (qPCR), RNA sequencing or digital PCR), or a NanoString nCounter assay. Preferably, measuring the gene expression of a gene in Table 10 comprises using a NanoString nCounter assay directed to one or more transcripts of the gene. The present inventors have found that the NanoString nCounter enables the reliable detection of panels of genes of the range of sizes (number of genes) used in the present disclosure, even when using relatively low amounts of sample (e.g. low amounts of extracted nucleic acids, low amounts of extracted RNA or mRNA) and/or nucleic acids extracted from FFPE tissue samples.
In embodiments, making a prediction of the treatment response and/or prognosis of the patient comprises predicting the response/prognosis of the patient following at least one treatment with one or more chemotherapeutic agents selected from the group consisting of: cisplatin, carboplatin, 5-fluourouracil, mitomycin C, gemcitabine, methotrexate, vinblastine, doxorubicin, paclitaxel, capecitabine, and etoposide.
In embodiments, the at least one treatment comprises neoadjuvant therapy with one or more chemotherapeutic agents selected from the group consisting of: cisplatin, gemcitabine, carboplatin, and etoposide.
In embodiments, the at least one treatment comprises chemotherapy with one or more chemotherapeutic agents selected from the group consisting of: 5-fluourouracil, mitomycin C, gemcitabine, and capecitabine. The chemotherapy may be concurrent with a course of radiotherapy treatment.
In embodiments, step b) making a prediction of the treatment response of the patient based on the sample gene expression profile comprises:
In embodiments, said first reference centroid comprises the low-risk centroid made up of the value, for each of the selected genes, for the subtype 4 or subtype 5 centroid in Table 11, Table 12, Table 13, Table 14, or Table 15 and said second reference centroid comprises the high-risk centroid made up of the value, for each of the selected genes, for the subtype 1, subtype 2 or subtype 3 centroid in Table 11, Table 12, Table 13, Table 14, or Table 15.
Optionally, said first reference centroid comprises the low-risk centroid made up of the value, for each of the selected genes, for the subtype 5 centroid in Table 11, Table 12, Table 13, Table 14, or Table 15 and said second reference centroid comprises the high-risk centroid made up of the value, for each of the selected genes, for the subtype 1, centroid in Table 11, Table 12, Table 13, Table 14, or Table 15.
In embodiments, step b) making a prediction of the treatment response of the patient based on the sample gene expression profile comprises:
Optionally, said first reference centroid comprises the low-risk centroid made up of the value, for each of the selected genes, for the subtype 5 centroid in Table 10, said second reference centroid comprises the moderate-risk centroid made up of the value, for each of the selected genes, for the subtype 3, centroid in Table 11, Table 12, Table 13, Table 14, or Table 15, and said third reference centroid comprises the moderate-risk centroid made up of the value, for each of the selected genes, for the subtype 1 centroid in Table 11, Table 12, Table 13, Table 14, or Table 15.
In embodiments, step b) making a prediction of the treatment response of the patient based on the sample gene expression profile comprises:
In certain cases, the reference centroids may have been pre-determined and may be obtained by, e.g., retrieval from a volatile or non-volatile computer memory or data store (including retrieval from a network or other remote store). The derivation of exemplary centroids is described in detail herein.
In embodiments, a sample gene expression profile being classified as belonging to a group defined by a poor prognosis (radioresistant) centroid indicates that the patient is at high risk of poor treatment response, at high risk of suffering recurrence of the tumour and/or at high risk of having a shorter than median survival time. In embodiments, a sample gene expression profile being classified as belonging to a group defined by a low risk (radiosensitive) centroid indicates that the patient is at low risk of poor treatment response, at low risk of suffering recurrence of the tumour and/or at low risk of having a shorter than median survival time.
In embodiments, the sample gene expression profile is compared with each reference centroid for closeness of fit using K-means clustering, model based clustering, non-negative matrix factorization, variants of factor analysis or principal component analysis.
In embodiments, comparing the sample gene expression profile, optionally after said normalising, with two or more reference centroids comprises computing the correlation coefficient, preferably the Pearson correlation coefficient, between the sample gene expression profile and the centroid. Preferably, classifying the sample gene expression profile as belonging to the risk group having the reference centroid to which it is most closely matched comprises classifying the sample gene expression profile as belonging to the risk group having the reference centroid with the highest correlation coefficient with the sample gene expression profile.
In embodiments, step b) making a prediction of the treatment response of the patient based on the sample gene expression profile comprises:
In embodiments, the risk score is referenced to the median risk score of a sample cohort of bladder cancer patients, which median risk score serves as a threshold, and wherein:
In certain cases, the risk score is related to a reference or threshold level, for example wherein the median risk of a cohort of patients is set to an arbitrary threshold (e.g. zero) or is median centred and wherein:
In embodiments, a patient determined to be at high or moderate risk of poor treatment response or poor prognosis, is selected for additional or alternative treatment, including aggressive treatment.
In embodiments, a patient determined to be at low risk of poor treatment response or low risk of poor prognosis, is selected for less aggressive ongoing treatment or for non-treatment, and/or wherein a patient determined to be at low risk of poor treatment response or low risk of poor prognosis, is selected for radiotherapy or chemoradiation therapy.
In embodiments, a patient determined to be at low risk of poor treatment response or low risk of poor prognosis, is selected for treatment with a bladder preservation strategy. For example, such a patient may be selected for surgical resection of the tumour accompanied with perioperative(chemo)radiation therapy.
In accordance with any aspect of the present invention, the method may further comprise selecting the patient for an appropriate treatment in view of the risk classification made by the method of the present invention. In particular, when the patient is found to be at high or moderate risk of poor treatment response by the method of the present invention, the patient may be selected for additional or alternative treatment, including aggressive treatment. Suitably, the aggressive treatment may include cystectomy. In certain cases, an aggressive treatment selection for a patient determined to be at high risk of poor treatment response may comprise the same chemotherapeutic agent or combination of agents that were administered to the patient perioperatively or in combination with radiotherapy, but administered more frequently and/or at a higher dose. In some cases, an aggressive treatment selection for a patient determined to be at high or moderate risk of poor treatment response may comprise a different chemotherapeutic agent or combination of agents than were administered to the patient perioperatively or in combination with radiotherapy. In some cases, an aggressive treatment selection for a patient determined to be at high or moderate risk of poor treatment response may comprise immunotherapy.
According to a second aspect, there is provided a computer-implemented method for predicting the treatment response or prognosis of a human bladder cancer patient, the method comprising:
The method of the present aspect may include any of the features of the method of the first aspect.
According to a third aspect, there is provided a computer-implemented method for predicting the treatment response or prognosis of a human bladder cancer patient, the method comprising:
The method of the present aspect may include any of the features of the method of the first aspect.
According to any aspect, obtaining expression data may comprise receiving expression data that has previously been acquired.
According to a fourth aspect, to method of treatment of bladder cancer in a human patient is provided, the method comprising:
According to a further aspect, there is provided a method of treatment of bladder cancer in a human patient, the method comprising:
According to a sixth aspect, there is provided a method of classifying a bladder cancer as belonging to one of a plurality of subtypes, wherein the plurality of subtypes comprises at least a neuronal subtype, the method comprising:
In embodiments, making a prediction of the subtype of the bladder cancer based on the sample gene expression profile comprises:
In some such embodiments, the bladder cancer is predicted to be a neural subtype if it is classified as belonging to the subtype having the neuronal subtype centroid. Optionally, the bladder cancer may be predicted to not be a neuronal subtype if it is classified as belonging to a subtype having one of the four additional subtype centroids.
In embodiments, the method further comprises selecting a patient from which the bladder cancer tumour sample has been obtained for treatment with a ‘neuroendocrine-type’ chemotherapy treatment if the bladder cancer is predicted to belong to a neuronal subtype. Without wishing to be bound by theory, a bladder cancer predicted to belong to a neuronal subtype is believed to show signs of neuroendocrine differentiation. For example, the prediction that a bladder cancer belongs to a neuronal subtype may be indicative of the presence of small cell carcinoma or large cell carcinoma. As such, the patient may be selected for treatment with a chemotherapy that is typically recommended and/or used for small or large cell carcinoma. For example, the patient may be selected for treatment with a combination of cisplatin and etoposide, a treatment with etoposide, a treatment with a combination of carboplatin and etoposide, or a treatment with a combination of ifosfamide and doxorubicin.
The method of the present aspect may include any of the features of the method of the first aspect.
According to a seventh aspect, there is provided a method of classifying a bladder cancer as belonging to one of a plurality of subtypes, wherein the plurality of subtypes comprises at least a luminal subtype and a neuronal subtype, the method comprising:
The method of the present aspect may include any of the features of the method of the first aspect.
In embodiments, making a prediction of the subtype of the bladder cancer based on the sample gene expression profile comprises:
In embodiments, the reference centroids further comprise three additional subtypes centroids made up of the values, for each of the selected genes, for the subtype 1 centroid, the subtype 4 centroid and the subtype 5 centroid, respectively, in Table 11, Table 12, Table 13, Table 14 or Table 15. In some such embodiments, providing a prediction of the bladder cancer subtype comprises predicting that the bladder cancer is not a luminal, neuronal or luminal papillary bladder cancer subtype if the sample gene expression profile is classified as belonging to the subtype having the reference centroid made of the values for the subtype 1 centroid in Table 11, Table 12, Table 13, Table 14 or Table 15.
In some embodiments, providing a prediction of the bladder cancer subtype comprises predicting that the bladder cancer is not a neuronal subtype if the sample gene expression profile is classified as belonging to the subtype having the reference centroid made of the values for the subtype 2 centroid in Table 11, Table 12, Table 13, Table 14 or Table 15.
In some embodiments, providing a prediction of the bladder cancer subtype comprises predicting that the bladder cancer is not a luminal subtype if the sample gene expression profile is classified as belonging to the subtype having the reference centroid made of the values for the subtype 3 centroid in Table 11, Table 12, Table 13, Table 14 or Table 15.
In some embodiments, providing a prediction of the bladder cancer subtype comprises predicting that the bladder cancer is a basal squamous or luminal papillary subtype if the sample gene expression profile is classified as belonging to the subtype having the reference centroid made of the values for the subtype 4 centroid in Table 11, Table 12, Table 13, Table 14 or Table 15.
In some embodiments, providing a prediction of the bladder cancer subtype comprises predicting that the bladder cancer is a basal squamous or luminal papillary subtype if the sample gene expression profile is classified as belonging to the subtype having the reference centroid made of the values for the subtype 5 centroid in Table 11, Table 12, Table 13, Table 14 or Table 15.
In some embodiments, providing a prediction of the bladder cancer subtype comprises predicting that the bladder cancer is not a luminal or neuronal subtype if the sample gene expression profile is classified as belonging to the subtype having the reference centroid made of the values for the subtype 5 centroid in Table 11, Table 12, Table 13, Table 14 or Table 15.
Any of the embodiments of the two aspects above may be combined with features of any embodiment of the preceding aspects.
In accordance with any aspect of the present invention, the patient may be a human, particularly a human who has been diagnosed as having, or at risk of having a bladder cancer, such as muscle invasive bladder cancer. In some cases, the patient has had chemotherapy for bladder cancer and/or has had surgical resection of a bladder tumour (in particular, trans urethral resection of bladder tumour (TURB)). In some cases the patient may be a plurality of patients. In particular, the methods of the present invention may be for stratifying a group of patients (e.g. for a clinical trial) into subgroups that are more or less likely to benefit from radiotherapy (alone or in combination with chemotherapy), based on their gene expression profiles.
Embodiments of the present invention will now be described by way of example and not limitation with reference to the accompanying figures. However various further aspects and embodiments of the present invention will be apparent to those skilled in the art in view of the present disclosure.
The present invention includes the combination of the aspects and preferred features described except where such a combination is clearly impermissible or is stated to be expressly avoided. These and further aspects and embodiments of the invention are described in further detail below and with reference to the accompanying examples and figures.
In describing the present invention, the following terms will be employed, and are intended to be defined as indicated below.
A “test sample” as used herein may be a cell or tissue sample (e.g. a biopsy), a biological fluid, an extract (e.g. a protein or DNA extract obtained from the subject). In particular, the sample may be a tumour sample, including a bladder tumour. The sample may be one which has been freshly obtained from the subject or may be one which has been processed and/or stored prior to making a determination (e.g. frozen, fixed or subjected to one or more purification, enrichment or extractions steps). In embodiments, the sample is a fixed tumour tissue sample (such as e.g. a formalin-fixed paraffin-embedded (FFPE) tissue sample), or a frozen tumour tissue sample (such as e.g. a fresh frozen (FF) tissue sample). The preferred sample type according to the present invention is a FFPE tissue sample, as this type of samples is widely available. Indeed, FFPE tissue samples are commonly obtained in clinical settings, for example for histopathological diagnosis. Reference to “cancer cells” herein may refer to cancer cells present in a cell or tissue sample, such as e.g. cells in a tumour tissue from a biopsy.
“and/or” where used herein is to be taken as specific disclosure of each of the two specified features or components with or without the other. For example “A and/or B” is to be taken as specific disclosure of each of (i) A, (ii) B and (iii) A and B, just as if each is set out individually herein.
Reference to determining the expression level refers to determination of the expression level of an expression product of the gene. Expression level may be determined at the nucleic acid level or the protein level. Within the context of the present invention, expression levels of genes of interest are preferably determined at the nucleic acid level, and in particular at the mRNA level.
The gene expression levels determined may be considered to provide an expression profile. By “expression profile” is meant a set of data relating to the level of expression of one or more of the relevant genes in an individual, in a form which allows comparison with comparable expression profiles (e.g. from individuals for whom the prognosis is already known), in order to assist in the determination of prognosis and in the selection of suitable treatment for the individual patient.
The determination of gene expression levels may involve determining the presence or amount of mRNA in a sample of cancer cells. Methods for doing this are well known to the skilled person. Gene expression levels may be determined in a sample of cancer cells using any conventional method, for example using nucleic acid microarrays or using nucleic acid synthesis (such as quantitative PCR). For example, gene expression levels may be determined using a NanoString nCounter Analysis system (see, e.g., U.S. Pat. No. 7,473,767).
Alternatively or additionally, the determination of gene expression levels may involve determining the protein levels expressed from the genes in a sample containing cancer cells obtained from an individual. Protein expression levels may be determined by any available means, including using immunological assays. For example, expression levels may be determined by immunohistochemistry (IHC), Western blotting, ELISA, immunoelectrophoresis, immunoprecipitation and immunostaining. Using any of these methods it is possible to determine the relative expression levels of the proteins expressed from the genes listed in Table 10.
Gene expression levels may be compared with the expression levels of the same genes in cancers from a group of patients whose survival time and/or treatment response is known. The patients to which the comparison is made may be referred to as the ‘control group’. Accordingly, the determined gene expression levels may be compared to the expression levels in a control group of individuals having cancer. The comparison may be made to expression levels determined in cancer cells of the control group. The comparison may be made to expression levels determined in samples of cancer cells from the control group. The cancer in the control group may be the same type of cancer as in the individual. For example, if the expression is being determined for an individual with bladder cancer, the expression levels may be compared to the expression levels in the cancer cells of patients also having bladder cancer.
Other factors may also be matched between the control group and the individual and cancer being tested. For example the stage of cancer may be the same, the subject and control group may be age-matched and/or gender matched.
Additionally the control group may have been treated with the same form of surgery and/or same radiotherapy treatment and/or same chemotherapeutic treatment. For example, if the subject has been or is being treated with gemcitabine and cisplatin, all of the patients in the control group(s) may have been treated with gemcitabine and cisplatin.
Accordingly, an individual may be stratified or grouped according to their similarity of gene expression with the group with good or poor prognosis.
In some embodiments, the present invention provides methods for classifying, prognosticating, or monitoring bladder cancer in subjects. In particular, data obtained from analysis of gene expression may be evaluated using one or more pattern recognition algorithms. Such analysis methods may be used to form a predictive model, which can be used to classify test data. For example, one convenient and particularly effective method of classification employs multivariate statistical analysis modelling, first to form a model (a “predictive mathematical model”) using data (“modelling data”) from samples of known subgroup (e.g., from subjects known to have a particular bladder cancer prognosis subgroup), and second to classify an unknown sample (e.g., “test sample”) according to subgroup.
Pattern recognition methods have been used widely to characterize many different types of problems ranging, for example, over linguistics, fingerprinting, chemistry and psychology. In the context of the methods described herein, pattern recognition is the use of multivariate statistics, both parametric and non-parametric, to analyse data, and hence to classify samples and to predict the value of some dependent variable based on a range of observed measurements. There are two main approaches. One set of methods is termed “unsupervised” and these simply reduce data complexity in a rational way and also produce display plots which can be interpreted by the human eye. However, this type of approach may not be suitable for developing a clinical assay that can be used to classify samples derived from subjects independent of the initial sample population used to train the prediction algorithm.
The other approach is termed “supervised” whereby a training set of samples with known class or outcome is used to produce a mathematical model which is then evaluated with independent validation data sets. Here, a “training set” of gene expression data is used to construct a statistical model that predicts correctly the “subgroup” of each sample. This training set is then tested with independent data (referred to as a test or validation set) to determine the robustness of the computer-based model. These models are sometimes termed “expert systems,” but may be based on a range of different mathematical procedures such as support vector machine, decision trees, k-nearest neighbour and naïve Bayes. Supervised methods can use a data set with reduced dimensionality (for example, the first few principal components), but typically use unreduced data, with all dimensionality. In all cases the methods allow the quantitative description of the multivariate boundaries that characterize and separate each subtype in terms of its intrinsic gene expression profile. It is also possible to obtain confidence limits on any predictions, for example, a level of probability to be placed on the goodness of fit. The robustness of the predictive models can also be checked using cross-validation, by leaving out selected samples from the analysis.
After stratifying the training samples according to subtype, a centroid-based prediction algorithm may be used to construct centroids based on the expression profile of the gene set described in Table 10.
“Translation” of the descriptor coordinate axes can be useful. Examples of such translation include normalization and mean-centring. “Normalization” may be used to remove sample-to-sample variation. Some commonly used methods for calculating normalization factor include: (i) global normalization that uses all genes on the microarray or nanostring codeset; (ii) housekeeping genes normalization that uses constantly expressed housekeeping/invariant genes; and (iii) internal controls normalization that uses known amount of exogenous control genes added during hybridization (Quackenbush, 2002). In one embodiment, the genes listed in Table 10 can be normalized to one or more control housekeeping genes. Exemplary housekeeping genes include AMMECR1L (NCBI Gene ID: 83607; NCBI RefSeq IDs: NM_001199140.2, NM_031445.2), DHX16 (NCBI Gene ID: 8449; NCBI RefSeq IDs: NM_001164239.1, NM_001363515.1, NM_003587.5), FCF1 (NCBI Gene ID: 51077; NCBI RefSeq IDs: NM_001318508.2, NM_015962.5), PPIA (NCBI Gene ID: 5478; NCBI RefSeq IDs: NM_001300981.2, NM_021130.5), PRPF38A (NCBI Gene ID: 84950; NCBI RefSeq IDs: NM_032864.4), RPL13A (NCBI Gene ID: 23521; NCBI RefSeq IDs: NM_001270491.1, NM_012423.4), TMUB2 (NCBI Gene ID: 79089; NCBI RefSeq IDs: NM_001076674.3, NM_001330235.2, NM_001353173.2, NM_001353174.2, NM_001353175.2, NM_001353176.2, NM_001353177.2, NM_001353178.2, NM_001353180.2, NM_001353181.2, NM_001353182.2, NM_001353183.2, NM_001353184.2, NM_001353185.2, NM_001353186.2, NM_001353187.2, NM_001353188.2, NM_001353189.2, NM_001353190.2, NM_001353191.2, NM_024107.3, NM_177441.4), ZNF143 (NCBI Gene ID: 7702; NCBI RefSeq IDs: NM_001282656.1, NM_001282657.1, NM_003442.6), ZNF384 (NCBI Gene ID: 171017; NCBI RefSeq IDs: NM_001039920.2, NM_001135734.2, NM_133476.5) and DNAJC14 (NCBI Gene ID: 85406; NCBI RefSeq IDs: NM_032364.5), the numbers in brackets following each gene name being the NCBI Gene ID number for that gene and the NCBI RefSeq IDs for known mRNA transcripts from that gene, as of 25 Mar. 2020; the nucleotide sequence for each gene (resp. transcript) as disclosed at that NCBI Gene ID number (resp. NCBI RefSeq ID number) on 25 Mar. 2020 is expressly incorporated herein by reference. It will be understood by one of skill in the art that the methods disclosed herein are not bound by normalization to any particular housekeeping genes, and that any suitable housekeeping gene(s) known in the art can be used. Many normalization approaches are possible, and they can often be applied at any of several points in the analysis. In one embodiment, microarray data is normalized using the LOWESS method, which is a global locally weighted scatterplot smoothing normalization function. In another embodiment, qPCR and NanoString nCounter analysis data is normalized to the geometric mean of a set of multiple housekeeping genes. Moreover, qPCR can be analysed using the fold-change method.
“Mean-centering” may also be used to simplify interpretation for data visualisation and computation. Usually, for each descriptor, the average value of that descriptor for all samples is subtracted. In this way, the mean of a descriptor coincides with the origin, and all descriptors are “centered” at zero. In “unit variance scaling,” data can be scaled to equal variance. Usually, the value of each descriptor is scaled by 1/StDev, where StDev is the standard deviation for that descriptor for all samples. “Pareto scaling” is, in some sense, intermediate between mean centring and unit variance scaling. In pareto scaling, the value of each descriptor is scaled by 1/sqrt(StDev), where StDev is the standard deviation for that descriptor for all samples. In this way, each descriptor has a variance numerically equal to its initial standard deviation. The pareto scaling may be performed, for example, on raw data or mean centered data.
“Logarithmic scaling” may be used to assist interpretation when data have a positive skew and/or when data spans a large range, e.g., several orders of magnitude. Usually, for each descriptor, the value is replaced by the logarithm of that value. In “equal range scaling,” each descriptor is divided by the range of that descriptor for all samples. In this way, all descriptors have the same range, that is, 1. However, this method is sensitive to presence of outlier points. In “autoscaling,” each data vector is mean centred and unit variance scaled. This technique is a very useful because each descriptor is then weighted equally, and large and small values are treated with equal emphasis. This can be important for genes expressed at very low, but still detectable, levels.
When comparing data from multiple analyses (e.g., comparing expression profiles for one or more test samples to the centroids constructed from samples collected and analysed in an independent study), it will be necessary to normalize data across these data sets. In one embodiment, Distance Weighted Discrimination (DWD) is used to combine these data sets together (Benito et al. (2004), incorporated by reference herein in its entirety). DWD is a multivariate analysis tool that is able to identify systematic biases present in separate data sets and then make a global adjustment to compensate for these biases; in essence, each separate data set is a multi-dimensional cloud of data points, and DWD takes two points clouds and shifts one such that it more optimally overlaps the other. Further methods for combining data sets include the “ComBat” method and others described in Lagani et al. 2016, the entire contents of which is expressly incorporated herein by reference. ComBat is a method specifically devised for removing batch effects in gene-expression data (Johnson W E, Li C, Rabinovic A. 2007, the entire contents of which is expressly incorporated herein by reference).
In some embodiments described herein, the prognostic performance of the gene expression signature and/or other clinical parameters is assessed utilizing a Cox Proportional Hazards Model Analysis, which is a regression method for survival data that provides an estimate of the hazard ratio and its confidence interval. The Cox model is a well-recognized statistical technique for exploring the relationship between the survival of a patient and particular variables. This statistical method permits estimation of the hazard (i.e., risk) of individuals given their prognostic variables (e.g., gene expression profile with or without additional clinical factors, as described herein). The “hazard ratio” is the risk of death at any given time point for patients displaying particular prognostic variables.
In accordance with any aspect of the present invention, the genes that make up the gene expression profile may be selected from any 9 or more (such as all of the) genes selected from the genes listed in Table 10 below; the nucleotide sequence for each gene as disclosed at the NCBI Gene ID number indicated in Table 10, on 25 Mar. 2020 is expressly incorporated herein by reference. Particular subsets of the said genes are contemplated herein. For example, the genes shown in Table 10, column C71, column C68, column C54, column C32, column C20 or column C9 may provide a compact signature of genes whose expression is significantly associated with response to radiotherapy. A particularly preferred gene expression profile includes at least the 9 genes: TUBB2B, KRT14, KRT5, KRT20, UPK2, DES, SNX31, SFRP4, and PI3. A particularly preferred gene expression profile includes at least: CLDN3, CLDN4, TWIST1, and CLDN7. A particularly preferred gene expression profile includes at least: KRT14, KRT5, PI3, KRT6A, and DSC3. A particularly preferred gene expression profile includes at least: SFRP4 and DES. A particularly preferred gene expression profile includes at least: TUBB2B, SNX31, KRT20, and UPK2. A particularly preferred gene expression profile includes at least: KRT20, SNX31 and TUBB2. In some cases the gene expression each of these genes is that of the corresponding transcript as listed in Table 10, for example as measured using a Nanostring ncounter assay.
An individual grouped with the good prognosis group, may be identified as having a cancer that is sensitive to radiotherapy, e.g. radical radiotherapy for bladder cancer. Such an individual may also be referred to as an individual that responds well to radiotherapy treatment. An individual grouped with the poor prognosis group, may be identified as having a cancer that is resistant to radiotherapy treatment, including radical radiotherapy for bladder cancer. Radiotherapy may be administered alone or in combination with chemotherapy, such as e.g. platinum-based chemotherapy, gemcitabine, etoposide, mitomycin C, epirubicin, capecitabine, 5-fluorouracil, doxorubicin, or combinations thereof. Where radiotherapy is administered in combination with chemotherapy, it may be referred to as “chemoradiation therapy”. An individual grouped with the good (resp. poor) prognosis group, may be identified as having a cancer that is sensitive (resp. resistant) to radiotherapy alone or in combination with chemotherapy.
Where the individual is grouped with the good prognosis group, the individual may be selected for treatment with suitable radiotherapy and/or chemoradiation therapy as described in further detail below. Where the individual is grouped with the poor prognosis group, the individual may be deselected for treatment with the aforementioned radiotherapy/chemoradiation therapy and may, for example, receive surgical treatment alone or surgery plus a chemotherapy or a novel or experimental therapy, including immunotherapy.
Whether a prognosis is considered good or poor may vary between cancers and stage of disease. In general terms a good prognosis is one where the overall survival (OS), locoregional relapse free survival (LR RFS), invasive locoregional relapse free survivial (inv LR RFS), bladder cancer specific survival (BCCS) and/or progression-free survival (PFS) is longer than average for that stage and cancer type. A prognosis may be considered poor if PFS, LR RFS, inv LR RFS, BCCS and/or OS is lower than average for that stage and type of cancer. The average may be the mean OS, LR RFS, inv LR RFS, BCCS or PFS. For example, a prognosis may be considered good if the PFS is >2 years (or >3 years), LR RFS >2 years (or >3 years), inv LR RFS >2 years (or >3 years), BCCS >4 years (or >5 or >6 years) and/or OS >4 years (or >5 or >6 years). Similarly PFS of <2 years, LR RFS <2 years, inv LR RFS <2 years, BCCS <4 years and/or OS <4 years may be considered poor. In particular, PFS >2 years, LR RFS >2 years, inv LR RFS >2 years, BCCS >4 years and/or OS >4 years may be considered good for advanced cancers.
As described in detail herein, the present inventors found that classification based on the gene expression model of the present invention was able to group patients into groups that show a good response to chemoradiation (good prognosis/sensitive, including subtypes 4 and 5), and groups that show a poor response to chemoradiation (poor prognosis/resistant, including at least subtype 1). Further, at least some of the patient groups showing a good response to chemoradiation could be associated with radiosensitivity based at least in part on the pattern of local vs. global relapse response to chemoradiation therapy. Indeed, patients groups showing a lower incidence of invasive locoregional disease recurrence may be assumed to be radiosensitive, as radiotherapy is a local therapy (whereas chemotherapy is administered systemically in the cohort under investigation). Patient groups showing a good local and global response to chemoradiation may be chemosensitive, radiosensitive or both. Such patient groups are likely to benefit from chemoradiation therapy regardless of whether chemotherapy, radiation therapy or both therapies are driving the favourable outcome. Conversely, patient groups showing a poor response to chemoradiation may be assumed to be radioresistent. The median overall survival for poor prognosis (resistant) patients was 1.373 years (95% CI 1.096-1.649 years). The median overall survival for good prognosis (sensitive) patients was 6.41 years (95% CI 0.00-13.41 years) in subtype 4 and was not reached for patients in subtype 5. The median progression free survival for poor prognosis (resistant) patients was 0.37 years (95% CI 0.33-0.41 years). The median progression free survival for good prognosis (sensitive) patients was 3.50 years (95% CI 0.00-7.19 years) in subtype 4 and 3.82 for patients in subtype 5. The median locoregional relapse free survival for poor prognosis (resistant) patients was 0.47 years (95% CI 0.28-0.66 years). The median locoregional relapse free survival for good prognosis (sensitive) patients was 3.50 years (95% CI 0.00-7.11 years) in subtype 4 and was not reached for patients in subtype 5. The median bladder cancer specific survival for poor prognosis (resistant) patients was 3.54 years (95% CI 0). The median bladder cancer specific survival for good prognosis (sensitive) patients was 6.41 years (95% 0.00-13.41 years) in subtype 4 (i.e. the same as the overall survival value as all relapses in this group were bladder cancer specific) and was not reached for patients in subtype 5.
In general terms, a “good prognosis” is one where survival (OS, LR RFS, inv LR RFS and/or PFS) and/or disease stage of an individual patient can be favourably compared to what is expected in a population of patients within a comparable disease setting. This might be defined as better than median survival (i.e. survival that exceeds that of 50% of patients in population). Alternatively, this may be defined as a better than expected disease stage at a given time point, such as e.g. following therapy, where an expected disease stage may be the disease stage that is most common in the population of patients within a comparable disease setting. Similarly, a “poor prognosis” is one where survival (OS, LR RFS, inv LR RFS and/or PFS) of an individual patient is lower (or disease stage worse) than what is expected in a population of patients within a comparable disease setting. A good prognosis is preferably one where at least inv LR RFS of an individual patient can be favourably compared to what is expected in a population of patients within a comparable disease setting.
Cancer stages may be determined according to the TNM staging system. In particular, the notation “pT” (or “T”) refers to the size of the primary tumour, with TO indicating that the tumour cannot be found, and T1 to T4 referring to increasing size and/or extent of the primary tumour. In the context of bladder cancer, T1 may refer to the tumour having spread to the connective tissue that separates the lining of the bladder from the muscles beneath, but not involving the bladder wall muscle. T2 may refer to the tumour having spread to the muscle of the bladder wall (with T2a referring to the superficial muscle/inner half of the muscle and T2b referring to the deep muscle/outer half of the muscle). T3 may refer to the tumour having spread to the perivesical tissue (with T3a referring to the cancer growth being visible in the perivesical tissue by microscope inspection, and T3b referring to a macroscopically visible growth into the perivesical tissue). T4 may refer to the tumour having spread to any of the abdominal wall, the pelvic wall, the prostate or seminal vesicle (if the patient is male), uterus or vagina (if the patient is female) (with T4a referring to spread to the prostate, seminal vesicle, uterus or vagina and T4b referring to the pelvic wall or abdominal wall). Additional stages Ta and Tis may be defined in the context of bladder cancer, Ta referring to the presence of noninvasive papillary carcinoma, Tis indicating the presence of carcinoma in situ. The notation “N” refers to the presence of cancer in regional lymph nodes, with N0 indicating that there is no cancer in nearby lymph nodes and N1 top N3 indicating increasing numbers/increasingly distant lymph nodes containing cancer. In the context of bladder cancer, N1 may refer to the cancer having spread to a single regional lymph node in the pelvis, N2 may refer to the cancer having spread to 2 or more regional lymph nodes in the pelvis, and N3 may refer to the cancer having spread to the common iliac lymph nodes. The notation “M” refers to the presence of metastasis, with M0 indicating that the cancer has not spread to other locations in the body, and M1 indicating that the cancer has spread to other regions in the body. In the context of bladder cancer, Mia may refer to the cancer having spread only to lymph nodes outside of the pelvis, and M1b may refer to the cancer having spread to other parts of the body.
“Predicting the likelihood of survival of a bladder cancer patient” is intended to assess the risk that a patient will die as a result of the underlying bladder cancer.
“Predicting the response of a bladder cancer patient to a selected treatment” is intended to mean assessing the likelihood that a patient will experience a positive or negative outcome with a particular treatment.
As used herein, “indicative of a positive treatment outcome” refers to an increased likelihood that the patient will experience beneficial results from the selected treatment (e.g. reduction in tumour size, ‘good’ prognostic outcome, improvement in disease-related symptoms and/or quality of life). In particular, beneficial results from the selected treatment may include lack of locoregional recurrence after a given period of time following treatment, increased disease free survival time, increased overall survival, increased locoregional recurrence disease free survival, lack of invasive locoregional recurrence after a given period of time following treatment, increased invasive locoregional recurrence disease free survival, and/or complete pathological response following therapy. Overall survival may be defined as the time from the start of radiotherapy to the date of death. This may be measured with data censored at date last known alive in patients not deceased. Within the context of bladder cancer, locoregional recurrence may be defined as bladder and/or pelvic nodal relapse, including metastatic disease and non-muscle invasive bladder cancer. Invasive locoregional recurrence may be defined as bladder and/or pelvic nodal relapse including metastatic disease but excluding non-muscle invasive bladder cancer. Locoregional relapse-free survival may be defined as the time free of disease recurrence in the regional nodes and/or superficial or invasive disease in the bladder, measured from the start of radiotherapy. This may be measured with data censored at any preceding distant metastases (where the metastasis may be considered “preceding” if they are observed more than a period of time, e.g. 30 days, before locoregional failure), death from non-bladder cause, and/or date last known alive. Invasive locoregional relapse-free survival may be defined in a similar way as locoregional relapse-free survival but with data additionally censored at non-muscle invasive bladder cancer recurrence. A complete pathological response following therapy may be defined as the absence of a detectable tumour at the site of the primary tumour (i.e. pT0), for example based on a post-therapy biopsy. A post-therapy biopsy may be collected a few weeks/months after completion of the course of therapy, such as e.g. 2-5 months, preferably 3-4 months following completion of the course of therapy. Beneficial results from a selected treatment preferably include one or both of lack of invasive locoregional recurrence after a given period of time following treatment, and increased invasive locoregional recurrence disease free survival.
“Indicative of a negative treatment outcome” is intended to mean an increased likelihood that the patient will not receive the aforementioned benefits of a positive treatment outcome.
As used herein, “bladder cancer” refers to any cancer of the bladder, including non-muscle invasive bladder cancer (NMIBC) and muscle invasive bladder cancer (MIBC). Preferably, the bladder cancer is muscle invasive bladder cancer. The present invention is particularly beneficial in the context of muscle invasive bladder cancer. Indeed, MIBC patients typically have poor prognosis, whereas NMIBC patients usually respond well to a combination of local resection and treatment with intravesical agents.
Radiotherapy is the use of ionising radiation to induce DNA damage and subsequent cell death. Radical radiotherapy refers to the use of high doses of radiation, typically daily (or mostly daily. e.g. excluding weekend days).
Radiotherapy (alone or in combination with chemotherapy—as described below, commonly referred to as “chemoradiation”) is often used in bladder preservation strategies (i.e. as alternatives to cystectomy). Bladder preservation with radical combined modality treatment (CMT, such as e.g. a combination of chemotherapy and radical radiotherapy) is increasingly recognised as an alternative to radical surgery. 5-year overall survival rates of 50-57% have been reported with a CMT (see Ploussard, G., et al. (2014); Mak, R. H., et al. (2014)) including a maximal transurethral resection (TURBT) and chemoradiation. However, locoregional relapse is frequent, with rates of 67% reported at 2 years following CMT (James, N. D. et al. (2012)). In those with locoregional invasive relapse, salvage cystectomy may be performed and rates of 10-30% (Chang, S. S., et al. (2017); Lee, C. T., et al. (2006)) are documented. However, those patients eventually undergoing cystectomy have been subjected to both radiation and two rounds of surgery (TURBT and cystectomy), with the delay to effective treatment potentially compromising overall outcome. The present invention can advantageously be used to identify those patients that are likely to benefit from radiotherapy (with or without chemotherapy), and those that are not. The latter can for example be directed to other treatment modalities, including cystectomy, while a bladder preservation strategy in combination with (chemo)radiation can be attempted for the former.
Chemotherapy with cisplatin, gemcitabine, etoposide, mitomycin C (MMC), capecitabine, epirubicin, 5-fluorouracil (5FU) and/or doxorubicin is commonly used in the treatment of bladder cancer. Chemotherapy may be administered as a single dose, or on a more continuous basis, for example if the patient relapses after prior surgery and single dose chemotherapy. Further, chemotherapy may be administered locally or systemically. In the context of muscle invasive bladder cancer, chemotherapy is typically administered systemically, for example before surgery or radiotherapy, or in a palliative setting.
Platinum-based combination therapies (such as e.g. combinations of cisplatin, carboplatin or oxaliplatin with gemcitabine, etoposide epirubicin, and/or 5-fluourouracil) may be used in the management of bladder cancer, particularly in the context of neoadjuvant therapy (neoadjuvant chemotherapy, NAC). Neoadjuvant platinum-based combination chemotherapy has been shown to confer a 5% survival advantage at 5 years (Vale, C. (2003); Advanced Bladder Cancer Meta-analysis, C. (2005); Grossman, H. B., et al. (2003)), and international guidelines recommend it is considered for all patients with T2-4NOMO disease (Witjes, J., et al. (2017); Chang, S. S., et al. (2017); Excellence, N.I.f.H.a.C. (2015)). While response rates to neoadjuvant chemotherapy (NAC) of up to 60% are reported with pathological complete response in 30-40% (Advanced Bladder Cancer Meta-analysis, C. (2005); Grossman, H. B., et al. (2003)), a subset of patients do not respond and may progress during treatment.
The gene expression signature of the present invention was derived in patients treated with radiotherapy in combination with concurrent chemotherapy (in particular, 5FU+MMC, capecitabine+MMC or gemcitabine). Some patients also received platinum-based combination neoadjuvant therapy (in particular, gemcitabine+cisplatin, gemcitabine+carboplatin, or carboplatin+etoposide). However, without wishing to be bound by any particular theory, the present inventors believe that patients treated with other chemotherapies (or no concurrent chemotherapy) will display comparable outcome predictive power (i.e. treatment response prediction) for the said gene expression signature. Indeed, the gene expression signature of the present invention is believed to be primarily associated with response to radiotherapy.
The following is presented by way of example and is not to be construed as a limitation to the scope of the claims.
Diagnostic FFPE samples were obtained for 53 patients who had completed radical daily radiotherapy+/−chemotherapy for MIBC. All samples in this study were obtained from diagnostic biopsies, i.e. prior to any treatment. Two patients were excluded because pathological review (H&E stained slides) revealed the presence of carcinosarcoma and no transitional cell carcinoma (TCC), NMIBC, respectively. Therefore, RNA was extracted from macrodissected samples for 51 patients (see below). Approval was obtained from institutional review boards according to local and national requirements.
The characteristics of the patients are shown in Table 1. In Table 1 below, cancer stages were determined according to the TNM staging system (UICC TNM Classification of Malignant Tumours, 7th Edition, 2009), as described above.
A significant proportion of patients had high risk disease, including 3 with para-aortic nodal involvement, which would be treated palliatively in many centres rather than with radical chemoradiation. 75% of patients received neoadjuvant chemotherapy and all but one patient had concurrent chemotherapy. Just over half this cohort were treated within a radiotherapy dose escalation trial, and hence received more than the standard 64Gy in 32 fractions. 26 patients had a post-radiotherapy biopsy result available (3-4 months post radiotherapy ending), Seventy-seven percent (20/26) had a complete pathological response i.e. pT0. Three patients (12%) had residual pT2 disease and the remaining 3 (12%) had CIS (carcinoma in situ). An additional 13 patients underwent cystoscopy alone which showed no evidence of residual disease. The remaining 4 patients had imaging follow-up at 3-6 months with no evidence of local disease recurrence.
Clinical endpoints for the study were defined as follows:
According to these definitions, a total of n=17 (out of 43) patients had locoregional recurrence, and n=9 patients (out of 43) had invasive locoregional recurrence.
At a median follow-up period of 3.80 years, 22/43 (51.2%) patients had experienced disease relapse. The median progression-free survival was 3.80 years (95% CI 1.519-6.081). The patterns of disease recurrence at first relapse were:
A total of 12/43 (27.9%) developed distant metastatic disease at some point, and 17/43 (39.5%) had locoregional recurrence (LRR; of which 9/17 were muscle-invasive disease). The median LRR disease-free survival was 3.82 years (95% CI 2.44-5.21).
4 patients had salvage cystectomy (2 for pT3/4 disease and 2 patients with NMIBC only). A total of 17/43 (39.5%) patients have died and the median overall survival for the group is 6.41 years (95% CI 0.95-11.78). The median bladder cancer-specific survival has not yet been reached.
The 2-year LRR disease-free survival was 66.3%.
Sections were processed in batches of up to a maximum of 80 sections at a time. After xylene deparaffinisation, macrodissection was performed using a 16G needle and macrodissected tissue was collected into a labelled 1.5 ml RNA LoBind Eppendorf containing 200 μl 100% ethanol. Samples were then centrifuged at 13 000 rpm for 5 minutes, and the ethanol was then removed without disturbing the tissue pellet. Samples were placed (with lid open) in a thermoblock at 55 C for approximately 5 minutes (or until dry). Samples were stored at −20° C.
For 30 of the samples macrodissected, areas with different tumour content or different histologies were macrodissected separately, resulting in a total of 46 samples from different tumour regions from 30 patient blocks. For 23 other samples, multiple regions of tumour (unless of differing histology) were microdissected into one Eppendorf to increase the concentration or RNA extracted for Nanostring testing. A total of 25 regions were therefore macrodissected from 23 patient blocks.
Dual DNA and RNA extraction was performed using the Ambion Recoverall kit. Brifely, macrodissected tissue samples were thawed at room temperature. Digestion buffer and protease was added to each sample. Samples were incubated overnight in a thermoblock for 16 hours at 50 C. Samples were checked at 15 hours to ensure adequate digestion. If there was significant undigested tissue remaining, an additional 1-2 μl protease was added and the sample vortexed. Additional incubation time was given beyond 16 hours if required to ensure adequate digestion of tissue. Samples were then transferred to 80 C for 15 minutes, before the addition of isolation additive and transfer to a filter cartridge in a new collection tube. Samples were centrifuged. The filter cartridge was transferred to a new collection tube and stored at 4° C. for DNA extraction later. The RNA extraction protocol was then completed on the filtrate as per the manufacturer's instructions. Following buffer washes and treatment with a DNase mix, RNA was eluted in a volume of 20 μl pre-warmed nuclease-free water and a double elution was performed i.e. eluate was re-applied to the filter column. Samples were then kept on ice pending the DNA extractions, and all samples were quantified using Nanodrop. A total of 70 dual extractions were performed on samples from 51 patients. There was adequate RNA to proceed with Nanostring testing in 44/51 patients.
A total of 144 genes with potential relevance to bladder cancer were selected for analysis, including 10 control (housekeeping genes) genes (Table 1). Genes which had potential relevance for bladder cancer were selected for the following reasons:
Group 1: Genes used to classify samples according to TCGA MIBC subtypes in Robertson et al. (2017), which is incorporated herein by reference. Forty six genes were selected in this category. These included DSC3, GSDMC, PI3, TGM1, TP63, APLP1, GNG4, MSI1, PEG10, PLEKHG4B, RND2, SOX2, TUBB2B, FGFR3, FOXA1, GATA3, KRT20, PPARG, SNX31, UPK1A, UPK2, CD274, CXCL11, IDO1, L1CAM, PDCD1LG2, SAA1, CDH2, CLDN3, CLDN4, CLDN7, SNAI1, TWIST1, ZEB1, ZEB2, C7, COMP, DES, PGM5, SFRP4, SGCD, CD44, COL17A1, KRT14, KRT5, and KRT6A.
For the nCounter assay, 48 samples of 100 ng of total RNA from 44 patients were hybridized with the custom designed code set of 144 genes and processed according to manufacturer's instruction. The final hybridisation was at 67° C. for 16 hours.
For normalisation of NanoString data, the nSolver Analysis Software (v3.0)(NanoString Technology) was used. Low quality samples flagged by the software were removed. 2 samples were removed at this point leaving data available for 46 of the 48 samples tested, from 43/44 patients.
Normalisation was performed using a standard approach, i.e. data was normalised using both control probes and housekeeping genes (as in Ragulan et al., 2019).
Positive spike-in RNA hybridization controls for each lane were summed to estimate the overall efficiency of hybridization and recovery for each lane. Background for each lane was determined from the negative control counts.
Expression data was subjected to a process similar to that described in Sadanandam et al. (2013) to identify stable subtypes and subtype-specific biomarkers using the custom panel of genes described above. In particular, stable subtypes were identified using consensus clustering-based Non-negative matrix factorization (NMF). NMF reduces datasets containing large numbers of genes into a smaller matrix of metagenes and metasamples. Patterns of expression of the metasamples can then be used to robustly define subtypes and metagenes to identify subtype specific genes. An advantage of NMF is that, unlike with hierarchical clustering, an objective assessment of the number of groups present (k) can be inferred.
NMF clustering was followed by a two-step process to identify subtype-specific biomarkers:
Following preliminary work indicating that the CRCAssigner-38 subtypes may not be indicative of clinical outcome in the present cohort (see Reference example 1 below), the work above was restricted to genes in Groups 1-4.
The cohort was divided into radiotherapy responders and non-responders based upon the presence or absence firstly of locoregional recurrence, and then of invasive locoregional recurrence. A Shapiro-Wilk test on the log 2 normalised data confirmed a non-normal distribution and so Mann-Whitney tests were used to explore for differentially expressed genes.
To minimise the effects of multiple testing, a subset of 36 DNA damage repair and candidate radiosensitivity genes were tested and the Benjamini-Hochberg correction was applied using a false discovery rate of 0.05.
The method described in Ragulan et al. (2019) was used to assign samples to one of the five colorectal cancer subtypes described in Sadanandam, A. et al. (2013). A total of 36 out of 43 samples (78.3%) could be allocated to a subtype. 6/43 samples were deemed to be a mix of subtypes, and 4/43 samples were labelled as undetermined. The distribution of samples in the different subtypes is shown in Table 3.
There was no significant difference in T-stage (p=0.94), N-stage (p=0.95) or M-stage (p=0.37) between samples allocated to different primary subtypes (Fisher's exact test). However, there was a statistically significant difference in tumour content between the CRC subtypes allocated (p=0.0005, Kruskal-Wallis test). Post-hoc Dunn-Bonferroni tests showed significant differences between the stem-like subtype and goblet, TA and inflammatory subtypes (tumour contents, mean±standard deviation: enterocyte 74.5±14.62%, goblet-like 85.1±12.00%, inflammatory 80.0±15.81%, stem-like 57.0±14.54%, TA 87.0±6.71%).
Table 3 below also shows the pattern of relapse at a median follow-up of 3.80 years, depending on the CRCAssigner-38 subtype assigned. Where multiple samples have been tested from one patient, the patient has only been included once, under the most representative subtype (as determined by the subtype of the majority of samples). Patients returned as ‘undetermined’ or mixed’ were labelled according to the primary subtype.
Table 4 below and
This analysis suggests that although the CRC subtypes in bladder cancer may have biological significance, as indicated by the finding that the subtypes differed at least in terms of tumour content, they do not appear to capture features that associate with radiosensitivity.
The TCGA classification system was not publicly available, so it was re-created from publicly available data on a subset of 234 TCGA subjects. Gene expression data and the subset of the TCGA subjects with the corresponding five subtypes were downloaded from the Broad Institute Firehose resource. A TCGA PAM centroid classifier for the five subtypes with 46 genes was developed. Samples from the present cohort were assigned to the TCGA subtypes based on the maximum Pearson correlation coefficient values after correlating each patient expression profile with the TCGA PAM centroid. 38/43 (82.6%) samples were assigned to a subtype. 4/43 samples were deemed to be a mix of subtypes and 2/43 were labelled undetermined. The primary subtype distribution is shown in Table 5 below. Of note, the 3 cases with small cell/neuroendocrine differentiation were assigned to the neuronal subtype.
Using primary subtype allocated, there was no significant difference in T-stage (p=0.9212), N-stage (p=0.7594), or M-stage (p=0.6414) between the allocated TCGA subtypes (Fisher's exact test). Tumour content differences between subtypes were observed (tumour contents, mean±standard deviation: basal squamous 68.75±18.11%, luminal infiltrated 58.57±17.72%, luminal papillary 76.6±12.87%, luminal 74.29±19.02%, neuronal 87.83±12.17%), although the only difference that reached closed to significance was the trend towards luminal infiltrated having lower tumour content (p=0.051, Kruskal-Wallis test).
Table 5 below also shows the patterns of relapse for the samples assigned to the different subtypes. For the two patients with more than one sample sent, the more prevalent subtype was selected for this analysis. This was not possible for one patient where 2 samples were tested with differing subtype allocations, and so the same sample as used for the CRCAssigner-38 analysis was selected for consistency (assigned to the neuronal subtype).
Formal statistical comparison was not performed due to small subcohort numbers. However, when dividing the cohort into those with basal (21/43 samples) or luminal (22/43) subtypes, there was no statistically significant difference in locoregional disease-free survival (p=0.826) or overall survival (p=0.549). In the luminal subtypes, 11 patients had a relapse, 8 had LRR, 4 had invasive LRR, and 4 had M1 disease. In the luminal subtypes, 11 patients had a relapse, 9 had LRRR, 5 had invasive LRR, and 8 had M1 disease.
Table 6 below and
The subtypes used in this example were developed in the context of bladder cancer and have already been shown to have biological significance (Robertson et al., 2017). However, the present analysis indicates that this classifier did not sufficiently capture features that associate with radiosensitivity.
In view of the lack of statistically significant prognostic or predictive effect of the TCGA or CRCAssigner subtypes in the present dataset (Reference examples 1 and 2 above), the present inventors set out to find whether a gene expression signature could be identified that would classify the patients into new subtypes that may be associated with radisosensitivity. A panel of genes as detailed above (Groups 1 to 4) was custom-designed, which contains (i) genes shown in Robertson et al. (2017) to classify MIBC patients into subtypes with different morphology, molecular phenotypes, and prognosis (Group 1), as well as (ii) a manually curated list of genes involved in DNA damage, radiosensitivity and bladder cancer (Groups 2-4).
Using the panel of 91 genes in Groups 1-4 in Table 2 (also in Table below) and NMF clustering, five subtypes were identified. As before, patients with multiple samples tested were included only once and the most prevalent subtype allocated was used. Table 7 below shows the results of this analysis, together with the patterns of relapse according to subtype allocated. The data shows that subtype 5, the largest subtype which included 30.2% of the cohort, did not contain any case of invasive LRR (i.e. 0 patients out of 13).
Kaplan-Meier analysis was performed and the results of this are shown in
Analysis of the post-radiotherapy assessment biopsies (see Table 9) revealed that across subtypes 4 and 5, there is a pathological complete response rate, defined as pT0, of 100% (11/11) compared to 60% (9/15) across groups 1-3 (p=0.0237).
All patients undergoing a post-radiotherapy cystoscopy but with no biopsy result (13/39) were documented to have no evidence of residual disease on cystoscopic appearances. These cases were added to the data in Table 9 and labelled as ‘pT0; no malignancy/atypia only’. Comparison of complete response rates between subtypes 1-3 and 4-5 remained statistically significant (p=0.0267, Fisher's exact test) even with this less reliable data source. Furthermore, of the 4 patients not undergoing cystoscopy, none had a documented locoregional recurrence at the time of data analysis.
Further analysis was performed by dividing the cohort into two groups: subtypes 1−3 and 4−5 or subtypes 1+3 and 2+4+5, based on the differences identified above (i.e. grouping subtypes that appeared to have poor vs good prognosis). This was performed in order to increase the number of samples in each group, with the hope that the increased number of samples would increase the power of the statistical comparison (despite the potential additional noise associated with the “lumping” of groups, which were known from the NMF to be biologically different).
Markers associated with the five subtypes (i.e. markers that contribute significantly to the classification) were identified using significance analysis of microarrays (SAM; Tusher, Tibshirani and Chu (2001)). Briefly, the SAM method computes a statistic di for each gene i, measuring the strength of the relationship between the gene expression and the response (in this case, classification label from the NMF clustering). In particular, the d statistic is a t statistic comparing expression of gene i in each class to the overall centroid (standardised by the within class standard deviation for each gene to give higher weight to genes whose expression is stable within each class), shrunken by an amount ΔSAM to obtain a more robust classifier (“de-noised” centroids). Repeated permutation of the data is used to estimate significance. The cutoff for significance is determined by the tuning parameter ΔSAM, chosen by the user depending on the false positive rate. As shown on
Subsequently, prediction analysis for microarrays (PAM, described in Narashiman and Chu (2002)) was used to classify the samples using the increasingly smaller panels of genes, and the misclassification error rate (MCR) was calculated for each of these.
PAM is a nearest shrunken centroids-based method that identifies subsets of genes that best characterise each class. The method computes a standardised centroid for each class (average gene expression for each gene in each class divided by the within-class standard deviation for that gene), which is then shrunk toward the overall centroid for all classes by an amount ΔPAM (also referred to as “threshold”). New samples are then classified using the shrunken centroids, by comparing the distance between the gene expression profile for the new sample and the shrunken centroids. The shrinkage makes the classifier more robust by reducing the effect of noisy genes, and does automatic gene selection. Indeed, if a gene is shrunk to zero for all classes, then it is eliminated from the prediction rule.
The training data gene expression was used to predict the five subtypes using PAM five-fold cross validation. Five delta values, 0.896, 1.15, 1.8, 2.09 and 2.7, were used to reduce the genes to 68, 54, 32, 20 and 9 gene sets with misclassification error rate of 0.19, 0.19, 0.27, 0.29 and 0.35, respectively. The centroids represent the average expression pattern of each gene set on the five subtypes.
The results of these analyses are shown on
The 71 genes classifier (c71, Tables 10 and 11 below) comprises 40 genes from Group 1, and 31 genes from Groups 2-4 (4 genes from Group 2, 4 genes from Group 3, 23 genes from Group 4). The group 1 genes are: KRT20, SFRP4, TWIST1, ZEB1, ZEB2, APLP1, C7, CD44, CDH2, CLDN3, CLDN4, CLDN7, COL17A1, COMP, DES, DSC3, FGFR3, FOXA1, GATA3, GNG4, GSDMC, KRT14, KRT5, KRT6A, L1CAM, MSI1, PGM5, PI3, PPARG, RND2, SAA1, SGCD, SNX31, TGM1, TP63, TUBB2B, UPK1A, UPK2, PDCD1LG2 and CD274. The Group 2 genes are RelA, CDK1, SUMO1 and HDAC1. The Group 3 genes are Trex1, cGAS, AIMP3 and STING. The Group 4 genes are RAD54L, RB1, MRE11, ERCC4, ERCC6, FANCD2, FANCF, FANCG, ATM, ATR, TXNIP, SLX4, BCLAF1, RAD50, NBN, E2F3, ERCC1, ERCC5, FANCB, BRCA2, BRCA1, KTM2D/MLL2 and BRIP1.
The 68 genes classifier (c68, Table 10) comprises 40 genes from Group 1, and 28 genes from Groups 2-4 (3 genes from Group 2, 4 genes from Group 3, 21 genes from Group 4). The group 1 genes are: KRT20, SFRP4, TWIST1, ZEB1, ZEB2, APLP1, C7, CD44, CDH2, CLDN3, CLDN4, CLDN7, COL17A1, COMP, DES, DSC3, FGFR3, FOXA1, GATA3, GNG4, GSDMC, KRT14, KRT5, KRT6A, L1CAM, MSI1, PGM5, PI3, PPARG, RND2, SAA1, SGCD, SNX31, TGM1, TP63, TUBB2B, UPK1A, UPK2, PDCD1LG2 and CD274. The Group 2 genes are RelA, CDK1 and HDAC1. The Group 3 genes are Trex1, cGAS, AIMP3 and STING. The Group 4 genes are RAD54L, RB1, MRE11, ERCC4, ERCC6, FANCD2, FANCF, FANCG, ATM, ATR, TXNIP, SLX4, BCLAF1, RAD50, NBN, E2F3, ERCC1, ERCC5, FANCB, BRCA2 and BRIP1.
The 54 genes classifier (c54, Table 10) comprises 39 genes from Group 1, and 15 genes from Groups 2-4 (3 genes from Group 2, 2 genes from Group 3, 10 genes from Group 4). The group 1 genes are: KRT20, SFRP4, TWIST1, ZEB1, ZEB2, APLP1, C7, CD44, CDH2, CLDN3, CLDN4, CLDN7, COL17A1, COMP, DES, DSC3, FGFR3, FOXA1, GATA3, GNG4, GSDMC, KRT14, KRT5, KRT6A, L1CAM, MSI1, PGM5, PI3, PPARG, RND2, SAA1, SGCD, SNX31, TGM1, TP63, TUBB2B, UPK1A, UPK2, and CD274.
The Group 2 genes are RelA, CDK1 and HDAC1.
The Group 3 genes are Trex1 and STING.
The Group 4 genes are RAD54L, RB1, MRE11, ERCC4, ERCC6, FANCD2, FANCF, FANCG, ATM, and ATR.
The 32 genes classifier (c54, Table 10) comprises 29 Group 1 genes and 3 Group 2-4 genes (1 from Group 3 and 2 from Group 4). The Group 1 genes are: TUBB2B, KRT14, KRT5, KRT20, UPK2, DES, SFRP4, SNX31, PI3, FOXA1, CLDN3, UPK1A, CLDN4, TWIST1, MSI1, CLDN7, ZEB2, KRT6A, FGFR3, COMP, PPARG, L1CAM, DSC3, SAA1, TP63, GNG4, TGM1, SGCD, and GATA3. The Group 3 gene is Trex1. The Group 4 genes are: MRE11 and RAD54L.
The 20 genes classifier (c20, Table 10) comprises 20 Group 1 genes. The genes are: TUBB2B, KRT14, KRT5, KRT20, UPK2, DES, SNX31, SFRP4, PI3, CLDN3, FOXA1, UPK1A, CLDN4, TWIST1, CLDN7, MSI1, FGFR3, KRT6A, ZEB2, and PPARG.
The 9 genes classifier (c9, Table 10) comprises 9 group 1 genes. The genes are: TUBB2B, KRT14, KRT5, KRT20, UPK2, DES, SNX31, SFRP4, and PI3.
The data on
The increase in misclassification rate when reducing the number of genes from 54 to 32 coincides with the loss of Group 2 genes and many of Group 4 genes. Together with the fact that classification using only the genes from Group 1 (Reference Example 2) did not result in a classification that significantly associated with the response to radiotherapy, this indicates that the inclusion of genes from Groups 2-4 (and in particular Groups 2 and 4) is helpful in distinguishing subtypes of MCIB that associate with response to radiotherapy. In particular, comparing the 32 genes classifier and the 54 genes classifiers, it may be assumed that measuring the expression of at least 30 genes from Group 1 and at least 5 genes from Groups 2-4 would likely provide particularly useful predictive information.
PAM analysis computes a univariate score, centroids, that represent the importance of each gene to each class. Mathematically these are proportional to the loadings for each gene with a supervised principal component based on the class labels as a response variable. The centroids for the 5 subtypes in the C71 classifier are provided in Table 11. The centroids for the 5 subtypes in the C68 classifier are provided in Table 12. The centroids for the 5 subtypes in the C54 classifier are provided in Table 13. The centroids for the 5 subtypes in the C32 classifier are provided in Table 14. The centroids for the 5 subtypes in the C20 classifier are provided in Table 15. The centroids for the 5 subtypes in the C9 classifier are provided in Table 16.
The data in Table 11 above shows that the following genes are of particular importance to differentiate subtype 5 from the other subtypes: CLDN3, CLDN4, TWIST1, CLDN7, Trex1, MRE11, SAA1, GATA3, RND2, RelA, ATM, KRT14 (abs(score)>0.25) and KTM2D/MLL2, ATR, SFRP4, BCLAF1, ERCC4, ERCC6, MSI1, HDAC1, CD274, CD44, FANCG, UPK2, FANCF, PI3, APLP1, BRCA2, ERCC1, TXNIP, STING, TGM1, cGAS, KRT6A, SUMO1 (abs(score)>0.1). The data in Table 13 above indicates that amongst these, CLDN3, CLDN4, TWIST1, CLDN7, Trex1, MRE11, SAA1, GATA3, RND2, RelA and KRT14 are particularly important (abs(score)>0.1), and that each of CLDN3, CLDN4, TWIST1, CLDN7, Trex1, MRE11, SAA1, GATA3, RND2, RelA, HDAC1, CD274, CD44, FANCG, UPK2, FANCF, PI3, ERCC4, ERCC6, MSI1, ATM and KRT14 contributes to the classification in subtype 5 with classifier C54. The data in Table 14 above indicates that amongst these, CLDN3, CLDN4, TWIST1 and CLDN7 are particularly important (abs(score)>0.1), and that each of CLDN3, CLDN4, TWIST1, CLDN7, Trex1, MRE11, SAA1, GATA3 and KRT14 contributes to the classification in subtype 5 with classifier C32. The data in Table 15 above indicates that amongst these, CLDN3 is particularly important (abs(score)>0.1), and that each of CLDN3, CLDN4, TWIST1, CLDN7 contributes to the classification in subtype 5 with classifier C20. Expression of these genes may therefore be used as predictive markers indicative of a likely positive response to radiotherapy (no invasive locoregional relapse). In particular, gene sets comprising (i) at least CLDN3, (ii) at least CLDN3 and CLDN4, (iii) at least CLDN3, CLDN4 and TWIST1, (iv) at least CLDN3, CLDN4, TWIST1 and CLDN7, (v) at least CLDN3, CLDN4, TWIST1, CLDN7 and Trex1, (vi) at least CLDN3, CLDN4, TWIST1, CLDN7, Trex1 and MRE11, (vii) at least CLDN3, CLDN4, TWIST1, CLDN7, Trex1, MRE11 and SAA1, (viii) at least CLDN3, CLDN4, TWIST1, CLDN7, Trex1, MRE11, SAA1 and GATA3, (ix) at least CLDN3, CLDN4, TWIST1, CLDN7, Trex1, MRE11, SAA1, GATA3 and RND2, (x) at least CLDN3, CLDN4, TWIST1, CLDN7, Trex1, MRE11, SAA1, GATA3, RND2 and RelA, (xii) at least CLDN3, CLDN4, TWIST1, CLDN7, Trex1, MRE11, SAA1, GATA3, RND2, RelA and KRT14, (xii) at least CLDN3, CLDN4, TWIST1, CLDN7, Trex1, MRE11, SAA1, GATA3, RND2, RelA, and KRT14, (xiii) at least CLDN3, CLDN4, TWIST1, CLDN7, Trex1, MRE11, SAA1, GATA3, RND2, RelA, KRT14, and HDAC1, (xiv) at least CLDN3, CLDN4, TWIST1, CLDN7, Trex1, MRE11, SAA1, GATA3, RND2, RelA, KRT14, HDAC1, and CD274, (xv) at least CLDN3, CLDN4, TWIST1, CLDN7, Trex1, MRE11, SAA1, GATA3, RND2, RelA, KRT14, HDAC1, CD274, and CD44, (xvi) at least CLDN3, CLDN4, TWIST1, CLDN7, Trex1, MRE11, SAA1, GATA3, RND2, RelA, KRT14, HDAC1, CD274, CD44, and FANCG, (xvii) at least CLDN3, CLDN4, TWIST1, CLDN7, Trex1, MRE11, SAA1, GATA3, RND2, RelA, KRT14, HDAC1, CD274, CD44, FANCG, and UPK2, (xviii) at least CLDN3, CLDN4, TWIST1, CLDN7, Trex1, MRE11, SAA1, GATA3, RND2, RelA, KRT14, HDAC1, CD274, CD44, FANCG, UPK2, and FANCF, (xix) at least CLDN3, CLDN4, TWIST1, CLDN7, Trex1, MRE11, SAA1, GATA3, RND2, RelA, KRT14, HDAC1, CD274, CD44, FANCG, UPK2, FANCF, and PI3, (xx) at least CLDN3, CLDN4, TWIST1, CLDN7, Trex1, MRE11, SAA1, GATA3, RND2, RelA, KRT14, HDAC1, CD274, CD44, FANCG, UPK2, FANCF, PI3, and ERCC4, (xxi) at least CLDN3, CLDN4, TWIST1, CLDN7, Trex1, MRE11, SAA1, GATA3, RND2, RelA, KRT14, HDAC1, CD274, CD44, FANCG, UPK2, FANCF, PI3, ERCC4, and ERCC6, or (xxii) at least CLDN3, CLDN4, TWIST1, CLDN7, Trex1, MRE11, SAA1, GATA3, RND2, RelA, KRT14, HDAC1, CD274, CD44, FANCG, UPK2, FANCF, PI3, ERCC4, ERCC6, MSI1, and ATM are explicitly envisaged (optionally in combination with the genes identified herein in gene sets suitable for use in identifying any of subtypes 1, 2, 3 and 4—see below).
The data in Table 11 above shows that the following genes are of particular importance to differentiate subtype 4 from the other subtypes: KRT14, KRT5, PI3, KRT6A, DSC3, TGM1, COL17A1, GSDMC, L1CAM, C7 (abs(score)>0.25) and COMP, KRT20, ATM, UPK2, TP63, SAA1, CD44, FGFR3 (abs(score)>0.1). The data in Table 13 above indicates that amongst these, KRT14, KRT5, PI3, KRT6A, DSC3, TGM1, COL17A1 and GSDMC are particularly important (abs(score)>0.1), and that each of KRT14, KRT5, PI3, KRT6A, DSC3, TGM1, COL17A1, GSDMC, L1CAM, TP63 and C7 contributes to the classification in subtype 4 with classifier C54. The data in Table 14 above indicates that amongst these, KRT14, KRT5, PI3, KRT6A are particularly important (abs(score)>0.1), and that each of KRT14, KRT5, PI3, KRT6A, DSC3 contributes to the classification in subtype 4 with classifier C32. The data in Tables 15 and 16 above indicates that amongst these, KRT14, KRT5, PI3 are particularly important (abs(score)>0.1), and that each of KRT14, KRT5, PI3, KRT6A contributes to the classification in subtype 4 with classifiers C20. Expression of these genes may therefore be used as predictive markers indicative of a likely positive response to radiotherapy (such as e.g. no invasive locoregional relapse). In particular, gene sets comprising (i) at least KRT14, KRT5, PI3, (ii) at least KRT14, KRT5, PI3, KRT6A, (iii) at least KRT14, KRT5, PI3, KRT6A, DSC3, (iv) at least KRT14, KRT5, PI3, KRT6A, DSC3, TGM1, (v) at least KRT14, KRT5, PI3, KRT6A, DSC3, TGM1, COL17A1, (vi) at least KRT14, KRT5, PI3, KRT6A, DSC3, TGM1, COL17A1, GSDMC, (vii) at least KRT14, KRT5, PI3, KRT6A, DSC3, TGM1, COL17A1, GSDMC, L1CAM, (viii) at least KRT14, KRT5, PI3, KRT6A, DSC3, TGM1, COL17A1, GSDMC, L1CAM, C7, (or ix) at least KRT14, KRT5, PI3, KRT6A, DSC3, TGM1, COL17A1, GSDMC, L1CAM, C7, and TP63, are explicitly envisaged (optionally in combination with the genes identified herein in gene sets suitable for use in identifying any of subtypes 1, 2, 3 and 5).
The data in Table 11 above shows that the following genes are of particular importance to differentiate subtype 1 from the other subtypes: DES, SFRP4, ZEB2, COMP, SGCD, C7, ZEB1, PGM5, KRT14, ATM, RB1, STING, CLDN7, CLDN3, RAD54L (abs(score)>0.25) and E2F3, FANCD2, BRCA1, DSC3, MSI1, CDK1, BRIP1, ERCC5, CDH2, UPK1A, TXNIP, PDCD1LG2 (abs(score)>0.1). The data in Table 13 above indicates that amongst these DES, SFRP4, ZEB2, COMP, RAD54L, SGCD, C7, ZEB1 are particularly important (abs(score)>0.1), and that each of DES, SFRP4, ZEB2, COMP, RAD54L, SGCD, C7, ZEB1, CLDN3, PGM5, KRT14, ATM, CLDN7, RB1, and STING contributes to the classification in subtype 1 with classifier C54. The data in Table 14 above indicates that amongst these, DES, SFRP4, ZEB2 are particularly important (abs(score)>0.1), and that each of DES, SFRP4, ZEB2, COMP, RAD54L, SGCD contributes to the classification in subtype 1 with classifier C32. The data in Table 15 above indicates that amongst these, DES, SFRP4 are particularly important (abs(score)>0.1), and that each of DES, SFRP4, and ZEB2 contributes to the classification in subtype 1 with classifier C20. The data in Table 16 above indicates that amongst these, DES particularly important (abs(score)>0.1), and that each of SFRP4 and DES contributes to the classification in subtype 1 with classifier C9. Expression of these genes may therefore be used as predictive markers indicative of a likely negative response to radiotherapy (such as e.g. invasive locoregional relapse). In particular, gene sets comprising
(i) at least DES, (ii) at least DES, SFRP4, (iii) at least DES, SFRP4, ZEB2, (iv) at least DES, SFRP4, ZEB2, COMP, (v) at least DES, SFRP4, ZEB2, COMP, RAD54L (vi) at least DES, SFRP4, ZEB2, COMP, RAD54L, SGCD, (vii) at least DES, SFRP4, ZEB2, COMP, RAD54L, SGCD, C7, (viii) at least DES, SFRP4, ZEB2, COMP, RAD54L, SGCD, C7, ZEB1, (ix) at least DES, SFRP4, ZEB2, COMP, RAD54L, SGCD, C7, ZEB1, CLDN3, (x) at least DES, SFRP4, ZEB2, COMP, RAD54L, SGCD, C7, ZEB1, CLDN3, PGM5, (xi) at least DES, SFRP4, ZEB2, COMP, RAD54L, SGCD, C7, ZEB1, CLDN3, PGM5, KRT14, (xii) at least DES, SFRP4, ZEB2, COMP, RAD54L, SGCD, C7, ZEB1, CLDN3, PGM5, KRT14, ATM, (xiii) at least DES, SFRP4, ZEB2, COMP, RAD54L, SGCD, C7, ZEB1, CLDN3, PGM5, KRT14, ATM, CLDN7, (xiv) at least DES, SFRP4, ZEB2, COMP, RAD54L, SGCD, C7, ZEB1, CLDN3, PGM5, KRT14, ATM, CLDN7, RB1, (xv) at least DES, SFRP4, ZEB2, COMP, RAD54L, SGCD, C7, ZEB1, CLDN3, PGM5, KRT14, ATM, CLDN7, RB1, and STING, are explicitly envisaged (optionally in combination with the genes identified herein in gene sets suitable for use in identifying any of subtypes 2, 3, 4 and 5).
The data in Table 11 above shows that the following genes are of particular importance to differentiate subtype 2 from the other subtypes: KRT20, SNX31, UPK2, FANCD2, MRE11, CDH2, CD44, GSDMC, RND2, APLP1, Trex1, GNG4, TGM1, SAA1, TWIST1, KRT5, KRT6A, L1CAM, PI3, TUBB2B (abs(score)>0.25) and E2F3, PGM5, FANCG, COMP, ERCC1, SLX4, PDCD1LG2, cGAS, COL17A1, CLDN3, CD274, RelA, DES, STING, FOXA1, ATR, PPARG, ERCC6, CDK1, UPK1A, AIMP3, FANCB, NBN, BRIP1 (abs(score)>0.1). The data in Table 13 above indicates that amongst these, KRT20, SNX31, TUBB2B, PI3, UPK2, L1CAM, KRT6A, KRT5, TWIST1, SAA1, TGM1, GNG4, Trex1, APLP1, RND2, GSDMC, CD44, and FANCD2 are particularly important (abs(score)>0.1), and that each of KRT20, SNX31, TUBB2B, PI3, UPK2, L1CAM, KRT6A, KRT5, TWIST1, SAA1, TGM1, GNG4, Trex1, APLP1, RND2, GSDMC, CD44, FANCD2, CDH2, MRE11, FOXA1, ATR, STING, PPARG, DES, ERCC6, CDK1, RelA, CD274, CLDN3, and COL17A1 contributes to the classification in subtype 2 with classifier C54. The data in Table 14 above indicates that amongst these, KRT20, SNX31, TUBB2B, PI3, are particularly important (abs(score)>0.1), and that each of KRT20, SNX31, TUBB2B, PI3, UPK2, L1CAM, KRT6A, KRT5, TWIST1, SAA1, and TGM1 contributes to the classification in subtype 2 with classifier C32. The data in Table above indicates that amongst these, KRT20, SNX31, and TUBB2B are particularly important (abs(score)>0.1), and that each of KRT20, SNX31, TUBB2B, PI3, and UPK2 contributes to the classification in subtype 2 with classifier C20. The data in Table 16 above indicates that amongst these, KRT20 and SNX31 are particularly important (abs(score)>0.1), and that each of KRT20, SNX31 and TUBB2 contributes to the classification in subtype 2 with classifier C9. Expression of these genes may therefore be used as predictive markers indicative of a likely positive response to radiotherapy (such as e.g. no invasive locoregional relapse). In particular, gene sets comprising (i) at least KRT20, SNX31, and TUBB2B (ii) at least KRT20, SNX31, TUBB2B, PI3, (iii) at least KRT20, SNX31, TUBB2B, PI3, UPK2, (iv) at least KRT20, SNX31, TUBB2B, PI3, UPK2, L1CAM, (v) at least KRT20, SNX31, TUBB2B, PI3, UPK2, L1CAM, KRT6A, (vii) at least KRT20, SNX31, TUBB2B, PI3, UPK2, L1CAM, KRT6A, KRT5, (viii) at least KRT20, SNX31, TUBB2B, PI3, UPK2, L1CAM, KRT6A, KRT5, TWIST1, (ix) at least KRT20, SNX31, TUBB2B, PI3, UPK2, L1CAM, KRT6A, KRT5, TWIST1, SAA1, (x) at least KRT20, SNX31, TUBB2B, PI3, UPK2, L1CAM, KRT6A, KRT5, TWIST1, SAA1, TGM1, (xii) at least KRT20, SNX31, TUBB2B, PI3, UPK2, L1CAM, KRT6A, KRT5, TWIST1, SAA1, TGM1, GNG4, (xii) at least KRT20, SNX31, TUBB2B, PI3, UPK2, L1CAM, KRT6A, KRT5, TWIST1, SAA1, TGM1, GNG4, Trex1, APLP1, (xiii) at least KRT20, SNX31, TUBB2B, PI3, UPK2, L1CAM, KRT6A, KRT5, TWIST1, SAA1, TGM1, GNG4, Trex1, APLP1, RND2, (xiv) at least KRT20, SNX31, TUBB2B, PI3, UPK2, L1CAM, KRT6A, KRT5, TWIST1, SAA1, TGM1, GNG4, Trex1, APLP1, RND2, GSDMC, (xv) at least KRT20, SNX31, TUBB2B, PI3, UPK2, L1CAM, KRT6A, KRT5, TWIST1, SAA1, TGM1, GNG4, Trex1, APLP1, RND2, GSDMC, CD44, FANCD2, (xvi) at least KRT20, SNX31, TUBB2B, PI3, UPK2, L1CAM, KRT6A, KRT5, TWIST1, SAA1, TGM1, GNG4, Trex1, APLP1, RND2, GSDMC, CD44, FANCD2, CDH2, (xvii) at least KRT20, SNX31, TUBB2B, PI3, UPK2, L1CAM, KRT6A, KRT5, TWIST1, SAA1, TGM1, GNG4, Trex1, APLP1, RND2, GSDMC, CD44, FANCD2, CDH2, MRE11, (xviii) at least KRT20, SNX31, TUBB2B, PI3, UPK2, L1CAM, KRT6A, KRT5, TWIST1, SAA1, TGM1, GNG4, Trex1, APLP1, RND2, GSDMC, CD44, FANCD2, CDH2, MRE11, FOXA1, (xix) at least KRT20, SNX31, TUBB2B, PI3, UPK2, L1CAM, KRT6A, KRT5, TWIST1, SAA1, TGM1, GNG4, Trex1, APLP1, RND2, GSDMC, CD44, FANCD2, CDH2, MRE11, FOXA1, ATR, (xx) at least KRT20, SNX31, TUBB2B, PI3, UPK2, L1CAM, KRT6A, KRIS, TWIST1, SAA1, TGM1, GNG4, Trex1, APLP1, RND2, GSDMC, CD44, FANCD2, CDH2, MRE11, FOXA1, AIR, STING, (xxi) at least KRT20, SNX31, TUBB2B, PI3, UPK2, L1CAM, KRT6A, KRT5, TWIST1, SAA1, TGM1, GNG4, Trex1, APLP1, RND2, GSDMC, CD44, FANCD2, CDH2, MRE11, FOXA1, AIR, STING, PPARG, (xxii) at least KRT20, SNX31, TUBB2B, PI3, UPK2, L1CAM, KRT6A, KRT5, TWIST1, SAA1, TGM1, GNG4, Trex1, APLP1, RND2, GSDMC, CD44, FANCD2, CDH2, MRE11, FOXA1, AIR, STING, PPARG, DES, (xiii) at least KRT20, SNX31, TUBB2B, PI3, UPK2, L1CAM, KRT6A, KRT5, TWIST1, SAA1, TGM1, GNG4, Trex1, APLP1, RND2, GSDMC, CD44, FANCD2, CDH2, MRE11, FOXA1, AIR, STING, PPARG, DES, ERCC6, (xiv) at least KRT20, SNX31, TUBB2B, PI3, UPK2, L1CAM, KRT6A, KRT5, TWIST1, SAA1, TGM1, GNG4, Trex1, APLP1, RND2, GSDMC, CD44, FANCD2, CDH2, MRE11, FOXA1, AIR, STING, PPARG, DES, ERCC6, CDK1, (xv) at least KRT20, SNX31, TUBB2B, PI3, UPK2, L1CAM, KRT6A, KRT5, TWIST1, SAA1, TGM1, GNG4, Trex1, APLP1, RND2, GSDMC, CD44, FANCD2, CDH2, MRE11, FOXA1, AIR, STING, PPARG, DES, ERCC6, CDK1, RelA, (xvi) at least KRT20, SNX31, TUBB2B, PI3, UPK2, L1CAM, KRT6A, KRT5, TWIST1, SAA1, TGM1, GNG4, Trex1, APLP1, RND2, GSDMC, CD44, FANCD2, CDH2, MRE1A, FOXA1, AIR, STING, PPARG, DES, ERCC6, CDK1, RelA, CD274, (xvii) at least KRT20, SNX31, TUBB2B, PI3, UPK2, L1CAM, KRT6A, KRT5, TWIST1, SAA1, TGM1, GNG4, Trex1, APLP1, RND2, GSDMC, CD44, FANCD2, CDH2, MRE11, FOXA1, AIR, STING, PPARG, DES, ERCC6, CDK1, RelA, CD274, CLDN3, (xviii) at least KRT20, SNX31, TUBB2B, PI3, UPK2, L1CAM, KRT6A, KRT5, TWIST1, SAA1, TGM1, GNG4, Trex1, APLP1, RND2, GSDMC, CD44, FANCD2, CDH2, MRE11, FOXA1, AIR, STING, PPARG, DES, ERCC6, CDK1, RelA, CD274, CLDN3, and COL17A1, are explicitly envisaged (optionally in combination with the genes identified herein in gene sets suitable for use in identifying any of subtypes 1, 3, 4 and 5).
The data in Table 11 above shows that the following genes are of particular importance to differentiate subtype 3 from the other subtypes: TUBB2B, MSI1, GNG4, RB1, PI3, TP63, KRT14, PPARG, FGFR3, CLDN4, UPK1A, FOXA1, SNX31, KRT20, UPK2 (abs(score)>0.25) and BRIP1, E2F3, RAD54L, FANCB, AIMP3, GATA3, ZEB2, MRE11, GSDMC, TXNIP, CLDN7, RAD50, SFRP4, TWIST1, ZEB1, KRT6A, STING, RelA, CD44, KRT5 (abs(score)>0.1). The data in Table 13 above indicates that amongst these TUBB2B, MSI1, GNG4, PI3, TP63, KRT14, PPARG, FGFR3, CLDN4, UPK1A, FOXA1, SNX31, KRT20, UPK2 are particularly important (abs(score)>0.1), and that each of TUBB2B, MSI1, GNG4, RelA, CD44 KRT5, RB1, PI3, TP63, KRT14, PPARG, FGFR3, CLDN4, UPK1A, FOXA1, SNX31, KRT20, and UPK2 contributes to the classification in subtype 3 with classifier C54. The data in Table 14 above indicates that amongst these, TUBB2B, MSI1 are particularly important (abs(score)>0.1), and that each of TUBB2B, MSI1, GNG4, TP63, KRT14, PPARG, FGFR3, CLDN4, UPK1A, FOXA1, SNX31, KRT20, UPK2 contributes to the classification in subtype 3 with classifier C32. The data in Table 15 above indicates that amongst these, TUBB2B, FOXA1, SNX31, KRT20, UPK2 are particularly important (abs(score)>0.1), and that each of TUBB2B, MSI1, PPARG, FGFR3, CLDN4, UPK1A, FOXA1, SNX31, KRT20, UPK2 contributes to the classification in subtype 3 with classifier C20. The data in Table 16 above indicates that amongst these, TUBB2B, UPK2 are particularly important (abs(score)>0.1), and that each of TUBB2B, SNX31, KRT20, UPK2 contributes to the classification in subtype 3 with classifier C9. Expression of these genes may therefore be used as predictive markers indicative of a likely negative response to radiotherapy (such as e.g. invasive locoregional relapse). In particular, gene sets comprising
(i) at least TUBB2B and UPK2, (ii) at least TUBB2B, UPK2, KRT20, (iii) at least TUBB2B, UPK2, KRT20, SNX31, FOXA1, (iv) at least TUBB2B, UPK2, KRT20, SNX31, FOXA1, UPK1A, (v) at least TUBB2B, UPK2, KRT20, SNX31, FOXA1, UPK1A, CLDN4, (vi) at least TUBB2B, UPK2, KRT20, SNX31, FOXA1, UPK1A, CLDN4, MSI1, (vii) at least TUBB2B, UPK2, KRT20, SNX31, FOXA1, UPK1A, CLDN4, MSI1, FGFR3, (viii) at least TUBB2B, UPK2, KRT20, SNX31, FOXA1, UPK1A, CLDN4, MSI1, FGFR3, PPARG,
(ix) at least TUBB2B, UPK2, KRT20, SNX31, FOXA1, UPK1A, CLDN4, MSI1, FGFR3, PPARG, KRT14, (x) at least TUBB2B, UPK2, KRT20, SNX31, FOXA1, UPK1A, CLDN4, MSI1, FGFR3, PPARG, KRT14, TP63, (xi) at least TUBB2B, UPK2, KRT20, SNX31, FOXA1, UPK1A, CLDN4, MSI1, FGFR3, PPARG, KRT14, TP63, UPK2, (xii) at least TUBB2B, UPK2, KRT20, SNX31, FOXA1, UPK1A, CLDN4, MSI1, FGFR3, PPARG, KRT14, TP63, UPK2, PI3, (xiii) at least TUBB2B, UPK2, KRT20, SNX31, FOXA1, UPK1A, CLDN4, MSI1, FGFR3, PPARG, KRT14, TP63, UPK2, PI3, RB1, (xiv) at least TUBB2B, UPK2, KRT20, SNX31, FOXA1, UPK1A, CLDN4, MSI1, FGFR3, PPARG, KRT14, TP63, UPK2, PI3, RB1, KRT5, (xv) at least TUBB2B, UPK2, KRT20, SNX31, FOXA1, UPK1A, CLDN4, MSI1, FGFR3, PPARG, KRT14, TP63, UPK2, PI3, RB1, KRT5, CD44, or (xvi) at least TUBB2B, UPK2, KRT20, SNX31, FOXA1, UPK1A, CLDN4, MSI1, FGFR3, PPARG, KRT14, TP63, UPK2, PI3, RB1, KRT5, CD44, and RelA, are explicitly envisaged (optionally in combination with the genes identified herein in gene sets suitable for use in identifying any of subtypes 1, 2, 4 and 5).
The above data further indicates that the following genes may be particularly important to differentiate patients that have a poor prognosis following chemoradiation (e.g. patients in subtypes 1, 2 and/or 3) from patients that have a good prognosis following chemoradiation (e.g. patients in subtypes 4 and/or 5): ATM (overexpressed in subtypes 1-3, underexpressed in subtypes 4-5), ATR (overexpressed in subtype 2, underexpressed in subtype 5), C7 (overexpressed in subtype 1, underexpressed in subtype 4), CD274 (underexpressed in subtype 2, overexpressed in subtype 5), CD44 (underexpressed in subtypes 2-3, overexpressed in subtypes 4-5), cGAS (underexpressed in subtype 2, overexpressed in subtype 5), CLDN3 (underexpressed in subtypes 1-2, overexpressed in subtype 5), CLDN7 (underexpressed in subtypes 1-3, overexpressed in subtype 5), CLDN4 (underexpressed in subtypes 1, 3, overexpressed in subtype 5), ERCC1 (underexpressed in subtype 2, overexpressed in subtype 5), ERCC6 (overexpressed in subtype 2, underexpressed in subtype 5), KRT6A (underexpressed in subtypes 2-3, overexpressed in subtypes 4-5), MRE11 (underexpressed in subtypes 2-3, overexpressed in subtype 5), PI3 (underexpresed in subtypes 2-3, overexpressed in subtypes 4-5), RelA (underexpresed in subtypes 2-3, overexpressed in subtypes 4-5), SAA1 (underexpresed in subtypes 2 and 3 (to a lower extent), overexpressed in subtypes 4-5), SFRP4 (overexpressed in subtype 1, underexpressed in subtypes 3-5 (to a lower extent)), SUMO1 (moderately underexpressed in subtype 2, moderately overexpressed in subtype 5), TGM1 (underexpressed in subtypes 2 and 3 (to a lower extent), overexpressed in subtypes 4-5), Trex1 (underexpressed in subtypes 2 and 3 (to a lower extent), overexpressed in subtype 5), and TWIST1 (underexpressed in subtypes 2-3, overexpressed in subtype 5).
The following genes may be useful in differentiating patients in subtype 1 (that have a particularly poor prognosis following chemoradiation) from patients in subtypes 4 and/or 5 (that have a particularly good prognosis following chemoradiation): KRT5, SFRP4, DES, PI3, CLDN3, CLDN7, KRT14, ZEB2, COMP, C7, CLDN4, SGCD, ZEB1, ZEB2, COL17A1, TGM1, DSC3, KRT6A, and TWIST1 (Group 1), and RAD54L, ATM (Group 4).
The following genes may be useful in differentiating patients in subtype 1 (that have a particularly poor prognosis following chemoradiation) from patients in subtype 4 (that have a particularly good prognosis following chemoradiation): KRT5, SFRP4, DES, PI3, KRT14, C7, COMP, ZEB2, DSC3, KRT6A, SGCD, ZEB1, COL17A1, and TGM1 (Group 1), and RAD54L, ATM (Group 4). Indeed, the PAM centroid coordinates for subtypes 1 and 4 for each of these genes have a distance >0.4.
The following genes may be useful in differentiating patients in subtype 1 (that have a particularly poor prognosis following chemoradiation) from patients in subtype 5 (that have a particularly good prognosis following chemoradiation, where the good prognosis is thought to be driven by radiosensitivity): SFRP4, DES, CLDN3, CLDN7, KRT14, ZEB2, COMP, C7, CLDN4, SGCD, ZEB1, and TWIST1 (Group 1), and RAD54L, ATM (Group 4). Indeed, the PAM centroid coordinates for subtypes 1 and 5 for each of these genes have a distance >0.4.
The following genes may be useful in differentiating patients in subtype 1 (that have a particularly poor prognosis following chemoradiation) from patients in subtypes 4 and 5 (that have a particularly good prognosis following chemoradiation): SFRP4, DES, CLDN3, CLDN7, KRT14, ZEB2, COMP, C7, CLDN4, SGCD, ZEB1, and TWIST1 (Group 1), and RAD54L, ATM (Group 4).
Table 17 below shows the clinicopathological features of each subtype identified in Example 3. No significant difference was noted between the 5 groups although there was a trend towards subtype 1 having a lower tumour content.
Subtype 1 overexpressed genes within the epithelial-mesenchymal transition (EMT) pathway such as SGCD, CDH2, SFRP4, ZEB2 and COMP. Subtype 1 also overexpressed extracellular matrix genes such as DES. CLDN3 and CLDN7 were underexpressed which would be in keeping with a claudin-low subtype. Interestingly, RAD54L, BRIP1 and CDK1 were also underexpressed; RAD54L and BRIP1 are involved in homologous recombination (repair of double stranded DNA breaks). There was a trend towards a lower tumour content compared to other subtypes which is something seen in TCGA luminal infiltrated cases (Robertson et al., 2017).
Subtype 2 overexpressed luminal markers such as KRT20, PPARG, UPK2. Of note, this subtype demonstrated higher levels of expression of AIMP3, FANCB and NBN compared to the other subtypes.
Subtype 3 displayed high expression of genes associated with the TCGA neuronal subtype such as TUBB2 and MSI1. RAD54L and FANCB expression also featured although at lower levels than that seen in subtype 1. Luminal markers were underexpressed, in keeping with this being a basal subtype.
Subtype 4 demonstrated high levels of keratins expressed by basal cells (KRT14 and KRT5, KRT6A). ATM was underexpressed although not to the same degree as that seen in subtype 5. Subtype 4 also demonstrated the highest levels of L1CAM, which was categorised as an immune marker in the TCGA report.
Subtype 5 was showed moderate expression of EMT genes (CLDN3/4/7, TWIST1). Of all the subtypes, this group had the highest expression levels of Trex1 and MRE11. Of note, there was underexpression of ATM, ERCC6, ERCC4, BCLAF1 and ATR. Subtype 5 also had the highest expression of immune markers SAA1 and CD274.
Table 17 below compares the TCGA subtype allocations (see Reference Example 2) and the subtypes allocated as described in Examples 3 and 4. The data shows that luminal tumours were found in subtype 2 only and most of the neuronal tumours within subtype 3. Basal-squamous tumours tended to be in subtype 5.
Focussing on a subset of 36 genes associated with DNA damage repair or candidate radiosensitivity genes, Mann-Whitney tests comparing gene expression between patients with or without any locoregional recurrence, and additionally with or without invasive locoregional recurrence were performed. Tables 18 and 19 below show the raw and adjusted p-values (p-values adjusted for multiple testing using Benjamini-Hochberg correction with FDR 0.05) obtained for LRR and invasive LLR, respectively, for the top 5 most differentially expressed genes.
A positive log 2 fold change value indicates higher levels of expression in patients with locoregional relapse vs those without, and a negative indicates lower expression in those with locoregional relapse i.e. a log 2 fold change of 1 indicates the gene expression level is twice as high in patients who had a locoregional relapse compared to those with no locoregional relapse, and conversely, a log 2 fold change of −1 indicates that the gene expression level in those with locoregional relapse is half of that that seen in patients with no locoregional relapse.
None of the genes reached statistical significance once adjustments were made for multiple testing. However, together with the data in Example 4, this data indicates that the expression of at least HDAC1 (Group 2), ATM (Group 4), ERCC5 (Group 2), MRE11 (Group 4) and BRCA2 (Group 4) may particularly contribute to differentiating patients with poor vs good prognosis following radiotherapy, especially in combination with genes that identify subtypes of MIBC from Group 1 as demonstrated in Examples 3 and 4.
The c71 classifier shown in Table 11 above was applied to the data from Robertson et al. (2017) (also referred to herein as “TCCA data”, or “TCCA cohort”). The gene expression of each sample was correlated to c71 centroid and samples were assigned to the subtype with the maximum Pearson correlation coefficient. Table 16 below shows the results of this analysis.
Comparing the TCGA and c71 subtypes allocation in this data, it can be seen that 54/84 (64.3%) of the TCGA basal squamous samples were allocated c71 subtype 4 and 18/84 (21.4%) were classified as subtype 3. 11/84 (13.1%) were deemed to have a mixed subtype and further exploration revealed samples to be either a mix of subtypes 3 and 4, or of subtypes 3 and 1.
Of the samples labelled as TCGA luminal, 9/15 (60%) samples fell into subtype 2, and 3/15 (20%) were assigned to subtype 1. The remaining 3/15 (20%) of cases were all deemed to have a mixture of subtypes 1 and 3. Interestingly, 47/81 (58.0%) TCGA luminal papillary were also labelled as subtype 2, suggesting that the present classifier is less sensitive to the proposed subdivision of luminal cases into luminal or luminal papillary.
29/45 (64.4%) of the TCGA luminal infiltrated were labelled as subtype 1. 8/9 (88.9%) of the neuronal samples were classified as subtype 3.
Interestingly, only 6/234 TCGA cases were allocated to subtype 5, and these were all luminal papillary cases. By contrast, in the radiotherapy cohort in Example 3, subtype 5 formed the largest subgroup accounting for 30.2%.
Survival analysis was performed for 233/234 samples of the TCGA, for which survival data was available.
As discussed in Examples 4 to 6 above, the following genes may be particularly relevant in identifying patients that are likely to respond to radiotherapy+/−chemotherapy and/or patients that are unlikely to respond to radiotherapy+/−chemotherapy:
A review of the literature discussing some of the above genes was performed and is summarised below.
ATM was underexpressed in subtype 4 and particularly subtype 5. ATM plays a key role in initiation of DNA damage repair pathways by interacting with the MRN complex which is composed of MRE11, NBN and RAD5016. Decreased levels of expression might therefore be hypothesised to result in decreased activation of DNA repair pathways with subsequent radiosensitivity. Of note, ATR was also underexpressed in subtype 5 and plays a similar role to that of ATM in sensing DNA damage and initiating repair pathways.
Individuals with ataxia telangiectasia, a condition where the ATM gene is mutated, are very sensitive to ionising radiation. In MIBC, genomic aberrations in ATM have been shown to be associated with response to neoadjuvant chemotherapy (Plimack et al., 2015), and overall improved outcomes (Yap et al., 2014). In the context of chemoradiation for MIBC, Desai et al. (2016) reported that deleterious aberrations in DNA damage repair genes (including ATM) may be associated with a lower risk of recurrence. However, within the cohort of 48 patients in the study of Desai et al., only 5 had ATM aberrations and so it is not possible to comment on the significance of ATM aberrations in chemoradiation. Interestingly, no association between ATM and outcomes has been noted at an immunohistochemical level (Choudhury et al., 2010; Laurberg et al., 2012). However, a predictive value for radiotherapy response associated with ATM mRNA levels was not known before the present study.
Conversely, subtype 1 had the highest levels of ATM expression and the highest incidence of invasive LRR (3/5; 60%) of the 5 subtypes. The higher level of invasive local recurrence seen in subtype 1 supports the hypothesis that ATM overexpression and underexpression is associated with radioresistance and radiosensitivity respectively.
These results described are also supported by the data from differential gene expression analysis where ATM was noted to be expressed at twice the level in those with invasive locoregional relapse compared to those without (raw p value=0.002), although this was no longer statistically significant after adjustment for multiple testing (p=0.072) (most likely due primarily to the high number of genes tested).
MRE11 was most highly expressed in subtype 5. Expression of this gene at the protein level was previously shown to be a potential predictive biomarker of radiotherapy response in MIBC by Choudhury et al. (2010). They reported that higher MRE11 levels at an immunohistochemical level were associated with better cause-specific survival following radiotherapy but not cystectomy. These results were validated in an independent cohort (Laurberg et al., 2012) although more recent work from 2 groups (Desai et al., 2016; Walker et al, 2019) found no such association.
An association between high MRE11 protein expression and improved outcomes following radiotherapy is arguably counterintuitive. Indeed, one might hypothesise that with increased expression of MRE11, there would be increased detection of DNA damage and repair, resulting in radioresistance. In looking for an explanation for this ‘unexpected’ finding, Martin et al. (2014) explored whether MRE11 mRNA and protein levels were correlated. They reported a lack of correlation, suggesting therefore that MRE11 is subject to post-transcription regulation. Of note, Martin et al. (2014) reported that mRNA and protein levels of NBS1 and RAD50 did appear to be positively correlated. Interestingly, in this work, no association was found between MRE11 protein expression and survival as in Choudhury et al., (2010).
In the present work, higher expression levels of MRE11 were seen in subtype 5, members of which had lower incidence of invasive locoregional recurrence and better overall survival.
ERCC 1/2/4-6 are involved in nucleotide excision repair, which is the primary pathway by which adducts such as those from cisplatin or mitomycin C are repaired. ERCC2 has been of great interest as work from several groups has suggested it may have a role as a biomarker of response to neoadjuvant chemotherapy. However, somewhat surprisingly, ERCC2 did not contribute to the c71 gene panel despite being expressed at statistically significant lower levels in patients with locoregional relapse (p=0.002) before multiple testing correction. There was also a trend towards lower expression levels of ERCC2 in patients with invasive locoregional relapse (p=0.06 before multiple testing correction). Its omission from the c71 panel might in part be due to the fact that the fold change observed was small at −0.46 (patients with locoregional relapse vs patients without locoregional relapse) and −0.45 (patients with invasive locoregional relapse vs patients without invasive locoregional relapse) respectively. Further, ERCC2 is thought to primarily play a role in the removal of platinum adducts rather than radiation-induced DNA damage repair.
Given the proposed role of ERCC2 mutations with regards to chemosensitivity, it seems plausible that ERCC2 status influences the effects of concomitant mitomycin-C (used as a radiosensitiser in bladder radiotherapy): patients with reduced ERCC2 function may gain more radiosensitising effect from concomitant chemotherapy, while those with ‘normal’ or increased ERCC2 effects may be better served by radiotherapy alone, or alternative radiosensitisers such as carbogen and nicotinamide.
Desai et al. (2016) explored aberrations of DNA damage repair genes in a cohort of MIBC patients treated with chemoradiation and reported that ERCC2 mutations were associated with lower 2-year distant metastatic recurrence, and that only 1 of 6 patients with an ERCC2 missense aberration had a local recurrence.
In contrast to ERCC2, ERCC4 and ERCC6 however did form part of the c71 gene panel and were both underexpressed in subtype 5. A role of ERCC4 or ERCC6 expression in the context of bladder cancer or radiotherapy has not been previously reported. Given the role of the ERCC gene family in nucleotide excision repair, it is not unreasonable to suggest that expression levels of ERCC4 and ERCC6 may influence the effect of chemotherapy given neoadjuvantly and/or concurrently. The majority of patients in the RT cohort received neoadjuvant chemotherapy and concurrent chemotherapy as they were treated following results from key trials (James et al., 2012; Advanced Bladder Cancer Meta-analysis, C., 2005) demonstrating survival benefit for NAC and concurrent chemotherapy. The potential interaction between the ERCC genes expression and chemotherapy is not something that can be investigated in the existing cohort as the subset of those not receiving chemotherapy is small. Indeed, it would likely be difficult to prospectively accrue numbers for patients treated with radiotherapy alone as the standard of care is to use chemotherapy neoadjuvantly and concurrently if proceeding with a bladder-sparing strategy (James et al., 2012; Advanced Bladder Cancer Meta-analysis, C., 2005; Chang et al., 2017; Witjes et al., 2017; NCCN, 2018; Yin et al., 2016). Patients treated with radiotherapy alone are likely to have significant comorbidities precluding the use of chemotherapy or surgery, and these would likely be confounding factors in data analysis. Nevertheless, the present data indicates that patients that underexpress ERCC5 and/or ERCC6 are likely to benefit from a bladder preservation strategy using chemoradiation, which is the current standard of care for bladder preservation strategy. In other words, such patients could usefully be directed to a bladder preservation strategy implemented according to the current standard of care.
As previously mentioned, subtype 1 had the highest incidence of invasive locoregional recurrence. This subtype had the lowest levels of expression of RAD54L and BRIP1, which are components of the homologous recombination pathway, responsible for the repair of double-stranded DNA breaks (DSB).
One might therefore expect that underexpression of these genes would result in reduced DSB repair and subsequent radiosensitivity. In view of the previously discussed evidence that ATM and RB1 may contribute to radioresistance in subtype 1, it seems that the type of DNA repair that is particularly active in these cells may play a role in moderating radiosensitivity. DSBs are predominantly repaired by NHEJ in the G1 phase of the cell cycle, whereas HR is predominant in the G2-M phases (Branzei, D. & Foiani, M, 2008). Greater levels of RB1 might therefore cause cells to arrest at the G1/S checkpoint which is a less radiosensitive phase, and where the predominant repair mechanism is that of NHEJ. In such a state, RAD54L and BRIP1 would not contribute significant DNA repair and therefore the effect from their underexpression would not impact the efficiency of DNA repair.
In this work, we identified a classifier and subtypes associated with invasive locoregional relapse following radical radiotherapy+/−chemotherapy. No such association was seen with the TCGA subtypes, when used on the same data set.
In particular, tumours classified in subtypes 4 and 5 as disclosed herein were associated with improved outcomes following radiotherapy+/−chemotherapy. Of the 20 patients assigned to c71 subtype 4 or 5, only one had an invasive locoregional recurrence (5%), compared to 8/23 (34.8%) of patients allocated to subtypes 1-3 (p=0.0243). Of note, patients in subtypes 4 and 5 had similar median follow-up periods to those in subtypes 1-3 (3.40 vs 3.80 years; p=0.591).
Furthermore, subtypes 4 and 5 together have a statistically significant higher pathological complete response rate after radiotherapy compared to subtypes 1-3 (100% vs 60%; p=0.0237).
This suggests that the expression signature identified herein could function as a predictive biomarker with patients found to have subtype 4 or 5 tumours being counselled towards bladder preservation strategies.
Tumours with basal features have previously been reported as associated with poorer outcomes. In the present analysis of the TCGA data which included those primarily treated with radical cystectomy, the 2-year overall survival of those with basal squamous tumours was <50%. Repeat analysis of the same dataset using the classifier disclosed herein showed that subtype 4 had a similarly poor 2-year survival rate. This is consistent with the broad overlap between the TCGA basal-squamous and subtype 4 groups; 64% of TCGA basal-squamous tumours were allocated to subtype 4.
However, in the present radiotherapy-treated cohort, subtype 4 was associated with a 2-year overall survival of 85%. This result suggests that patients with subtype 4 tumours tumours derive greater benefit from radiotherapy+/−chemotherapy over that from surgery alone (as is the case in the TCGA cohort). A potential association of basal-like subtypes with improved outcomes following bladder preservation strategies over surgery has not been previously reported.
As the majority of the radiotherapy cohort received NAC and concurrent chemotherapy, it could be suggested that the improvement in overall survival observed in subtype 4 and TCGA basal squamous patients in the radiotherapy cohort is primarily due to the use of chemotherapy, as opposed to radiotherapy over surgery. Seiler et al reported in their NAC cohort that tumours assigned to GSC basal subtype appeared to derive the most benefit from NAC with 3-year OS rate of 77.8% compared to 49.2% in a non-NAC cohort. This is similar to the subtype 4 observed 3-year overall survival rates of 85% in the radiotherapy+/−chemotherapy cohort, and 32% in the TCGA surgical cohort where the majority did not receive NAC.
Given that MIBC is recognised to have high metastatic relapse rates and this is what limits a patient's survival, it seems possible that the benefit seen in the basal-squamous/subtype 4 groups is primarily due to the combination of chemotherapy with radiotherapy, as opposed to being driven by the use of radiotherapy over surgery.
Nevertheless, the present data indicates that patients in subtype 4 are likely to benefit from a bladder preservation strategy using chemoradiation, which is the current standard of care for bladder preservation strategy. In other words, such patients could usefully be directed to a bladder preservation strategy implemented according to the current standard of care.
By contrast, the effect in subtype 5 was primarily seen at the level of locoregional relapse, which reflects the local action of radiotherapy. This suggests that the improved prognosis observed in these patients is likely driven by the use of radiotherapy. This indicates that patients in subtype 5 are likely to benefit from a bladder preservation strategy implemented according to the current standard of care (chemoradiation), and potentially also from a bladder preservation strategy implemented using radiotherapy alone (e.g. where surgery and chemotherapy is preferably avoided for other reasons).
All references cited herein are incorporated herein by reference in their entirety and for all purposes to the same extent as if each individual publication or patent or patent application was specifically and individually indicated to be incorporated by reference in its entirety.
The specific embodiments described herein are offered by way of example, not by way of limitation. Any sub-titles herein are included for convenience only, and are not to be construed as limiting the disclosure in any way.
Number | Date | Country | Kind |
---|---|---|---|
2011213.2 | Jul 2020 | GB | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2021/070274 | 7/20/2021 | WO |