Embodiments of this invention relate to improved methods for therapy, diagnosis, prognosis, and predicting response to treatment of certain types of cancer, and to methods for screening for and developing novel targets, biomarkers and therapeutics for treating certain types of cancer. Embodiments of the invention have particular application in methods for therapy, diagnosis, prognosis and predicting response to treatment of clear cell carcinoma of the ovary, endometrioid carcinoma, and uterine carcinoma, and to methods of screening for and developing novel therapeutics for treating clear cell carcinoma of the ovary, endometrioid carcinoma, and uterine carcinoma.
In North America, ovarian cancer is the leading cause of death due to gynaecological malignancies and is the fifth leading cause of cancer death in Canadian women. Ovarian cancers can be divided into subtypes based on their tumour cell types. Clear cell carcinomas (CCC) of the ovary are one of the ovarian cancer subtypes and represent approximately 12% of all malignant ovarian tumours. Though they are intrinsically resistant to traditional platinum and taxane therapies, these cancers are still treated similarly to other ovarian cancers. Patients with CCC are therefore exposed to treatment which is ineffective, toxic, and expensive and there are currently no alternative anti-cancer agents effective for this disease. Thus, due to the limited success of traditional chemotherapy, there is an urgent need for more effective treatments which are specific to the CCC subtype of ovarian cancer.
Epithelial ovarian cancer is the fifth leading cause of cancer death and second most common gynaecological malignancy in Canada. There are several subtypes of epithelial ovarian cancer. High grade serous cancers are the most common and account for approximately 70% of all cases. CCCs are the second most common subtype (12% of cases) and the second leading cause of ovarian cancer associated deaths. Whereas high grade serous cancers are the subject of The Cancer Genome Atlas Project, CCCs are relatively understudied.
Despite evidence that ovarian carcinoma subtypes are essentially different diseases3,4, it is current practice to treat them all with platinum/taxane chemotherapy. CCCs, however, respond extremely poorly to this treatment5-7 with response rates of 15% compared to 80% for high grade serous carcinomas4. CCCs have a low mitotic rate4,8, are genetically stable, diploid or tetraploid and develop from well-established precursor lesions. They do not exhibit the complex karyotypes or chromosomal instability associated with high grade serous cancers8,9, which may contribute to their chemoresistance. CCCs are often diagnosed at an early stage, with 80% of cases presenting with stage I or II carcinoma10,11, however survival rates for stage I/II CCC are significantly lower (60%) compared to patients with other ovarian cancer subtypes presenting with stage I/II disease7,12. There are currently no effective anti-cancer agents for CCCs.
CCCs are defined based on histopathological findings as tumours composed predominantly of clear cells and hobnail cells13. While CCC express hepatocyte nuclear factor-1beta, they rarely express biomarkers commonly associated with high grade serous or other ovarian cancers4 and the distinctive CCC immunophenotype can be used as an aid in diagnostically challenging cases14. The most commonly mutated gene in CCC is PIK3CA (present in 14%-50% of cases)15-19. By contrast, BRCA1, BRCA2, and TP53 mutations are commonly found in high grade serous cancers but are typically absent in CCCs19,20. Though there is an association between both CCCs and low-grade endometrioid carcinomas with endometriosis21, the mechanism of this transformation was previously unknown for CCCs. In addition, CCCs can arise from adenofibromas22,23. CCCs are aggressive cancers untreatable with current chemotherapy, are poorly understood, and remain relatively understudied. In addition, they are genomically stable8,9.
Next generation sequencing technology is based on massively parallel single molecule sequencing to cost-effectively produce millions of short sequence reads. This technology can fully interrogate genomes or transcriptomes at a single base resolution for single nucleotide variance, splice variants, genome rearrangements, copy number changes, inversions, and insertions and deletions24. In the case of paired-end sequencing, next generation sequencing technology generates millions of randomly fragmented, short sequenced reads that flank longer unsequenced regions. Data is generated using a four-color DNA “sequencing-by-synthesis” technology followed by fluorescence detection. After completion of the first read, templates are regenerated in situ to enable a second read from the opposite end of the fragments, producing end-sequence pairs. It is possible to use this technology for whole genome analysis, however this is much more costly than RNA-seq (whole transcriptome analysis) which sequences cDNAs generated from total mRNA. Resulting paired-end reads are aligned to a reference sequence (e.g. NCBI build 36.1, hg18) which produces relevant data on each read, such as location within the transcriptome, quality of read, number of mismatches, and paired-end flags. Single nucleotide variants (SNVs) are predicted based on discrepancies between the reference genome and the aligned mapped reads. Fusion transcripts and other rearrangements are recognized by identifying all mate-pairs that do not align canonically in pairs to the human genome.
Chromosomal DNA is wound around proteins called histones to form a complex structure called chromatin. The basic unit of chromatin is the nucleosome which is composed of DNA wrapped around eight histone proteins. Nucleosomes are connected by linker DNA, similar to beads on a string. Further coiling or condensation of chromatin creates a higher order structure known as heterochromatin. DNA organized into heterochromatin is inaccessible to transcriptional machinery. Chromatin remodelling, either through covalent modification of histones or through the mobilization of nucleosomes, is required before DNA can be accessed for transcriptional initiation.
The SWI/SNF protein complex uses ATP hydrolysis to mobilize nucleosomes which modulates accessibility to transcription machinery. The SWI/SNF protein complex is typically associated with transcriptional activation or repression and functions at the promoter. This complex is present in all eukaryotes and is essential for many cellular processes including development, differentiation, proliferation, DNA repair, and tumour suppression26. The complex is comprised of one of two ATPases, BRM (Brahma) or BRG1 (Brahma-Related Gene 1)27,28, along with conserved core subunits and variable accessory proteins termed BAFs (BRM- or BRG1-associated factors) (
BRG1 containing SWI/SNF complexes contain either BAF250 or BAF180, while BRM complexes contain only BAF250. There are two BAF250 proteins which are encoded by paralogous genes. BAF250a (also referred to as p270) is encoded by the ARID1A gene and BAF250b is encoded by the ARID1B gene. These proteins are mutually exclusive within BRG1 or BRM containing SWI/SNF complexes29.
Co-immunoprecipitation studies indicate that BAF250a and BAF250b interact with BRG1 and BRM through their C-terminal domains30 and the interaction between BAF250a and BRG1 has been shown to be required for transactivation of the MMTV (mouse mammary tumour virus) promoter31. This steroid hormone responsive promoter is often used as part of a model system to study transcriptional activation from SWI/SNF-mediated chromatin remodelling. Specifically, BAF250a has been shown to stimulate glucocorticoid receptor-mediated transactivation; this requires the presence of the BAF250a C-terminus which can directly interact with the glucocorticoid receptor in vitro32.
There remains an unmet need in the oncology field for new treatment modalities that specifically target the molecular defects driving the pathogenensis of CCC, endometrioid carcinoma (EC), and uterine carcinoma. There is a need for novel prognostic, diagnostic and predictive (response to treatment) markers for CCC, EC, and uterine carcinoma. There is a need for novel therapeutic targets for treatment of CCC, EC, and uterine carcinoma, methods for identifying such novel therapeutic targets, and therapeutic agents for treating these cancers.
The following embodiments and aspects thereof are described and illustrated in conjunction with systems, tools and methods which are meant to be exemplary and illustrative, not limiting in scope. In various embodiments, one or more of the above-described problems have been reduced or eliminated, while other embodiments are directed to other improvements.
Embodiments of the invention provide novel biomarkers and therapeutic targets for treatment of certain types of cancer, including CCC, EC, and uterine carcinoma. Mutations in genes encoding proteins that form part of the SWI/SNF chromatin remodelling protein complex, including ARID1A, or loss of expression of such proteins, including BAF250a, can be used to evaluate the likelihood endometriosis will progress or transform to cancer, to provide a prognosis for a patient with cancer, to assess whether conventional treatment is likely to be effective against a cancer, and/or in a synthetic lethal screen to identify novel targets and therapeutics for the treatment of cancer.
Mutations in ARID1A or other genes encoding proteins that are components of the SWI/SNF complex can be assessed by assaying for the presence of such mutations in a sample of tissue obtained from a site of endometriosis or a carcinoma of a subject. Techniques that may be used to confirm the presence of mutations in ARID1A include Sanger sequencing of the tissue sample or next generation sequencing of the tissue sample, PCR-based methods including Amplification Refractory Mutation System (ARMS)-based PCR, or TaqMan™ assays, or hybridization-based methods including fluorescence in-situ hybridization (FISH), or any other suitable detection technique.
Loss of expression of proteins that are components of the SWI/SNF complex, including BAF250a, can be assessed by obtaining a sample of tissue from a site of endometriosis or a carcinoma of a subject for expression of that protein, for example using immunohistochemistry.
In some embodiments, cells having mutations in ARID1A or other genes encoding proteins that are components of the SWI/SNF complex can be used in a synthetic lethal screen to identify new targets for the treatment of CCC, EC and uterine carcinoma. In some embodiments, targets identified by such screens can be used to screen for novel therapeutics useful in the treatment of CCC, EC and uterine carcinoma.
In addition to the exemplary aspects and embodiments described above, further aspects and embodiments will become apparent by reference to the drawings and by study of the following detailed descriptions.
Exemplary embodiments are illustrated in referenced figures of the drawings. It is intended that the embodiments and figures disclosed herein are to be considered illustrative rather than restrictive.
Throughout the following description specific details are set forth in order to provide a more thorough understanding to persons skilled in the art. However, well known elements may not have been shown or described in detail to avoid unnecessarily obscuring the disclosure. Accordingly, the description and drawings are to be regarded in an illustrative, rather than a restrictive, sense.
For further clarity, database identifiers for the ARID1A gene, RNA and protein are as follows: Entrez Gene: 8289; UniProtKB/Swiss-Prot: ARI1A HUMAN, O14497; RefSeq DNA sequence: NC—000001.10 NT—004610.19; REFSEQ mRNAs for ARID1A gene (2 alternative transcripts): NM—006015.4 NM—139135.2. The wild-type sequence for ARID1A (NM—006015.4) is set forth in SEQ ID NO.:1.
The inventors have now discovered that mutations in genes encoding proteins that are components of the SWI/SNF complex are useful as biomarkers or targets to assist in the diagnosis, prognosis and treatment of, and development of therapeutic agents for, certain types of cancer including clear cell carcinoma (CCC) of the ovary, endometrioid carcinoma (EC), and uterine carcinoma. The inventors have demonstrated that such mutations are relatively common in endometrial carcinomas but relatively infrequent in other types of cancer. The mechanism of progression of cancer involving these mutations appears to be distinct from other known mechanisms of cancer development. See also Wiegand et al., N. Engl. J. Med. 2010, 363:1532-1543, and the Supplementary Appendix thereto, both of which are hereby incorporated by reference herein.
Ovarian CCC and EC are thought to arise from endometriosis. The presence of nonsense mutations, significant missense mutations, or genetic rearrangements in genes encoding proteins that are important to the proper functioning of the SWI/SNF complex in endometriosis may indicate a risk of malignant progression or transformation of endometriosis to these cancers or other types of ovarian cancers, a poor prognosis for a patient having a form of cancer with such mutations, or a likelihood that standard chemotherapeutic agents such as platinum or taxane therapeutics are unlikely to be effective in treating a form of cancer with such mutations. A lack of expression of proteins that are important to the proper functioning of the SWI/SNF complex in endometriosis may indicate a risk of malignant progression or transformation of endometriosis to these cancers or other types of ovarian cancers. A lack of expression of proteins that are important to the proper functioning of the SWI/SNF complex in a carcinoma may indicate a poor prognosis for a patient with the carcinoma, and/or a likelihood that standard chemotherapeutic agents such as platinum or taxane therapeutics are unlikely to be effective in treating that carcinoma.
As used herein, the term “significant mutation” when used with reference to a gene means a mutation in the DNA sequence of the gene that produces a mutated protein that is not able to fully perform the typical function of that protein. The term “significant mutation” when used with reference to a protein means a mutation in the DNA sequence encoding that protein that produces a mutated protein product that is not able to fully perform the typical function of that protein, and includes all mutations equivalent thereto by reason of the degeneracy of the genetic code. A significant mutation could include a truncation mutation, a nonsense mutation, a significant missense mutation, and/or a genetic rearrangement.
As used herein, the term “poor prognosis” means a significant prospect that a patient with cancer will suffer a negative outcome, e.g. morbidity or death, as a result of the cancer.
Embodiments of the invention provide novel targets and molecular defects associated with the development and pathogenesis of CCC of the ovary, EC and uterine carcinoma. These targets and defects are distinct from those characteristic of other types of ovarian cancer and will enable the development of new therapies effective for treatment of CCC of the ovary, EC and uterine carcinoma.
Embodiments of the invention provide novel biomarkers useful for the prognosis of CCC of the ovary, EC and uterine carcinoma. Embodiments of the invention provide novel biomarkers to enable prediction of the risk of malignant progression (or transformation) of endometriotic lesions (endometriosis) to these cancers or other types of ovarian cancer.
Embodiments of the invention provide novel biomarkers useful for predicting response to treatment (chemotherapy, radiation, targeted drug therapy and the like) of patients with CCC of the ovary, EC and uterine carcinoma.
In one aspect of the invention, mutations in one or more of the genes/proteins comprising the SWI/SNF chromatin remodelling complex are markers that are useful as therapeutic targets, or to enable the development of therapeutic targets for treatment of CCC of the ovary, EC and uterine carcinoma.
In another aspect of the invention, mutations in one or more of the genes/proteins comprising the SWI/SNF chromatin remodelling complex are novel biomarkers useful for the prognosis of CCC of the ovary, EC and uterine carcinoma and for prediction of the risk of malignant progression (or transformation) of endometriotic lesions (endometriosis).
In another aspect of the invention, mutations in one or more of the genes/proteins comprising the SWI/SNF chromatin remodelling complex are novel biomarkers that are useful for predicting response to treatment (chemotherapy, radiation, targeted drug therapy and the like) of patients with CCC of the ovary, EC and uterine carcinoma.
In another aspect of the invention, one or more mutations in the gene ARID1A (encoding protein BAF250a (also referred to as p270)), a component of the SWI/SNF chromatin remodelling complex, are markers that are useful as therapeutic targets, or to enable the development of therapeutic targets for treatment of CCC of the ovary, EC and uterine carcinoma.
In another aspect of the invention, one or more mutations in the gene ARID1A (encoding protein BAF250a (also referred to as p270)), a component of the SWI/SNF chromatin remodelling complex, are novel biomarkers useful for the prognosis of CCC of the ovary, EC and uterine carcinoma and for prediction of the risk of malignant progression (or transformation) of endometriotic lesions (endometriosis).
In another aspect of the invention, one or more mutations in the gene ARID1A (encoding protein BAF250a (also referred to as p270)), a component of the SWI/SNF chromatin remodelling complex, are novel biomarkers that are useful for predicting response to treatment (chemotherapy, radiation, targeted drug therapy and the like) of patients with CCC of the ovary, EC and uterine carcinoma.
In an aspect of the invention, one or more of the mutations in SEQ ID NO.:2 through SEQ ID NO.:122 (shown in
In another aspect of the invention, one or more of the mutations in SEQ ID NO.:2 through SEQ ID NO.:122 (shown in
In an aspect of the invention, one or more of the mutations in SEQ ID NO.:2 through SEQ ID NO.:122 (shown in
In an aspect of the invention, one or more mutations (shown in
In another aspect of the invention, one or more mutations (shown in
In another aspect of the invention, one or more mutations (shown in
In some embodiments, the presence of mutations in one or more genes that encode components of the SWI/SNF complex that disrupt the function or expression of the corresponding protein products in a sample of tissue obtained from a pre-cancerous lesion of a subject indicates a risk of malignant progression or transformation of the lesion to cancer. The presence of mutations in such genes can be determined by any suitable method, such as, for example, Sanger sequencing of the tissue sample or next generation sequencing of the tissue sample, PCR-based methods including ARMS-based PCR, fluorescence in situ hybridization (FISH), or other suitable detection technique. In some embodiments, the one or more genes are ARID1B, ARID2, SMARCA2, SMARCC1, SMARCD1, SMARCD2, SMARCD3, SMARCE1, ACTL6A, ACTL6B, or SCMARCB1.
In some embodiments, the absence of expression of one or more proteins that are components of the SWI/SNF complex in a sample of tissue obtained from a pre-cancerous lesion of a subject indicates a risk of malignant progression or transformation of the pre-cancerous lesion to cancer. The expression level of the proteins in the tissue sample may be determined in any suitable manner, including for example immunohistochemistry. In some embodiments, the proteins are BAF250b, BAF200, BRM, BAF155, BAF60a, BAF60b, BAF60c, BAF57, BAF53a, BAF53b, or BAF47.
In some embodiments, the presence of mutations in ARID1A, a gene encoding the protein BAF250a, that disrupt the function or expression of BAF250a in a sample of tissue obtained from an endometriotic lesion of a subject indicates a risk of malignant progression or transformation of the endometriotic lesion to cancers such as CCC, EC or uterine cancer. The presence of mutations in ARID1A in the tissue sample can be determined by any suitable method, such as, for example, Sanger sequencing of the tissue sample or next generation sequencing of the tissue sample, PCR-based methods including ARMS-based PCR, FISH, or other suitable detection technique.
In some embodiments, the mutations in ARID1A that indicate a risk of malignant progression or transformation of the endometriotic lesion to cancers such as CCC, EC or uterine cancer include the mutations set forth in SEQ ID NO.:2 through SEQ ID NO.:122 (shown in
In some embodiments, the absence of expression of BAF250a in a sample of tissue obtained from an endometriotic lesion of a subject indicates a risk of malignant progression or transformation of the endometriotic lesion to cancers such as CCC, EC or uterine cancer. The expression level of BAF250a in the tissue sample may be determined in any suitable manner, including for example immunohistochemistry.
In some embodiments, the mutations in BAF250a that indicate a risk of malignant progression or transformation of the endometriotic lesion to cancers such as CCC, EC or uterine cancer include the mutations set forth in
In some embodiments, the presence of mutations in SMARCA4, PBRM1, or SMARCC2 that disrupts the function or expression of BRG1, BAF180, or BAF170, respectively, in a sample of tissue obtained from an endometriotic lesion of a subject indicates a risk of malignant progression or transformation of the endometriotic lesion to cancers such as CCC, EC or uterine cancer. The presence of mutations in these genes in the tissue sample can be determined by any suitable method, such as, for example, Sanger sequencing of the tissue sample or next generation sequencing of the tissue sample, PCR-based methods including ARMS-based PCR, FISH, or other suitable detection technique.
In some embodiments, the absence of expression of BRG1, BAF180, or BAF170 in a sample of tissue obtained from an endometriotic lesion of a subject indicates a risk of malignant progression or transformation of the endometriotic lesion to cancers such as CCC, EC or uterine cancer. The expression level of BRG1, BAF180 or BAF170 in the tissue sample may be determined in any suitable manner, including for example immunohistochemistry.
In some embodiments, the presence of mutations in one or more genes that encode components of the SWI/SNF complex that disrupt the function or expression of the corresponding protein products in a sample of tissue obtained from a cancerous lesion of a subject indicates a poor prognosis for the subject. The presence of mutations in such genes can be determined by any suitable method, such as, for example, Sanger sequencing of the tissue sample or next generation sequencing of the tissue sample, PCR-based methods including ARMS-based PCR, or TaqMan™ assays, or hybridization-based methods including FISH, or any other suitable detection technique. In some embodiments, the one or more genes are ARID1B, ARID2, SMARCA2, SMARCC1, SMARCD1, SMARCD2, SMARCD3, SMARCE1, ACTL6A, ACTL6B, or SCMARCB1.
In some embodiments, the absence of expression of one or more proteins that are components of the SWI/SNF complex in a sample of tissue obtained from a cancerous lesion of a subject indicates a poor prognosis for the subject. The expression level of the proteins in the tissue sample may be determined in any suitable manner, including for example immunohistochemistry. In some embodiments, the proteins are BAF250b, BAF200, BRM, BAF155, BAF60a, BAF60b, BAF60c, BAF57, BAF53a, BAF53b, or BAF47.
In some embodiments, the presence of mutations in ARID1A, a gene encoding the protein BAF250a, that disrupt the function or expression of BAF250a in a sample of tissue obtained from a CCC, EC or uterine cancer of a subject indicates a poor prognosis for the subject. The presence of mutations in ARID1A in the tissue sample can be determined by any suitable method, such as, for example, Sanger sequencing of the tissue sample or next generation sequencing of the tissue sample, PCR-based methods including ARMS-based PCR, or TaqMan™ assays, or hybridization-based methods including FISH, or any other suitable detection technique.
Those skilled in the art will recognize that a number of methods or techniques for identifying products such as ARMS-PCR products may be used in order to detect the presence of mutations in ARID1A or other genes encoding proteins that are components of the SWI/SNF complex. For example, embodiments include, but are not limited to, techniques such as primer extension, classical microarrays or line probes. Methods of PCR product endpoint detection including, but not limited to, fluorescence, chemiluminescence, colourimetric techniques or measurement of redox potential may also be used with the embodiments described herein for detecting gene mutations.
In some embodiments, the mutations in ARID1A that indicate a poor prognosis include the mutations in SEQ ID NO.:2 through SEQ ID NO.:122, set forth in
In some embodiments, the absence of expression of BAF250a in a sample of tissue obtained from a CCC, EC or uterine cancer of a subject indicates a poor prognosis. The expression level of BAF250a in the tissue sample may be determined in any suitable manner, including for example immunohistochemistry.
In some embodiments, the mutations in BAF250a that indicate a poor prognosis include the mutations set forth in
In some embodiments, the presence of mutations in SMARCA4, PBRM1, or SMARCC2 that disrupts the function or expression of BRG1, BAF180, or BAF170, respectively, in a sample of tissue obtained from a CCC, EC or uterine cancer of a subject indicates a poor prognosis. The presence of mutations in these genes in the tissue sample can be determined by any suitable method, such as, for example, Sanger sequencing of the tissue sample or next generation sequencing of the tissue sample, PCR-based methods including ARMS-based PCR, FISH, or other suitable detection technique.
In some embodiments, the absence of expression of BRG1, BAF180, or BAF170 in a sample of tissue obtained from a CCC, EC or uterine cancer of a subject indicates a poor prognosis. The expression level of BRG1, BAF180 or BAF170 in the tissue sample may be determined in any suitable manner, including for example immunohistochemistry.
In some embodiments, the presence of mutations in one or more genes that encode components of the SWI/SNF complex that disrupt the function or expression of the corresponding protein products in a sample of tissue obtained from a cancerous lesion of a subject indicates a low likelihood that treatment of the subject with standard chemotherapeutic agents such as platinum and taxane therapies is likely to be successful. The presence of mutations in such genes can be determined by any suitable method, such as, for example, Sanger sequencing of the tissue sample or next generation sequencing of the tissue sample. In some embodiments, the one or more genes are ARID1B, ARID2, SMARCA2, SMARCC1, SMARCD1, SMARCD2, SMARCD3, SMARCE1, ACTL6A, ACTL6B, or SCMARCB1.
In some embodiments, the absence of expression of one or more proteins that are components of the SWI/SNF complex in a sample of tissue obtained from a cancerous lesion of a subject indicates a low likelihood that treatment of the subject with standard chemotherapeutic agents such as platinum and taxane therapies is likely to be successful. The expression level of the proteins in the tissue sample may be determined in any suitable manner, including for example immunohistochemistry. In some embodiments, the proteins are BAF250b, BAF200, BRM, BAF155, BAF60a, BAF60b, BAF60c, BAF57, BAF53a, BAF53b, or BAF47.
In some embodiments, the presence of mutations in ARID1A, a gene encoding the protein BAF250a, that disrupt the function or expression of BAF250a in a sample of tissue obtained from a CCC, EC, or uterine cancer of a subject indicates a low likelihood that treatment of the subject with standard chemotherapeutic agents such as platinum and taxane therapies is likely to be successful. The presence of mutations in ARID1A in the tissue sample can be determined by any suitable method, such as, for example, Sanger sequencing of the tissue sample or next generation sequencing of the tissue sample, PCR-based methods including ARMS-based PCR, or TaqMan™ assays, or hybridization-based methods including FISH, or any other suitable detection technique.
In some embodiments, the mutations in ARID1A that indicate a low likelihood that treatment of the subject with standard chemotherapeutic agents such as platinum and taxane therapies is likely to be successful include the mutations in SEQ ID NO.:2 through SEQ ID NO.:122 set forth in
In some embodiments, the absence of expression of BAF250a in a sample of tissue obtained from a CCC, EC, or uterine cancer of a subject indicates a low likelihood that treatment of the subject with standard chemotherapeutic agents such as platinum and taxane therapies is likely to be successful. The expression level of BAF250a in the tissue sample may be determined in any suitable manner, including for example immunohistochemistry.
In some embodiments, the mutations in BAF250a that indicate a low likelihood that treatment of the subject with standard chemotherapeutic agents such as platinum and taxane therapies is likely to be successful include the mutations set forth in
In some embodiments, the presence of mutations in SMARCA4, PBRM1, or SMARCC2 that disrupts the function or expression of BRG1, BAF180, or BAF170, respectively, in a sample of tissue obtained from a CCC, EC, or uterine cancer of a subject indicates a low likelihood that treatment of the subject with standard chemotherapeutic agents such as platinum and taxane therapies is likely to be successful. The presence of mutations in these genes in the tissue sample can be determined by any suitable method, such as, for example, Sanger sequencing of the tissue sample or next generation sequencing of the tissue sample, PCR-based methods including ARMS-based PCR, or TaqMan™ assays, or hybridization-based methods including FISH, or any other suitable detection technique.
In some embodiments, the absence of expression of BRG1, BAF180, or BAF170 in a sample of tissue obtained from a CCC, EC, or uterine cancer of a subject indicates a low likelihood that treatment of the subject with standard chemotherapeutic agents such as platinum and taxane therapies is likely to be successful. The expression level of BRG1, BAF180, or BAF170 in the tissue sample may be determined in any suitable manner, including for example immunohistochemistry.
In some embodiments, loss of expression or function of BAF250a is a biomarker for malignancy derived from endometrial epithelium. In some embodiments, ARID1A mutation or BAF250a loss is a targetable feature of a cancer. In some embodiments, the cancer is CCC, EC, or uterine cancer.
In some embodiments, mutations in one or more of the genes that encode proteins that are components of the SWI/SNF complex that disrupt the function of the corresponding protein in the SWI/SNF complex may be used in a screen to identify therapeutic targets for treatment of CCC, EC, and/or uterine carcinoma. In some embodiments, mutations in one or more proteins that are components of the SWI/SNF complex that disrupt the function of that protein in the SWI/SNF complex may be used in a screen to identify therapeutic targets for the treatment of CCC, EC, and/or uterine carcinoma.
The screen used to identify the therapeutic targets may be a synthetic lethal screen. Any suitable cell line that does not express one or more of the SWI/SNF component proteins, expresses one or more of the SWI/SNF component proteins at levels that are too low to maintain proper functioning of the SWI/SNF complex, or a mutant form of one or more of the SWI/SNF component proteins that does not allow proper functioning of the SWI/SNF complex to be maintained, may be used.
In some embodiments, the screen may be conducted using 867CL, 867CL-ARID1A-ΔL2007, and 867CL-ARID1A-WT cells. In some embodiments, the screen may be conducted using an isogenic knockout of ARID1A in HCT116 cells.
In some embodiments, the synthetic lethal screen may use the Hannon/Elledge lenti-shRNA human library. In some embodiments, the synthetic lethal screen may use the Dharmacon siGenome pool.
In some embodiments, at least one mutation used in the synthetic lethal screen is in the ARID1A gene. In some embodiments, the at least one mutation in the ARID1A gene is one of the mutations in SEQ ID NO.:2 through SEQ ID NO.:122. In some embodiments, the at least one mutation in the ARID1A gene is ARID1A-ΔL2007. In some embodiments, the at least one mutation in the ARID1A gene encodes a mutant form of the BAF250a protein. In some embodiments, the mutant form of the BAF250a protein is one of the mutations set forth in
In some embodiments, at least one mutation used in the synthetic lethal screen is in one of the SMARCA4, PBRM1, or SMARCC2 genes. In some embodiments, the at least one mutation in the SMARCA4, PBRM1, or SMARCC2 genes is one of the mutations set forth in
In some embodiments, at least one mutation used in the synthetic lethal screen is in one of the ARID1B, ARID2, SMARCA2, SMARCC1, SMARCD1, SMARCD2, SMARCD3, SMARCE1, ACTL6A, ACTL6B, or SCMARCB1 genes. In some embodiments, at least one mutation is in one of the BAF250b, BAF200, BRM, BAF155, BAF60a, BAF60b, BAF60c, BAF57, BAF53a, BAF53b, or BAF47 proteins.
In some embodiments, therapeutic agents are developed to inhibit the activity of one or more targets identified by the synthetic lethal screen. In some embodiments, such therapeutic agents are used to treat cancers such as CCC, EC, or uterine cancer. In some embodiments, treatment involves administering a therapeutically effective amount of the therapeutic agent to the subject in need. Potential therapeutic agents that may be screened against the one or more targets include known drugs, small molecules, natural compounds, chemical libraries, and siRNA.
In some embodiments, reagents for assaying for the presence of a mutation in a gene encoding a protein that forms part of the SWI/SNF complex, including ARID1A, or for assaying for expression of a protein that forms part of the SWI/SNF complex, including BAF250a, may be provided in the form of a kit.
Embodiments of the invention are further illustrated with reference to the following examples, which are intended to be illustrative and not limiting.
Because CCC are genomically stable8,9, it was expected they will have a constricted mutational landscape and recurrent mutations which would be evident from the analysis of a small number of cases.24
The inventors decoded the transcriptomes of 17 ovarian clear cell cancers using RNA-seq. Gene fusions and small interstitial deletions and insertions were detected by methods described in recent publications25,33,34 and SNVs were detected using SNVmix, a Bayesian mixture based algorithm recently published35. The vast majority of SNVs were expected to be rare germline variants as opposed to somatic mutations. Therefore the inventors used the same approach that resulted in identification of the FOXL2 mutation in granulosa cell tumours1 to identify genes recurrently mutated in CCCs, but not in unrelated cancer types. The inventors identified mutations in the ARID1A gene in six of seventeen CCCs: three cases had nonsense mutations, a fourth case had a 6018-6020delGCT (2007ΔL) 3 base pair deletion mutation, a fifth case had both a somatic missense mutation (T5953C(S1985P)) and a single nucleotide insertion in exon 20 (5541insG), and a sixth case had a genomic deletion spanning intron one resulting in loss of the region 3′ to exon 1 in ARID1A and fusion to the neighbouring gene (ZDHHC18); this was validated by fluorescent in situ hybridization (FISH) (
All ARID1A point mutations were validated by Sanger sequencing, and in all cases where germline DNA was available, mutations were determined to be somatic. Loss of heterozygosity (LOH) was detected in CCC01 which had the 6018-6020delGCT mutation.
The ARID1A gene was analysed in an additional case of CCC arising in an endometriotic cyst (CCC23) using Sanger sequencing, as this case was not included in the RNA-seq experiments. This resulted in identification of a truncating mutation (G6139T (E2047*)). This case also exhibited LOH through loss of one copy of chromosome 1. Thus, somatic mutations in the ARID1A gene were found in seven of eighteen clear cell cancers studied. By comparison, no ARID1A mutations were seen in the transcriptomes of 50 triple negative breast cancers, 6 endometrioid, or 6 high grade serous cancers (p=0.00003). A truncating ARID1A mutation was found in one of the two mucinous carcinomas of the ovary studied.
With reference to
The foregoing results provide strong genetic evidence that ARID1A, a gene implicated as a tumour suppressor through functional studies, is frequently disrupted in CCCs.
The inventors have detected SNVs in other SWI/SNF genes including a missense mutation in SMARCA4 (encodes for BRG1) and a missense mutation in PBRM1 (encodes BAF180) in ARID1A-mutation-negative CCCs (
With reference to
In addition to variants in ARID1A, CTNNB1 (C110G (S37C), NM—001904.3, SEQ ID NO.:125) somatic mutations were detected in CCC02 and CCC03 and validated by PCR amplification and Sanger sequencing in both tumor and germline DNA from these cases. Additionally, two variants were predicted based on RNA sequencing data in the TOV21G cell line in PIK3CA (C3139T (H1047Y), NM—006218.2, SEQ ID NO.: 123) and KRAS (G37T (G13C), NM—004985.3, SEQ ID NO. 124) which were validated by PCR amplification and Sanger sequencing. Though variants in BRAF were observed in the RNA sequencing data, none of these passed validation by Sanger sequencing in tumor DNA.
To demonstrate that ARID1A mutations are associated with loss of expression, the inventors used a mouse monoclonal antibody (Abgent, Inc.) targeting the central region of the BAF250a protein. The antibody stained all normal nuclei strongly. Of the 18 clear cell cancer samples analysed by RNA-seq in Example 1.0, eight showed loss of BAF250a expression. Of these eight cases, five had ARID1A mutations (
To demonstrate that loss of BAF250a is a subtype-specific finding in ovarian cancer, the inventors stained 300 tumours from their ovarian tumour bank. All non tumour nuclei were strongly positive for BAF250a whereas 11 of 27 CCC cases (40%) showed complete loss of BAF250a in all tumour cells. By comparison, 17 of 180 (10%) high grade serous cancers (p<0.0001) showed BAF250a loss.
To demonstrate whether ARID1A mutations and loss of BAF250a expression are early events in ovarian carcinogenesis, the inventors studied tumour and adjoining endometriosis from case CCC23 which has a truncating mutation in exon 20 and LOH accompanied by complete loss of BAF250a expression (
As part of the inventors' tumour banking procedures, they have developed a xenograft sub-renal capsule technique to generate ovarian cancer models in NOD/SCID mice with a greater than 90% rate of successful engraftment to date36. Transplantable xenografts have been established from five clear cell cancers including case VOA867 (CCC 14) which has a truncating mutation (C1680A/G, Y560X) accompanied by complete absence of BAF250a protein (
The inventors have also effectively knocked down expression of BAF250a through expression of ARID1A-shRNAmir-GFP in HCT116 cells (
Based on the results from sequencing the whole transcriptomes of 18 CCCs and a CCC cell line discussed above in Example 1.0, the inventors sequenced ARID1A in an additional 210 ovarian carcinomas and a second ovarian CCC cell line. In 2 CCCs, the inventors sequenced DNA from microdissected contiguous atypical endometriotic epithelium to determine whether ARID1A mutations were present. The inventors measured BAF250a expression by means of immunohistochemical analysis in an additional 455 ovarian carcinomas.
Eighteen ovarian CCC from the OvCaRe (Ovarian Cancer Research) frozen tumor bank and one CCC cell line (TOV21G) were selected for whole-transcriptome paired-end RNA sequencing. Patients provided written informed consent for research using these tumor samples before undergoing surgery, including acknowledgement that a loss of confidentiality could occur through the use of samples for research. Separate approval from the hospital's institutional review board was obtained to permit the use of these samples for RNA-sequencing experiments.
To evaluate the frequency of ARID1A mutations in CCC and other ovarian cancer subtypes, the inventors used Illumina based targeted exon resequencing to interrogate the DNA sequence of a mutation validation cohort of 101 CCC (in addition to the 19 cases for RNA seq, described above (the “discovery cohort”)), 33 EC, 76 HGS carcinomas and the CCC derived cell line ES2. 10 CCC came from Johns Hopkins University (JHU), 29 from the Université de Montreal (UdeM) and 42 from the Australian Ovarian Cancer Study (AOCS); all other cancers were obtained from the OvCaRe frozen tumor bank. For 70 cases with predicted mutations germline DNA was available. All patients had consented to have their tumors and germline DNA used for research including genomic studies. From the cohort of 119 CCCs (both discovery cohort and mutation validation cohort) and 33 ECs (mutation validation cohort), 86 CCCs and all 33 ECs were examined to determine if endometriosis was present at the time of surgery. These results are shown in
DNA and RNA were extracted using standard methodologies. In cases for which insufficient DNA for ARID1A resequencing was available whole genome amplification (WGA) was used to extend the DNA template, however mutations were all confirmed using non-WGA treated DNA.
All tumor samples were independently reviewed by a gynecologic pathologist before mutational analysis. In cases in which the review diagnosis differed from the source diagnosis, the samples were further reviewed by another gynecologic pathologist, who acted as an arbiter. Both review pathologists were unaware of the results of genomic studies.
Whole transcriptome sequencing was performed as previously described1,35. Double stranded cDNA was synthesized from polyadenylated RNA, and the resulting cDNA was sheared. The 190-210 bp DNA fraction was isolated and PCR amplified to generate the sequencing library, as per the Illumina Genome Analyzer paired end library protocol (Illumina Inc., Hayward, Calif.). The resulting libraries were sequenced on an Illumina GAii. Short read sequences obtained from the Illumina GAii were mapped to the reference human genome (NBCI build 36.1, hg18) plus a database of known exon junctions 2 using MAQ 3 in paired end mode.
Single nucleotide variants were predicted using a Bayesian mixture model, SNVmix1,35. Only bases with >Q20 base quality were considered to minimize errors. SNVs were cross-referenced against dbSNP version 129 and published genomes in order to eliminate any previously described germline variants1.
Gene fusions were predicted using deFuse. deFuse predicts gene fusions by searching paired end RNA-sequencing data for reads that harbor fusion boundaries. Spanning reads harbor a fusion boundary in the unsequenced region in the middle of the read, whereas split reads harbor a fusion boundary in the sequence of one end. deFuse searches for spanning reads with reads ends that align to different genes. Approximate fusion boundaries implied by spanning reads are then resolved to nucleotide level using dynamic programming based alignment of candidate split reads.
The Affymetrix SNP 6.0 arrays were normalized using CRMAv237 using the default settings for performing allelic-crosstalk calibration, probe sequence effects normalization, probe-level summarization, and PCR fragment length normalization. Log ratios were then computed by normalizing against a reference generated using a normal dataset of 270 HapMap samples obtained from Affymetrix. Segmentation is performed using an 11-state hidden Markov model. This approach simultaneously detects and discriminates somatic and germline DNA copy number changes in cancer genomes. The hidden Markov model performs segmentation of the log ratio intensity data and predicts discrete copy number status for each resulting segment from the set of five somatic states (homozygous deletion, hemizygous deletion, gain, amplification, and high-level amplification), five analogous germline states, and neutral copy number. The boundaries of the segments provide candidate breakpoints in the genome as a result of copy number alteration events.
In all cases with Affymetrix SNP 6.0 data, only CCC04 contained a breakpoint in ARID1A. The segment (chr1:26898389-27000523) is a homozygous deletion that breaks the gene near the 5′ end and truncates it. The published CNV map from 450 HapMap individuals38 was studied to see whether any regions overlapping ARID1A were reported and none were found. Based on this, it is predicted that this is a somatic change.
Genomic DNA for the cases described under Patients and Samples above was subjected to Illumina based targeted exon resequencing. Briefly, all ARID1A exons were PCR amplified and individual amplicons were indexed, pooled, and sequenced. Individual indexes enabled the deconvolution of reads deriving from individual samples concurrently sequenced from the same library. Validation by Sanger sequencing was performed for all potential truncating or missense mutations with a Grantham index for amino acid change of greater than X, present above a 10% mutant allele frequency cut-off. Insufficient usable data was obtained from exon 1; this was sequenced by Sanger sequencing in all cases using four overlapping amplicons.
Automated primer design was performed using Primer339 and custom scripting. Primers were designed to span annotated exons of ARID1A (UCSC build hg18) with an average PCR product size of 2067 bp. Primers were synthesized by Integrated DNA Technologies at a 25 nmol scale with standard desalting (IDT Coralville, Iowa) and tested in PCR using control human genomic DNA. Primer pairs that failed to generate a product of the expected size were redesigned. The sequences for the primers are provided in
Sequence reads from the ARID1A targeted exon resequencing experiment were aligned to the genomic regions targeted by the PCR primers using MAQ version 0.7.1. Each exon was assessed for coverage by enumerating all uniquely aligning reads to the targeted space. SNVs were determined by computing the allelic counts for each genomic position within the complete targeted space. All positions exhibiting an allelic ratio of at least 10% variant were considered for validation by Sanger sequencing. Insertions and deletions were predicted using the Maq indelpe program using 10% allelic ratio criteria for selection for experimental follow up. In addition, to determine a confidence measure for each SNV prediction, we applied a one-tailed Binomial exact test to each position covered as described in Shah et al.1 using all aligned reads to compute the expected distribution. Benjamini-Hochberg40 correction for multiple comparison was applied to the resultant Binomial-test p-values to yield q-values for each position.
The Illumina based targeted exon sequencing of ARID1A did not provide coverage of exon 1. To obtain sequence information for exon 1, four overlapping PCR primer sets were designed, priming sites for M13 forward and M13 reverse added to their 5′ ends to allow direct Sanger sequencing of amplicons. For the PCR, after denaturation at 94° C. for 1 min, DNA was amplified over 35 cycles (94° C. 30 sec, 58-60° C. 30sec, 72° C. 30 sec) using an MJ Research Tetrad (Ramsey, Minn.). Final extension was at 72° C. for 5 min PCR products were purified using ExoSAP-IT® (USB® Products Affymetrix, Inc., Cleveland, Ohio) and sequenced using an ABI BigDye terminator v3.1 cycle sequencing kit (Applied Biosystems, Foster City, Calif.) and an ABI Prism 3130x1 Genetic Analyzer (Applied Biosystems, Foster City, Calif.). All capillary traces were visually inspected to confirm their presence in tumor and absence from germline traces or analyzed using Mutation Surveyor.
Based on the exon resequencing data, any truncating or radical missense mutations (results in change to the charge or polarity of the amino acid') that occurred at an allele frequency of greater than 10% were further validated in tumor DNA, and in most cases germline DNA, using Sanger sequencing. Regions of ARID1A containing putative mutations were PCR amplified from genomic DNA using primers with priming sites for M13 forward and M13 reverse added to their 5′ ends to allow direct Sanger sequencing of amplicons. In cases where the matched germline DNA of the patient was from FFPE material, short (<250 nt) amplicons were designed to validate the SNVs.
Unless otherwise stated, amplicons were produced from genomic DNA from both the tumor and matched germline DNA from the same patient. For the PCR, after denaturation at 94° C. for 1 min, DNA was amplified over 35 cycles (94° C. 30 sec, 60-65° C. 30sec, 72° C. 30 sec) using an MJ Research (Ramsey, Minn.) Tetrad. Final extension was at 72° C. for 5 min PCR products were purified using a MinElute PCR purification kit (QIAGEN, Valencia Calif.) and sequenced using an ABI BigDye terminator v3.1 cycle sequencing kit (Applied Biosystems, Foster City, Calif.) and an ABI Prism 3130×1 Genetic Analyzer (Applied Biosystems, Foster City, Calif.). All capillary traces were visually inspected to confirm their presence in tumor and absence from germline traces or analyzed using Mutation Surveyor. Results from this analysis along with immunohistochemistry are summarized in
Immunohistochemical (IHC) staining for BAF250a was performed in all cases with the exception of the 42 CCC from the AOCS and 4 samples from JHU. Additional IHC staining for hepatocyte nuclear factor (HNF)-1β, and estrogen receptor (ER) was performed on whole sections for two cases with associated atypical endometriosis as previously described14. ER is typically positive in endometriosis and negative in CCC, while HNF-1β is typically negative in endometriosis and positive in CCC14.
Immunohistochemical analysis was performed on 4μm thick paraffin sections on the semi-automated Ventana Discovery® XT instrument (Ventana Medical Systems, Tucson, Ariz.). ARID1A and HNF-1β was stained using the Ventana ChromoMap™ DAB kit. Antigen retrieval was standard CC1 with a two hour primary incubation. ARID 1A mouse clone 3H2 (Abgent, San Diego, Calif.) was applied at 1:25 followed by a 16 minute secondary incubation of pre-diluted UltraMap™ Mouse HRP (Ventana). HNF-1β goat polyclonal (Santa Cruz Biotechnology, Santa Cruz, Calif.) was applied at 1:200 dilution followed by a 32 minute incubation of unconjugated rabbit antigoat secondary at 1:500 (Jackson ImmunoResearch Labs Inc., West Grove, Pa.). Afterwards the tertiary antibody was incubated for 16 minutes with the prediluted Ventana UltraMap™ Rabbit HRP. ER immunostaining was done using the Ventana DABMap™ kit with standard CC1. The rabbit clone SP1 (Thermo Scientific, Fremont, Calif.) was incubated at 1:25 for 60 minutes with heat followed by a 32 minute secondary incubation with the pre-diluted Ventana Universal Secondary. Histologic images were obtained with the use of a ScanScope XT digital scanning system (Aperio Technologies Inc., Vista, Calif.).
A total of 455 additional ovarian-carcinoma samples—including 132 ovarian clear-cell carcinomas, 125 endometrioid carcinomas, and 198 high-grade serous carcinomas—from a previously described tissue microarray4 were used for an immunohistochemical validation cohort and were analyzed for BAF250a expression. All normal gynecologic tissues showed moderate or intense nuclear immunoreactivity for BAF250a. Tumors were scored positive for BAF250a if tumor cells showed definite nuclear staining and negative if tumor nuclei had no immunoreactivity but endothelial and other nontumor cells from the same samples showed immunoreactivity. Cases in which neither normal cells in the stroma nor tumor cells were immunoreactive were considered to be the result of technical failure. Additional immunohistochemical staining for hepatocyte nuclear factor 1β (HNF-1β) and estrogen receptor was performed on whole sections for two tumors with contiguous atypical endometriosis, as previously described.14
In two cases with identified ARID1A mutations, atypical (adjacent) and distant endometriosis sections were identified by a gynecological pathologist. Laser capture microdissection was used to isolate endometriotic epithelium. DNA extracted from these cells was analyzed by sequencing for the mutations seen in each case. For microdissection, formalin-fixed paraffin embedded (FFPE) sections (5 μM) were cut on a Tissue-Tek® Cryo3® cryostat (Sakura Finetek, Dublin, Ohio) onto clean uncharged slides. FFPE sections were deparaffanized and rehydrated, stained with Arcturus® HistoGene® Staining Solution (Molecular Devices, Inc., Sunnyvale, Calif.), then dehydrated in alcohol and xylene. All reagents were prepared with nuclease-free water and all steps were performed using nuclease-free techniques.
Atypical or distant endometriotic cells were microdissected from prepared FFPE sections using the Veritas™ Laser Capture Microdissection System (Arcturus Bioscience, Inc., Mountain View, Calif.) according to the manufacturer's standard protocols. LCM caps with captured cells were placed directly in 15 μL of lysis buffer with 10 μL of Proteinase K, and DNA was isolated using the QIAamp® DNA Micro kit (QIAGEN, Hilden, Germany). DNA was subsequently quantified on a NanoDrop spectrophotometer (NanoDrop Technologies, Wilmington, Del.). PCR was performed, followed by gel extraction of PCR products using the QIAquick Gel Extraction Kit (QIAGEN), PCR products were cloned using the Topo® TA Cloning® Kit following manufacturer's instructions (Invitrogen Corp., Carlsbad, Calif.). Inserts from individual clones were PCR amplified and Sanger sequenced to determine mutation frequency.
Tissue samples from CCC 13 and CCC23 were assayed for deletion of ARID1A using fluorescent in-situ hybridization (FISH). Six micrometer-thick sections were pre-treated as described previously.42 Three-color FISH assays were performed using BACs specific to the regions flanking ARID1A (RP11-35M8 (chr1:26,609,021-26,767,926) and RP11-285H13 (chr 1:27,033,759-27,216,771)) and fosmids specific to the ARID1A locus (G248P86703G10 (chr1:26,976,949-27,017,636), G248P89619A2 (chr1:26,954,143-26,991,761), and G248P88415D8 (chr1:26,914,023-26,954,284)). BAC and fosmid probes were obtained from British Columbia Genome Sciences Centre, and were directly labeled with Spectrum Red, Spectrum Blue, or Spectrum Green using a Nick Translation Kit (Abbott Molecular Laboratories, Abbott Park, Ill.). Analysis was done on a Zeiss Axioplan epifluorescent microscope. Images were captured using Metasystems Isis FISH imaging software (MetaSystems Group, Inc. Belmont Mass.). Loss of heterozygosity was confirmed in CCC23 and the results were inconclusive for CCC 13.
For gene expression analysis, the RNA-sequencing reads initially were mapped to the genome (NCBI36/hg 18) using MAQ (0.7.1). The inventors used the Sequence Alignment/Map (SAMtools 0.1.7) for downstream processing. Up to five mismatches was allowed. Raw expression values (read counts) were obtained by summing the number of reads that mapped to human genes based on the Ensembl database (Release 51). The initial gene expression values were normalized using a quantile normalization procedure using aroma.light (1.16.0.) package in R (2.11.1).
Of the 19 RNAseq samples, 3 had somatic truncating mutations (C4201T (Q1401*), C5164T (R1722*), and C1680A (Y560*), where asterisks denote a stop codon), 2 had somatic indels (insertion-deletion: 6018-6020delGCT and 5541insG), one somatic missense mutation (T5953C (51989P), found in the same sample as the 5541insG mutation), and 1 had a gene rearrangement involving ARID1A and the neighbouring gene ZDHHC18 encoding the zinc-finger DHHC domain-containing protein 18 (
Since mutations in PIK3CA (the phosphoinositide-3-kinase, catalytic, alpha polypeptide gene), CTNNB1 (the catenin beta-1 gene), KRAS (the v-Ki-ras2 Kirsten rat sarcoma viral oncogene homologue gene), and TP53 (the tumor protein p53 gene) are recurrent in ovarian clear-cell carcinoma,15 the inventors also analyzed the RNA-sequencing data and performed a polymerase-chain-reaction assay for the presence of variants in these genes (
ARID1A mutation frequency in CCC and other ovarian cancer subtypes was established through Illumina-based targeted exon resequencing of a larger cohort of samples. The total frequency of CCC with significant ARID1A mutations is 55/119, or 46%. Only two were somatic missense mutations; the remainder were truncating mutations that were evenly distributed across the coding sequence (
The inventors analyzed germ-line DNA from 55 samples (47 ovarian clear-cell carcinomas and 8 endometrioid carcinomas) in the discovery and mutation-validation cohorts for the presence of 65 truncating mutations (53 found in ovarian clear-cell carcinomas and 12 found in endometrioid carcinomas). In all 55, the mutations were found to be somatic. On this basis, the inventors made the assumption that 12 subsequent truncating mutations (10 in ovarian clear-cell carcinoma and 2 in endometrioid carcinoma) would be somatic (i.e., predicted to be somatic without germ-line DNA testing) (
The presence of ARID1A mutation shows a strong association (Fisher Exact p<0.0001) with endometriosis associated ovarian cancer subtypes (CCC or EC) (
ARID1A was further evaluated by IHC staining for BAF250a in 73 CCC, 33 EC and 76 HGS cancers for which formalin-fixed, paraffin-embedded sections were available in the discovery cohort and the mutation-validation cohort. These results are summarized in
In another analysis, the correlation between ARID1A mutations and BAF250a expression was evaluated by means of immunohistochemical staining for BAF250a in 182 tumors for which formalin-fixed, paraffin embedded sections were available in the discovery cohort and the mutation-validation cohort described above: 73 ovarian clear-cell carcinomas, 33 endometrioid carcinomas, and 76 high-grade serous carcinomas. The presence of mutations was significantly associated with BAF250a loss in endometriosis-associated cancers (P<0.001 by Fisher's exact test). A total of 27 of 37 samples (73%) and 5 of 10 samples (50%) of ovarian clear-cell carcinoma and endometrioid carcinoma, respectively, with an ARID1A mutation showed a loss of BAF250a expression, as compared with 4 of 36 samples (11%) and 2 of 23 samples (9%), respectively, without an ARID1A mutation (
The immunohistochemical validation cohort was also assessed for BAF250a expression (
With reference to
Two patients with ovarian clear-cell carcinomas (samples CCC 13 and CCC23) carrying ARID1A mutations had contiguous atypical endometriosis.
Case CCC23 had an ARID1A truncating mutation (G6139T (E2047*)) in exon 20 and had BAF250a loss in both cancer and contiguous atypical endometriotic epithelium (
With reference to
The second case, CCC13, data shown in
With reference to
Sanger sequencing was carried out on CCC13. The two somatic mutations (5541insG and T5953C(S1985P)) were sequenced from a single PCR fragment. PCR products were cloned and then resequenced. In total, sequences from 45 clones were analyzed. The inventors found 15/45 (33%) wildtype sequence, 9/45 (20%) sequences with the T5953C (S1985P) mutation, 9/45 (20%) sequences with the 5541insG mutation, and 12/45 (27%) sequences with both mutations in a single Sanger sequence trace. This reveals the complex relationship between the mutations which occur both in trans (on independent alleles) and also in cis (on the same allele) (see
Mutations including truncating and somatic missense mutations, and one ARID1A rearrangement, were seen in 56/119 (47%) CCCs and 10/33 (30%) ECs ( 66/153 or 43% in total); but in only 1/76 (1%) high-grade serous ovarian carcinomas. All truncating mutations for which germline DNA was available were somatic and fifteen cases had two somatic mutations. Loss of BAF250a protein correlated strongly with truncating mutations. In two CCCs the ARID1A mutations and loss of BAF250a expression was evident in the tumor and contiguous atypical endometriosis, but not in distant endometriotic lesions or normal tissue.
Results for the 50 genes with the greatest differential expression with respect to cells having an ARID1A mutation are shown in
Overall, 46% of CCC and 30% of EC had somatic truncating or missense mutations in ARID1A as opposed to none in 76 specimens of HGS carcinoma analyzed. Loss of ARID1A expression was also subtype specific with loss of nuclear BAF250a seen in 39% of CCC and EC but only 1% of HGS carcinomas.
There are a number of lines of evidence supporting a significant biological role for somatic ARID1A mutations. Firstly, the mutations identified are almost exclusively truncating mutations, expected to encode non-functional protein. They are present at a high frequency in endometriosis associated ovarian carcinomas but not HGS carcinoma, two distinct tumor types, strongly suggesting that they are highly relevant in the former, and not random events. By comparing clear cell carcinomas to their adjacent atypical endometriotic lesions, the inventors have demonstrated that the same mutations are present in the putative precursor lesions as the tumors. In contrast, the distant endometriotic lesions are mutation negative.
In the case shown in
Four additional mutations were identified when the RNAseq cases were analyzed by amplicon exon resequencing; these mutations were likely not seen in RNAseq data due to transcripts being rapidly targeted for nonsense mediated decay (NMD)43, indicating that RNAseq, although a useful discovery tool, has imperfect sensitivity for detecting nonsense and other truncating mutations.
In CCC and EC loss of expression was seen in 67% of mutation positive cases and only 16% of mutation negative cases. It is possible that the mutant negative CCC and EC with loss of BAF250a expression may have lost ARID1A expression through other mechanisms such as chromosomal rearrangements, epigenetic silencing, expression of transcriptional repressors or post-translational mechanisms. The presence of BAF250a immunoreactivity in a minority of cases with protein truncating mutations may indicate that haploinsufficiency (which is embryonic lethal in a mice) is pathogenic. Alternatively it may be due to second hit events that do not impact protein expression levels, a dominant negative function of some mutations, or detection of truncated but dysfunctional protein in the IHC assay. The latter is possible in some cases as the antibody used targets the middle of the protein (between exons 14-16).
Though there is long standing evidence that endometriosis is a major risk factor for CCC and EC, the molecular mechanism of this transformation is unknown44,45. Mutations in the PTEN gene have been described in 20% of endometriotic cysts. In a mouse model, Cre-mediated expression of oncogenic K-ras was found to induce endometriosis, while a second hit in the tumor suppressor Pten caused progression to endometrioid carcinoma, however K-ras mutations are not seen in human endometriosis or endometriosis associated ovarian cancers.
Gaining an understanding of initiating events for CCC and EC subtypes could lead to the development of new therapeutic approaches and enable the creation of identification tools for endometriotic lesions that are at risk for neoplastic transformation. Mutations in ARID1A and loss of BAF250a expression were preferentially seen in CCC and EC, cancers that do not feature the genomic chaos, near ubiquitous TP53 mutations, and frequent BRCA abnormalities of HGS carcinomas. If HGS carcinomas are characterized by gross structural abnormalities in chromosomes, it is possible that defects in genes that alter the use of chromatin, along with previously described WNT and PI3 kinase pathway mutations will define CCC and EC. If such a model is correct, other abnormalities impacting the ARID1A locus or dysregulation of other chromatin remodeling genes will be found in the ARID1A mutation negative CCC and EC. This is supported by the clinical similarities between ovarian clear-cell carcinomas positive for and those negative for an ARID1A mutation.
The mechanism by which somatic mutations in ARID1A enables the progression of the benign condition of endometriosis to carcinoma has yet to be elucidated, however, the foregoing findings strongly suggest a fundamental role for ARID1A mutation in the genesis of both CCC and EC. The loss of ARID1A in endometriotic epithelium appears to be of importance in malignant transformation in this tissue type.
These data implicate ARID1A as a tumor suppressor gene frequently disrupted in CCC and EC. As ARID1A mutation and loss of BAF250a can be seen in the pre-neoplastic lesions, this is an early event and likely critical in the transformation of endometriosis into cancer.
To demonstrate whether BAF250a loss is common in other malignancies, immunohistochemistry (IHC) screening for BAF250a expression was performed on tissue microarrays (TMAs) in more than 3000 cancers, including carcinomas of breast, lung, thyroid, endometrium, kidney, stomach, oral cavity, cervix, pancreas, colon, and rectum, as well as endometrial stromal sarcomas, gastrointestinal stromal tumours (GIST), sex cord-stromal tumours and four major types of lymphoma (diffuse large B-cell lymphoma [DLBCL], primary mediastinal B-cell lymphoma [PMBCL], mantle cell lymphoma [MCL], and follicular lymphoma). The inventors have demonstrated that BAF250a loss is frequent in endometrial carcinomas, but infrequent in other types of malignancies, with loss observed in 29% of Grade 1 or 2, and 39% of Grade 3 endometrioid carcinomas of the endometrium, 18% of high grade serous, and 26% of clear cell carcinomas. Since endometrial cancers showed BAF250a loss, the inventors stained whole tissue sections for BAF250a expression in 9 cases of atypical hyperplasia and 10 cases of atypical endometriosis. Of the 9 cases of complex atypical endometrial hyperplasia, all showed BAF250a expression, however of 10 cases of atypical endometriosis (the putative precursor lesion for clear cell and ovarian carcinoma), one case showed loss of staining for BAF250a in the atypical areas with retention of staining in areas of non-atypical endometriosis; this was the sole case that recurred as an endometrioid carcinoma, indicating that BAF250a loss may be an early event in carcinogenesis. Since BAF250a loss is seen in endometrial carcinomas at a rate similar to that seen in ovarian carcinomas of clear cell and endometrioid type and is uncommon in other malignancies, loss of BAF250a is a particular feature of carcinomas arising from endometrial glandular epithelium.
Cases from the archives of Vancouver General Hospital, St. Paul's Hospital, and the British Columbia Cancer Agency were used to construct tissue microarrays (TMA) from duplicate 0.6 mm cores, as described previously46. The follicular lymphoma TMA was constructed using duplicate 1.0 mm cores. For the studies of atypical hyperplasia of the endometrium, hysterectomy cases where there was no co-existent carcinoma were used and full sections were immunostained. Immunostaining on the cases of atypical endometriosis was also performed on full sections. All prospectively collected patient samples were collected with informed patient consent under a research ethics board (REB)-approved protocol, and analysis of archived samples was covered by pre-existing REB approvals.
Immunohistochemical (IHC) staining for BAF250a was performed on all cases included in this study. IHC was performed on 4μm thick paraffin sections of tissue microarrays or whole tissue sections on the semi-automated Ventana Discovery® XT instrument (Ventana Medical Systems, Tucson, Ariz.) using the Ventana ChromoMar DAB kit. Antigen retrieval was standard CC1 with a two hour primary incubation. BAF250a mouse clone 3112 (Abgent, San Diego, Calif.) was applied at 1:50 followed by a 16-minute secondary incubation of pre-diluted UltraMap™ Mouse HRP (Ventana). Histologic images were obtained with the use of a ScanScope XT digital scanning system (Aperio Technologies Inc.,Vista, Calif.).
The scoring for BAF250a was performed as previously described47. Non-neoplastic cells, including endothelial cells, fibroblasts, and lymphocytes, normally show BAF250a nuclear staining and served as positive internal controls. Positively scored tissue cores were ones that contained any positive tumour cell nuclear staining, regardless of intensity. Negatively scored tissue cores were ones that showed completely absent tumour cell nuclear staining, as well as positive normeoplastic cell nuclear staining. Tissue cores lacking tumour cells were not scored. Cases in which neither normal cells in the stroma nor tumour cells were immunoreactive were considered to be the result of technical failure. Each case on a tissue microarray was represented as duplicate cores; one positive core in a duplicate was sufficient to count the case as positive.
Overall, loss of BAF250a expression measured by IHC was not a common event in nongynaecological malignancies (
Nine cases of complex atypical hyperplasia of the endometrium were stained for BAF250a, and all nine showed the same pattern of staining as adjacent normal endometrium (i.e. moderate to intense nuclear positivity). Of the ten cases of atypical endometriosis, all but one showed retention of BAF250a (i.e. normal staining pattern). A single case showed of loss of staining in the cytologically atypical areas with retention of staining in non-atypical endometriosis (
BAF250a, the protein encoded by ARID1A (the AT-rich interactive domain1A gene) is one of the accessory subunits of the SWI/SNF chromatin remodeling complex believed to confer specificity in the regulation of gene expression27,28. The SWI/SNF complex consists of multiple components, with the core catalytic subunit utilizing ATP to mobilize nucleosomes, thus providing transcriptional control of genes by altering the accessibility of the promoter regions by the transcriptional machinery. The SWI/SNF complex, ubiquitous in eukaryotes, is important for the regulation of diverse cellular processes, from development, differentiation and proliferation to DNA repair and tumour suppression26.
The results of this Example establish that loss of BAF250a is characteristic of a wide range of tumours arising from eutopic as well as ectopic endometrium, but is uncommon in other tumour types studied. The carcinomas of the endometrium, particularly those of higher grade, show the most frequent loss of BAF250a. In the carcinomas of the endometrium that showed BAF250a loss, the mutational status of the ARID1A gene is not known. However in the clear cell and endometrioid carcinomas of the ovary, mutation of ARID1A correlates well, although not perfectly, with BAF250a expression. Therefore, the inventors hypothesize that in carcinomas of the endometrium with BAF250a loss, most will harbor mutations in the ARID1A gene. In cases that do not show BAF250a loss, it is possible that other components of the SWI/SNF chromatin remodeling complex will show loss of function. Additionally, since the deletion of ARID1A on one allele results in embryonic lethality in mice, it is possible that mutations in ARID1A resulting in partial loss of BAF250a expression could have a biologic effect in tumours and the effect of ARID1A may be underestimated by screening for total BAF250a loss by IHC48. The measurement of partial loss would require a nuanced approach to scoring or the use of multiplexed immunofluorescence.
In this study, the inventors did not identify BAF250a loss in any of the nine cases of atypical endometrial hyperplasia. One of the ten cases of atypical endometriosis had loss of BAF250a expression. This patient returned two years later with an endometrioid carcinoma at the location of the atypical endometriosis. This finding could be interpreted in two ways. Firstly BAF250a loss and thus ARID1A mutation is a late event in the progression of precursor lesions to cancer or that the particular lesion studied was already fully malignant, although not recognized as such on morphological grounds. Either way, this case along with the frequency of BAF250a loss in frank carcinomas, the rarity (or absence) of loss in normal tissue and precursor lesions suggest that loss of BAF250a expression is a feature highly indicative of malignancy.
Approximately 30 genes including all 15 SWI/SNF genes will be analyzed for mutations in 150 clear cell carcinomas and 350 other ovarian cancers, using targeted next generation sequencing. When available, precursor lesions will be analyzed to assess if SWI/SNF mutations are early events in oncogenesis. It is predicted that tumours with SWI/SNF mutations will not contain mutations affecting pathways known to drive type I ovarian cancers, so samples will also be analysed for mutations in selected genes associated with these pathways. The 400 cases analysed by targeted resequencing along with an additional 1500 ovarian cases (that have clinical outcome data) will be immunohistochemically analysed to identify cases with loss of BAF250a expression and determine whether this correlates with ARID1A mutation status.
As described above, the inventors have demonstrated that approximately 39% of CCCs harbour mutations in the ARID1A gene. An additional two cases had mutations in other SWI/SNF complex genes. This observation will be expanded to determine the frequency of mutations in ARID1A and the other 15 genes coding SWI/SNF complex proteins mutations in a large cohort (˜400 cases) of ovarian carcinomas, including all pathological subtypes of this disease,26,49 to determine how frequently this complex is perturbed in ovarian cancer.
It is predicted that alterations in the SWI/SNF complex represent a mechanism of oncogenesis of fundamental significance, distinct from previously identified molecular pathways in ovarian carcinoma. This prediction will be confirmed by assessing the mutational status of several genes that are known to be involved in ovarian carcinomas. It is anticipated that chromosomally stable type I ovarian cancers will be able to be sub-categorized into two groups: (i) cancers with mutations in known oncogenic pathways and (ii) cancers with mutations affecting chromatin remodelling Immunohistochemistry will be used to assess BAF250a expression in the 400 sequenced cases along with 1500 additional ovarian cases.
DNA from 400 frozen ovarian tumour samples representing all subtypes will be used for targeted resequencing. All cases will have an accompanying source of germline DNA. Approximately 150 of these samples will be CCCs and the remaining 250 will be comprised of other ovarian cancer subtypes (50 endometrioid, 150 high grade serous, 25 low grade serous and serous borderline, and 25 mucinous and mucinous borderline). All 250 tumours representing non-CCC subtypes plus 35 CCCs will be obtained from the OvCaRe Tissue Bank (http://www.ovcare.ca/research/platforms.php) located in the Department of Pathology at the Vancouver General Hospital. The remaining 115 CCCs will be obtained from outside sources, such as 42 CCCs from the Australian Ovarian Cancer Study, 30 CCCs from the Institut du cancer de Montreal, 33 CCCs from Mt. Sinai School of Medicine, New York, 10 CCCs from Johns Hopkins University, and 9 CCC cell lines from Dr. Michael Anglesio. With 150 CCC cases, the rate of mutations in CCC will be determined with a margin of error of 8% or less (95% confidence level).
For immunohistochemical analysis of BAF250a protein expression, in addition to the 400 samples described above, another 1500 ovarian cancer samples assembled into tissue microarrays will be examined. These tissue microarrays include approximately 250 CCCs with the remaining cases representing other ovarian cancer subtypes, and have been described previously.4,50 In addition, 50 putative CCC precursor lesions, i.e. endometriosis and atypical endometriosis, will be analysed. Lesions from tumours used for targeted sequencing, described above, will be prioritized and the remaining cases will be from the Vancouver General Hospital Pathology Archives.
The 15 SWI/SNF genes along with genes known to be mutated in ovarian cancer including TP53, KRAS, BRAF, P1 LN, PI3KCA, CTNNB1, BRCA1, and BRCA2 will be sequenced. In total these include 406 exons and intron exon boundary sequence covering 120 kb. To accomplish this, genomic DNA libraries will be enriched with target genes, which will be analysed by next generation sequencing. Alternative approaches are less attractive as high throughput Sanger sequencing is expensive and insensitive to mutations found in less than 15% of alleles due to stromal contamination or intra-tumoural heterogeneity, sequencing of the polyA+ transcriptome would not detect mutations resulting in nonsense mediated mRNA decay, and whole exome sequencing would be too costly.
The inventors have extracted DNA from over 300 of the samples, and the other extractions will be performed using the Qiagen MagAttract™ kit on a Qiagen M48 robot. Quantification of DNA will be performed using the Quant-iT dsDNA HS assay kit and Qubit™ fluorometer (Invitrogen) prior to plate-based library construction. Libraries of sheared genomic fragments will be constructed in 96 well plates using a Covaris E210 sonication platform and Biomek™ FX liquid handler. Library construction begins with 1 μg of DNA which is automatically 1) sheared to an average size of ˜200 bp, 2) transferred to 96 well plates, 3) end-polished, 4) poly-A tailed, 5) ligated to barcoded adapters, and 6) PCR-amplified with oligonucleotides specific for sequences required for clonal cluster generation. Once constructed, libraries will be pooled (up to 94 samples in a single run) and enriched by solid or liquid phase capture probes.
There are competing approaches for target enrichment including using custom Agilent and Nimblegen solid and solution phase capture platforms however, to date, these platforms have not been validated for multiplexed sample capture and we would be required to examine the 400 samples as individual capture experiments, which would be cost prohibitive. Thus, a solid phase microfluidic capture platform developed by febit for the SOLiD™ 3.5 sequencing platform (febit biomed gmbh and Applied Biosystems, respectively) will be used. The febit HybSelect™ microarray-based capture method selectively captures fragments of sequence from complex genomic libraries through hybridization of DNA samples to specific oligonucleotides generated by light-activated in-situ synthesis on microfluidic chips (Geniom™ Biochip)51. Each Geniom™ Biochip contains 8 individually addressable arrays, each composed of >15,000 capture probes segmented into features of variable number and size. The number of features, density, and probe length are customizable, up to a maximum of 800 kb per array. Twelve barcoded SOLiD™ sequencing libraries will be pooled for each array (96 libraries per Geniom™ Biochip) and subjected to sequence capture, washing and elution on a Geniom™ RT device. The sequence capture steps will be performed by febit's Genomics services unit.
The enriched samples will be assessed and quantified using a DNA 1000 series II assay (Agilent) and Quant-iT dsDNA HS assay kit and Qubit™ fluorometer, respectively (Invitrogen). Sets of libraries will be further pooled (up to 96 samples per slide) and subjected to bulk emulsion PCR (emPCR), enrichment, and sequencing on the SOLiD™ 3.5 platform. Each bulk emPCR will be subjected to a work flow analysis (WFA) run on the SOLiD™ platform to ensure that noise to signal ratio are within specification. Once approved, the emPCR will be used for large scale bead deposition targeting ˜500 million reads per slide, 1 billion reads per run.
Data Analysis:
Image processing to colour calls will be performed on instrument and resulting files will be aligned to the reference human genome (NCBI build 36.1, hg18) using Bioscope™ v1.01 (Applied Biosystems). Variants in the resulting alignments will be detected using the diBayes package (Applied Biosystems). The probability of the existence of a heterozygote or a non-reference homozygote will be evaluated using prior probabilities of the SNP being a “miscolourcall”, “position error” or “probe error”. In addition, data will be analysed independently of the diBayes approach by aligning all reads in colourspace using the Mosaik aligner (http://bioinformatics.bc.edu/marthlab/Mosaik). This algorithm has several advantages over competing methods: it uses a banded Smith-Waterman approach for alignment that is more likely to detect insertions and deletions, it takes full advantage of the colourspace reads, and may be less prone to misalignment. Moreover, Mosaik seamlessly converts back to base-space and thus allows us to leverage the cancer-specific framework the inventors have developed for SNV detection called SNVMix 56 used in the discovery of the FOXL2 mutation in granulosa cell tumours of the ovary1 and the analysis of genome-wide mutational evolution in a lobular breast cancer.25 After alignment, we will predict SNVs and cross reference all non-synonymous protein coding predictions against a database of known SNPs to enrich the results for somatic variants.25 All remaining non-synonymous SNVs and protein coding insertions and deletions will henceforth be referred to as somatic mutation candidates (SMCs). The SMCs will be validated by targeted ultra-deep amplicon sequencing in tumour and normal DNA on Illumina GAIIx machines25. This approach is expected to yield allelic frequency information and is sensitive enough to confirm SMCs, even those present in a small minority of cells. Reads will be aligned to the human reference genome using Maq 0.7.1 and variants will be assessed using a Binomial exact test followed by correction for multiple comparisons using the Benjamin-Hochberg method. All positions where the variant is statistically significantly present in the tumour but not the normal will be considered a validated somatic mutation.
Once the sequencing has been completed, the data will be used to identify and quantify all mutations. Validation of potential mutations will be performed by Illumina sequencing of PCR amplicons from tumour derived DNA.25 Matched normal DNA will be assessed for the presence of all validated mutations to determine somatic versus germline status. It is estimated that there will be five potential mutations per case in the genes sequenced (thus 2000 mutations in 400 cases). The inventors have working primer sets for the known cancer genes and estimate the need to develop an additional 200 primer sets to validate mutations in SWI/SNF genes. Amplicons for all mutations will be placed into two pools, each of which will be used to create a library that will be run on a single lane of the Illumina GIIx analyzer. The amplicons from normal and tumour DNA will be pooled into separate libraries to eliminate the need for barcoding. If identical changes are seen in multiple cases, these will be validated by Sanger sequencing. In cases where ARID1A mutations are found, LOH at the second allele will be assessed using FISH.
If the HybSelect method does not work as outlined above, alternative sequencing strategies will be used if needed: either Illumina-based sequencing of selected amplicons or Sanger sequencing will be used. If Sanger based sequencing is used, the number of cases analysed will be decreased to 100 due to increased costs associated with this approach.
As described above, the inventors have demonstrated that the mutation status of ARID1A correlates with BAF250a expression. The above experiments were conducted using a mouse monoclonal antibody directed against a 111 amino acid region (amino acids 1216-1326) C-terminal to the ARID domain of BAF250a (clone 3H2Abgent Inc.). As this antibody targets the central region of the protein, there may be positive staining even when nonsense mutations within the C-terminus give rise to a truncated form of the protein. As several of the mutations identified by the inventors fall within the C-terminus (
Expression of BAF250b (encoded by ARID1B) will also be assessed. Since SWI/SNF complexes cannot contain both BAF250a and BAF250b, it is predicted that depletion of BAF250a may correlate with increased BAF250b.
Based on the RNA-seq data described above, it appears that ARID1B expression levels are not affected by mutations in ARID1A and in fact are not variable when compared across all cancer types. However, in order to ensure that BAF250b protein expression is not increased due to BAF250a deficiencies, all BAF250a-negative cases will be immunostained for expression of BAF250b. Since SWI/SNF complexes cannot contain both BAF250a and BAF250b, it may be that the absence of BAF250a corresponds to an enrichment of BAF250b containing complexes. This would have functional consequences as BAF250a depletion has been shown to specifically inhibit cell cycle arrest, while BAF250b depletion has no effect on cell cycle arrest52. In addition, BRM, BRG1, and BAF47 immunohistochemistry will be done on all tissue microarrays.
Cases with unexplained loss of BAF250a, BRM, BRG1, or BAF47 expression will be re-examined for promoter hypermethylation, which has been described for BRM and BRG126, using primers designed through access to known tools such as http://www.urogene.org/methprimer/index.html or published primers. Immunostaining of all cases will be preformed at the Genetic Pathology Evaluation Centre8,14,50.
With about 150 CCC cases, a determination of the rate of ARID1A mutations can be assessed to within +10%. Analysis of 400 ovarian cancer tumours will allow detection of differences in mutation rates between pathological or molecularly defined subtypes of 15% (80% power level). Mutation frequency in SWI/SNF genes will be compared between cancer subtypes using Fisher's exact test. It will be determined whether CCCs with ARID1A mutations or loss of expression have a distinct clinical phenotype by correlation with patient outcomes and tumour stage. Log rank test and Kaplan Meier plots will be used to assess differences in survival characteristics4,50. Associations with clinical and biomarker data will be assessed with chi-square tests and contingency tables.
In all cases where mutations within SWI/SNF genes are found, putative precursor lesions (when present) will be analyzed by immunohistochemistry for BAF250a expression; FISH for chromosomal based LOH; and laser capture microdissection (LCM) followed by Sanger sequencing of cloned PCR products to assess ARID1A mutation status. This approach has already been used on case CCC23 discussed above.
A determination will be made as to how ARID1A (wildtype, loss, and mutant) affects cell growth and survival in clear cell carcinoma cells and xenograft mouse models. The effect of ARID1A mutations on protein-protein interactions will be determined using co-immunoprecipitation experiments followed by mass spectrometry. To determine if ARID1A mutations affect recruitment to BAF250a targets, chromatin immumoprecipitation combined with next generation sequencing will be used (ChIP-seq). Genome-wide nuclease accessibility assays will be used to validate SWI-SNF-chromatin interactions identified in chromatin immunoprecipitation experiments (
The inventors have developed a transplantable xenograft from VOA867 (CCC 14), a CCC with a heterozygous ARID1A truncating somatic mutation (C 1680A (Y560*)) in exon 3 resulting in complete loss of BAF250a expression. An ARID1A -null cell line (867CL) established from the VOA867 (CCC 14) xenograft to create isogenic derivatives will be used for all functional studies. Site-directed mutagenesis of the full length ARID1A cDNA (pCMV6-XL4 plasmid, OriGene Technologies) will be conducted to generate ARID1A constructs corresponding to mutations identified through RNA-seq. Specifically, 876CL isogenic lines will be created with 1) vector only as a control (867CL-vector), 2) the 6018-6020delGCT (2007AL) 3 by deletion found in VOA120 (867CL-ARID1A-ΔL2007), and 3) wildtype ARID1A (867CL-ARID1A-WT). To prevent disruption of BRG1 binding resulting from a BAF250a C-terminal GFP fusion, a vector with GFP expressed through an IRES site (internal ribosome entry site) and use BAF250a antibodies to validate expression. These ARID1A mutant and wild-type constructs will be packaged into pLVX-Puro lentiviral expression vector which will be used to infect 867CL cells. Transduced cells will be selected using puromycin and/or flow sorting for GFP. Stable clones will be derived by limited dilution to select clones with ARID1A expression that is comparable TOV-21G (a CCC derived cell line that endogenously expresses wildtype ARID1A). These cells (867CL, 867CL-vector, 867CL-ARID1A-ΔL2007, 867CL-ARID1A-WT) will be subjected to RNA-seq and differentially expressed genes will be mapped to pathways using Ingenuity Pathway Analysis software. These data will also be used to validate ChIP-seq results.
The three isogenic and parent 867CL cells will be analyzed in vitro for growth and cell cycle activity. MTT (3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyl tetrazolium bromide) assays will be used to evaluate proliferation status as a function of mitochondrial activity.53 As a second measurement of cell survival, the cell colony formation assay will be used, which assesses cell cycle arrest or cell death leading to reduced colony formation.54 As depletion of ARID1A plays a role in cell cycle repression,52 cell cycle activity will be assessed through analysis of DNA synthesis as measured by [3H] thymidine incorporation into DNA.55 To further elucidate the biological function of ARID1A in vivo, the parent 867CL cells will be transplanted along with the three derived isogenic cells into NOD/SCID mice using the xenograft sub-renal capsule technique and growth properties of the tumour xenografts will be compared. If 867CL-ARID1A-WT xenografts have a longer tumour doubling time compared to ARID1A-null (867CL, 867CL-vector) or ARID1A-mutant (867CL-ARID1A-ΔL2007), this would further support that ARID1A acts as a tumour suppressor in CCCs.
If isogenic cell lines cannot successfully be created from 867CL cells, other ARID1A-null cells will be selected to serve as potential alternatives: 1) any of the nine CCC cell lines sequenced in the Examples above with loss of ARID1A expression or 2) IOSE (immortalized ovarian surface epithelium) or HCT116 cells stably expressing lentiviral ARID1A shRNA. Preliminary data demonstrate efficient ARID1A shRNA-mediated knock-down of BAF250a expression in HCT 116 cells (
The inventors predict that 867CL and 867-vector cells will produce identical results in the cell cycle and growth assays described above. If this is the case, it will be concluded that the vector has no effect and will use only 867CL cells for the remaining experiments. Immunoprecipitation (IP) of SWI/SNF complexes is required for both assessment of protein composition (in MS experiments) and chromatin binding (in ChIP-seq experiments). IP experiments will be done from nuclear extracts in null (867CL), mutant (867CL-ARID1A-ΔL2007) and wildtype ARID1A (867CL-ARID1A-WT) cell lines using three SWI/SNF antibodies targeting: 1) one core component of the complex (i.e. BAF155, BAF170, or BAF47)49; 2) BAF250b; and 3) BAF180. In addition, the inventors will IP SWI/SNF complexes using BAF250a antibodies from 867CL-ARID1A-ΔL2007, 867CL-ARID1A-WT, and TOV-21G cells (
With reference to
Antibodies for IP56 are available from Santa-Cruz (BAF170,sc-10757; BAF47, sc-16189; BAF250a, sc-32761) and Bethy Laboratories (BAF180, A301-590A; BAF155, A301-019A; BAF250b, A301-047A) and these will be tested to select antibodies that produce the cleanest results. As SWI/SNF complexes must contain one of BAF250a, BAF250b, or BAF180, ARID1A loss or mutations may manifest as dramatically reduced levels of wildtype BAF250a complexes and an increase in BAF250b or BAF180 containing complexes. A second consequence of ARID1A mutations may be alteration of the protein combinations within SWI/SNF complexes. A third consequence of these mutations may be changes in chromatin targets for SWI/SNF complexes which would affect gene regulation. These will all be investigated using the combination of MS and ChIP-seq experiments described below.
The inventors will use the multiple reaction monitoring (MRM) MS analysis technique to quantitate signature peptides for 15 known components of SWI/SNF complexes (
Comparison of 867CL to 867CL-ARID1A-WT or TOV-21G cells will identify changes associated with altered overall SWI/SNF complex composition and altered BAF250b and BAF180 complex composition associated with ARID1A loss in CCCs. It is predicted that the SWI/SNF composition of 867CL-ARID1A-ΔL2007 compared to 867CL-ARID1A-WT and TOV-21G will identify proteins gained or lost due to BAF250a interactions that are dependent on contacts to Leu2007 or tertiary structures affected by the Leu2007 residue. This will be verified by IP of SWI/SNF complexes using the BAF250a antibody in 867CL-ARID1A-ΔL2007, 867CL-ARID1A-WT, and TOV-21G cells.
IP of SWI/SNF complexes from nuclear extracts using antibodies to SWI/SNF core proteins and analysis by MS/MS has succeeded in identifying all of the core proteins to be monitored57,58, thus the more sensitive MRM technique should also be successful. Technical replicates for MRM analysis vary by less than 5%, thus it is anticipated that small (10-20%) changes in the relative levels of individual SWI/SNF proteins in the overall pool of SWI/SNF components will be detectable. The experiments will not be able to differentiate between SWI/SNF complexes with different compositions, but should detect major adjustments in SWI/SNF complex composition due to the loss of BAF250a. If the data identify compelling changes, experiments to characterize individual SWI/SNF complexes in the BAF250a mutant lines would be performed. Using biochemical size fractionation chromatography and the MRM assay, the molar stoichiometry of individual SWI/SNF complexes and their components would be determined
Experiments will be conducted to determine if mutations in ARID1A lead to distinctive SWI/SNF-chromatin interactions. The effect of ARID1A mutations on BAF250a mediated transactivation will be assayed using a luciferase reporter construct. ChIP-seq and nuclease protection assays59 will assess how wildtype and mutant BAF250a proteins differentially interact with chromatin.
Effect of ARID1a Mutations on Transactivation:
The XG46TL plasmid will be obtained that contains multiple glucocorticoid receptor response elements upstream of a luciferase reporter which will be transiently transfected into the four cell lines (867CL, 867CL-ARID1A-WT, 867CL-ARID1A-ΔL2007, TOV-21G). Cells will be treated with dexamethasone to stimulate the glucorticoid receptor which acts in concert with the SWI/SNF complex to activate transcription; this can be assessed through quantitation of luciferase as previously described.60 Using this reporter system, effects of ARID1A mutations on transactivation can be directly assessed.
Effect of ARID1A Mutations on BAF250a Interaction with DNA:
The impact of ARID1A mutations on SWI/SNF complex binding to chromatin will be assessed using ChIP-seq to identify promoters interacting with SWI/SNF complexes in the four cell lines described above. IP's will be done as described above, in duplicate. The tools required for ChIP-seq and associated analysis have been previously described.61 The coverage chosen (˜5 Gbp per library) will achieve the redundancy necessary to find high confidence peaks while maintaining budget constraints.
Cell lines will be treated with formaldehyde to cross-link DNA and associated proteins. Cleared cell lysates will be sonicated to shear the chromatin, then incubated with the selected SWI/SNF antibody followed by overnight Protein A/G Sepharose precipitation. Chromatin IPs will be washed, eluted, used to create an Illumina sequencing library, and sequenced in one lane of an Illumina flow cell. Paired reads will be aligned to the reference human genome with Exonerate (http://www.ebi.ac.uk/˜guy/exonerate) or Maq.62 Regions of clustered sequence tags (peaks) corresponding to chromatin will be defined using FindPeaks software.63 Sequences not present in both biological replicates or found to be in common with the ARID1A-wildtype (867CL-ARID1A-WT, TOV21G), ARID1A-mutant (867CL-ARID1A-ΔL2007, and ARID1A-null (867CL) cells will be removed from analysis. Data will be analysed with MEME64 to detect any over-represented motifs and with TRANSFAC to find known transcription factor binding sites. Finally, genes and highly conserved intergenic sites will be identified proximal to peaks. It is expected to see on the order of 1000 peaks at false discovery rate=0.05. These areas will be prioritized based on where they are located (i.e. promoter regions upstream of target genes), the relevance of genes that may be transcriptionally regulated by these regions, and by the data obtained from the targeted sequencing and MS experiments.
This approach will allow identification of high confidence DNA-protein interactions in the primary dataset and eliminate signals due to sporadic or non-specific DNA-protein binding. Interactions of interest will be validated with orthogonal techniques including interactions of BAF250a with selected promoters upstream of a luciferase reporter gene. To determine whether findings from the ChIP-seq experiments are supported by expression changes for the implicated genes, data generated from triplicate libraries from the 867CL, 867CL-vector, 867CL-ARID1A-WT, and 867CL-ARID1A-ΔL2007cell lines which will be analysed by RNA-seq for differential gene expression using the edgeR Bioconductor statistical package.65 Briefly, edgeR models read count data for a particular gene according to a negative Binomial distribution. Using an overdispersed Poisson model for differential gene expression analysis, the model is able to account for both technical and biological variation. All genes showing differential expression and concomitant differential ChIP-seq peak detection in their promoter regions will be selected as candidate genes affected by ARID1A mutation.
Effects of ARID1A Mutations on In Vivo Nucleosome Remodelling:
Nucleosome-free DNA is sensitive to digestion by low concentrations of nuclease and ARID1A mutations may be reflected as changes in nuclease sensitivity. Nuclease sensitivity at 20 ARID1A targets identified through ChIP-seq will be assessed, focusing on genes that are known drug targets or cancer genes and for which the ChIP-seq data correspond to changes in gene expression through RNA-seq. Briefly, nuclei from CCC cell lines will be treated with low concentrations of micrococcal nuclease or DNAaseI, causing only DNA from nucleosome-free regions to be degraded. The remaining protected DNA will be sequenced using primers specific for each target.
In the event that no changes in SWI/SNF composition or DNA binding are identified in the presence of ARID1A mutations, it will additionally be assessed whether these mutations result in alteration of histone ubiquitination, as it was recently demonstrated that BAF250b (the gene product of ARID1B) is an E3 ubquitin ligase for histone H2B at lysine (K)120.66
An siRNA library will be used to identify genes that are necessary for survival of cells expressing mutant ARID1A. Any identified genes would be potential targets for the development of therapeutics for clear cell cancers with ARID1A mutations. The siRNA library will be screened in xenograft mouse models of ARID1A mutant clear cell carcinomas.
An established approach to identifying therapeutic targets in cancer, is to search for “synthetic lethality”, also known as conditional genetics. The prototype example of synthetic lethality is PARP inhibition in the context of BRCA1 or BRCA2 deficiency67,68. To define therapeutic targets that would be uniquely effective in tumours bearing ARID1A mutations, a synthetic lethal (viability) screen will be conducted using established siRNA/high content screening methodology. A fully integrated siRNA screening facility equipped with robotics, fluid handling and an INCELL 1100 high content imager. The inventors will use a published siRNA/high content multiparameter screening method69 to measure seven phenotypic parameters relevant to cell viability, proliferation, cell cycle, and associated checkpoints.
The siRNA libraries screened will be the Hannon/Elledge lenti-shRNA human library (approx 66,000 constructs) and the Dharmacon siGenome pools, representing approximately 22,000 gene loci. Both libraries have been internally formatted for 96 well and 384 well screens. In preference, the siRNA library pools will be used for screening at 25 nM. If for any reason siRNA transfection proves difficult, the shRNA library will be used. The 867CL, 867CL-ARID1A-ΔL2007, 867CL-ARID1A-WT cells will be used. If screening using these cell lines proves intractable, an isogenic knockout of ARID1A in HCT116 cells will be used as a second choice (
Cell lines will be compared pairwise, in 384 well plates. Each transfection plate will contain controls for transfection efficiency, transfection toxicity and siRNA effectiveness, and phenotypic baseline measurements. In the primary screens, all 22,000 siRNA pools/66000 shRNAs (representing the full human gene complement thus far established) will be used. The screen will be performed in 384 well plates on the three isogenic cell lines, in triplicate. Control plates (all wells transfected with the same non-targeting siRNA) will be used to correct for well position effects in a linear mixed effects model. Cells will be transduced with 25 nM of siRNA pools or lentiviral particles at a MOI of 3, as appropriate. The effects of each siRNA pool or shRNA on cell viability, cell shape and transduction efficiency will be measured 4 days post transfection. Transduction efficiency will be evaluated using control wells from each screen plate, containing PLK1 siRNA. All conditions will be assessed in triplicate to allow adequate assessment of variability. After image segmentation and quantification as described, the data will be analysed with a linear mixed effects model70 to handle known screening artefacts such as wellplate edge effects, reagent dispenser pipette tip effects etc. Multiple comparisons adjustments will be performed using the Benjamin-Hochberg approach for p-values, and empirical Bayes shrinkage for effect estimates where appropriate71,72 To measure the degree of synthetic interaction, an interaction index (scaled ratio of wt phenotype size to mutant phenotype size, for a given siRNA) will be calculated from linear model adjusted values. The top 5% of candidate shRNA targets, based on ranked synthetic effect magnitude and ranked p-value, will be triaged for follow-up validation. Following primary screening and selection of initial hits, these will be rescreened individually (pool deconvolution) for maximum discrimination. Re-validated siRNAs will also be assayed in conjunction with qRT-PCR (quantitative reverse transcriptase PCR) for the target transcript to determine whether the phenotype segregates with the degree of transcript knockdown. siRNAs surviving these filters will be grouped by GO-terms and structural class, for further follow up.
While a number of exemplary aspects and embodiments have been discussed above, those of skill in the art will recognize certain modifications, permutations, additions and sub-combinations thereof. It is therefore intended that the following appended claims and claims hereafter introduced are not limited by the preferred embodiments set forth in the disclosure and the examples, but are to be given the broadest interpretation consistent with the specification as a whole.
This application claims the benefit of U.S. provisional patent application No. 61/326,859 filed 22 Apr. 2010 and U.S. provisional patent application No. 61/368,596, filed 28 Jul. 2010, both entitled NOVEL MARKERS AND THERAPEUTIC TARGETS FOR CLEAR CELL CARCINOMA OF THE OVARY, each of which is expressly incorporated by reference herein.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/IB11/51763 | 4/22/2011 | WO | 00 | 11/7/2012 |
Number | Date | Country | |
---|---|---|---|
61326859 | Apr 2010 | US | |
61368596 | Jul 2010 | US |