The technology described herein relates to methods of diagnosing and treating cancer.
Microsatellite instability (MSI) is a molecular tumor phenotype that is indicative of genomic hypermutability, usually reflecting inactivation of the mismatch repair (MMR) system. MSI is marked by spontaneous gains or losses of nucleotides from repetitive DNA tracts, resulting in new alleles of differing length that serve as the basis for its clinical diagnosis. Although classically associated with colorectal and endometrial tumors, MSI has now been recognized in most cancer types with varying prevalence and is accompanied by a generally increased rate of mutations genome-wide. Molecular diagnosis of MSI can be predictive of a patient's response to an anti-cancer therapy. However, there is an unmet need for cancer screening panels of MSI in specific cancer types and the microsatellite marker sequences vary between different types of cancers.
To better inform molecular diagnosis of MSI, over 9,000 tumor-normal exome pairs and over 900 whole genome sequence pairs were examined from 33 different cancer types and cataloged for genome-wide microsatellite instability events. Using a statistical framework, microsatellite mutations that were predictive of MSI within and across cancer types were identified. The diagnostic accuracy of different subsets of maximally informative markers was estimated computationally using a dedicated validation set. Based on these analyses, twenty-five cancer types exhibited hypermutated states consistent with MSI. Recurrently mutated microsatellites associated with MSI were identifiable in 15 cancer types, but were largely specific to individual cancer types. Cancer-specific microsatellite panels of one to seven loci were needed to attain 95% diagnostic sensitivity and specificity for 11 cancer types, and in eight of the cancer types, 100% sensitivity and specificity were achieved. In contrast, breast cancer required 800 loci to achieve comparable performance, and it was not possible to identify recurrent microsatellite mutations supporting reliable MSI diagnosis in ovarian tumors. Thus, as described further herein below, most microsatellites informative for MSI are specific to particular cancer types, requiring the use of tissue-specific loci for optimal diagnosis. Accordingly, limited numbers of markers are needed to provide accurate MSI diagnosis in most tumor types, but it is challenging to diagnose breast and ovarian cancers using pre-defined microsatellite locus panels.
Features associated with informative microsatellites were cataloged, and sets of microsatellites are ranked herein according to degree of predictive value for each microsatellite for each tumor type and underpin diagnostic, prognostic and therapeutic methods as described herein below. For each set, the number of mutated members of that set which are determinative of MSI-H is influenced by where the mutated members of the set are in the rank ordering, with higher ranked microsatellite markers carrying greater weight (i.e., lower p-value). Thus, the number of microsatellite mutations necessary for a sensitive and specific determination of MSI-H for a given cancer type can vary depending upon the ranking of the mutated markers in the set, with fewer mutations needed to assign the cancer to the MSI-H category when the mutations detected are in higher ranked microsatellites (i.e., lower p-value).
Accordingly, in one aspect, described herein is a method of predicting whether a subject's cancer will respond to checkpoint inhibitor immunotherapy, the method comprising: (a) receiving microsatellite instability data for a defined set of microsatellite repeat marker sequences in cells of the subject's cancer; and (b) processing the microsatellite instability data to output a categorical measure of microsatellite instability high (MSI-H) or microsatellite stable (MSS); wherein when the subject's cancer is determined to exhibit MSI-H, it is predicted that the cancer will respond to checkpoint inhibitor immunotherapy, and when the subject's cancer is determined to exhibit MSS, it is predicted that the cancer is less likely to respond to checkpoint inhibitor immunotherapy.
In another aspect, described herein is a method of predicting whether a subject's cancer will respond to checkpoint inhibitor immunotherapy, the method comprising: determining microsatellite instability status for a defined set of microsatellite repeat marker sequences in cells of the subject's cancer, wherein determining the microsatellite instability status of the subject's cancer comprises: (i) assaying loci in the defined set of microsatellite repeat marker sequences in cells of the subject's cancer for mutation, thereby generating microsatellite instability data; and (ii) processing the microsatellite instability data to output a categorical measure of microsatellite instability high (MSI-H) status or microsatellite stable (MSS) status for the subject's cancer; wherein when the subject's cancer is determined to exhibit MSI-H, it is predicted that the cancer will respond to checkpoint inhibitor immunotherapy, and when the subject's cancer is determined to exhibit MSS, it is predicted that the cancer is less likely to respond to checkpoint inhibitor immunotherapy.
In another aspect, described herein is a method of determining microsatellite instability (MSI) in a subject's cancer, the method comprising: assaying mutation status for a set of microsatellite repeat marker sequences in cells of the subject's cancer, wherein when the set of microsatellites exhibits mutation at or above a threshold number of microsatellites in the set, the subject's cancer is determined to exhibit high microsatellite instability (MSI-H), and when the set of microsatellites exhibits mutation below the threshold number of microsatellites in the set, the subject's cancer is determined to exhibit microsatellite stability (MSS); wherein the threshold for the set is determined using the percentage of mutated microsatellite markers in the set to calculate the area under the receiver operating characteristic (AUROC), and wherein the smallest number of markers in the set necessary to reach an AUROC value selected in the range of 0.6 to 0.99, inclusive, is the threshold.
In another aspect, described herein is a method of treating cancer in a subject in need thereof, the method comprising determining microsatellite instability status for cells of a subject's cancer by a method of described herein, and, when the microsatellite instability status is determined to be MSI-H, administering a checkpoint inhibitor, or, when the microsatellite instability status is determined to be MSS, administering a non-checkpoint inhibitor cancer therapeutic.
In another aspect, described herein is a method of treating cancer in a subject in need thereof, wherein the subject's cancer has been determined to have a microsatellite instability status of MSI-H as determined by a method as described herein, the method comprising administering an effective amount of a checkpoint inhibitor to the subject.
In another aspect, described herein is a method of predicting whether a subject's cancer will respond to checkpoint inhibitor immunotherapy, the method comprising: assaying mutation status for a set of microsatellite repeat marker sequences in cells of the subject's cancer, wherein when mutations are present in at least a threshold number of markers in the set, the set provides at least 95% sensitivity and at least 95% specificity for predicting high microsatellite instability (MSI-H) for the subject's tumor.
In another aspect, described herein is a method of determining microsatellite instability (MSI) in a subject's cancer, the method comprising: assaying mutation status for a set of microsatellite repeat marker sequences in the cells of the subject's cancer, wherein when mutations are present in at least a threshold number of markers in the set, the set provides at least 95% sensitivity and at least 95% specificity for predicting high microsatellite instability (MSI-H) for the subject's tumor.
In another aspect, described herein is a method of treating cancer in a subject in need thereof, the method comprising: assaying mutation status for a set of microsatellite repeat marker sequences in cells of the subject's cancer, wherein when mutations are present in at least a threshold number of markers in the set, the set provides at least 95% sensitivity and at least 95% specificity for predicting high microsatellite instability (MSI-H) for the subject's tumor.
In another aspect, described herein is a use of a checkpoint inhibitor for the treatment of cancer, comprising: (a) receiving microsatellite instability data for a defined set of microsatellite repeat marker sequences in cells of the subject's cancer; and (b) processing the microsatellite instability data to output a categorical measure of microsatellite instability high (MSI-H) or microsatellite stable (MSS); wherein the checkpoint inhibitor is administered when the subject's cancer is determined to exhibit MSI-H.
In another aspect, described herein is a use of a checkpoint inhibitor for the treatment of cancer, comprising: determining microsatellite instability status for a defined set of microsatellite repeat marker sequences in cells of the subject's cancer, wherein determining the microsatellite instability status of the subject's cancer comprises: (i) assaying loci in the defined set of microsatellite repeat marker sequences in cells of the subject's cancer for mutation, thereby generating microsatellite instability data; and (ii) processing the microsatellite instability data to output a categorical measure of microsatellite instability high (MSI-H) status or microsatellite stable (MSS) status for the subject's cancer; wherein the checkpoint inhibitor is administered when the subject's cancer is determined to exhibit MSI-H.
In another aspect, described herein is a diagnostic kit for determining microsatellite instability in a cancer, the kit comprising reagents that permit the detection of microsatellite mutation in a set of microsatellite repeat marker sequences in the cancer.
In another aspect, described herein is an array for detecting microsatellite instability in a cancer, the array comprising nucleic acids that permit the detection of microsatellite mutation in a set of microsatellite repeat marker sequences in the cancer, wherein the nucleic acids are linked to a solid support.
In one embodiment of any of the aspects, the set of microsatellite repeat markers comprises a plurality of up to 800 of the microsatellites set out herein or in Appendix A of U.S. Provisional Application No. 63/044,029 filed Jun. 25, 2020, the contents of which are incorporated herein by reference in their entirety.
In some embodiments of any of the aspects, when the cancer is colon adenocarcinoma (COAD), the set of microsatellite repeat marker sequences comprises a plurality up to 500 of the microsatellites set out in Table 1. In some embodiments of any of the aspects, when the cancer is esophageal carcinoma (ESCA), the set of microsatellite repeat marker sequences comprises a plurality up to 503 of the microsatellites set out in Table 2. In some embodiments of any of the aspects, when the cancer is glioblastoma multiforme (GBM), the set of microsatellite repeat marker sequences comprises a plurality up to 500 of the microsatellites set out in Table 3. In some embodiments of any of the aspects, when the cancer is lung adenocarcinoma (LUAD), the set of microsatellite repeat marker sequences comprises a plurality up to 500 of the microsatellites set out in Table 4. In some embodiments of any of the aspects, when the cancer is lung squamous cell carcinoma (LUSC), the set of microsatellite repeat marker sequences comprises a plurality up to 500 of the microsatellites set out in Table 5. In some embodiments of any of the aspects, when the cancer is rectum adenocarcinoma (READ), the set of microsatellite repeat marker sequences comprises a plurality up to 501 of the microsatellites set out in Table 6. In some embodiments of any of the aspects, when the cancer is stomach adenocarcinoma (STAD), the set of microsatellite repeat marker sequences comprises a plurality up to 500 of the microsatellites set out in Table 7. In some embodiments of any of the aspects, when the cancer is brain lower grade glioma (LGG), the set of microsatellite repeat marker sequences comprises a plurality up to 266 of the microsatellites set out in Table 8. In some embodiments of any of the aspects, when the cancer is prostate adenocarcinoma (PRAD), the set of microsatellite repeat marker sequences comprises a plurality up to 500 of the microsatellites set out in Table 9. In some embodiments of any of the aspects, when the cancer is cervical squamous cell carcinoma (CESC), the set of microsatellite repeat marker sequences comprises a plurality up to 500 of the microsatellites set out in Table 10. In some embodiments of any of the aspects, when the cancer is lymphoid neoplasm diffuse large B-cell lymphoma (DLBC), the set of microsatellite repeat marker sequences comprises a plurality up to 212 of the microsatellites set out in Table 11. In some embodiments of any of the aspects, when the cancer is uterine corpus endometrial carcinoma (UCEC), the set of microsatellite repeat marker sequences comprises a plurality up to 500 of the microsatellites set out in Table 12. In some embodiments of any of the aspects, when the cancer is kidney renal clear cell carcinoma (KIRC), the set of microsatellite repeat marker sequences comprises a plurality up to 508 of the microsatellites set out in Table 13. In some embodiments of any of the aspects, when the cancer is breast invasive carcinoma (BRCA), the set of microsatellite repeat marker sequences comprises a plurality up to 500 of the microsatellites set out in Table 14A. In some embodiments of any of the aspects, when the cancer is breast invasive carcinoma (BRCA), the set of microsatellite repeat marker sequences comprises a plurality up to 800 of the microsatellites set out in Tables 14A-14B. In some embodiments of any of the aspects, when the cancer is uterine corpus endometrial carcinoma (UCEC), colon adenocarcinoma (COAD), rectum adenocarcinoma (READ), or stomach adenocarcinoma (STAD), the set of microsatellite repeat marker sequences comprises a plurality up to 37 of the microsatellites set out in Table 21.
In another embodiment of any of the aspects, cancer that exhibits greater than or equal to the threshold number of mutations for that tumor type in the set is determined to be MSI-H, and cancer that exhibits fewer than the threshold number of mutations for that tumor type in the set, the subject's cancer is determined to be microsatellite stable or not MSI-H.
In some embodiments of any of the aspects, when the cancer is colon adenocarcinoma (COAD), the microsatellite repeat marker sequences comprise a plurality of microsatellite repeat markers in Table 1 and the threshold number is 50%. In some embodiments of any of the aspects, when the cancer is esophageal carcinoma (ESCA), the microsatellite repeat marker sequences comprise a plurality of microsatellite repeat markers in Table 2 and the threshold number is 50%. In some embodiments of any of the aspects, when the cancer is glioblastoma multiforme (GBM), the microsatellite repeat marker sequences comprise a plurality of microsatellite repeat markers in Table 3 and the threshold number is 50%. In some embodiments of any of the aspects, when the cancer is lung adenocarcinoma (LUAD), the microsatellite repeat marker sequences comprise a plurality of microsatellite repeat markers in Table 4 and the threshold number is 50%. In some embodiments of any of the aspects, when the cancer is lung squamous cell carcinoma (LUSC), the microsatellite repeat marker sequences comprise a plurality of microsatellite repeat markers in Table 5 and the threshold number is 50%. In some embodiments of any of the aspects, when the cancer is rectum adenocarcinoma (READ), the microsatellite repeat marker sequences comprise a plurality of microsatellite repeat markers in Table 6 and the threshold number is 50%. In some embodiments of any of the aspects, when the cancer is stomach adenocarcinoma (STAD), the microsatellite repeat marker sequences comprise a plurality of microsatellite repeat markers in Table 7 and the threshold number is 25%. In some embodiments of any of the aspects, when the cancer is brain lower grade glioma (LGG), the microsatellite repeat marker sequences comprise a plurality of microsatellite repeat markers in Table 8 and the threshold number is 12.5%. In some embodiments of any of the aspects, when the cancer is prostate adenocarcinoma (PRAD), the microsatellite repeat marker sequences comprise a plurality of microsatellite repeat markers in Table 9 and the threshold number is 16.7%. In some embodiments of any of the aspects, when the cancer is cervical squamous cell carcinoma (CESC), the microsatellite repeat marker sequences comprise a plurality of microsatellite repeat markers in Table 10 and the threshold number is 45.8%. In some embodiments of any of the aspects, when the cancer is lymphoid neoplasm diffuse large B-cell lymphoma (DLBC), the microsatellite repeat marker sequences comprise a plurality of microsatellite repeat markers in Table 11 and the threshold number is 25%. In some embodiments of any of the aspects, when the cancer is uterine corpus endometrial carcinoma (UCEC), the microsatellite repeat marker sequences comprise a plurality of microsatellite repeat markers in Table 12 and the threshold number is 12.9%. In some embodiments of any of the aspects, when the cancer is kidney renal clear cell carcinoma (KIRC), the microsatellite repeat marker sequences comprise a plurality of microsatellite repeat markers in Table 13 and the threshold number is 7.5%. In some embodiments of any of the aspects, when the cancer is breast invasive carcinoma (BRCA), the microsatellite repeat marker sequences comprise a plurality of microsatellite repeat markers in Table 14A and the threshold number is 3.6%. In some embodiments of any of the aspects, when the cancer is uterine corpus endometrial carcinoma (UCEC), colon adenocarcinoma (COAD), rectum adenocarcinoma (READ), or stomach adenocarcinoma (STAD), the microsatellite repeat marker sequences comprise a plurality of microsatellite repeat markers in Table 21 and the threshold number is at least 12.9% (e.g., 12.9%, 25%, 50%).
In another embodiment of any of the aspects, the quantitative model is selected from a continuous measure, regression or weighted scoring of markers, a combinatorial or decision-tree model, and a machine learning model. In another embodiment of any of the aspects, the quantitative model comprises a random-forest model or a deep neural network machine learning model.
In some embodiments of any of the aspects, information from individual markers can be combined to produce a final measure, non-limiting examples of which include a classification, score, probability, or prediction of MSI (e.g., stable, unstable, intermediate, indeterminate, or a continuous measure or probability of instability). Methods for analyzing information from individual markers to produce such a measure can include but are not limited to: additive scoring systems with or without use of weighting of individual markers (see e.g., Mehta et al. J Clin Epidemiol. 2016 November, 79:22-28; Austin et al. Stat Med. 2016, 35(22):4056-4072, published correction appears in Stat Med. 2018 Apr. 15, 37(8):1405; Avila et al. BMC Res Notes 8, 612 (2015)); traditional statistical methods such as linear, logistic, or Lasso regression (see e.g., Nick and Campbell (2007) Logistic Regression, in: Ambrosius W. T. (eds) Topics in Biostatistics, Methods in Molecular Biology™, vol 404, Humana Press; Eberly (2007) Multiple Linear Regression, in: Ambrosius W. T. (eds) Topics in Biostatistics, Methods in Molecular Biology™, vol 404, Humana Press; Tibshirani, Journal of the Royal Statistical Society, Series B (Methodological), vol. 58, no. 1, 1996, pp. 267-288); inverse probability-weighted estimation (see e.g., Curtis et al. Medical Care, vol. 45, no. 10, 2007, pp. 5103-5107); Bayesian methods (see e.g., Eberly 2007, supra); classifiers such as support vector machine (see e.g., Cortes and Vapnik, Mach Learn 20, 273-297 (1995)); principal component analysis (see e.g., Lever et al. Nat Methods 14, 641-642 (2017)); hierarchical clustering (see e.g., Rokach et al. Data mining and knowledge discovery handbook, Springer US, 2005, 321-352); decision trees (see e.g., Safavian and Landgrebe, IEEE Transactions on Systems, Man, and Cybernetics, vol. 21, no. 3, pp. 660-674, May-June 1991); random forest models (see e.g., Breiman, Random Forests. Machine Learning 45, 5-32 (2001)); gradient boosting methods (see e.g., Hastie et al. (2009), “10. Boosting and Additive Trees,” The Elements of Statistical Learning (2nd ed.), ISBN 978-0-387-84857-0, New York: Springer, pp. 337-384; Kobayashi and Yoshida, Environ Res. 2021 May, 196:110363); or machine learning approaches such as deep neural networks (see e.g., LeCun et al. Nature 521, 436-444 (2015)). In some embodiments of any of the aspects, missing data for individual markers can be handled intrinsically by models capable of utilizing missing data or can be accounted for using supplementary approaches to account for missing data such as imputation (see e.g., Eberly 2007, supra). The contents of each of the abovementioned references are incorporated herein by reference in their entireties.
In another embodiment of any of the aspects, the quantitative model evaluates or incorporates consideration of one or more test characteristics selected from the group consisting of sensitivity, accuracy, correlation, probability, specificity, false-positive rate, false negative rate, positive predictive value, negative predictive value and area under the receiver-operator characteristic (AUROC). In another embodiment of any of the aspects, the continuous measure comprises the proportion of unmutated or stable loci to mutated or unstable loci detected in the set.
In another embodiment of any of the aspects, thresholds for the one or more test characteristics indicative of MSI-H or MSS are defined within parameters of the known test characteristics for a given clinical application. In another embodiment of any of the aspects, the processing of MSI-H or MSS is determined by a threshold value for the set of microsatellite marker sequences. In another embodiment of any of the aspects, the threshold for the set of microsatellite marker sequences is determined using the percentage of mutated microsatellite markers in the set to calculate the area under the receiver operating characteristic (AUROC). In another embodiment of any of the aspects, the smallest number of markers in the set necessary to reach a selected AUROC value is the threshold. In another embodiment of any of the aspects, the selected AUROC value is a value between 0.6 and 0.99, inclusive. In another embodiment of any of the aspects, the selected AUROC value is 0.9 or more.
In another embodiment of any of the aspects, the set of microsatellite repeat marker sequences comprises a plurality up to 200 of the sequences set out in the respective Table for the subject's cancer type. In another embodiment of any of the aspects, the set of microsatellite repeat marker sequences comprises a plurality up to 300 of the sequences set out in the respective Table for the subject's cancer type. In another embodiment of any of the aspects, the set of microsatellite repeat marker sequences comprises a plurality up to the first 100 of the sequences set out in the respective Table for the subject's cancer type. In another embodiment of any of the aspects, the set of microsatellite repeat marker sequences comprises a plurality up to the first 200 of the sequences set out in the respective Table for the subject's cancer type. In another embodiment of any of the aspects, the set of microsatellite repeat marker sequences comprises a plurality up to the first 300 of the sequences set out in the respective Table for the subject's cancer type.
In another embodiment of any of the aspects, the checkpoint inhibitor immunotherapy comprises a checkpoint inhibitor antibody. In another embodiment of any of the aspects, the checkpoint inhibitor immunotherapy is an inhibitor of a checkpoint molecule selected from the group consisting of: PD-1 or PD-L1, CTLA4, Adenosine A2A receptor (A2AR), CD276, CD39, CD73, B7 family immune checkpoint molecules, V-set domain-containing T-cell activation inhibitor 1 (B7H4), B and T Lymphocyte Attenuator (BTLA), Indoleamine 2,3-dioxygenase (IDO), Killer-cell Immunoglobulin-like Receptor (KIR), Lymphocyte Activation Gene-3 (LAG3), nicotinamide adenine dinucleotide phosphate NADPH oxidase isoform 2 (NOX2), T-cell Immunoglobulin domain and Mucin domain 3 (TIM-3), T cell immunoreceptor with Ig and ITIM domains (TIGIT), V-domain Ig suppressor of T cell activation (VISTA), and Sialic acid-binding immunoglobulin-type lectin 7 (SIGLEC7). In another embodiment of any of the aspects, the checkpoint inhibitor is selected from the group consisting of pembrolizumab (Keytruda®), nivolumab (Opdivo®), cemiplimab (Libtayo®), spartalizumab, camrelizumab, sintilimab, tislelizumab, toripalimab, dostarlimab, INVMGA00012, AMP-224, AMP-514, atezolizumab (Tecentriq®), avelumab (Bavencio®), survalumab (Imfinzi®), KN035, CK-301, AUNP12, CA-170, BMS-986189, and ipilimumab (Yervoy®)
In another embodiment of any of the aspects, the non-checkpoint inhibitor cancer therapy comprises one or more of angiostatin K1-3, DL-α-Difluoromethyl-ornithine, endostatin, fumagillin, genistein, minocycline, staurosporine, and (±)-thalidomide; a DNA intercalator/cross-linker, such as Bleomycin, Carboplatin, Carmustine, Chlorambucil, Cyclophosphamide, cis-Diammineplatinum(II) dichloride (Cisplatin), Melphalan, Mitoxantrone, and Oxaliplatin; a DNA synthesis inhibitor, such as (±)-Amethopterin (Methotrexate), 3-Amino-1,2,4-benzotriazine 1,4-dioxide, Aminopterin, Cytosine β-D-arabinofuranoside, 5-Fluoro-5′-deoxyuridine, 5-Fluorouracil, Ganciclovir, Hydroxyurea, and Mitomycin C; a DNA-RNA transcription regulator, such as Actinomycin D, Daunorubicin, Doxorubicin, Homoharringtonine, and Idarubicin; an enzyme inhibitor, such as S(+)-Camptothecin, Curcumin, (−)-Deguelin, 5,6-Dichlorobenzimidazole 1-β-D-ribofuranoside, Etoposide, Formestane, Fostriecin, Hispidin, 2-Imino-1-imidazoli-dineacetic acid (Cyclocreatine), Mevinolin, Trichostatin A, Tyrphostin AG 34, and Tyrphostin AG 879; a gene regulator, such as 5-Aza-2′-deoxycytidine, 5-Azacytidine, Cholecalciferol (Vitamin D3), 4-Hydroxytamoxifen, Melatonin, Mifepristone, Raloxifene, all trans-Retinal (Vitamin A aldehyde), Retinoic acid, all trans (Vitamin A acid), 9-cis-Retinoic Acid, 13-cis-Retinoic acid, Retinol (Vitamin A), Tamoxifen, and Troglitazone; a microtubule inhibitor, such as Colchicine, Dolastatin 15, Nocodazole, Paclitaxel, Podophyllotoxin, Rhizoxin, Vinblastine, Vincristine, Vindesine, and Vinorelbine (Navelbine); a neoantigen; and an unclassified antitumor agent, such as 17-(Allylamino)-17-demethoxygeldanamycin, 4-Amino-1,8-naphthalimide, Apigenin, Brefeldin A, Cimetidine, Dichloromethylene-diphosphonic acid, Leuprolide (Leuprorelin), Luteinizing Hormone-Releasing Hormone, Pifithrin-α, Rapamycin, Sex hormone-binding globulin, Thapsigargin, and Urinary trypsin inhibitor fragment (Bikunin), Vemurafenib (Zelboraf®) imatinib mesylate (Gleevec®), erlotinib (Tarceva®), gefitinib (Iressa®), Vismodegib (Erivedge™), 90Y-ibritumomab tiuxetan, regorafenib (Stivarga®), sunitinib (Sutent®), Denosumab (Xgeva®), sorafenib (Nexavar®), pazopanib (Votrient®), axitinib (Inlyta®), dasatinib (Sprycel®), nilotinib (Tasigna®), bosutinib (Bosulif®), ofatumumab (Arzerra®), obinutuzumab (Gazyva™), ibrutinib (Imbruvica™), idelalisib (Zydelig®), crizotinib (Xalkori®), erlotinib (Tarceva®), afatinib dimaleate (Gilotrif®), ceritinib (LDK378/Zykadia), ibritumomab tiuxetan (Zevalin®), brentuximab vedotin (Adcetris®), bortezomib (Velcade®), siltuximab (Sylvant™), trametinib (Mekinist®), dabrafenib (Tafinlar®), a targeted therapy such as toremifene (Fareston®), fulvestrant (Faslodex®), anastrozole (Arimidex®), exemestane (Aromasin®), letrozole (Ferrara®), ziv-aflibercept (Zaltrap®), Alitretinoin (Panretin®), temsirolimus (Torisel®), Tretinoin (Vesanoid®), denileukin diftitox (Ontak®), vorinostat (Zolinza®), romidepsin (Istodax®), bexarotene (Targretin®), pralatrexate (Folotyn®), lenaliomide (Revlimid®), belinostat (Beleodaq™), lenaliomide (Revlimid®), pomalidomide (Pomalyst®), Cabazitaxel (Jevtana®), enzalutamide (Xtandi®), abiraterone acetate (Zytiga®), radium 223 chloride (Xofigo®), or everolimus (Afinitor®), an epigenetic targeted drug such as HDAC inhibitors, azacitidine (Vidaza®), decitabine (Dacogen®), vorinostat (Zolinza®), romidepsin (Istodax®), ruxolitinib (Jakafi®), kinase inhibitors, DNA methyltransferase inhibitors, histone demethylase inhibitors, or histone methylation inhibitors, and any derivative thereof.
In another embodiment of any of the aspects, assaying loci in the defined set of microsatellite repeat marker sequences in cells of the subject's cancer for mutation comprises polynucleotide sequencing, measurement of total mutational burden, immunohistochemical analysis, and/or polymerase chain reaction. In another embodiment of any of the aspect, polynucleotide sequencing comprises whole genome sequencing, whole exome sequencing, or targeted gene capture sequencing.
In another embodiment of any of the aspects, microsatellite mutations correspond to those identified in reference human genome GRCh37/hg19 translated to a different build of the human reference genome.
In another embodiment of any of the aspects, the kit comprises an array of reagents that permit the detection of microsatellite mutation on a solid support. In another embodiment of any of the aspects, the reagents comprise PCR primers that permit amplification of members of the set of microsatellite repeat marker sequences.
The methods and compositions described herein relate, in part, to the discovery of microsatellite repeat marker sequences that predict whether a subject will respond to checkpoint inhibitor immunotherapy for each cancer type. The methods comprise assaying mutation status for a set of microsatellite repeat marker sequences in cells of the subject's cancer, wherein when the set of microsatellites exhibits mutation at or above a threshold number of microsatellites in the set, the subject's cancer is determined to exhibit high microsatellite instability (MSI-H). The microsatellite repeat marker sequences for each cancer type that are most predictive of a subject's response to checkpoint inhibitor immunotherapy are provided below, e.g., in Tables 1-15. In Tables 1-15, chromosome is abbreviated as “ch.” The chromosome sequences of GRCh37/hg19 (also referred to herein as hg37) can be found at the following accession numbers indicated in Table 18 below.
In some embodiments of any of the aspects, a set of microsatellite repeat marker sequences are selected from the respective Table for the subject's cancer type (e.g., at least one of Tables 1-15 or Table 21). In some embodiments of any of the aspects, a set of microsatellite repeat marker sequences comprises a plurality up to 800 of the microsatellites set out in the respective Table for the subject's cancer type (e.g., at least one of Tables 1-15 or Table 21). In some embodiments of any of the aspects, a set of microsatellite repeat marker sequences comprises a plurality of at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 21, at least 22, at least 23, at least 24, at least 25, at least 26, at least 27, at least 28, at least 29, at least 30, at least 31, at least 32, at least 33, at least 34, at least 35, at least 36, at least 37, at least 38, at least 39, at least 40, at least 41, at least 42, at least 43, at least 44, at least 45, at least 46, at least 47, at least 48, at least 49, at least 50, at least 51, at least 52, at least 53, at least 54, at least 55, at least 56, at least 57, at least 58, at least 59, at least 60, at least 61, at least 62, at least 63, at least 64, at least 65, at least 66, at least 67, at least 68, at least 69, at least 70, at least 71, at least 72, at least 73, at least 74, at least 75, at least 76, at least 77, at least 78, at least 79, at least 80, at least 81, at least 82, at least 83, at least 84, at least 85, at least 86, at least 87, at least 88, at least 89, at least 90, at least 91, at least 92, at least 93, at least 94, at least 95, at least 96, at least 97, at least 98, at least 99, at least 100, at least 101, at least 102, at least 103, at least 104, at least 105, at least 106, at least 107, at least 108, at least 109, at least 110, at least 111, at least 112, at least 113, at least 114, at least 115, at least 116, at least 117, at least 118, at least 119, at least 120, at least 121, at least 122, at least 123, at least 124, at least 125, at least 126, at least 127, at least 128, at least 129, at least 130, at least 131, at least 132, at least 133, at least 134, at least 135, at least 136, at least 137, at least 138, at least 139, at least 140, at least 141, at least 142, at least 143, at least 144, at least 145, at least 146, at least 147, at least 148, at least 149, at least 150, at least 151, at least 152, at least 153, at least 154, at least 155, at least 156, at least 157, at least 158, at least 159, at least 160, at least 161, at least 162, at least 163, at least 164, at least 165, at least 166, at least 167, at least 168, at least 169, at least 170, at least 171, at least 172, at least 173, at least 174, at least 175, at least 176, at least 177, at least 178, at least 179, at least 180, at least 181, at least 182, at least 183, at least 184, at least 185, at least 186, at least 187, at least 188, at least 189, at least 190, at least 191, at least 192, at least 193, at least 194, at least 195, at least 196, at least 197, at least 198, at least 199, at least 200, at least 201, at least 202, at least 203, at least 204, at least 205, at least 206, at least 207, at least 208, at least 209, at least 210, at least 211, at least 212, at least 213, at least 214, at least 215, at least 216, at least 217, at least 218, at least 219, at least 220, at least 221, at least 222, at least 223, at least 224, at least 225, at least 226, at least 227, at least 228, at least 229, at least 230, at least 231, at least 232, at least 233, at least 234, at least 235, at least 236, at least 237, at least 238, at least 239, at least 240, at least 241, at least 242, at least 243, at least 244, at least 245, at least 246, at least 247, at least 248, at least 249, at least 250, at least 251, at least 252, at least 253, at least 254, at least 255, at least 256, at least 257, at least 258, at least 259, at least 260, at least 261, at least 262, at least 263, at least 264, at least 265, at least 266, at least 267, at least 268, at least 269, at least 270, at least 271, at least 272, at least 273, at least 274, at least 275, at least 276, at least 277, at least 278, at least 279, at least 280, at least 281, at least 282, at least 283, at least 284, at least 285, at least 286, at least 287, at least 288, at least 289, at least 290, at least 291, at least 292, at least 293, at least 294, at least 295, at least 296, at least 297, at least 298, at least 299, at least 300, at least 301, at least 302, at least 303, at least 304, at least 305, at least 306, at least 307, at least 308, at least 309, at least 310, at least 311, at least 312, at least 313, at least 314, at least 315, at least 316, at least 317, at least 318, at least 319, at least 320, at least 321, at least 322, at least 323, at least 324, at least 325, at least 326, at least 327, at least 328, at least 329, at least 330, at least 331, at least 332, at least 333, at least 334, at least 335, at least 336, at least 337, at least 338, at least 339, at least 340, at least 341, at least 342, at least 343, at least 344, at least 345, at least 346, at least 347, at least 348, at least 349, at least 350, at least 351, at least 352, at least 353, at least 354, at least 355, at least 356, at least 357, at least 358, at least 359, at least 360, at least 361, at least 362, at least 363, at least 364, at least 365, at least 366, at least 367, at least 368, at least 369, at least 370, at least 371, at least 372, at least 373, at least 374, at least 375, at least 376, at least 377, at least 378, at least 379, at least 380, at least 381, at least 382, at least 383, at least 384, at least 385, at least 386, at least 387, at least 388, at least 389, at least 390, at least 391, at least 392, at least 393, at least 394, at least 395, at least 396, at least 397, at least 398, at least 399, at least 400, at least 401, at least 402, at least 403, at least 404, at least 405, at least 406, at least 407, at least 408, at least 409, at least 410, at least 411, at least 412, at least 413, at least 414, at least 415, at least 416, at least 417, at least 418, at least 419, at least 420, at least 421, at least 422, at least 423, at least 424, at least 425, at least 426, at least 427, at least 428, at least 429, at least 430, at least 431, at least 432, at least 433, at least 434, at least 435, at least 436, at least 437, at least 438, at least 439, at least 440, at least 441, at least 442, at least 443, at least 444, at least 445, at least 446, at least 447, at least 448, at least 449, at least 450, at least 451, at least 452, at least 453, at least 454, at least 455, at least 456, at least 457, at least 458, at least 459, at least 460, at least 461, at least 462, at least 463, at least 464, at least 465, at least 466, at least 467, at least 468, at least 469, at least 470, at least 471, at least 472, at least 473, at least 474, at least 475, at least 476, at least 477, at least 478, at least 479, at least 480, at least 481, at least 482, at least 483, at least 484, at least 485, at least 486, at least 487, at least 488, at least 489, at least 490, at least 491, at least 492, at least 493, at least 494, at least 495, at least 496, at least 497, at least 498, at least 499, at least 500, at least 501, at least 502, at least 503, at least 504, at least 505, at least 506, at least 507, or at least 508, at least 509, at least 510, at least 515, at least 520, at least 525, at least 530, at least 535, at least 540, at least 545, at least 550, at least 555, at least 560, at least 565, at least 570, at least 575, at least 580, at least 585, at least 590, at least 595, at least 600, at least 605, at least 610, at least 615, at least 620, at least 625, at least 630, at least 635, at least 640, at least 645, at least 650, at least 655, at least 660, at least 665, at least 670, at least 675, at least 680, at least 685, at least 690, at least 695, at least 700, at least 705, at least 710, at least 715, at least 720, at least 725, at least 730, at least 735, at least 740, at least 745, at least 750, at least 755, at least 760, at least 765, at least 770, at least 775, at least 780, at least 785, at least 790, at least 795, or at least 800 or more of the microsatellites set out in the respective Table for the subject's cancer type (e.g., at least one of Tables 1-15 or Table 21).
In some embodiments of any of the aspects, a set of microsatellite repeat marker sequences comprises a plurality of at most 1, at most 2, at most 3, at most 4, at most 5, at most 6, at most 7, at most 8, at most 9, at most 10, at most 11, at most 12, at most 13, at most 14, at most 15, at most 16, at most 17, at most 18, at most 19, at most 20, at most 21, at most 22, at most 23, at most 24, at most 25, at most 26, at most 27, at most 28, at most 29, at most 30, at most 31, at most 32, at most 33, at most 34, at most 35, at most 36, at most 37, at most 38, at most 39, at most 40, at most 41, at most 42, at most 43, at most 44, at most 45, at most 46, at most 47, at most 48, at most 49, at most 50, at most 51, at most 52, at most 53, at most 54, at most 55, at most 56, at most 57, at most 58, at most 59, at most 60, at most 61, at most 62, at most 63, at most 64, at most 65, at most 66, at most 67, at most 68, at most 69, at most 70, at most 71, at most 72, at most 73, at most 74, at most 75, at most 76, at most 77, at most 78, at most 79, at most 80, at most 81, at most 82, at most 83, at most 84, at most 85, at most 86, at most 87, at most 88, at most 89, at most 90, at most 91, at most 92, at most 93, at most 94, at most 95, at most 96, at most 97, at most 98, at most 99, at most 100, at most 101, at most 102, at most 103, at most 104, at most 105, at most 106, at most 107, at most 108, at most 109, at most 110, at most 111, at most 112, at most 113, at most 114, at most 115, at most 116, at most 117, at most 118, at most 119, at most 120, at most 121, at most 122, at most 123, at most 124, at most 125, at most 126, at most 127, at most 128, at most 129, at most 130, at most 131, at most 132, at most 133, at most 134, at most 135, at most 136, at most 137, at most 138, at most 139, at most 140, at most 141, at most 142, at most 143, at most 144, at most 145, at most 146, at most 147, at most 148, at most 149, at most 150, at most 151, at most 152, at most 153, at most 154, at most 155, at most 156, at most 157, at most 158, at most 159, at most 160, at most 161, at most 162, at most 163, at most 164, at most 165, at most 166, at most 167, at most 168, at most 169, at most 170, at most 171, at most 172, at most 173, at most 174, at most 175, at most 176, at most 177, at most 178, at most 179, at most 180, at most 181, at most 182, at most 183, at most 184, at most 185, at most 186, at most 187, at most 188, at most 189, at most 190, at most 191, at most 192, at most 193, at most 194, at most 195, at most 196, at most 197, at most 198, at most 199, at most 200, at most 201, at most 202, at most 203, at most 204, at most 205, at most 206, at most 207, at most 208, at most 209, at most 210, at most 211, at most 212, at most 213, at most 214, at most 215, at most 216, at most 217, at most 218, at most 219, at most 220, at most 221, at most 222, at most 223, at most 224, at most 225, at most 226, at most 227, at most 228, at most 229, at most 230, at most 231, at most 232, at most 233, at most 234, at most 235, at most 236, at most 237, at most 238, at most 239, at most 240, at most 241, at most 242, at most 243, at most 244, at most 245, at most 246, at most 247, at most 248, at most 249, at most 250, at most 251, at most 252, at most 253, at most 254, at most 255, at most 256, at most 257, at most 258, at most 259, at most 260, at most 261, at most 262, at most 263, at most 264, at most 265, at most 266, at most 267, at most 268, at most 269, at most 270, at most 271, at most 272, at most 273, at most 274, at most 275, at most 276, at most 277, at most 278, at most 279, at most 280, at most 281, at most 282, at most 283, at most 284, at most 285, at most 286, at most 287, at most 288, at most 289, at most 290, at most 291, at most 292, at most 293, at most 294, at most 295, at most 296, at most 297, at most 298, at most 299, at most 300, at most 301, at most 302, at most 303, at most 304, at most 305, at most 306, at most 307, at most 308, at most 309, at most 310, at most 311, at most 312, at most 313, at most 314, at most 315, at most 316, at most 317, at most 318, at most 319, at most 320, at most 321, at most 322, at most 323, at most 324, at most 325, at most 326, at most 327, at most 328, at most 329, at most 330, at most 331, at most 332, at most 333, at most 334, at most 335, at most 336, at most 337, at most 338, at most 339, at most 340, at most 341, at most 342, at most 343, at most 344, at most 345, at most 346, at most 347, at most 348, at most 349, at most 350, at most 351, at most 352, at most 353, at most 354, at most 355, at most 356, at most 357, at most 358, at most 359, at most 360, at most 361, at most 362, at most 363, at most 364, at most 365, at most 366, at most 367, at most 368, at most 369, at most 370, at most 371, at most 372, at most 373, at most 374, at most 375, at most 376, at most 377, at most 378, at most 379, at most 380, at most 381, at most 382, at most 383, at most 384, at most 385, at most 386, at most 387, at most 388, at most 389, at most 390, at most 391, at most 392, at most 393, at most 394, at most 395, at most 396, at most 397, at most 398, at most 399, at most 400, at most 401, at most 402, at most 403, at most 404, at most 405, at most 406, at most 407, at most 408, at most 409, at most 410, at most 411, at most 412, at most 413, at most 414, at most 415, at most 416, at most 417, at most 418, at most 419, at most 420, at most 421, at most 422, at most 423, at most 424, at most 425, at most 426, at most 427, at most 428, at most 429, at most 430, at most 431, at most 432, at most 433, at most 434, at most 435, at most 436, at most 437, at most 438, at most 439, at most 440, at most 441, at most 442, at most 443, at most 444, at most 445, at most 446, at most 447, at most 448, at most 449, at most 450, at most 451, at most 452, at most 453, at most 454, at most 455, at most 456, at most 457, at most 458, at most 459, at most 460, at most 461, at most 462, at most 463, at most 464, at most 465, at most 466, at most 467, at most 468, at most 469, at most 470, at most 471, at most 472, at most 473, at most 474, at most 475, at most 476, at most 477, at most 478, at most 479, at most 480, at most 481, at most 482, at most 483, at most 484, at most 485, at most 486, at most 487, at most 488, at most 489, at most 490, at most 491, at most 492, at most 493, at most 494, at most 495, at most 496, at most 497, at most 498, at most 499, at most 500, at most 501, at most 502, at most 503, at most 504, at most 505, at most 506, at most 507, at most 508, at most 509, at most 510, at most 515, at most 520, at most 525, at most 530, at most 535, at most 540, at most 545, at most 550, at most 555, at most 560, at most 565, at most 570, at most 575, at most 580, at most 585, at most 590, at most 595, at most 600, at most 605, at most 610, at most 615, at most 620, at most 625, at most 630, at most 635, at most 640, at most 645, at most 650, at most 655, at most 660, at most 665, at most 670, at most 675, at most 680, at most 685, at most 690, at most 695, at most 700, at most 705, at most 710, at most 715, at most 720, at most 725, at most 730, at most 735, at most 740, at most 745, at most 750, at most 755, at most 760, at most 765, at most 770, at most 775, at most 780, at most 785, at most 790, at most 795, at most 800 of the microsatellites set out in the respective Table for the subject's cancer type (e.g., at least one of Tables 1-15 or Table 21).
The microsatellite repeat marker sequences in Tables 1-15 are listed in rank order; i.e., they are ordered by significance, from the highest significance (i.e., lowest p-value) in the first data row, to the lowest significance (i.e., highest p-value) in the last data row. In some embodiments of any of the aspects, the set of microsatellite repeat marker sequences comprises a plurality up to the first 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, or 800 or more of the sequences set out in the respective Table for the subject's cancer type e.g., at least one of Tables 1-15).
In some embodiments of any of the aspects, the set of microsatellite repeat marker sequences comprises a plurality comprising sequences, each with a p-value of no more than 1E-48, no more than 1E-47, no more than 1E-46, no more than 1E-45, no more than 1E-44, no more than 1E-43, no more than 1E-42, no more than 1E-41, no more than 1E-40, no more than 1E-39, no more than 1E-38, no more than 1E-37, no more than 1E-36, no more than 1E-35, no more than 1E-34, no more than 1E-33, no more than 1E-32, no more than 1E-31, no more than 1E-30, no more than 1E-29, no more than 1E-28, no more than 1E-27, no more than 1E-26, no more than 1E-25, no more than 1E-24, no more than 1E-23, no more than 1E-22, no more than 1E-21, no more than 1E-20, no more than 1E-19, no more than 1E-18, no more than 1E-17, no more than 1E-16, no more than 1E-15, no more than 1E-14, no more than 1E-13, no more than 1E-12, no more than 1E-11, no more than 1E-10, no more than 1E-9, no more than 1E-8, no more than 1E-7, no more than 1E-6, no more than 1E-5, no more than 1E-4, no more than 1E-3, no more than 0.01, no more than 0.02, no more than 0.03, no more than 0.04, or no more than 0.05, as set out in the respective Table for the subject's cancer type e.g., at least one of Tables 1-15).
Unless otherwise defined herein, scientific and technical terms used in connection with the present application shall have the meanings that are commonly understood by those of ordinary skill in the art to which this disclosure belongs. It should be understood that this invention is not limited to the particular methodology, protocols, and reagents, etc., described herein and as such can vary. The terminology used herein is for the purpose of describing particular embodiments only, and is not intended to limit the scope of the present invention, which is defined solely by the claims. Definitions of common terms in cell biology, immunology and molecular biology can be found in The Merck Manual of Diagnosis and Therapy, 20th Edition, published by Merck Sharp & Dohme Corp., 2018 (ISBN 0911910190, 978-0911910421); Robert S. Porter et al. (eds.), The Encyclopedia of Molecular Cell Biology and Molecular Medicine, published by Blackwell Science Ltd., 1999-2012 (ISBN 9783527600908); and Robert A. Meyers (ed.), Molecular Biology and Biotechnology: a Comprehensive Desk Reference, published by VCH Publishers, Inc., 1995 (ISBN 1-56081-569-8); Immunology by Werner Luttmann, published by Elsevier, 2006; Janeway's Immunobiology, Kenneth Murphy, Allan Mowat, Casey Weaver (eds.), W. W. Norton & Company, 2016 (ISBN 0815345054, 978-0815345053); Lewin's Genes XI, published by Jones & Bartlett Publishers, 2014 (ISBN-1449659055); Michael Richard Green and Joseph Sambrook, Molecular Cloning: A Laboratory Manual, 4th ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., USA (2012) (ISBN 1936113414); Davis et al., Basic Methods in Molecular Biology, Elsevier Science Publishing, Inc., New York, USA (2012) (ISBN 044460149X); Laboratory Methods in Enzymology: DNA, Jon Lorsch (ed.) Elsevier, 2013 (ISBN 0124199542); Current Protocols in Molecular Biology (CPMB), Frederick M. Ausubel (ed.), John Wiley and Sons, 2014 (ISBN 047150338X, 9780471503385), Current Protocols in Protein Science (CPPS), John E. Coligan (ed.), John Wiley and Sons, Inc., 2005; and Current Protocols in Immunology (CPI) (John E. Coligan, ADA M Kruisbeek, David H Margulies, Ethan M Shevach, Warren Strobe, (eds.) John Wiley and Sons, Inc., 2003 (ISBN 0471142735, 9780471142737), the contents of which are all incorporated by reference herein in their entireties.
For convenience, the meaning of some terms and phrases used in the specification, examples, and appended claims, are provided below. Unless stated otherwise, or implicit from context, the following terms and phrases include the meanings provided below. The definitions are provided to aid in describing particular embodiments, and are not intended to limit the claimed invention, because the scope of the invention is limited only by the claims. Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. If there is an apparent discrepancy between the usage of a term in the art and its definition provided herein, the definition provided within the specification shall prevail.
An “agent” can be any chemical, entity or moiety, including without limitation synthetic and naturally-occurring proteinaceous and non-proteinaceous entities. In some embodiments, an agent is a nucleic acid, nucleic acid analog, protein, antibody, peptide, aptamer, oligomer of nucleic acids, amino acids, or carbohydrates including without limitation a protein, oligonucleotide, ribozyme, DNAzyme, glycoprotein, siRNAs, lipoprotein and/or a modification or combinations thereof etc. In certain embodiments, agents are small molecule chemical moieties. For example, chemical moieties included unsubstituted or substituted alkyl, aromatic, or heterocyclyl moieties including macrolides, leptomycins and related natural products or analogues thereof. Compounds can be known to have a desired activity and/or property, or can be selected from a library of diverse compounds.
An agent can be a molecule from one or more chemical classes, e.g., organic molecules, which may include organometallic molecules, inorganic molecules, genetic sequences, etc. Agents may also be fusion proteins from one or more proteins, chimeric proteins (for example domain switching or homologous recombination of functionally significant regions of related or different molecules), synthetic proteins or other protein variations including substitutions, deletions, insertions and other variants.
The term “therapeutically effective amount” refers to an amount of a therapeutic as described herein, that is effective to treat a disease or disorder as the terms “treat” or “treatment” are defined herein. Amounts will vary depending on the specific disease or disorder, its state of progression, age, weight and gender of a subject, among other variables. Thus, it is not possible to specify an exact “effective amount”. However, for any given case, an appropriate “effective amount” can be determined by one of ordinary skill in the art using only routine experimentation.
As used herein, the term “checkpoint inhibitor immunotherapy” refers to any agent, small molecule, antibody, or the like that can reduce or inhibit the level or activity of an immune checkpoint molecule. Inhibition of an immune checkpoint molecule can promote an immune response, e.g., against cancer or a tumor which otherwise evades such response. Immune checkpoint molecules can include but are not limited to PD-1 or PD-L1, CTLA4, Adenosine A2A receptor (A2AR), CD276, CD39, CD73, B7 family immune checkpoint molecules, V-set domain-containing T-cell activation inhibitor 1 (B7H4), B and T Lymphocyte Attenuator (BTLA), Indoleamine 2,3-dioxygenase (IDO), Killer-cell Immunoglobulin-like Receptor (KIR), Lymphocyte Activation Gene-3 (LAG3), nicotinamide adenine dinucleotide phosphate NADPH oxidase isoform 2 (NOX2), T-cell Immunoglobulin domain and Mucin domain 3 (TIM-3), T cell immunoreceptor with Ig and ITIM domains (TIGIT), V-domain Ig suppressor of T cell activation (VISTA), and Sialic acid-binding immunoglobulin-type lectin 7 (SIGLEC7), and those described, e.g., Pardoll et al., Nature Reviews Cancer 12, 252-264 (2012), which is incorporated herein by reference in its entirety. Non-limiting examples of checkpoint inhibitors include pembrolizumab (Keytruda®), nivolumab (Opdivo®), cemiplimab (Libtayo®), spartalizumab, camrelizumab, sintilimab, tislelizumab, toripalimab, dostarlimab, INVMGA00012, AMP-224, AMP-514, atezolizumab (Tecentriq®), avelumab (Bavencio®), survalumab (Imfinzi®), KN035, CK-301, AUNP12, CA-170, BMS-986189, and ipilimumab (Yervoy®)
As used herein, the term “non-checkpoint inhibitor cancer therapy” refers to any therapy or agent useful in treating cancer other than checkpoint inhibitor immunotherapy. One of skill in the art can readily identify a non-checkpoint inhibitor chemotherapeutic agent, e.g. see Physicians' Cancer Chemotherapy Drug Manual 2014, Edward Chu, Vincent T. DeVita Jr., Jones & Bartlett Learning; Principles of Cancer Therapy, Chapter 85 in Harrison's Principles of Internal Medicine, 18th edition; Therapeutic Targeting of Cancer Cells: Era of Molecularly Targeted Agents and Cancer Pharmacology, Chs. 28-29 in Abeloff's Clinical Oncology, 2013 Elsevier; Baltzer L, Berkery R (eds): Oncology Pocket Guide to Chemotherapy, 2nd ed. St. Louis, Mosby-Year Book, 1995; Fischer D S (ed): The Cancer Chemotherapy Handbook, 4th ed. St. Louis, Mosby-Year Book, 2003)).
As used herein, the term “cancer” refers to a hyperproliferation of cells that exhibit a loss of normal cellular control that results in unregulated growth, lack of differentiation, local tissue invasion, and metastasis. Non-limiting examples of cancer types include, but are not limited to, human sarcomas and carcinomas, e.g., fibrosarcoma, myxosarcoma, liposarcoma, chondrosarcoma, osteogenic sarcoma, chordoma, angiosarcoma, endotheliosarcoma, lymphangiosarcoma, lymphangioendotheliosarcoma, synovioma, mesothelioma, Ewing's tumor, leiomyosarcoma, rhabdomyosarcoma, colon carcinoma, colorectal cancer, pancreatic cancer, breast cancer, ovarian cancer, prostate cancer, squamous cell carcinoma, basal cell carcinoma, adenocarcinoma, sweat gland carcinoma, sebaceous gland carcinoma, papillary carcinoma, papillary adenocarcinomas, cystadenocarcinoma, medullary carcinoma, bronchogenic carcinoma, renal cell carcinoma, hepatoma, bile duct carcinoma, liver cancer, choriocarcinoma, seminoma, embryonal carcinoma, Wilms' tumor, cervical cancer, bone cancer, brain tumor, testicular cancer, lung carcinoma, small cell lung carcinoma, bladder carcinoma, epithelial carcinoma, glioma, astrocytoma, medulloblastoma, craniopharyngioma, ependymoma, pinealoma, hemangioblastoma, acoustic neuroma, oligodendroglioma, meningioma, melanoma, neuroblastoma, retinoblastoma; leukemias, e.g., acute lymphocytic leukemia and acute myelocytic leukemia (myeloblastic, promyelocytic, myelomonocytic, monocytic and erythroleukemia); chronic leukemia (chronic myelocytic (granulocytic) leukemia and chronic lymphocytic leukemia); and polycythemia vera, lymphoma (Hodgkin's disease and non-Hodgkin's disease), multiple myeloma, Waldenstrom's macroglobulinemia, heavy chain disease. In some embodiments of any of the aspects, the cancer type is selected from the group consisting of: BLCA, Bladder Urothelial Carcinoma; BRCA, Breast invasive carcinoma; CESC, Cervical squamous cell carcinoma and endocervical adenocarcinoma; CHOL, Cholangiocarcinoma; COAD, Colon adenocarcinoma; DLBC, Lymphoid Neoplasm Diffuse Large B-cell Lymphoma; ESCA, Esophageal carcinoma; GBM, Glioblastoma multiforme; HNSC, Head and Neck squamous cell carcinoma; KICH, Kidney Chromophobe; KIRC, Kidney renal clear cell carcinoma; KIRP, Kidney renal papillary cell carcinoma; LAML, Acute Myeloid Leukemia; LGG, Brain Lower Grade Glioma; LIHC, Liver hepatocellular carcinoma; LUAD, Lung adenocarcinoma; LUSC, Lung squamous cell carcinoma; MESO, Mesothelioma; OV, Ovarian serous cystadenocarcinoma; PAAD, Pancreatic adenocarcinoma; PCPG, Pheochromocytoma and Paraganglioma; PRAD, Prostate adenocarcinoma; READ, Rectum adenocarcinoma; SARC, Sarcoma; SKCM, Skin Cutaneous Melanoma; STAD, Stomach adenocarcinoma; TGCT, Testicular Germ Cell Tumors; THCA, Thyroid carcinoma; THYM, Thymoma; UCEC, Uterine Corpus Endometrial Carcinoma; UCS, Uterine Carcinosarcoma; and UVM, Uveal Melanoma.
As used herein the term “microsatellite instability” or “MSI” refers to a molecular tumor phenotype that is indicative of genomic hypermutability, marked by spontaneous gains or losses of nucleotides from repetitive DNA tracts, resulting in new alleles of differing length.
As used herein, the term “microsatellite repeat sequence” refers to a repetitive nucleotide sequence of about 1-6 base pairs or more in length. The repeat sequences can vary in number of repeats, generally ranging from about 5 to about 60 repeats. The methods provided herein are based, in part, on microsatellite mutations relative to microsatellite repeat sequences in the reference human genome, GRCh37/hg19. A microsatellite repeat sequence can be identified, for example, using tools such as the microsatellite identification tool (MISA), found on the world wide web at <webblast.ipk-gatersleben.de/misa/>.
As used herein, the terms “treat,” “treatment,” “treating,” or “amelioration” refer to therapeutic treatments, wherein the object is to reverse, alleviate, ameliorate, inhibit, slow down or stop the progression or severity of a condition associated with, a disease or disorder. The term “treating” includes reducing or alleviating at least one adverse effect or symptom of a disease or disorder. Treatment is generally “effective” if one or more symptoms or clinical markers are reduced. Alternatively, treatment is “effective” if the progression of a disease is reduced or halted. That is, “treatment” includes not just the improvement of symptoms or markers, but also a cessation of at least slowing of progress or worsening of symptoms that would be expected in absence of treatment. Beneficial or desired clinical results include, but are not limited to, alleviation of one or more symptom(s), diminishment of extent of disease, stabilized (i.e., not worsening) state of disease, delay or slowing of disease progression, amelioration or palliation of the disease state, and remission (whether partial or total). The term “treatment” of a disease also includes providing relief from the symptoms or side-effects of the disease (including palliative treatment).
The terms “decrease”, “reduce”, “reduction”, or “inhibit” are all used herein to mean a decrease by a statistically significant amount. In some embodiments, “reduce,” “reduction”, “decrease” or “inhibit” means a decrease by at least 10% as compared to a reference level (e.g. the absence of a given treatment) and can include, for example, a decrease by at least about 10%, at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or more. As used herein, “reduction” or “inhibition” does not encompass complete inhibition or reduction as compared to a reference level. “Complete inhibition” is a 100% inhibition as compared to a reference level.
The terms “increased”, “increase”, “enhance”, or “activate” are all used herein to mean an increase by a statically significant amount. In some embodiments, the terms “increased”, “increase”, “enhance”, or “activate” can mean an increase of at least 10% as compared to a reference level, for example an increase of at least about 20%, or at least about 30%, or at least about 40%, or at least about 50%, or at least about 60%, or at least about 70%, or at least about 80%, or at least about 90% or up to and including a 100% increase or any increase between 10-100% as compared to a reference level, or at least about a 2-fold, or at least about a 3-fold, or at least about a 4-fold, or at least about a 5-fold or at least about a 10-fold increase, or any increase between 2-fold and 10-fold or greater as compared to a reference level.
As used herein, a “subject” is a human or a non-human animal. Usually the non-human animal is a vertebrate such as a primate, rodent, domestic animal or game animal. Primates include chimpanzees, cynomolgus monkeys, spider monkeys, and macaques, e.g., Rhesus. Rodents include mice, rats, woodchucks, ferrets, rabbits and hamsters. Domestic and game animals include cows, horses, pigs, deer, bison, buffalo, feline species, e.g., domestic cat, canine species, e.g., dog, fox, wolf, avian species, e.g., chicken, emu, ostrich, and fish, e.g., trout, catfish and salmon. In some embodiments, the subject is a mammal, e.g., a primate, e.g., a human. The terms, “individual,” “patient” and “subject” are used interchangeably herein.
Preferably, the subject is a mammal. The mammal can be a human, non-human primate, mouse, rat, dog, cat, horse, or cow, but is not limited to these examples. Mammals other than humans can be advantageously used as subjects that represent animal models of diseases including diseases and disorders involving inappropriate immunosuppression. A subject can be male or female.
A subject can be one who has been previously diagnosed with or identified as suffering from or having a condition in need of treatment or one or more complications related to such a condition, and optionally, have already undergone treatment for the condition or the one or more complications related to the condition. Alternatively, a subject can also be one who has not been previously diagnosed as having the condition or one or more complications related to the condition. For example, a subject can be one who exhibits one or more risk factors for the condition or one or more complications related to the condition or a subject who does not exhibit risk factors.
As used herein, a “subject in need” of treatment for a particular condition can be a subject having that condition, diagnosed as having that condition, or at risk of developing that condition. In one embodiment, a subject in need either has the condition or has been diagnosed as having the condition.
As used herein, a “reference level” refers to the level of a given, e.g., biomarker or parameter useful as a gauge for an experimental or diagnostic measurement. In one embodiment, a reference level is the level of such marker or parameter in a normal, otherwise unaffected cell population or tissue (e.g., a biological sample obtained from a healthy subject, or a level of the marker or parameter from a sample obtained from the subject at a prior time point, e.g., a biological sample obtained from a patient prior to being diagnosed with a disease or disorder, or a biological sample from an individual that has not been contacted with a therapeutic composition). It is contemplated that one can also use a biomarker or parameter from an individual whose cancer does respond to checkpoint inhibitor immunotherapy as a reference. Microsatellites as described herein are compared to those in reference human genome build GRCh37/hg19.
The term “statistically significant” or “significantly” refers to statistical significance and generally means a two standard deviation (2SD) or greater difference.
As used herein the term “comprising” or “comprises” is used in reference to compositions, methods, and respective component(s) thereof, that are essential to the method or composition, yet open to the inclusion of unspecified elements, whether essential or not.
The singular terms “a,” “an,” and “the” include plural referents unless context clearly indicates otherwise. Similarly, the word “or” is intended to include “and” unless the context clearly indicates otherwise. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of this disclosure, suitable methods and materials are described below. The abbreviation, “e.g.” is derived from the Latin exempli gratia, and is used herein to indicate a non-limiting example. Thus, the abbreviation “e.g.” is synonymous with the term “for example.”
Groupings of alternative elements or embodiments of the invention disclosed herein are not to be construed as limitations. Each group member can be referred to and claimed individually or in any combination with other members of the group or other elements found herein. One or more members of a group can be included in, or deleted from, a group for reasons of convenience and/or patentability. When any such inclusion or deletion occurs, the specification is herein deemed to contain the group as modified thus fulfilling the written description of all Markush groups used in the appended claims.
Other terms are defined herein within the description of the various aspects of the invention.
All patents and other publications; including literature references, issued patents, published patent applications, and co-pending patent applications; cited throughout this application are expressly incorporated herein by reference for the purpose of describing and disclosing, for example, the methodologies described in such publications that might be used in connection with the technology described herein. These publications are provided solely for their disclosure prior to the filing date of the present application. Nothing in this regard should be construed as an admission that the inventors are not entitled to antedate such disclosure by virtue of prior invention or for any other reason. All statements as to the date or representation as to the contents of these documents is based on the information available to the applicants and does not constitute any admission as to the correctness of the dates or contents of these documents.
The description of embodiments of the disclosure is not intended to be exhaustive or to limit the disclosure to the precise form disclosed. While specific embodiments of, and examples for, the disclosure are described herein for illustrative purposes, various equivalent modifications are possible within the scope of the disclosure, as those skilled in the relevant art will recognize. For example, while method steps or functions are presented in a given order, alternative embodiments may perform functions in a different order, or functions may be performed substantially concurrently. The teachings of the disclosure provided herein can be applied to other procedures or methods as appropriate. The various embodiments described herein can be combined to provide further embodiments. Aspects of the disclosure can be modified, if necessary, to employ the compositions, functions and concepts of the above references and application to provide yet further embodiments of the disclosure. These and other changes can be made to the disclosure in light of the detailed description. All such modifications are intended to be included within the scope of the appended claims.
Specific elements of any of the foregoing embodiments can be combined or substituted for elements in other embodiments. Furthermore, while advantages associated with certain embodiments of the disclosure have been described in the context of these embodiments, other embodiments may also exhibit such advantages, and not all embodiments need necessarily exhibit such advantages to fall within the scope of the disclosure.
The technology described herein is further illustrated by the following examples which in no way should be construed as being further limiting.
Some embodiments of the technology described herein can be defined according to any of the following numbered paragraphs:
49. A method of treating cancer in a subject in need thereof, wherein the subject's cancer has been determined to have a microsatellite instability status of MSI-H as determined by the method of any one of paragraphs 1-42, the method comprising administering an effective amount of a checkpoint inhibitor to the subject.
Microsatellite instability (MSI) predicts oncological response to checkpoint blockade immunotherapies. Although microsatellite mutation is pathognomonic for the condition, loci have unequal diagnostic value for predicting MSI within and across cancer types.
METHODS SUMMARY: To better inform molecular diagnosis of MSI, 9,438 tumor-normal exome pairs and 901 whole genome sequence pairs from 33 different cancer types were examined, and genome-wide microsatellite instability events were cataloged. Using a statistical framework, microsatellite mutations were identified that were predictive of MSI within and across cancer types. The diagnostic accuracy of different subsets of maximally informative markers was estimated computationally using a dedicated validation set.
RESULTS SUMMARY: Twenty-five cancer types exhibited hypermutated states consistent with MSI. Recurrently mutated microsatellites associated with MSI were identifiable in 15 cancer types, but were largely specific to individual cancer types. Cancer-specific microsatellite panels of one to seven loci were needed to attain 95% diagnostic sensitivity and specificity for 11 cancer types, and in eight of the cancer types, 100% sensitivity and specificity were achieved. Breast cancer required 800 loci to achieve comparable performance, and recurrent microsatellite mutations were not identified supporting reliable MSI diagnosis in ovarian tumors. Features associated with informative microsatellites were cataloged.
CONCLUSION SUMMARY: Most microsatellites informative for MSI are specific to particular cancer types, requiring the use of tissue-specific loci for optimal diagnosis. Limited numbers of markers are needed to provide accurate MSI diagnosis in most tumor types, but it is challenging to diagnose breast and ovarian cancers using pre-defined microsatellite locus panels.
Microsatellite instability (MSI) is a molecular tumor phenotype that is indicative of genomic hypermutability, usually reflecting inactivation of the mismatch repair (MMR) system. MSI is marked by spontaneous gains or losses of nucleotides from repetitive DNA tracts, resulting in new alleles of differing length that serve as the basis for its clinical diagnosis. Although classically associated with colorectal and endometrial tumors, MSI has now been recognized in most cancer types with varying prevalence and is accompanied by a generally increased rate of mutations genome-wide. Testing for MSI has subsequently emerged as a pan-cancer biomarker of therapeutic response to PDL-1 and PD-1 immune checkpoint inhibitors, where the MSI positive (microsatellite high, or MSI-H) phenotype is believed to serve as an indicator of mutation-associated neoantigens that permit a more robust T lymphocyte response than for MSI negative (microsatellite stable, or MSS) cases.
Molecular diagnosis of MSI in clinical practice is most commonly achieved using multiplexed PCR of defined microsatellite loci, followed by capillary electrophoresis to qualitatively detect new alleles (MSI-PCR). Quantitative next-generation sequencing (NGS) methods have been developed to identify MSI by assessing overall microsatellite mutation frequency at repetitive loci that are either directly targeted or incidentally captured by targeted gene enrichment oncology panels. Nevertheless, both conventional and NGS approaches interrogate markedly limited subsets of the millions of microsatellite markers available in the human genome: only five loci are included in standard MSI-PCR, whereas dozens to hundreds of sites are examined by typical NGS approaches.
Studies have revealed tissue-specific signatures of microsatellite mutation, such that alterations in specific loci can occur with disparate frequencies in tumors from different tissues. Consequently, microsatellites that are diagnostic for MSI potentially have unequal prognostic value within and across tumor types, such that loci useful in one cancer type may not yield accurate diagnoses in others. Supporting this hypothesis, standard MSI-PCR markers were developed for use in colon cancers and can exhibit poor performance in other malignancies.
The choice of markers is therefore critical to maximize sensitivity and specificity of molecular MSI diagnosis, regardless of whether testing is performed by conventional or NGS methods. Nevertheless, little effort to date has focused on using systematic, genome-scale analysis to identify optimal microsatellite loci for diagnosing MSI. Here, tumor-normal pairs of exome and whole genome data from 33 different cancer types were systematically evaluated in order to ascertain the most informative microsatellites for predicting MSI.
Genomic microsatellite loci were identified as previously, with some modifications; see e.g., Hause et al. Nat Med. 2016, 22:1342-50, the contents of which are incorporated herein by reference in its entirety. Briefly, microsatellites were defined in the human genome (GRCh37/hg19; see e.g., Table 18 for sequence references) as repeating subunits of 1-5 bp in length and comprising ≥5 repeats using MISA. Adjacent microsatellites within ten base pairs of each other were termed ‘complex’ (c*) single loci if comprised of tracts with different repeating subunit lengths or ‘compound’ (c) single loci for those having the same repeat length. This analysis defined 19,035,602 loci, of which the 18,882,838 present on autosomes and chromosome X were retained. Repeat features were annotated using ANNOVAR (24 Feb. 2014 release).
Sequence alignments of tumors and patient-matched normal specimens from exome and whole genome sequencing projects were obtained from TCGA Research Network (see e.g., cancergenome.nih.gov/). Alignments were standardized prior to analysis by converting alignments to FASTQ files using PICARD v1.98, re-aligning to GRCh37/hg19 using BWA-MEM v0.7.12, and indexing with SamTools v1.1.
For each tumor and normal specimen, the number of sequence reads supporting different tract lengths at each locus were quantified using mSINGS (Git Commit ID a7e9ea9).
To identify instability events, multinomial distributions of allele lengths for tumor at each locus were compared to the joint multinomial distribution of allele lengths across tumor and normal at the site by:
(Xwildtype_allele,Xalternative_allele_1, . . . Xalternative_allele_i)˜Mult(n,p)
where which n refers to the number of reads at a site and p refers to the proportion supporting each alternative allelic length. “Unstable” microsatellites (those evidencing somatic mutations) were defined as those with nominally significant differences (p<0.05) by likelihood ratio (G) tests without continuity correction. Rates for calling false positive instability at this heuristic threshold were estimated as <3% at all sites having ≥2 reads in tumor and ≥1 read in its paired normal by simulating and comparing two distributions of 1,000 normal sites with median observed multinomial distributions of allele lengths by:
Mult(n=read_depth,P=0.9wildtype_allele,0.1alternative_alleles)
To confidently define “stable” sites in tumor (those lacking somatic mutations), sites were simulated from empirically observed multinomial read distributions for highly unstable sites in cases having >30 read coverage in both tumor and normal by:
Mult(1000,p=0.4wildtype_allele,0.6alternative_alleles)
Down-sampling analyses estimated that ≥18 reads per site yielded 95% power for identifying an unstable locus, providing strong evidence to conclude that a site was “stable” if no difference in allelic distribution was observed at that coverage. Sites with 5-18 reads in both tumor and paired normal but no indication of instability (p>0.05), and those covered by fewer than five reads in either tumor or normal samples were marked as “missing data”.
As a quality control measure, samples were excluded having ≥75% missing data, leaving 9,438 tumor-normal exome pairs and 901 whole genome pairs for subsequent analysis. Similarly, individual loci were excluded for which ≥75% of specimens evidenced missing data.
The overall frequency of microsatellite mutations was quantified for each tumor as the fraction of unstable sites over total callable sites, and given the skewed nature of the data, log10 transformation was performed. For each cancer type, a Gaussian mixture model was then fit to these values using MCLUST v5.4.5 with one or two mixture components with equal variance. If the two-component model could be validly applied to a cancer type, individual tumors were classified as MSS (lower mode), MSI-H (higher mode), or indeterminate (uncertainty value >0.1). If distributions were instead consistent with a single component, all tumors of that cancer type were classified as MSS.
After excluding “intermediate” MSI classifications, 80% of tumors in each cancer type were randomly assigned into training sets and 20% into validation sets. Using the training sets for each tissue type, microsatellites were identified that were most frequently mutated in MSI-H relative to MSS tumors by Fisher's exact test. The proportions of stable and unstable sites between MSS and MSI-H tumors were compared (excluding missing data), allowing loci to be rank-ordered by p value. For each tissue type, subsets of the top n most informative loci (ranging from 1 to 2,000 markers) were then selected and the percentage of unstable loci for each sample (excluding missing data from both numerator and denominator) was calculated. These values were used to calculate the area under the receiver operating characteristic (AUROC) using pROC v1.15.3. The optimal percentage on the receiver operating characteristic curve was identified by Youden's J statistic. This value was subsequently used as the threshold to assign MSS or MSI-H classification to each sample based on the fraction of mutated markers identified from the simulated panel and to determine sensitivity and specificity for each tissue type.
To account for the possibility of more complex trans-genomic interactions between sites not captured by an additive model of instability, machine learning approaches were explored including random forest and boosted trees. Iterative reduction of marker features was performed using both Fisher exact test p values and Shapley feature importance values in the training dataset. However, these approaches did not meaningfully outperform the simpler, additive model.
The top 2,000 loci most strongly associated with MSI-H status for each tissue type were examined for cross-performance across tissue types by hierarchical clustering of normalized p values. Normalization was required to account for differences in the uneven sample sizes, and was accomplished by log 10-transformation of raw p values, followed by scaling on a per-tissue basis from a range of zero (least significant) to one (most significant). A heatmap was generated using superheat v0.1.0, and pairwise and cophenetic distances between tissue types were subsequently calculated and compared using the base-R dist, cophenetic, and cor functions.
Enrichment of particular locus features (annotated genic context of the repeat, repeat class, and number of repeat subunits in deciles) was examined among individual microsatellite associations with MSI-H status using linear regression. Intergenic annotation, pentanucleotide repeats, and a length of 5-11 repeats, respectively, were arbitrarily selected as reference levels for these analyses. Whole genome data were used for this analysis, as they more comprehensively represented feature annotations across coding and noncoding regions than exome data.
To ascertain the relative degree of genomic instability within and across cancer types, the overall frequency of microsatellite mutations was assessed within each of 33 cancer types using paired tumor-normal exome sequencing data. For each cancer type, Gaussian mixture modeling was used to circumscribe subpopulations of tumors having high burdens of microsatellite mutation, corresponding to MSI-H cases (see e.g.,
Twenty-five of 32 cancer types evidenced one or more hypermutated tumors consistent with an MSI-H phenotype, ranging in incidence from 0.2% to 40%, similar to other reports; see e.g., Hause et al. Nat Med. 2016, 22:1342-50; Cortes-Ciriano et al. Nat Commun. 2017, 8:15180; Bonneville et al. JCO Precision Oncology. 2017, 1-15; the contents of each of which are incorporated herein by reference in their entireties. The total fraction of mutated microsatellites in MSI-H tumors varied considerably across cancer types, with the greatest microsatellite mutation burdens occurring in stomach, colon, and endometrial tumors (see e.g.,
Parallel analyses were performed using whole genome data (see e.g.,
Next, microsatellite mutations most predictive of MSI were identified by cataloging events occurring significantly in MSI-H relative to MSS tumors of each cancer type. This analysis was restricted to cancers for which locus performance could be evaluated in both testing and validation sets, permitting examination of 15 cancer types by exome data alone (see e.g., Supplemental Table 3 in Appendix A of U.S. Provisional Application No. 63/044,029 filed Jun. 25, 2020, the contents of which are incorporated herein by reference in their entirety).
Hierarchical clustering (see e.g.,
Predictive Value of Microsatellites for MSI Diagnosis within and Across Cancer Types
Subsets of one to 2,000 of the most highly informative loci identified per cancer type were used to computationally evaluate their performance for diagnosing MSI within their tumor type of origin, using an independent set of tumor samples for validation (see e.g.,
The number of markers needed to achieve MSI diagnosis was initially determined with at least 95% sensitivity and specificity (see e.g.,
The desired balance of sensitivity and specificity can vary by clinical application, and relates to the number of markers examined. Therefore, the predictive capacity of various numbers of markers was additionally determined as measured by the AUROC and the number of makers required to achieve an area under the curve (AUC) of 0.9 or greater (see e.g., Table 17). Although the number of markers required for most cancers by this metric remained similar, decreases for breast (from 800 to 50), kidney renal clear cell (65 to 55), endometrial (20 to 2), and stomach (2 to 1) were observed.
A small number of loci were informative for MSI across multiple cancer types. Therefore, informative microsatellites were examined in endometrial, colon, rectal, and stomach cancers, which collectively showed closely related mutational patterns, in order to determine whether a common marker panel could be used to diagnose MSI in those tumors. After Bonferroni correction, 37 shared microsatellites were independently associated with MSI-H status in each of those four cancer types (see e.g., Table 21). The 37-marker panel demonstrated favorable performance characteristics for the four specific cancer types (0.98 AUC, sensitivity 94.3%, specificity 97.7%), but did not outperform respective tissue-type specific marker panels (see e.g., Table 17) and functioned poorly when applied to other cancers. For example, AUC was 0.45 when the panel was applied to lung squamous cell carcinoma and lung adenocarcinoma, compared with AUC 0.95 for a similarly sized panel specific to those cancer types.
The sequence composition and genic feature annotations that were enriched in microsatellites having globally high informativity were examined for MSI-H tumors (see e.g.,
Herein, genomic analyses were used to prioritize microsatellite markers that are most informative for diagnosing MSI by molecular methods. Whereas other work has cataloged loci that are frequently mutated in MSI-H tumors from the three or four cancer types where the phenotype occurs most often (see e.g., Hause et al., 2016, supra; Cortes-Ciriano et al., 2017, supra), herein microsatellite mutation occurrence was more broadly examined across cancer types and the performance of variously sized marker subsets was also evaluated for classifying MSI-H tumors in clinical practice.
Cancer types exhibit distinct patterns of microsatellite mutation overall, and herein it was found that microsatellites that are informative for diagnosing MSI also vary across cancer types (see e.g.,
Accordingly, in determining markers with the highest diagnostic utility (see e.g.,
The performance was evaluated of variously sized subsets of the most informative loci per cancer type in data sets withheld from those used to identify informative loci (see e.g.,
Although MSI status is currently an approved diagnostic marker to indicate eligibility for PDL-1 and PD-1 inhibitor treatments, alternative methods for testing tumor susceptibility to those immunotherapies are now available. The most widely used of these alternative approaches is tumor mutation burden (TMB), a biomarker based on estimating the total substitution and indel lesions present in a cancer genome. Nevertheless, TMB determination requires sequencing large gene panels and is of contested clinical utility. Because immunotherapy response is particularly associated with insertion-deletion mutational load, MSI is considered a more reliable positive predictor of treatment outcomes even though it is unable to identify all cancers for which a favorable response can be achieved. Although MSI and TMB determinations frequently overlap, they provide distinct information. Given these considerations, MSI and TMB can be considered complementary, and dedicated testing for MSI can continue to provide utility as an inexpensive, primary screening method for immunotherapy response.
The tissue-specific diagnostic performance of loci identified herein could not be directly compared to microsatellite loci included in clinical MSI assays due to both the unknown sensitivity of NGS relative to MSI-PCR for detecting microsatellite mutations and inadequate read depths for those markers in both exome and whole genome data. It is noteworthy that the most informative markers identified by these analyses do not overlap with those utilized in standard clinical assays for MSI.
Nonstandard abbreviations include the following: MSI, microsatellite instability; MMR, mismatch repair; MSI-H, microsatellite instability high; MSS, microsatellite stable; NGS, Next-Generation DNA sequencing; TMB, tumor mutation burden; AUC, area under the curve; AUROC, area under the receiver operating characteristic. Human gene abbreviations include at least the following: PD-1, programmed cell death 1; PDL-1, programmed cell death 1 ligand 1. All gene abbreviations herein (e.g., in Tables 1-15) are used as known in art.
References include the following, the contents of each of which are incorporated herein by reference in their entireties:
For full versions of the Supplemental Tables 1-3 described in EXAMPLE 1 above, please see APPENDIX A of U.S. Provisional Application No. 63/044,029 filed Jun. 25, 2020, the contents of which are incorporated herein by reference in their entirety.
Please see APPENDIX B of U.S. Provisional Application No. 63/044,029 filed Jun. 25, 2020, the contents of which are incorporated herein by reference in their entirety, for a table of microsatellite sequences most significantly associated with MSI-H classification for each cancer type. The location of each microsatellite is listed according to hg19 coordinates; and the p value for significance is indicated for each specified cancer type. Blank cells indicate non-significant values. See also Tables 1-15 herein.
This application claims benefit under 35 U.S.C. § 119(e) of U.S. Provisional Application No. 63/044,029 filed Jun. 25, 2020, the contents of which are incorporated herein by reference in their entirety.
This invention was made with government support under Grant No. CA222344, awarded by the National Institutes of Health. The government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2021/038672 | 6/23/2021 | WO |
Number | Date | Country | |
---|---|---|---|
63044029 | Jun 2020 | US |