This document relates to methods and materials involved in assessing and/or treating a mammal having a cancer. For example, methods and materials provided herein can be used to determine the corrected tumor mutation burden (cTMB) of one or more cells (e.g., one or more cancer cells) from a mammal having cancer, thereby identifying the cancer as being likely to respond to a particular cancer treatment (e.g., a cancer immunotherapy). This document also provides methods and materials for treating a mammal identified as having a cancer likely to respond to a particular cancer treatment.
A high tumor mutation burden (TMB) has been associated with benefit from immune checkpoint blockade (ICB) across tumor types (Yarchoan et al., The New England J. Med. 377:2500-2501 (2017); and Samstein et al., Nature genetics, doi:10.1038/s41588-018-0312-8 (2019)). Despite the value of TMB in predicting response and survival to ICB, there are tumors with a high TMB that do not respond and conversely there are tumors with low TMB that benefit from immunotherapy. Moreover, tissue-based TMB estimates may be challenging in low tumor purity samples and in tumors with a higher intra-tumoral heterogeneity. These limitations are reflected in the current NCCN guidelines, where the use of TMB as a predictive biomarker is limited by lack of calibration and harmonization across multiple next-generation sequencing platforms. Furthermore, response to immunotherapy is orchestrated by immune-related pathways, with the antigen presentation machinery playing a major role as mutation-associated neo-antigens (MANAs) are presented on MHC-I molecules to CD8+ T cells and trigger an anti-tumor immune response that translates to clinical benefit. Genetic variation in the antigen presenting machinery, both at a germline as well as a somatic level may therefore modulate an effective anti-tumor immune response (Gettinger et al., Cancer discovery 7:1420-1435 (2017); and Chowell et al., Science 359:582-587 (2018)).
This document provides methods and materials for assessing and/or treating a mammal having a cancer. For example, methods and materials provided herein can be used to determine the cTMB of one or more cells (e.g., one or more cancer cells) from a mammal having cancer, thereby identifying the cancer as being likely to respond to a particular cancer treatment (e.g., a cancer immunotherapy). This document also provides methods and materials for treating a mammal identified as having a cancer likely to respond to a particular cancer treatment.
As demonstrated herein, TMB can be corrected for tumor purity to obtain a cTMB which can be used to more accurately predict a patient outcome for immune checkpoint blockade. Furthermore, cTMB can be combined with genomic alterations in receptor tyrosine kinase (RTK) genes, genome-wide mutational signatures, and HLA class I genetic variation to capture the multifaceted nature of the tumor-immune system crosstalk to more accurately predict a patient outcome for immune checkpoint blockade. For example, this document demonstrates that an analysis of whole exome sequence data from 3,788 TCGA tumor samples found a significant correlation between TMB and tumor purity, suggesting that samples with low tumor purity are likely to have inaccurate TMB estimates. Whole exome sequencing using tumor samples from a cohort of 104 non-small cell lung cancer patients treated with immune checkpoint blockade identified improved markers of response, which were validated in a second independent cohort of immunotherapy treated lung cancer patients.
Having the ability to more accurately predict whether a patient is likely to respond to a particular cancer treatment (e.g., a cancer immunotherapy) can allow clinicians to provide an individualized approach in selected cancer treatments, thereby improving disease-free survival and/or overall survival and/or minimizing subjecting patients to ineffective treatments. In addition, insights into new mechanisms of resistance to immune checkpoint blockade described herein can lay the groundwork for the identification of molecular markers of response to a particular cancer treatment.
In general, one aspect of this document features methods for treating mammals having cancer where the methods can include, or consist essentially of, identifying a sample from a mammal as having a mutation in an ARID1A nucleic acid sequence; and administering a cancer immunotherapy to the mammal under conditions where the number of cancer cells present within the mammal is reduced. The sample can include at least one cancer cell. The sample can be a tissue sample. The mammal can be a human. The cancer immunotherapy can be alemtuzumab, atezolizumab, avelumab, ipilimumab, ofatumumab, nivolumab, pembrolizumab, rituximab, or durvalumab. The mammal also can be administered an additional cancer treatment. The additional cancer treatment can be surgery, radiation therapy, administration of a chemotherapy, administration of a hormone therapy, administration of a targeted therapy, or administration of a cytotoxic therapy. The cancer can be a lung cancer (e.g., a non-small cell lung cancer, a lung squamous cell carcinoma, or a lung adenocarcinoma).
In another aspect, this document features methods for treating mammals having cancer where the methods can include, or consist essentially of, identifying a sample from the mammal as having a molecular smoking signature; and administering a cancer immunotherapy to the mammal under conditions wherein the number of cancer cells present within the mammal is reduced. The sample can include at least one cancer cell. The sample can be a tissue sample. The mammal can be a human. The cancer immunotherapy can be alemtuzumab, atezolizumab, avelumab, ipilimumab, ofatumumab, nivolumab, pembrolizumab, rituximab, or durvalumab. The mammal also can be administered an additional cancer treatment. The additional cancer treatment can be surgery, radiation therapy, administration of a chemotherapy, administration of a hormone therapy, administration of a targeted therapy, or administration of a cytotoxic therapy. The cancer can be a lung cancer (e.g., a non-small cell lung cancer, a lung squamous cell carcinoma, or a lung adenocarcinoma).
In another aspect, this document features methods for treating mammals having cancer where the methods can include, or consist essentially of, administering a cancer immunotherapy to a mammal identified as having at least one cancer cell having a mutation in an ARID1A nucleic acid sequence. The mammal can be a human. The cancer immunotherapy can be alemtuzumab, atezolizumab, avelumab, ipilimumab, ofatumumab, nivolumab, pembrolizumab, rituximab, or durvalumab. The mammal also can be administered an additional cancer treatment. The additional cancer treatment can be surgery, radiation therapy, administration of a chemotherapy, administration of a hormone therapy, administration of a targeted therapy, or administration of a cytotoxic therapy. The cancer can be a lung cancer (e.g., a non-small cell lung cancer, a lung squamous cell carcinoma, or a lung adenocarcinoma).
In another aspect, this document features methods for treating mammals having cancer where the methods can include, or consist essentially of, administering a cancer immunotherapy to a mammal identified as having at least one cancer cell with a molecular smoking signature. The mammal can be a human. The cancer immunotherapy can be alemtuzumab, atezolizumab, avelumab, ipilimumab, ofatumumab, nivolumab, pembrolizumab, rituximab, or durvalumab. The mammal also can be administered an additional cancer treatment. The additional cancer treatment can be surgery, radiation therapy, administration of a chemotherapy, administration of a hormone therapy, administration of a targeted therapy, or administration of a cytotoxic therapy. The cancer can be a lung cancer (e.g., a non-small cell lung cancer, a lung squamous cell carcinoma, or a lung adenocarcinoma).
In another aspect, this document features methods for treating mammals having cancer where the methods can include, or consist essentially of, identifying a sample from the mammal as an activating mutation in EGFR nucleic acid, an activating mutation in ERBB2 nucleic acid, an activating mutation in MET nucleic acid, an activating mutation in FGFR1 nucleic acid, or an activating mutation in IGF1R nucleic acid; and administering a cancer treatment to the mammal under conditions where the number of cancer cells present within the mammal is reduced, and where the cancer treatment is not a cancer immunotherapy. The sample can include at least one cancer cell. The sample can be a tissue sample. The mammal can be a human. The cancer treatment can be surgery, radiation therapy, administration of a chemotherapy, administration of a hormone therapy, administration of a targeted therapy, or administration of a cytotoxic therapy. The cancer can be a lung cancer (e.g., a non-small cell lung cancer, a lung squamous cell carcinoma, or a lung adenocarcinoma).
In another aspect, this document features methods for treating mammals having cancer where the methods can include, or consist essentially of, identifying a sample from the mammal as having germline homozygosity or a loss of at least one HLA class I locus; and administering a cancer treatment to the mammal under conditions where the number of cancer cells present within the mammal is reduced, and where the cancer treatment is not a cancer immunotherapy. The sample can include at least one cancer cell. The sample can be a tissue sample. The mammal can be a human. The cancer treatment can be surgery, radiation therapy, administration of a chemotherapy, administration of a hormone therapy, administration of a targeted therapy, or administration of a cytotoxic therapy. The cancer can be a lung cancer (e.g., a non-small cell lung cancer, a lung squamous cell carcinoma, or a lung adenocarcinoma).
In another aspect, this document features methods for treating mammals having cancer where the methods can include, or consist essentially of, identifying a sample from the mammal as having a mutation in a KEAP1 nucleic acid sequence; and administering a cancer treatment to the mammal, and where the cancer treatment is not a cancer immunotherapy. The sample can include at least one cancer cell. The sample can be a tissue sample. The mammal can be a human. The cancer treatment can be surgery, radiation therapy, administration of a chemotherapy, administration of a hormone therapy, administration of a targeted therapy, or administration of a cytotoxic therapy. The cancer can be a lung cancer (e.g., a non-small cell lung cancer, a lung squamous cell carcinoma, or a lung adenocarcinoma).
In another aspect, this document features methods for treating mammals having cancer where the methods can include, or consist essentially of, administering a cancer treatment to a mammal identified as having at least one cancer cell having an activating mutation in EGFR nucleic acid, an activating mutation in ERBB2 nucleic acid, an activating mutation in MET nucleic acid, an activating mutation in FGFR1 nucleic acid, or an activating mutation in IGF1R nucleic acid, where the cancer treatment is not a cancer immunotherapy. The mammal can be a human. The cancer treatment can be surgery, radiation therapy, administration of a chemotherapy, administration of a hormone therapy, administration of a targeted therapy, or administration of a cytotoxic therapy. The cancer can be a lung cancer (e.g., a non-small cell lung cancer, a lung squamous cell carcinoma, or a lung adenocarcinoma).
In another aspect, this document features methods for treating mammals having cancer where the methods can include, or consist essentially of, administering a cancer treatment to a mammal identified as having germline homozygosity or a loss of at least one HLA class I locus, where the cancer treatment is not a cancer immunotherapy. The mammal can be a human. The cancer treatment can be surgery, radiation therapy, administration of a chemotherapy, administration of a hormone therapy, administration of a targeted therapy, or administration of a cytotoxic therapy. The cancer can be a lung cancer (e.g., a non-small cell lung cancer, a lung squamous cell carcinoma, or a lung adenocarcinoma).
In another aspect, this document features methods for treating mammals having cancer where the methods can include, or consist essentially of, administering a cancer treatment to a mammal identified as having a mutation in a KEAP1 nucleic acid sequence, where the cancer treatment is not a cancer immunotherapy. The mammal can be a human. The cancer treatment can be surgery, radiation therapy, administration of a chemotherapy, administration of a hormone therapy, administration of a targeted therapy, or administration of a cytotoxic therapy. The cancer can be a lung cancer (e.g., a non-small cell lung cancer, a lung squamous cell carcinoma, or a lung adenocarcinoma).
In another aspect, this document features methods for identifying a mammal as having a cancer that is likely to respond to an immunotherapy. The methods can include, or consist essentially of, determining a cTMB of the cancer, determining a mutational signature of the cancer, and identifying the cancer as not being likely to respond to an immunotherapy when the mutational signature of the cancer includes i) an activating mutation in a nucleic acid encoding a receptor tyrosine kinase (RTK) polypeptide; and ii) germline homozygosity or a loss of at least one HLA class I locus. The nucleic acid encoding the RTK polypeptide is a EGFR, ERBB2, MET, FGFR1, or IGF1R nucleic acid. Determining the cTMB of the cancer can include determining an observed TMB (obsTMB) of a sample including at least one cancer cell from the cancer, determining a tumor purity (a) of the sample, and adjusting the observed TMB based on the tumor purity using a correction factor (r) as set forth in Table 4 to determine the cTMB. The method of cTMB can be determined using the equation cTMB=r(α)*obsTMB. The cancer can be a lung cancer (e.g., a non-small cell lung cancer, a lung squamous cell carcinoma, or a lung adenocarcinoma).
In another aspect, this document features methods for identifying a mammal as having a cancer that is likely to respond to an immunotherapy. The methods can include, or consist essentially of, determining a cTMB of the cancer, determining a mutational signature of the cancer, and identifying the cancer as being likely to respond to the immunotherapy when the mutational signature of the cancer includes i) mutation in an ARID1A nucleic acid sequence or a molecular smoking signature; and ii) germline heterozygosity at least one HLA class I locus. The molecular smoking signature can include cytosine (C) to adenosine (A) transversions (C>A transversions). Determining the cTMB of the cancer can include determining an observed TMB (obsTMB) of a sample including at least one cancer cell from the cancer, determining a tumor purity (a) of the sample, and adjusting the observed TMB based on the tumor purity using a correction factor (r) as set forth in Table 4 to determine the cTMB. The method of cTMB can be determined using the equation cTMB=r(α)*obsTMB. The cancer can be a lung cancer (e.g., a non-small cell lung cancer, a lung squamous cell carcinoma, or a lung adenocarcinoma).
In another aspect, this document features methods for determining a cTMB. The methods can include, or consist essentially of, determining an obsTMB of a sample including at least one cancer cell; determining a tumor purity (a) of the sample; and adjusting the observed TMB based on the tumor purity using a correction factor (r) as set forth in Table 4 to determine the cTMB. The cTMB can be determined using the equation cTMB=r(α)*obsTMB.
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention pertains. Although methods and materials similar or equivalent to those described herein can be used to practice the invention, suitable methods and materials are described below. All publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting.
The details of one or more embodiments of the invention are set forth in the accompanying drawings and the description below. Other features, objects, and advantages of the invention will be apparent from the description and drawings, and from the claims.
This document provides methods and materials for assessing and/or treating a mammal having a cancer. For example, this document provides methods and materials for identifying a mammal having a cancer as being likely to be responsive to a particular cancer treatment (e.g., by detecting a cTMB of one or more cells such as cancer cells from the mammal), and, optionally, treating the mammal. In some cases, the methods and materials described herein can be used to predict response to a particular cancer treatment (e.g., a cancer immunotherapy). For example, a sample obtained from a mammal (e.g., a human) having a cancer can be assessed to determine if the mammal is likely to be responsive to a particular cancer treatment (e.g., a cancer immunotherapy) based, at least in part, on the cTMB of the sample and/or on a multivariable model including the cTMB, the presence of one or more mutations in one or more nucleic acid sequences encoding a RTK polypeptide, the ability to present one or more antigens (e.g., HLA germline variation), and/or the presence of a smoking-related mutational signature in the sample.
In some cases, the methods and materials described herein can be used to treat a mammal having a cancer. For example, a mammal having a cancer identified as being likely to be responsive to a particular cancer treatment based, at least in part, on the cTMB of the sample from the mammal, can be treated with that particular cancer treatment as described herein. In some cases, a mammal having a cancer identified as being likely to be responsive to a cancer immunotherapy based, at least in part, on the cTMB of the sample from the mammal, can be treated with a cancer immunotherapy as described herein. In some cases, the methods and materials described herein can be used to improve progression-free survival. In some cases, the methods and materials described herein can be used to improve disease-free (e.g., relapse-free) survival. In some cases, the methods and materials described herein can be used to improve overall survival.
When treating a mammal having a cancer as described herein, the treatment can be effective to treat the cancer (e.g., to reduce one or more symptoms of the cancer). In some cases, the number of cancer cells present within a mammal can be reduced using the materials and methods described herein. In some cases, the size (e.g., volume) of one or more tumors present within a mammal can be reduced using the materials and methods described herein. In some cases, the size (e.g., volume) of one or more tumors present within a mammal does not increase.
When treating a mammal having a cancer as described herein, the treatment can be effective to treat the cancer (e.g., to reduce one or more symptoms of the cancer) with reduced or eliminated complications associate with that treatment. For example, when the treatment is a cancer immunotherapy, the cancer immunotherapy can be administered to a mammal having cancer, and identified as being likely to be responsive to a cancer immunotherapy (e.g., by detecting a cTMB of one or more cells such as cancer cells from the mammal), with reduced or eliminated toxicity from the cancer immunotherapy. For example, when the treatment is a cancer immunotherapy, the cancer immunotherapy can be administered to a mammal having cancer, and identified as being likely to be responsive to a cancer immunotherapy (e.g., by detecting a cTMB of one or more cancer cells from the mammal), with reduced or eliminated infection from the cancer immunotherapy.
Any type of mammal having a cancer can be assessed and/or treated as described herein. Examples of mammals that can be assessed and/or treated as described herein include, without limitation, primates (e.g., humans and monkeys), dogs, cats, horses, cows, pigs, sheep, rabbits, mice, and rats. In some cases, a human having a cancer can be assessed to determine if the human is likely to be responsive to a particular cancer treatment based, at least in part, on the cTMB of the sample and, optionally, can be treated with that particular cancer treatment as described herein.
A mammal having any type of cancer (e.g., a cancer including one or more cancer cells) can be assessed and/or treated as described herein. In some cases, a cancer can include one or more tumors (e.g., one or more solid tumors). In some cases, a cancer can be a blood cancer. Examples of cancers that can be assessed and/or treated as described herein include, without limitation, lung cancers (e.g., non-small cell lung cancers such as lung squamous cell carcinoma and lung adenocarcinoma), breast cancers (e.g., breast carcinomas such as breast invasive carcinoma), prostate cancers, ovarian cancers, gastric cancers (e.g., gastroesophageal cancers), endometrial cancers, bladder cancers (e.g., bladder carcinomas such as bladder urothelial carcinoma), colon cancers (e.g., colon adenocarcinomas), brain cancers (e.g., glioblastomas), head and neck cancers (e.g., head and neck squamous cell carcinomas), kidney cancers (e.g., kidney clear cell carcinomas), and skin cancers (e.g., melanomas such as skin cutaneous melanoma).
In some cases, a mammal can be identified as having a cancer. Any appropriate method can be used to identify a mammal as having a cancer. For example, imaging techniques and biopsy techniques can be used to identify mammals (e.g., humans) as having cancer.
A mammal having a cancer can be assessed as described herein to determine whether or not it is likely to respond to a particular cancer treatment (e.g., a cancer immunotherapy). For example, a sample (e.g., a sample including one or more cancer cells) obtained from the mammal can be assessed for the cTMB as described herein, and the cTMB of one or more cancer cells from that mammal can be used to determine whether or not that mammal is likely to respond to a particular cancer treatment.
Any appropriate sample from a mammal (e.g., a human) having a cancer can be assessed as described herein. In some cases, a sample can be a biological sample. For example, a sample can be a tumor sample. In some cases, a tumor sample can contain at least a portion of a tumor. In some cases, a sample can contain one or more cancer cells. Examples of samples that can be assessed as described herein include, without limitation, tissue samples (e.g., colon tissue samples, rectum tissue samples, and skin tissue samples), stool samples, cellular samples (e.g., buccal samples), and fluid samples (e.g., blood, serum, plasma, urine, and saliva). A sample can be a fresh sample or a fixed sample. In some cases, a sample can be an embedded (e.g., paraffin embedded or OCT embedded) sample. In some cases, a sample can be processed (e.g., processed to isolate and/or extract one or more biological molecules such as nucleic acids and polypeptides).
In some cases, a cTMB of one or more cells (e.g., one or more cancer cells) from a mammal can be used to identify that mammal as being likely to be responsive to a cancer immunotherapy. As used herein a cTMB is a TMB that is adjusted for tumor purity. In some cases, a cTMB can include an increased number of mutations (e.g., as compared to a TMB that has not been corrected as described herein and/or as compared to a sample having low tumor purity). For example, a higher cTMB score can be used to identify that mammal as being likely to be responsive to a cancer immunotherapy. In some cases, a higher cTMB score can be a score that is within the top 20-30% of cTMB scores in a given cohort. For example, mammals having a cTMB score that is within the top 20-30% of cTMB scores in a given cohort can be identified as likely to be responsive to a cancer immunotherapy.
Any appropriate method can be used to obtain a cTMB. For example, a TMB (e.g., an observed TMB (obsTMB)) of a sample (e.g., a sample obtained from a mammal) can be adjusted, based at least in part on the tumor purity of the sample, to obtain a cTMB. A TMB can be determined using any appropriate method. For example, whole exome sequencing and targeted next-generation sequencing can be used to determine a TMB. As used herein, “tumor purity” refers to the percentage of cells in a sample (e.g., a sample obtained from a mammal) that are cancer cells. The tumor purity of a sample can be obtained using any appropriate method. For example, whole exome sequencing, and/or targeted next-generation sequencing can be used to determine the tumor purity of a sample. In some cases, a cTMB can be corrected for tumor purity using correction factors for particular tumor purity values. Correction factors for particular tumor purity values can be as described in Table 4. For example, a cTMB can be determined using the equation
cTMB=r(α)*obsTMB
where r is the correction factor and a is the tumor purity. In some cases, a cTMB can be corrected for tumor purity as described in Example 1.
A cTMB can include any number of mutations. In some cases, the number of mutations found in a cell can be referred to as the mutational load of the cell. In some cases, a mutational signature can include from about 1 mutation to about several thousands of mutations. For example, a cTMB can include from about 5 mutations to about 100 mutations. In some cases, a cTMB can include at least about 20 mutations.
A cTMB can include any appropriate mutational signature (e.g., can include any mutations found in a cell, such as a cancer cell, from a mammal). As used herein a “mutational signature” is a characteristic combination of mutations. A mutational signature can include any appropriate types of mutations. In some cases, a mutation can be a somatic mutation. In some cases, a mutation can be an activating mutation. In some cases, a mutation can be a loss of function mutation (e.g., an inactivating mutation). Examples of types of mutations that can be included in a mutational signature can include, without limitation, substitutions such as transversions (e.g., point mutations such as C>A transversions), insertions (e.g., in-frame insertions or frameshift insertions), deletions (e.g., gene deletions such as in-frame deletions or frameshift deletions and/or chromosomal deletions), insertion/deletions (indels; e.g., in-frame indels or frameshift indels), and truncating mutations. A mutation that can be included in a mutational signature can be any appropriate location within the genome of a cell (e.g., a cancer cell). In some cases, a mutation included in a mutational signature can be in a coding sequence (e.g., a nucleotide sequence that encodes a polypeptide). In some cases, a mutation included in a mutational signature can be in non-coding sequence. In some cases, a mutation included in a mutational signature can be in a splice site. In some cases, a mutation included in a mutational signature can be in regulatory region (e.g., a nucleotide sequence that controls expression of a polypeptide such as a promoter sequence or an enhancer sequence). When a mutation that can be included in a mutational signature is in a coding sequence (or a regulatory region that control expression of that coding sequence), the mutation can be in any appropriate coding sequence. In some cases, a mutation that can be included in a mutational signature can be in a coding sequence (or a regulatory region that control expression of that coding sequence) that encodes a RTK polypeptide. In some cases, a mutation that can be included in a mutational signature can be in a coding sequence (or a regulatory region that control expression of that coding sequence) that encodes a polypeptide involved in DNA damage repair (DDR). In some cases, a mutation that can be included in a mutational signature can be in a coding sequence (or a regulatory region that control expression of that coding sequence) that encodes a polypeptide involved in the WNT-β-catenin pathway. In some cases, a mutation that can be included in a mutational signature can be in a coding sequence (or a regulatory region that control expression of that coding sequence) that encodes a polypeptide involved in an immune-related pathway (e.g., the IFNγ pathway). In some cases, a mutation that can be included in a mutational signature can be in a coding sequence (or a regulatory region that can control expression of that coding sequence) that encodes a polypeptide involved in the PI3K-AKT-mTOR pathway. Examples of nucleic acid (coding sequences or regulatory regions that control expression of that coding sequence) that can include one or more mutations in a mutational signature can include, without limitation, EGFR, ERBB2, MET, FGFR1, IGF1R, ARID1A, KEAP1, JAK1, JAK2, KRAS, STK11, PTEN, MDM2, and MDM4 nucleic acid. In some cases, a mutation that can be included in a mutational signature and can be used to identify that mammal as being likely to be responsive to a cancer immunotherapy can be as described in Example 1. In some cases, a mutation that can be included in a mutational signature and can be used to identify that mammal as being likely to be responsive to a cancer immunotherapy can be as described in one or more examples, Tables and/or Figures herein.
Any appropriate method can be used to detect one or more mutations in the genome of a cell (e.g., a cancer cell). In some cases, one or more mutations can be detected in the genome of a cell using sequencing techniques (e.g., PCR-based sequencing such as Next-Generation PCR-based sequencing and Sanger sequencing), DNA hybridization techniques, and/or restriction enzyme digestion methods.
In some cases, the presence or absence of one or more mutations in one or more nucleic acid sequences encoding a RTK polypeptide (e.g., a RTK nucleic acid) can be used to identify that mammal as being likely to be responsive to a cancer immunotherapy. For example, detecting one or more mutations in one or more nucleic acid sequences encoding a RTK polypeptide in the genome of one or more cells (e.g., one or more cancer cells) from a mammal can be used to identify that mammal as being likely to be responsive to a cancer immunotherapy. A mutation included in nucleic acid sequence encoding a RTK polypeptide can be a somatic mutation or a germline mutation. A mutation in nucleic acid sequence encoding a RTK polypeptide can be an activating mutation or a loss of function mutation (e.g., an inactivating mutation). Examples of types of mutations that can be present in nucleic acid sequence encoding a RTK polypeptide can include, without limitation, substitutions such as transversions (e.g., C>A transversions), insertions (e.g., in-frame insertions or frameshift insertions), deletions (e.g., in-frame deletions or frameshift deletions), insertion/deletions (indels; e.g., in-frame indels or frameshift indels), amplifications, and truncating mutations. Examples of nucleic acid sequences that can encoding a RTK polypeptide can include, without limitation, EGFR, ERBB2, MET, FGFR1, and IGF1R nucleic acids. For example, one or more point mutations in EGFR nucleic acid (e.g., point mutations in EGFR exon 21 such as L858R), one or more point mutations in ERBB2 nucleic acid (e.g., point mutations in ERBB2 exon 19 such as E770 A771insAYVM), one or more point mutations in MET nucleic acid, one or more point mutations in FGFR1 nucleic acid, and/or one or more point mutations in IGF1R nucleic acid; an amplification of FGFR1 nucleic acid and/or an amplification of IGF1R nucleic acid; both one or more point mutations in and an amplification of EGFR nucleic acid, both one or more point mutations in and an amplification of ERBB2 nucleic acid, and/or both one or more point mutations in and an amplification of MET nucleic acid; an in-frame deletion in EGFR nucleic acid (e.g., in-frame deletions in EGFR exon 19 such as 745KELREA>T, E746 A750del, and L747_T751del); an in-frame insertion in EGFR nucleic acid (e.g., frame insertions in EGFR exon 20 such as N771 H773dup); and/or an in-frame insertion in ERBB2 nucleic acid (e.g., frame insertions in ERBB2 exon 20 such as 776G>VC) can be used to identify that mammal as being likely to be responsive to a cancer immunotherapy. In some cases, a mutation in nucleic acid sequence encoding a RTK polypeptide that can be used to identify that mammal as being likely to be responsive to a cancer immunotherapy can be as described in Example 1. In some cases, a mutation in nucleic acid sequence encoding a RTK polypeptide that can be used to identify that mammal as being likely to be responsive to a cancer immunotherapy can be as described in Tables 3, 5, 6 and/or 7.
Any appropriate method can be used to detect one or more mutations in the genome of a cell (e.g., a cancer cell). In some cases, one or more mutations can be detected in the genome of a cell using sequencing techniques (e.g., PCR-based sequencing such as Next-Generation PCR-based sequencing and Sanger sequencing), DNA hybridization techniques, and/or restriction enzyme digestion methods.
In some cases, the ability of one or more cells (e.g., one or more cancer cells) from a mammal to present one or more antigens (e.g., one or more tumor antigens such as MANAs) can be used to identify that mammal as being likely to be responsive to a cancer immunotherapy. For example, detecting one or more mutations that can reduce the antigen presentation potential of one or more cells (e.g., one or more cancer cells) from a mammal can be used to identify that mammal as being likely to be responsive to a cancer immunotherapy. As used herein a mutation that can reduce antigen presentation potential is a mutation in the genome of a cell (e.g., a cancer cell) that reduce the ability of that cell to present one or more antigens on its surface (e.g., as compared to a cell that does not have that particular mutation in its genome). In some cases, one or more mutations in nucleic acid encoding an antigen presenting polypeptide (e.g., MHC class I polypeptides) can reduce the ability of that cell to present one or more antigens on its surface. Any appropriate genomic event can reduce the antigen presentation potential of a cell (e.g., cancer cell). Examples of genomic events that can reduce the antigen presentation potential of a cell (e.g., cancer cell) can include, without limitation, a loss of homozygosity of an HLA locus. For example, a cancer cell whose genome has a homozygous loss of at least one HLA class I locus (e.g., a homozygous loss of HLA-B) can have a reduced antigen presentation potential. In some cases, a genomic event that can reduce the antigen presentation potential of a cell (e.g., a cancer cell) can be as described in Example 1. In some cases, a genomic event that can reduce the antigen presentation potential of a cell (e.g., a cancer cell) can be as described in Table 11.
Any appropriate method can be used to determine the ability of one or more cells (e.g., one or more cancer cells) from a mammal to present one or more antigens. In some cases, immunohistochemistry techniques, whole exome sequencing, targeted next generation sequencing, or expression analyses can be used to determine the ability of one or more cells from a mammal to present one or more antigens.
In some cases, the presence of a smoking-related mutational signature in one or more cells (e.g., one or more cancer cells) from a mammal can be used to identify that mammal as being likely to be responsive to a cancer immunotherapy. As used herein, a smoking-related mutational signature includes one or more (e.g., one, two, three, four, five, six, or more) mutations that are C>A transversions in the genome of a cell (e.g., a cancer cell) from a mammal. A smoking-related mutational signature can include one or more C>A transversions in any appropriate nucleic acid sequence within the genome of a cell. In some cases, a C>A transversion can be in a coding sequence (or a regulatory region that can control expression of that coding sequence). In some cases, a C>A transversion can be a in a non-coding sequence. In some cases, a smoking-related mutational signature can be as described in Example 1.
Any appropriate method can be used to determine the presence or absence of a smoking-related mutational signature in one or more cells (e.g., one or more cancer cells) from a mammal. In some cases, the presence or absence of a C>A transversion can be detected using sequencing techniques (e.g., PCR-based sequencing such as Next-Generation PCR-based sequencing and Sanger sequencing), DNA hybridization techniques, and/or restriction enzyme digestion methods.
In some cases, a cTMB (and, optionally, the presence of one or more mutations in one or more nucleic acid sequences encoding a RTK polypeptide, the ability to present one or more antigens, and/or the presence of a smoking-related mutational signature) in one or more cells (e.g., one or more cancer cells) from a mammal can be used to determine whether or not that mammal is likely to respond to a particular cancer treatment (e.g., a cancer immunotherapy). For example, a cTMB including the presence of one or more particular mutations in one or more particular nucleic acid sequences encoding a RTK polypeptide, the ability to present one or more antigens, and/or the presence of a smoking-related mutational signature in one or more cells (e.g., one or more cancer cells) from a mammal can be used to determine whether or not that mammal is likely to respond to a cancer immunotherapy.
When a cTMB (and, optionally, the presence of one or more mutations in one or more nucleic acid sequences encoding a RTK polypeptide, the ability to present one or more antigens, and/or the presence of a smoking-related mutational signature) in one or more cells (e.g., one or more cancer cells) from a mammal can be used to determine that a cancer is likely to respond to a cancer immunotherapy, the cTMB can include any appropriate one or more mutations. For example, a cTMB and, optionally, the presence of one or more mutations in one or more nucleic acid sequences encoding a RTK polypeptide, the ability to present one or more antigens, and/or the presence of a smoking-related mutational signature can be used to determine that a cancer is likely to respond to a cancer immunotherapy. In some cases, a cTMB that can be used as described herein to determine that a cancer is likely to respond to a cancer immunotherapy can be a cTMB that includes one or more mutations in a nucleic acid that can encode ARID1A, one or more inactivating mutations in nucleic acid that can encode KEAP1, and/or one or more C>A transversions (e.g., a smoking-related mutational signature).
When a cTMB (and, optionally, the presence of one or more mutations in one or more nucleic acid sequences encoding a RTK polypeptide, the ability to present one or more antigens, and/or the presence of a smoking-related mutational signature) in one or more cells (e.g., one or more cancer cells) from a mammal can be used to determine that a cancer is not likely to respond to a cancer immunotherapy, the cTMB can include any appropriate one or more mutations. For example, a cTMB and, optionally, the presence of one or more mutations in one or more nucleic acid sequences encoding a RTK polypeptide, the ability to present one or more antigens, and/or the presence of a smoking-related mutational signature can be used to determine that a cancer is not likely to respond to a cancer immunotherapy. In some cases, a cTMB that can be used as described herein to determine that a cancer is not likely to respond to a cancer immunotherapy can be a cTMB that includes one or more activating mutations in nucleic acid that can encode EGFR, one or more activating mutations in nucleic acid that can encode ERBB2, one or more activating mutations in nucleic acid that can encode MET, one or more activating mutations in nucleic acid that can encode FGFR1, one or more activating mutations in nucleic acid that can encode IGF1R, one or more activating mutations in nucleic acid that can encode MDM2/MDM4, and/or a homozygous loss of at least one HLA class I locus. For example, a cTMB having a mutational signature that includes one or more activating point mutations in nucleic acid encoding EGFR, one or more activating point mutations in nucleic acid encoding ERBB2, amplification of nucleic acid encoding MET, amplification of nucleic acid encoding FGFR1, amplification of nucleic acid encoding IGF1R, one or more activating point mutations in nucleic acid encoding MDM2/MDM4, and homozygous loss of at least one HLA class I locus can be used to determine that a cancer is not likely to respond to a cancer immunotherapy.
A mammal (e.g., a human) having a cancer can be administered, or instructed to self-administer, any one or more (e.g., 1, 2, 3, 4, 5, 6, or more) cancer treatments. A cancer treatment can include any appropriate cancer treatment. In some cases, a cancer treatment can include surgery. In some cases, a cancer treatment can include radiation therapy. In some cases, a cancer treatment can include administration of a pharmacotherapy such as a chemotherapy, hormone therapy, targeted therapy, and/or cytotoxic therapy. Examples of cancer treatments include, without limitation, administration of one or more receptor tyrosine kinase inhibitors (e.g., erlotinib), administration of one or more PD1/PD-L1 inhibitors (e.g., nivolumab, pembrolizumab, atezolizumab, avelumab, and durvalumab), administration of one or more immunotherapies (e.g., alemtuzumab, ipilimumab, nivolumab, ofatumumab, and rituximab), administration of one or more platinum compounds (e.g., a cisplatin or carboplatin), administration of one or more taxanes (e.g., paclitaxel, docetaxel, or an albumin bound paclitaxel such as nab-paclitaxel), administration of altretamine, administration of capecitabine, administration of cyclophosphamide, administration of etoposide (vp-16), administration of gemcitabine, administration of ifosfamide, administration of irinotecan (cpt-11), administration of liposomal doxorubicin, administration of melphalan, administration of pemetrexed, administration of topotecan, administration of vinorelbine, administration of one or more luteinizing-hormone-releasing hormone (LHRH) agonists (such as goserelin and leuprolide), administration of one or more anti-estrogen therapies (such as tamoxifen), administration of one or more aromatase inhibitors (such as letrozole, anastrozole, and exemestane), administration of one or more angiogenesis inhibitors (such as bevacizumab), administration of one or more poly(ADP)-ribose polymerase (PARP) inhibitors (such as olaparib, rucaparib, and niraparib), administration of external beam radiation therapy, administration of brachytherapy, administration of radioactive phosphorus, and administration of any combinations thereof.
In cases where a mammal (e.g., a human) is identified as having a cancer that is likely to be responsive to a cancer immunotherapy based, at least in part, on the cTMB of the sample from the mammal, the mammal can be treated with one or more (e.g., 1, 2, 3, 4, 5, 6, or more) cancer immunotherapies. In some cases, a cancer immunotherapy can be a cellular immunotherapy (e.g., a dendritic cell therapy or a chimeric antigen receptor (CAR)-T cell therapy). In some cases, a cancer immunotherapy can be an antibody therapy (e.g., a monoclonal antibody therapy). In some cases, a cancer immunotherapy can be a cytokine therapy (e.g., interferon therapy or interleukin therapy). In some cases, a cancer immunotherapy can activate one or more cell death mechanisms (e.g., antibody-dependent cell-mediated cytotoxicity (ADCC) or the complement system). In some cases, a cancer immunotherapy can target one or more (e.g., 1, 2, 3, 4, 5, 6, or more) immune checkpoint molecules. An immune checkpoint molecule can be an inhibitory checkpoint molecule. Examples of immune checkpoint molecules that can be targeted by a cancer immunotherapy can include, without limitation, cytotoxic T-lymphocyte-associated protein 4 (CTLA4, also known as cluster of differentiation 152 (CD152)), programmed cell death protein 1 (PD-1, also known as cluster of differentiation 279 (CD279)), and programmed death-ligand 1 (PD-L1, also known as cluster of differentiation 274 (CD274) and B7 homolog 1 (B7-H1)). Examples of cancer immunotherapies that can be administered to a mammal identified as having a cancer that is likely to be responsive to a cancer immunotherapy based, at least in part, on the cTMB of a sample from the mammal can include, without limitation, alemtuzumab, atezolizumab, avelumab, ipilimumab, ofatumumab, nivolumab, pembrolizumab, rituximab, and durvalumab.
In cases where a mammal (e.g., a human) is identified as having a cancer that is likely to be responsive to a cancer immunotherapy based, at least in part, on the cTMB of the sample from the mammal, the mammal can be treated with a cancer immunotherapy and also can be administered any one or more (e.g., 1, 2, 3, 4, 5, 6, or more) additional cancer treatments (e.g., one or more cancer treatments that are not cancer immunotherapies). A cancer treatment can include any appropriate cancer treatment. A cancer treatment can include any appropriate cancer treatment. In some cases, a cancer treatment can include surgery. In some cases, a cancer treatment can include radiation therapy. In some cases, a cancer treatment can include administration of a pharmacotherapy such as a chemotherapy, hormone therapy, targeted therapy, and/or cytotoxic therapy. Examples of chemotherapeutic agents that can be administered to a mammal having a cancer can include, without limitation, pemetrexed, platinum-based compounds, taxanes, and combinations thereof.
In cases where a mammal having cancer is treated with one or more (e.g., 1, 2, 3, 4, 5, 6, or more) cancer immunotherapies and is treated with one or more (e.g., 1, 2, 3, 4, 5, 6, or more) additional cancer treatments (e.g., one or more cancer treatments that are not cancer immunotherapies), the one or more cancer immunotherapies and the one or more additional cancer treatments can be administered at the same time or independently. For example, one or more cancer immunotherapies can be administered first, and the one or more additional cancer treatments (e.g., one or more cancer treatments that are not cancer immunotherapies) administered second, or vice versa.
The invention will be further described in the following examples, which do not limit the scope of the invention described in the claims.
This Example describes an integrated approach where an improved measure for TMB, corrected for tumor purity, is combined with genomic alterations in RTK genes, genome-wide mutational signatures, and HLA class I genetic variation to capture the multifaceted nature of the tumor-immune system crosstalk and more accurately predict outcome for immune checkpoint blockade.
Results
TMB is an emerging predictive biomarker of response to immune checkpoint blockade, however its broad implementation in clinical decision making has been hindered by complexities with establishing a robust predictive power. Low tumor purity, mainly due to sampling, may greatly affect TMB assessments, resulting in falsely low TMB in low tumor cellularity samples, especially for tumors with a higher fraction of subclonal mutations. Furthermore, the estimation of tumor purity itself may be challenging as pathologic assessments are frequently imprecise and have limited reproducibility (Viray et al., Archives of pathology & laboratory medicine 137:1545-1549 (2013)). To determine tumor purity for cohorts 1 and 2, both a mutant allele frequency based and a copy-number based approach were employed. To determine the tumor purity needed to accurately determine TMB in the setting of different clonal composition backgrounds, simulation analyses were performed and the tumor purity required to establish reliable TMB assessments was determined, and that TMB also depends on intratumoral clonal heterogeneity (
To substantiate these findings, tumor whole exome sequencing data from 3,788 TCGA samples from 7 tumor types (bladder carcinoma, breast carcinoma, colon adenocarcinoma, head and neck squamous cell carcinoma, kidney clear cell carcinoma, NSCLC and melanoma) were analyzed and a correlation between TMB and tumor purity was found, with a lower number of alterations observed in samples with low tumor purity (
To overcome this limitation of TMB measurements, an approach was developed to estimate corrected TMB values (cTMB) for each tumor based on tumor purity. First, 20,000 tumors were simulated with various levels of intra-tumoral heterogeneity, TMB, and depth of coverage using a reference set from TCGA. In silico dilutions of these simulated tumors were then used to model the observed TMB resulting from characterization of each simulated tumor sample at various levels of tumor purity. For each simulated tumor a correction factor was generated for different purity tiers (
The approach was further refined by interrogating mutational signatures as smoking-related C>A transversions have been identified in NSCLC patients with clinical benefit from ICB (Miao et al., Nature genetics 50:1271-1281 (2018); and Forde et al., The New England journal of medicine, 378:1976-1986 (2018)). The number of mutations needed to accurately estimate the contribution of the C>A rich molecular smoking signature were evaluated. In silico dilution experiments of whole exome mutational profiles of 985 TCGA NSCLC tumors were performed and it was found that a minimum of 20 non-synonymous mutations would be required to predict the presence of a dominant smoking signature (
Genomic alterations in driver genes that were selectively associated with responding or non-responding tumors after accounting for the mutation load of a given tumor were identified. Such an adjustment is crucial given the higher probability of passenger mutations in driver genes in tumors with a high tumor mutation burden. A significant enrichment in activating mutations in receptor tyrosine kinase (RTK) genes were found in patients who did not derive durable clinical benefit from immune checkpoint blockade (Mann-Whitney p<0.001, FDR-adjusted p=0.002,
Recurrent alterations in ARID1A were found in patients with durable clinical benefit (Mann-Whitney p=0.005, FDR-adjusted p=0.024), with a trend towards statistical significance after correction for TMB (p=0.062,
A pathway-focused approach was followed in order to identify enrichment or mutual exclusivity of genomic alterations in oncogenic processes or signaling pathways. DNA damage repair (DDR) genes and the WNT-β-catenin pathway were considered. One responding TMB-high tumor was identified with biallellic inactivation of MLH1, but an overall enrichment was not identified in deleterious somatic DDR gene mutations in responding tumors (
A strong correlation was found between TMB and predicted MANA load (R=0.98, p<0.001). As only a small fraction of predicted MANAs are immunogenic, neoantigens that have predicted MHC affinities ≤50 nM and for which the corresponding wild-type peptide does not bind MHC class I (affinity >1000 nM) were focused on as these “fit” neoantigens are most likely to be identified as non-self by the immune system and potentiate an anti-tumor immune response. A higher number of fit MANAs was found in responding vs. non-responding tumors (Mann-Whitney p=0.01, FDR-adjusted p=0.05;
Antigen presentation deficiency may lead to immune escape through both HLA class I germline homozygosity and somatic loss of heterozygosity (LOH). In the cohort, 22 cases were homozygous for at least one HLA class I locus in their germline, and somatic HLA LOH occurred in 27 tumors (
Given the importance of specific individual features identified, cTMB, molecular smoking signature, RTK activating mutations, and HLA genetic variation were combined in a multi-parameter predictor of outcome (
The predictive value of individual biomarkers of response to immunotherapy such as PD-L1 expression and TMB have modest predictive utility across a plethora of studies These analyses showed that the complexities of the predictive value of TMB may be in part attributed to tumor purity and developed a new approach to generate corrected TMB values that more accurately predicted outcome for ICB. These findings are of particular importance for metastatic NSCLC where the majority of tumor samples are obtained by bronchoscopy or core needle biopsies and are therefore subject to tumor purity limitations. While targeted next-generation sequencing may alleviate the tumor purity effect given the higher coverage compared to whole exome sequencing, our findings suggest that TMB values should only be interpreted after taking into consideration the tumor purity of the sample analyzed.
This study found a significant enrichment in activating RTK genomic alterations in non-responding tumors which identified patients with an inferior outcome from immune checkpoint blockade in three independent NSCLC cohorts. This study also found that activating genomic alterations in RTK genes including EGFR, HER2, MET, FGFR1 and IGF1R can be linked to primary resistance to immune checkpoint blockade independent of mutation burden.
Key molecular features identified in this study were combined into a predictive classifier for NSCLC patients treated with ICB. Previous attempts to combine biomarkers have focused on a limited number of features such as TMB and chromosomal imbalance (Roh et al., Science translational medicine 9:3560 (2017)), TMB and immune cell gene expression profiles (Cristescu et al., Science 362:3593 (2018)) or HLA variation and TMB (Chowell et al., Science 359:582-587 (2018); and McGranahan et al., Cell 171:1259-1271 (2017)). The multivariable model described herein incorporates an improved measure of TMB through correction of tumor purity, RTK mutations, molecular smoking signature and HLA genetic variation, highlighting the need for development of integrative platforms that capture the complexities of the cancer-immune system crosstalk.
Methods
Cohort Characteristics
Matched tumor-normal exome sequencing data was obtained from 3,788 patients in TCGA (cancergenome.nih.gov), as outlined in the TCGA publication guidelines cancergenome.nih.gov/publications/publicationguidelines, focusing on tumors that would be relevant for immunotherapy. Cohort 1 consisted of 104 NSCLC patients treated with immune checkpoint blockade at Johns Hopkins Sidney Kimmel Cancer Center and the Nederlands Kanker Instituut. Of these, 15 cases were not included in the final analyses because of tumor purity <10% or absence of matched normal samples. The studies were approved by the Institutional Review Board (IRB) and patients provided written informed consent for sample acquisition for research purposes. Clinical characteristics for all patients are summarized in Table 1. Exome data from a published cohort of NSCLC patients treated with PD1 blockade (cohort 2) were obtained and analyzed to validate key findings from cohort 1 as described elsewhere (see, e.g., Rizvi et al., Science, 348:124-128 (2015); and Wood et al., Science translational medicine 10:7939 (2018)). A publicly available cohort of 240 NSCLC patients treated with ICB was obtained through CBioPortal for Cancer Genomics (MSK, JCO 2018; available online at cbioportal.org/study?id=nsclc_pd1_msk_2018) and used to validate the association of RTK mutations with outcome (cohort 3). A publicly available cohort of 1,661 tumors analyzed by targeted next-generation sequencing was obtained through CBioportal for Cancer Genomics (MSKCC, Nat Genet 51(2):202-206 (2019)) to validate the correlation between TMB and tumor purity in the setting of higher sequencing depth.
Treatment and Assessment of Clinical Response
Eighty patients were treated with anti-PD1 therapy, 7 patients received combination anti-PD1 and anti-CTLA4 therapy and 2 patients were treated with chemotherapy and anti-PD1 therapy. Response was defined as durable clinical benefit if complete, partial response or stable disease was achieved with a duration >6 months. Responding and non-responding tumors, therefore refer to durable clinical benefit and non-durable clinical benefit respectively. Progression-free survival (PFS) and overall survival (OS) were defined as the time elapsed between the date of treatment initiation and the date of disease progression or death from disease, or the date of death, respectively. Ultimately, overall survival was used to determine long-term outcome for cohort 1. Overall survival was not available for cohorts 2 and 3, therefore progression-free survival was used. Response assessments and outcome are shown in detail in Table 1.
Sample Preparation and Whole Exome Sequencing
Whole exome sequencing was performed on pre-immunotherapy tumor and matched normal samples, with the exception of 3 cases for which tumor from the time of resistance to therapy was analyzed (Table 1). Tumor samples underwent pathological review for confirmation of lung cancer diagnosis and assessment of tumor cellularity; histology, anatomic location of the lesion analyzed and pathologic tumor purity are shown in Table 1. Slides from each FFPE block were macrodissected to remove contaminating normal tissue. Matched normal samples were provided as peripheral blood. DNA was extracted from patients' tumors and matched peripheral blood using the Qiagen DNA FFPE and Qiagen DNA blood mini kit respectively (Qiagen, CA). Fragmented genomic DNA from tumor and normal samples used for Illumina TruSeq library construction (Illumina, San Diego, Calif.) and exonic regions were captured in solution using the Agilent SureSelect v.4 kit (Agilent, Santa Clara, Calif.) according to the manufacturers' instructions as described elsewhere (see, e.g., Anagnostou et al., Cancer discovery 7:264-276 (2017)). Paired-end sequencing, resulting in 100 bases from each end of the fragments for the exome libraries was performed using Illumina HiSeq 2000/2500 instrumentation (Illumina, San Diego, Calif.). The mean depth of total and distinct coverage for the pre-treatment tumors were 231× and 144×, allowing identification of sequence alterations and copy number changes in >20,000 genes (Tables 2, 3 and 6).
Primary Processing of Exome Data and Identification of Putative Somatic Mutations
Somatic mutations were identified using VariantDx custom software for identifying mutations in matched tumor and normal samples as described elsewhere (see, e.g., Jones et al., Science translational medicine 7, 283ra253 (2015)). Prior to mutation calling, primary processing of sequence data for both tumor and normal samples were performed using Illumina CASAVA software (version 1.8), including masking of adapter sequences. Sequence reads were aligned against the human reference genome (version hg19) using ELAND with additional realignment of select regions using the Needleman-Wunsch method as described elsewhere (see, e.g., Needleman et al., J Mol Biol 48:443-453 (1970)). Candidate somatic mutations, consisting of point mutations, insertions, and deletions were then identified using VariantDx across the whole exome. VariantDx examines sequence alignments of tumor samples against a matched normal while applying filters to exclude alignment and sequencing artifacts. In brief, an alignment filter was applied to exclude quality failed reads, unpaired reads, and poorly mapped reads in the tumor. A base quality filter was applied to limit inclusion of bases to those with reported Phred quality score >30 for the tumor and >20 for the normal. A mutation in the pre or post treatment tumor samples was identified as a candidate somatic mutation only when (1) distinct paired reads contained the mutation in the tumor; (2) the fraction of distinct paired reads containing a particular mutation in the tumor was at least 10% of the total distinct read pairs and (3) the mismatched base was not present in >1% of the reads in the matched normal sample as well as not present in a custom database of common germline variants derived from dbSNP and (4) the position was covered in both the tumor and normal. Mutations arising from misplaced genome alignments, including paralogous sequences, were identified and excluded by searching the reference genome. Candidate somatic mutations were further filtered based on gene annotation to identify those occurring in protein coding regions. Functional consequences were predicted using snpEff and a custom database of CCDS, RefSeq and Ensembl annotations using the latest transcript versions available on hg19 from UCSC (genome.ucsc.edu/). Predictions were ordered to prefer transcripts with canonical start and stop codons and CCDS or Refseq transcripts over Ensembl when available. Finally, mutations were filtered to exclude intronic and silent changes, while retaining mutations resulting in missense mutations, nonsense mutations, in-frame and frameshift insertions and deletions, or splice site alterations. Somatic mutations were annotated against the set of mutations in COSMIC (v84) database, and the number of samples with identical amino acid change were reported. Mutations were characterized as hotspots when the same amino acid change was reported in at least 10 tumor samples in COSMIC v84 database. Missense mutations were evaluated for their potential as cancer drivers by CHASMplus (Tokheim et al., bioRxiv dx.doi.org/10.1101/010876 (2018)). For the differential enrichment analysis between patients with durable and non-durable clinical benefit, only genomic alterations with known cancer initiating/promoting functional consequences independent of observed frequency and hotspots for oncogenes and truncating/loss-of-function mutations for tumor suppressor genes were considered.
For the TCGA cohort, WES-derived somatic mutation calls from the TCGA PanCancer Atlas MC3 project were retrieved from the NCI Genomic Data Commons (gdc.cancer.gov/about-data/publications/mc3-2017). The MC3 mutation call set is the result of application of a uniform analysis pipeline including a standardized set of six mutation callers and an array of automated filters to all the entire TCGA exome data. Mutation calls in cohort 2 were obtained from re-analysis of the original calls and consequence prediction was performed using CRAVAT (Masica et al., Cancer Res 77, e35-e38 (2017)). TMB scores for the cohort of 1,661 tumors were retrieved from the original publication and refer to the total number of somatic mutations identified normalized to the exonic coverage of the targeted panel used in megabases (Samstein et al., Nature genetics, 51(2):202-206 (2019)).
Neoantigen Prediction and Feature Characterization
To assess the immunogenicity of somatic mutations, exome data combined with each individual patient's MHC class I haplotype were applied in a neoantigen prediction platform that evaluates binding of somatic peptides to class I WIC, antigen processing, self-similarity and gene expression. Detected somatic mutations, consisting of nonsynonymous single base substitutions, insertions and deletions, were evaluated for putative neoantigens using the ImmunoSelect-R pipeline (Personal Genome Diagnostics, Baltimore, Md.) as described elsewhere (see, e.g., Anagnostou et al., Cancer discovery 7:264-276 (2017)). For single base substitutions, ImmunoSelect-R performs a comprehensive assessment of paired somatic and wild type peptides 8-11 amino acids in length at every position surrounding a somatic mutation. In the case of frameshifts, all peptides 8-11 amino acids encompassing the new protein sequence resulting from the frameshift alteration were considered.
To accurately infer a patient's germline HLA 4-digit allele genotype, whole-exome-sequencing data from paired tumor/normal samples were first aligned to a reference allele set, which was then formulated as an integer linear programming optimization procedure to generate a final genotype by OptiType v1.0.44. The HLA genotype served as input to netMHCpan to predict the WIC class I binding potential of each somatic and wild-type peptide (IC50 nM), with each peptide classified as a strong binder (SB), weak binder (WB) or non-binder (NB) as described elsewhere (see, e.g., Nielsen et al., Genome Med 8:33 (2016); Lundegaard et al., Nucleic Acids Res 36:W509-512 (2008); and Lundegaard et al., Bioinformatics 24:1397-1398 (2008)). Peptides were further evaluated for antigen processing (netCTLpan48) and were classified as cytotoxic T lymphocyte epitopes (E) or non-epitopes (NA). Paired somatic and wild-type peptides were assessed for self-similarity based on MHC class I binding affinity. Neoantigen candidates meeting an IC50 affinity <5000 nM were subsequently ranked based on MHC binding and T-cell epitope classifications. A single MANA per mutation was selected based on their MHC affinity and neoantigen candidates with an MHC affinity <500 nM were further selected to estimate the neoantigen tumor burden and used for downstream analyses. Tumor-associated expression levels derived from TCGA were used to generate a final ranking of candidate immunogenic peptides. MANAs were further characterized based on their immunogenic potential by selecting neopeptides with high MHC affinity for which their wild type counterpart predicted not to bind MHC class I molecules (fit MANA: MHC affinity for mutant peptide <50 nM and for wild type peptide >1000 nM). For MANAs stemming from frameshift mutations, the length of the resulting protein until a stop codon was reached was considered, as a longer novel amino acid sequence would have the potential to generate more immunogenic neoantigens. Sequences more prone to undergo nonsense mediated decay were subsequently filtered out as described elsewhere (see, e.g., Balasubramanian et al., Nature communications 8:382 (2017)), during this process aberrant transcripts are typically removed at the mRNA level and therefore would not stand a chance of occurring despite the presence of bioinformatic predictions. The percentage of frameshift mutations undergoing nonsense mediated decay is shown in
Mutational Signatures
Mutational signatures were extracted based on the fraction of coding point mutations in each of 96 trinucleotide contexts and estimated the contribution of each signature to each tumor sample using the deconstructSigs R package as described elsewhere (see, e.g., Viray et al., Archives of pathology & laboratory medicine 137:1545-1549 (2013); and Anagnostou et al., Cancer discovery 7:264-276 (2017)). To evaluate the impact of the total number of observed single base substitutions on detection of a smoking signature within a tumor sample, in-silico dilution experiments were performed utilizing somatic mutation data from 985 NSCLC samples from the TCGA PanCancer Atlas MC3 project. A total of 76 tumors (64 LUAD and 12 LUSC, with average patient pack years of 43.8 and 32.8, respectively) with mutational loads >250 (requiring a minimum 10% MAF and at least 4 variant supporting reads per mutation) and a detected smoking signature with >75% contribution were diluted in silico by subsampling to lower mutation counts from 5 up to 100. For each round of subsampling, tumor mutations were re-evaluated for a smoking signature using the deconstructSigs package. Reductions in the smoking signature and overall percentage deviation from the original smoking signature percent contribution were then assessed in the sample.
Copy Number Analyses, Tumor Purity and Ploidy Assessment
The somatic copy number profile and the extent of aneuploidy in each tumor were estimated using whole exome sequencing data as follows. First, the relative copy number profile of each tumor sample was determined by evaluating the number of reads mapping to exonic and intronic regions (bins) of the genome while correcting them for confounding factors such as region size, GC content, and sequence complexity. The corrected density profile in each tumor sample was then compared to a reference generated by processing a panel of normal samples in a similar manner to define log copy ratio values which reflect the relative copy number profile of each genomic region. Next, circular binary segmentation (CBS) was applied to bin-level copy ratio values to reduce the inherent noise associated with stochastic read count variation and to enable accurate assessment of copy number breakpoints; i.e. boundaries between genomic segments with distinct somatic copy number. Finally, a genome-wide analysis of segmental copy ratio values combined with minor allele frequency of heterozygous SNPs overlapping the segments, implemented as an in-house pipeline, yielded an estimate of tumor purity and ploidy. In brief, the model exhaustively evaluated all plausible combinations of tumor purity and ploidy and returned the optimal combination of the two parameters using a maximum likelihood approach. The performance of this platform was compared against FACETS on a collection of 97 NSCLC tumors and the two methods provided similar estimates of tumor purity (r=0.94, p-value <2.2e-16) and ploidy (r=0.66, p-value=1.489e-13). The estimated purity and ploidy of the tumor sample were subsequently used to determine the allele specific copy number of genome segment by selecting the combination of total and minor copy number that best approximate the segment's log copy ratio and average minor allele frequency as described elsewhere (see, e.g., Anagnostou et al., Cancer discovery 7:264-276 (2017)).
Focal amplifications and homozygous deletions were determined as segments of the genome with length ≤3 Mbp and total copy number greater than or equal to three times ploidy of the genome (amplification), or total copy number of zero (deletion). To increase the specificity of this approach, a set of blacklisted regions was created from a panel of 96 healthy control samples. For each healthy sample, a weighted mean and weighted standard deviation was calculated from segment means obtained from the circular binary segmentation algorithm on copy ratio values, weighted by the number of bins supporting each segment. Genomic intervals in each healthy sample with a segment mean greater than 3 standard deviations away from the mean were added to the blacklist. Focal alterations where >50% of the segment overlapped a blacklisted region in at least 2 healthy control samples were dropped. In addition, segments supported by less than 5 bins and also segments from GC-rich and GC-poor regions of the genome where more than 50% of bins supporting a segment had a GC-content of less than 35% or greater than 70% were excluded.
Several measures of tumor aneuploidy were calculated including the fraction of the genome with loss of heterozygosity (LOH: complete loss of the minor allele), and allelic imbalance (AI: inequality of major and minor allele copy number). In each tumor sample, the modal copy number was determined as the most prevalent total copy number value across the genome. The fraction of the genome with total copy number-CN different from this modal value was calculated and referred to as Non-modal CN Fraction. This measure of aneuploidy is equal to zero for a euploid genome, and increases as the tumor genome accumulates copy number aberrations. Finally, the fraction of the genome at each observed total copy number value was determined, and applied the concept of entropy from information theory to quantify the amount of uncertainty in the assignment of total copy number for each genomic segment. Genome CN Entropy is at its minimum when the entire genome is at a single total copy number, and reaches its maximum when all the observed total copy number levels represent equal fractions of the genome; e.g. 25% of the genome at n=1, 2, 3, and 4.
For a subset of cases (n=14 in cohort 1 and n=10 in cohort 2) where the pipeline could not determine the purity and ploidy due to low tumor purity, technical noise, or copy-number heterogeneity, a mutation-based measure of tumor purity based on the median of mutant allele fractions was used to derive an approximate measure of tumor purity. Tumor purity estimates from copy number analysis above were combined with these mutation-based estimates to define the “Adjusted Tumor Purity” measure.
Evaluation of Tumor Purity in TCGA Samples
Consensus tumor purity estimates from four independent methods were obtained for TCGA samples as described elsewhere (see, e.g., Aran et al., Nature communications 6:8971 (2015)). The analysis were restricted to 3,788 TCGA samples from 7 tumor types (BLCA, BRCA, COAD, HNSC, KIRC, LUAD, LUSC, and SKCM) that had both MC3 mutation calls and a consensus tumor purity estimate. For each cancer type, we computed the Pearson correlation between the total number of mutations called in each sample and tumor purity (
Mutation Clonality Assessment
Mutant allele frequency, ploidy and purity were incorporated to estimate mutation cellular fraction that is the fraction of cancer cells that harbor a specific mutation. SCHISM56 was applied to determine the mutation cellular fraction based on the observed variant allele frequency, estimated copy number, and sample purity by following an approach similar to that described elsewhere (see, e.g., Anagnostou et al., Cancer discovery 7:264-276 (2017)). Briefly, the expected mutant allele frequency (Vexp) of a mutation with mutation cellular fraction (CF) present in m copies (mutation multiplicity), at a locus with total copy number (nT) in the tumor sample and total copy number (n N) in the matched normal sample, with purity (α) can be calculated as
Where m indicates multiplicity, i.e. the number of mutant copies present in the cancer cells. A confidence interval for variable Vexp can be derived based on the observed distinct mutant counts and distinct coverage assuming a binomial process. Substitution of this value in the above equation resulted in a confidence interval for the product of the two unknown variables m and CF. Finally, the following set of rules were applied to determine the mutation cellular fraction: (1) For clonal mutations (CF=1), the product m*CF only assumes integer values; therefore, if the confidence interval includes an integer value, that value is equal to the multiplicity of the mutation and the mutation is clonal (CF=1). (2) For mutations where the upper bound of the confidence interval form*CF is below 1, multiplicity is assumed to be 1. If the point estimate for CF is within a tolerance threshold (0.25) of 1.0, the mutation is assumed to be clonal and CF is substituted by 1.0. Otherwise, the mutation is deemed subclonal. (3) For mutations where the confidence interval for m*CF does not encompass an integer number and the entire interval exceeds 1.0, it is plausible to assume a multiplicity greater than 1.0. In this case, the multiplicity is set to smallest integer value such that the confidence value for CF falls within the expected interval of [0, 1]. This procedure results in a point estimate for CF. Similar to (2), if the point estimate is within a tolerance threshold (0.25) of 1.0, the mutation is assumed to be clonal and CF is substituted by 1.0; otherwise, the mutation is considered subclonal.
Limitations of TMB Assessment
The impact of tumor purity and intratumoral heterogeneity on the accuracy of TMB estimates was evaluated in a simulation experiment (
c˜{dot over (Γ)}(βμC,β)
where μC is the mean distinct coverage of the sample, and was set to set to 200. The rate parameter β determined the variance of base-level coverage in the sample, and was set to 0.013 based on evaluation of coverage distribution in 100 tumor samples. Distinct mutant read count (m) were generated by assuming a draw from a binomial distribution with probability of success set to the expected mutation allele frequency (Vexp) given the purity of the tumor sample (α) and cellular fraction of the mutation (CT), assuming absence of somatic copy number alterations at the mutation loci as follows:
Mutations with simulated distinct coverage c≥10, distinct mutant read count m≥3, and observed allele frequency {circumflex over (ν)}≥10% were determined to be present, and were tallied up to derive the observed TMB (obsTMB). The observed TMB was calculated in each replicate, and the median was reported (
Correction of TMB for Tumor Purity
Corrected TMB (cTMB) values were generated based on observed TMB and tumor purity as follows. Given the findings that low tumor purity can limit the detection of subclonal mutations and skew the estimates of clonal composition, the level of intra-tumor heterogeneity in a set of TCGA NSCLC cancers with high tumor purity was first established. Purity, ploidy, and allele specific copy number profiles of the tumor samples based on analysis of SNP6 copy number array data were obtained from Synapse (synapse.org/#!Synapse:syn1710464.2). A set of 31 NSCLC samples with tumor purity of at least 80% and tumor ploidy in the range of [1.5, 5.0] was selected, where highly confident mutation calls (MC3 set) were available, and somatic copy number profile was determined. The cellular fraction of mutations in each tumor was estimated as described above, and determined the fraction of clonal mutations. This analysis revealed a low level of intra-tumor heterogeneity in untreated lung tumors, as it was observed clonal mutation fraction of 70% or above in all but two of the 31 tumors analyzed. Given the small number of lung tumors where the clonal composition could be accurately determined, an additional group of samples was identified to supplement the original set. 704 highly pure (purity >=80%) tumors were identified with available mutation and copy number data from the TCGA project in tumor types other than NSCLC, and characterized them in terms of clonal composition. An estimate was derived for the clonal composition of each tumor defined as the frequency of observed mutations in CF bins of width 0.05 spanning the [0,1] interval, and used these estimates as a basis to model mutation CF values in the simulation experiment. This set was further filtered to ensure that their level of intra-tumor heterogeneity matches that of NSCLC tumors by requiring clonal mutation fraction of 70% or above. The clonal composition from this reference combined set of NSCLC (n=29) and other (n=577) tumors with high clonal fraction (>=70%) was used to model mutation CF in the following simulation experiment.
20,000 in silico tumor samples were subsequently simulated, where the true TMB of each tumor was determined by sampling from the distribution of TMB in TCGA NSCLC samples. The mean sample sequence depth of coverage (C) was set to follow a normal distribution with μ=150 and σ=10. The clonal composition of each tumor was specified by randomly sampling from the reference set. The cancer cell fraction of mutations in each tumor were determined by sampling from a multinomial distribution with p parameters set to match the tumor's clonal composition.
Next, following the approach outlined above, the observed TMB (obsTMB) was determined at tumor purity values ranging from 10-100% for each tumor sample. At each level of tumor purity and for each tumor sample, the ratio of true to observed TMB was determined. The median of this ratio across the simulated tumors was considered as a multiplicative correction factor used to transform the observed TMB to a value referred to as corrected TMB (cTMB) that more closely approximates the true TMB. The median and 95% confidence interval of the correction factor (r) calculated at different levels of tumor purity (α) from the simulation experiment are reported (Table 4).
cTMB=r(α)*obsTMB
This approach was applied to the tumor samples in cohort 1 and estimated the corrected TMB and its 95% confidence interval (
HLA Genetic Variation
OptiType v1.2. was used to determine HLA class I haplotypes as described elsewhere (see, e.g., Szolek et al., Bioinformatics 30:3310-3316 (2014)). The highly polymorphic nature of the HLA loci limits the accuracy of sequencing read alignment and somatic mutation detection by conventional methods. Therefore, a separate bioinformatic analysis using POLYSOLVER27 was applied to detect and annotate the somatic mutations in class I HLA genes. HLA class I haplotypes derived from application of Optitype-v1.2 to TCGA RNA-seq samples were retrieved from Genomic Data Commons (gdc.cancer.gov/about-data/publications/panimmune). To assess the possibility of loss of germline alleles in tumor, allele specific copy number profiles of the tumor samples from analysis of SNP6 copy number array data were obtained from Synapse (synapse.org/#!Synapse:syn1710464.2). Loss of heterozygosity of each HLA gene was determined by considering the minor allele copy number of the overlapping genomic region (minor CN=0 indicated complete loss of minor allele). Individual HLA-I alleles are classified into discrete supertypes, based upon similar peptideanchor-binding specificities as described elsewhere (see, e.g., Sidney et al., BMC immunology 9:1 (2008)).
Evaluation of Somatic HLA Loss
Given the essential role of MHC class I molecules in presentation of neo-antigens and initiation of a cascade of events that leads to anti-tumor immune response, we determined their maintenance or loss in tumor by applying LOHHLA using default program settings as described elsewhere (see, e.g., McGranahan et al., Cell 171:1259-1271 e1211 (2017)). LOHHLA determines allele specific copy number of HLA locus by realignment of NGS reads to patient-specific HLA reference sequences, and correction of the resulting coverage profile for tumor purity and ploidy. At each HLA locus heterozygous in germline, loss of heterozygosity was declared if the copy number for one of the two alleles was below 0.5, and there was a statistically significant different between the log copy ratio of the two alleles (PVal_unique <0.01). The unique number of class I HLA alleles in tumor was calculated by subtracting the number of germline heterozygous alleles with somatic LOH from the total number of unique alleles in germline.
TCR Sequencing
TCR clones were evaluated in tumor tissue by next generation sequencing. DNA from tumor samples was isolated by using the Qiagen DNA FFPE kit (Qiagen, CA). TCR-β CDR3 regions were amplified using the survey ImmunoSeq assay in a multiplex PCR method using 45 forward primers specific to TCR VP gene segments and 13 reverse primers specific to TCR Jβ gene segments (Adaptive Biotechnologies) as described elsewhere (see, e.g., Carlson et al., Nature communications 4:2680 (2013)). Productive TCR sequences were further analyzed. For each sample, a clonality metric was estimated in order to quantitate the extent of mono- or oligo-clonal expansion by measuring the shape of the clone frequency distribution as described elsewhere (see, e.g., Gao et al., Cell 167:397-404 e399 (2016)). Clonality values range from 0 to 1, where values approaching 1 indicate a nearly monoclonal population (Table 13).
Immunohistochemistry and Interpretation of CD8 Staining
Immunolabeling for CD8 detection was performed on formalin-fixed, paraffin embedded sections on a Ventana Discovery Ultra autostainer (Roche Diagnostics). Briefly, following deparaffinization and rehydration, epitope retrieval was performed using Ventana Ultra CC1 buffer (Roche Diagnostics) at 96° C. for 64 minutes. Sections were subsequently incubated with the primary mouse anti-human CD8 antibody, (1:100 dilution, clone m7103, Dako) at 36° C. for 60 minutes, followed by incubation with an anti-mouse HQ detection system (Roche Diagnostics) and application of the Chromomap DAB IHC detection kit (Roche Diagnostics). A minimum of 100 tumor cells were evaluated per specimen. CD8-positive lymphocyte density was evaluated per 20× high power field.
Statistical Analyses
Differences between responding and non-responding tumors were evaluated using chi-square or Fisher's exact test for categorical variables and the Mann-Whitney test for continuous variables. The Pearson correlation coefficient (R) was used to assess correlations between continuous variables. P values were corrected using the Benjamini-Hochberg procedure and the associated false discovery rate (FDR) values were calculated. Tumors were classified based on their non-synonymous sequence alteration load in high and low mutators, using the second tertile as a cut-off point. The median point estimate and 95% CI for PFS and OS were estimated by the Kaplan-Meier method and survival curves were compared by using the nonparametric log rank test. Univariate Cox proportional hazards regression analysis was used to determine the impact of individual parameters on overall survival. A multivariable Cox proportional hazards model was employed using corrected TMB, RTK mutations, smoking mutational signature and number of HLA germline alleles. A risk score reflecting the relative hazard was calculated as the exponential of the sum of the product of mean-centered covariate values and their corresponding coefficient estimates for each case. The second tertile of the risk score was used to classify patients in high risk (top 33.3%) and low risk (bottom 66.6%) groups. All p values were based on two-sided testing and differences were considered significant at p<0.05. Statistical analyses were done using the SPSS software program (version 25.0.0 for Windows, IBM, Armonk, N.Y.) and R version 3.2 and higher, http://www.R-project.org/).
It is to be understood that while the invention has been described in conjunction with the detailed description thereof, the foregoing description is intended to illustrate and not limit the scope of the invention, which is defined by the scope of the appended claims. Other aspects, advantages, and modifications are within the scope of the following claims.
This application claims benefits of priority to U.S. Provisional Application No. 62/824,807 filed Mar. 27, 2019, the entire contents of which are incorporated herein by reference.
This invention was made with government support under CA180950, CA006973, and CA121113 awarded by the National Institutes of Health. The government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2020/025551 | 3/27/2020 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62824807 | Mar 2019 | US |