While immune checkpoint therapies (ICT) have improved outcomes for some cancer patients, most patients do not respond to ICT. Identifying factors underlying resistance to immune checkpoint therapy (ICT) is still challenging. Previous whole-exome sequencing (WES) and transcriptome sequencing of tumors identified multiple factors that are associated with favorable ICT outcome, including expression of PD-L112, high tumor mutational burden13, and the presence of tumor-infiltrating CD8+ T cells14. Markers indicative of unfavorable response include defects in IFNγ pathways or antigen presentation15,16. However, previous efforts to discover biomarkers for patients who will respond to ICT mainly focused on CD8+ T cells17.
The present invention addresses the aforementioned need by determining if other types of immune cells and their subclusters are associated with ICT outcomes. While prior studies represented a first step in identifying biomarkers, studies using single-cell RNA sequencing (scRNA-seq) have the potential to greatly improve the identification of factors underlying ICT outcomes. The inventors identified a previously unrecognized immune cell subpopulations that could play an important role in determining ICT responsiveness. The analysis of multiple additional gene expression datasets of more melanoma samples identified and validated an ICT outcome signature (“ImmuneCells.Sig”) enriched with the genes characteristic of the immune cell subsets detected in the scRNA-seq study. It predicted the ICT outcomes of melanoma patients more accurately than the 12 previously reported ICT response signatures. The validated ImmuneCells.Sig provided an improved predictor of ICT response and could contribute to the decision making for immunotherapy, particularly anti-PD-1 therapy.
The present disclosure thereby provides a novel gene expression signature (ImmuneCells.Sig) that predicted the ICT (immune checkpoint therapy) outcomes of melanoma patients with significantly more accuracy than all previously reported ICT response signatures. The validated ImmuneCells.Sig provided one of the most accurate predictors to date of ICT response and could contribute immensely to clinical decision making for immunotherapy. The gene expression signature may be provided in a chip or detection kit for determining ICT responsiveness of a tumor.
The present invention provides methods and kits for determining if a subject would be susceptible to immune checkpoint therapy, the method comprising detecting one or more genes associated with the gene expression signature as described in the Examples section. In some embodiments, the one or more genes is detected by RNA sequencing (RNA-seq). In some embodiments, the one or more genes is detected by single-cell RNA sequencing (scRNA-seq).
In one aspect, the present invention provides a method of determining susceptibility and response to immune checkpoint therapy in a subject in need thereof, the method comprising detecting one or more genes associated with an immune cell gene expression signature (ImmuneCells.Sig) of Table 1, wherein the detecting of one or more of the genes detects resistance to the immune checkpoint therapy.
In another aspect, the invention provides a method of treating a subject with cancer, the method comprising: a) determining if the subject has a cancer which is susceptible and responsive to a checkpoint inhibitor by determining expression profile of one or more genes associated with an immune cell gene expression signature (ImmuneCells.Sig), and b) treating the subject with the checkpoint inhibitor in an amount effective to treat the cancer.
In a further aspect, the disclosure provides a gene chip comprising an expression signature (ImmuneCells.Sig) useful for determining the response to immune checkpoint therapy, the gene chip comprising probes useful to detect the level of 10 or more biomarkers listed in Table 1.
In yet another aspect, the disclosure provides method for processing a test sample to determine a likelihood that a cancer is responsive to anti-PD-1 immunotherapy in a patient, comprising: (a) receiving information indicative of an expression level of a plurality of biomarkers in a tumor sample extracted from the patient; (b) providing the plurality of biomarker levels as input to a classifier configured to predict likelihood that a patient is reactive in response to anti-PD-1 immunotherapy in a computer to classify the test sample, wherein the classifier was trained with a plurality of training samples comprising pre-therapy tumor expression data of known PD-1 therapy responding patients and pre-therapy tumor expression data of known non-responder patients, and wherein the sensitivity and specificity of the classifier is sufficient to identify the likelihood that the patient is responsive to anti-PD-1 immunotherapy; (c) receiving, from the classifier, an output report that identifies said classification as indicative of the likelihood that the patient is responsive to anti-PD-1 immunotherapy.
In yet another aspect, the invention provides a kit for detecting the likelihood of a subject with cancer to be responsive to checkpoint therapy, the kit comprising a panel of 10 biomarkers from Table 2 attached to a solid surface and instructions for use.
In a further aspect, the invention provides a system for processing a test sample to determine a likelihood that a patient with cancer is responsive to anti-PD-1 immunotherapy in a patient, comprising: (a) a computer capable of receiving input data of the expression of a plurality of biomarker levels, (b) a classifier configured to predict likelihood that a to respond to anti-PD-1 immunotherapy to classify the test sample, and (c) an output report from the classifier that identifies said classification as indicative of the likelihood that the patient be responsive to anti-PD-1 immunotherapy.
The foregoing and other aspects and advantages of the invention will appear from the following description. In the description, reference is made to the accompanying drawings which form a part hereof, and in which there are shown, by way of illustration, preferred embodiments of the invention. Such embodiments do not necessarily represent the full scope of the invention, however, and reference is made therefore to the claims and herein for interpreting the scope of the invention.
Despite progress in the development of immune checkpoint therapies (ICT), identifying factors underlying ICT resistance is still challenging. Most cancer patients do not respond to ICT and the availability of the predictive biomarkers is limited. Here, we analyzed a single-cell RNA sequencing (scRNA-seq) dataset of tumor-infiltrating immune cells from 48 melanoma samples of patients subjected to ICT and discovered a subset of macrophages (cluster 12) overexpressing TREM2 and a subset of γδ T cells (cluster 21) that were both overrepresented in the non-responding tumors. In addition, the percentage of a B cell subset (cluster 22) was significantly lower in the non-responders. The presence of the immune cell subtypes was corroborated in other scRNA-seq datasets including that of another cancer type. The analysis of multiple gene expression datasets of the melanoma samples identified and validated an ICT outcome signature—ImmuneCells.Sig enriched with the genes characteristic of the above immune cell subsets. ImmuneCells.Sig predicted the ICT outcomes of melanoma patients with significantly more accuracy than all previously reported ICT response signatures. The validated ImmuneCells.Sig provided one of the most accurate predictors to date of ICT response and could contribute immensely to clinical decision making for immunotherapy.
The present invention provides novel gene signature associated with immune checkpoint inhibitor (ICT) named ImmuneCells.Sig which is predicative of ICT outcomes of cancer patients, e.g., melanoma patients, which is significantly more accurate than all previously reported ICT response signatures. The ImmuneCells.Sig can be used as an accurate predictor of ICT response and may be used to determine if a patient will be susceptible and respond to ICT treatment.
The methods and compositions of the current disclosure pertain to signatures used to determine if a patient will be susceptible and respond to ICT treatment.
As used herein, “immune checkpoints” refers to proteins or peptides that regulate the activity of an immune response. For example, some immune checkpoints interfere with the ability of the immune system to mount an effective response. By way of example but not by way of limitation, immune checkpoints include the PD-1:PD-L1/PD-L2 axis.
As used herein, “immune checkpoint therapy” (“ICT”) refers to an intervention that is targeted to interfere with the normal function of “immune checkpoints.” In some embodiments, ICT comprises a treatment that interferes with the function of PD-1 or its ligands PD-L1 and PD-L2. In some embodiments, the ICT comprises a monoclonal antibody targeted to PD-1. In some embodiments, the monoclonal ICT therapy is selected from the group consisting of pembrolizumab, nivolumab, cemiplimab, atezolizumab, dostarlimab, durvalumab, and avelumab.
Checkpoint inhibitors that comprise anti-PD1 antibodies or anti-PDL1-antibodies or fragments thereof are known to those skilled in the art, and include, but are not limited to, cemiplimab, nivolumab, pembrolizumab, MEDI0680 (AMP-514), spartalizumab, camrelizumab, sintilimab, toripalimab, dostarlimab, and AMP-224. Checkpoint inhibitors that comprise anti-PD-L1 antibodies known to those skilled in the art include, but are not limited to, atezolizumab, avelumab, durvalumab, and KN035. The antibody may comprise a monoclonal antibody (mAb), chimeric antibody, antibody fragment, single chain, or other antibody variant construct, as known to those skilled in the art. PD-1 inhibitors may include, but are not limited to, for example, PD-1 and PD-L1 antibodies or fragments thereof, including, nivolumab, an anti-PD-1 antibody, available from Bristol-Myers Squibb Co and described in U.S. Pat. Nos. 7,595,048, 8,728,474, 9,073,994, 9,067,999, 8,008,449 and 8,779,105; pembrolizumab, and anti-PD-1 antibody, available from Merck and Co and described in U.S. Pat. Nos. 8,952,136, 83,545,509, 8,900,587 and EP2170959; atezolizumab is an anti-PD-L1 available from Genentech, Inc. (Roche) and described in U.S. Pat. No. 8,217,149; avelumab (Bavencio, Pfizer, formulation described in PCT Publ. WO2017097407), durvalumab (Imfinzi, Medimmune/AstraZeneca, WO2011066389), cemiplimab (Libtayo, Regeneron Pharmaceuticals Inc., Sanofi, see, e.g., U.S. Pat. Nos. 9,938,345 and 9,987,500), spartalizumab (PDR001, Novartis), camrelizumab (AiRuiKa, Hengrui Medicine Co.), sintilimab (Tyvyt, Innovent Biologics/Eli Lilly), KN035 (Envafolimab, Tracon Pharmaceuticals, see, e.g., WO2017020801A1); tislelizumab available from BeiGene and described in U.S. Pat. No. 8,735,553; among others and the like. Other PD-1 and PD-L1 antibodies that are in development may also be used in the practice of the present invention, including, for example, PD-1 inhibitors including toripalimab (JS-001, Shanghai Junshi Biosciences), dostarlimab (GlaxoSmithKline), INCMGA00012 (Incyte, MarcoGenics), AMP-224 (AstraZeneca/MedImmune and GlaxoSmithKline), AMP-514 (AstraZeneca), and PD-L1 inhibitors including AUNP12 (Aurigene and Laboratoires), CA-170 (Aurigen/Curis), and BMS-986189 (Bristol-Myers Squibb), among others (the references citations regarding the antibodies noted above are incorporated by reference in their entireties with respect to the antibodies, their structure and sequences). Fragments of PD-1 or PD-L1 antibodies include those fragments of the antibodies that retain their function in binding PD-1 or PD-L1 as known in the art, for example, as described in AU2008266951 and Nigam et al. “Development of high affinity engineered antibody fragments targeting PD-L1 for immunoPED,” J Nucl Med May 1, 2018 vol. 59 no. supplement 1 1101, the contents of which are incorporated by reference in their entireties.
As used herein, “cancer” refers to many diseases, e.g., cell proliferative diseases, wherein an organism's cells grow uncontrollably and may spread to other locations in the organism. By way of example but not by way of limitation, cancer may refer to breast cancer, lung cancer, prostate cancer, skin cancer, colon cancer, leukemia, or lymphoma. In some embodiments, cancer refers to melanoma. In some embodiments, cancer refers to basal cell carcinoma (BSC). Specifically, the cancers may be cancers in which checkpoint inhibitors are used for treatment, including anti-PD-1 therapies.
In a first aspect of the current disclosure, methods of determining susceptibility and response to immune checkpoint therapy in a subject in need thereof are provided. As used herein, “susceptibility” refers to the expectation that a patient will respond positively to the indicated therapy. As used herein, “response” refers to a condition where therapeutic targets, for example, tumor burden, that have been defined a priori have been significantly modified by treatment. Modification of treatment includes a reduction of tumor burden, inhibition or reduction of tumor growth and the like. In some embodiments, the method comprises detecting one or more genes associated with an immune cell gene expression signature (ImmuneCells.Sig) of Table 1, wherein the detecting of one or more of the genes detects resistance to the immune checkpoint therapy. In some embodiments of the method, the subject has melanoma. In some embodiments, the subject has basal cell carcinoma (BCC). In some embodiments, the method comprises treating the subject with immune checkpoint therapy if the one or more genes is not detected. In some embodiments, the one or more genes are associated with macrophages that overexpress TREM2 or a subset of γδ T cells. In some embodiments, the one or more genes comprises 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, or 108 of the biomarkers listed in Table 1.
In a second aspect of the current disclosure, methods of treating a subject with cancer are provided. In some embodiments, the method comprises: a) determining if the subject has a cancer which is susceptible and responsive to a checkpoint inhibitor by determining expression profile of one or more genes associated with an immune cell gene expression signature (ImmuneCells.Sig), and b) treating the subject with the checkpoint inhibitor in an amount effective to treat the cancer. In some embodiments, the cancer is melanoma. In some embodiments, the checkpoint inhibitor is PD-1 or PD-L1 inhibitor. In some embodiments, the one or more genes comprise 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, or 108 of the biomarkers listed in Table 1.
In a third aspect of the current disclosure, a gene chip is provided. In some embodiments, the gene chip comprises an expression signature (ImmuneCells.Sig) useful for determining the response to immune checkpoint therapy, the gene chip comprising probes useful to detect the level of 10 or more biomarkers listed in Table 1. In some embodiments, the gene chip comprises probes useful to detect the level 20, 30, 40, 50, 60, 70, 80, 90, 100, or 108 of the biomarkers listed in Table 1. In some embodiments, the chip comprises 108 biomarkers listed in Table 1.
In a fourth aspect of the current disclosure, methods for processing a test sample to determine a likelihood that a cancer is responsive to anti-PD-1 immunotherapy in a patient are provided. In some embodiments, the methods comprise (a) receiving information indicative of an expression level of a plurality of biomarkers in a tumor sample extracted from the patient; (b) providing the plurality of biomarker levels as input to a classifier configured to predict likelihood that a patient is reactive in response to checkpoint therapy, preferably anti-PD-1 immunotherapy, in a computer to classify the test sample, wherein the classifier was trained with a plurality of training samples comprising pre-therapy tumor expression data of known PD-1 therapy responding patients and pre-therapy tumor expression data of known non-responder patients, and wherein the sensitivity and specificity of the classifier is sufficient to identify the likelihood that the patient is responsive to anti-PD-1 immunotherapy; (c) receiving, from the classifier, an output report that identifies said classification as indicative of the likelihood that the patient is responsive to anti-PD-1 immunotherapy. In some embodiments, the method for processing a test sample further comprises: determining, based on the output, that the patient is likely responsive to anti-PD-1 immunotherapy; and administering anti-PD-1 immunotherapy to the patient based on the determination that the patient is likely to respond to anti-PD-1 immunotherapy.
As used herein, “biomarker” refers to a biological molecule found in blood, other body fluids, or tissues that is a sign of a normal or abnormal process, or of a condition or disease. A biomarker may be used to see how well the body responds to a treatment for a disease or condition. Also called molecular marker and signature molecule.
In some embodiments, the classifier has an accuracy of at least 85%. In some embodiments, the method comprises: detecting the expression level of the plurality of biomarkers by sequencing the nucleic acid molecules from the sample to yield data comprising one or more levels of gene expression producing is the sample. In some embodiments, the method comprises RNA sequencing (RNA-seq) analysis. As used herein, “sequencing” refers to the sequencing of nucleic acids. Sequencing of nucleic acids may be accomplished using, by way of example but not by way of limitation, Sanger sequencing, or next-generation sequencing. In some embodiments, the plurality of biomarkers comprises 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, or 108 of the biomarkers listed in Table 1. In some embodiments, the plurality of biomarkers consists of the 10 biomarkers in Table 2. In some embodiments, the patient's tumor is of a type selected from the group consisting of melanoma and basal cell carcinoma (BCC). In some embodiments of the methods, step (b) comprises identifying a copy number variation or a variant in the nucleotide data. In some embodiments of the method, said known samples comprise a cancer tissue sample from melanoma or basal cell carcinoma (BCC). In some embodiments, said plurality of training samples further comprises a normal tissue sample. In some embodiments of the method, said sensitivity is at least 70%. In some embodiments, said classifier generates said classification at a specificity of at least about 90%, alternatively at least 95%. In some embodiments, said sample of melanoma tissue was from a patient that was sensitive to checkpoint inhibitor therapy, preferably anti-PD-1 therapy, and wherein said classifier does classify said sample as likely to be responsive to the checkpoint inhibitor therapy. In some embodiments, said sample of melanoma tissue was from a patient treated with anti-PD therapy that was not responsive to checkpoint therapy, and wherein said classifier classifies said sample of melanoma tissue as not likely to be responsive to checkpoint therapy. In some embodiments, the method further comprises providing a treatment to said subject.
In a fifth aspect of the current disclosure, a kit for detecting the likelihood of a subject with cancer to be responsive to checkpoint therapy is provided. In some embodiments, the kit comprises a panel of 10 biomarkers from Table 2 attached to a solid surface and instructions for use.
In a sixth aspect of the current disclosure, systems for processing a test sample to determine a likelihood that a patient with cancer is responsive to anti-PD-1 immunotherapy in a patient are provided. In some embodiments, the system comprises: (a) a computer capable of receiving input data of the expression of a plurality of biomarker levels, (b) a classifier configured to predict likelihood that a to respond to anti-PD-1 immunotherapy to classify the test sample, and (c) an output report from the classifier that identifies said classification as indicative of the likelihood that the patient be responsive to anti-PD-1 immunotherapy.
The present disclosure provides systems that are programmed to implement methods of the disclosure.
Generally, machine learning algorithms are used to construct models that accurately assign class labels to examples based on the input features that describe the example. In some case it may be advantageous to employ machine learning and/or deep learning approaches for the methods described herein. Further, machine learning can be understood as the ability of a learning machine to perform accurately on new, unseen examples/tasks after having experienced a learning data set. Machine learning may include the following concepts and methods. Supervised learning concepts may include AODE; Artificial neural network, such as Backpropagation, Autoencoders, Hopfield networks, Boltzmann machines, Restricted Boltzmann Machines, and Spiking neural networks; Bayesian statistics, such as Bayesian network and Bayesian knowledge base; Case-based reasoning; Gaussian process regression; Gene expression programming; Group method of data handling (GMDH); Inductive logic programming; Instance-based learning; Lazy learning; Learning Automata; Learning Vector Quantization; Logistic Model Tree; Minimum message length (decision trees, decision graphs, etc.), such as Nearest Neighbor Algorithm and Analogical modeling; Probably approximately correct learning (PAC) learning; Ripple down rules, a knowledge acquisition methodology; Symbolic machine learning algorithms; Support vector machines; Random Forests; Ensembles of classifiers, such as Bootstrap aggregating (bagging) and Boosting (meta-algorithm); Ordinal classification; Information fuzzy networks (IFN); Conditional Random Field; ANOVA; Linear classifiers, such as Fisher's linear discriminant, Linear regression, Logistic regression, Multinomial logistic regression, Naive Bayes classifier, Perceptron, Support vector machines; Quadratic classifiers; k-nearest neighbor; Boosting; Decision trees, such as C4.5, Random forests, ID3, CART, SLIQ, SPRINT; Bayesian networks, such as Naive Bayes; and Hidden Markov models. Unsupervised learning concepts may include; Expectation-maximization algorithm; Vector Quantization; Generative topographic map; Information bottleneck method; Artificial neural network, such as Self-organizing map; Association rule learning, such as, Apriori algorithm, Eclat algorithm, and FP-growth algorithm; Hierarchical clustering, such as Single-linkage clustering and Conceptual clustering; Cluster analysis, such as, K-means algorithm, Fuzzy clustering, DBSCAN, and OPTICS algorithm; and Outlier Detection, such as Local Outlier Factor. Semi-supervised learning concepts may include; Generative models; Low-density separation; Graph-based methods; and Co-training. Reinforcement learning concepts may include; Temporal difference learning; Q-learning; Learning Automata; and SARSA. Deep learning concepts may include; Deep belief networks; Deep Boltzmann machines; Deep Convolutional neural networks; Deep Recurrent neural networks; and Hierarchical temporal memory.
The computer system 200 depicted in
The storage unit 215 can store files, such as output reports, and/or communications with the data about samples, or any aspect of data associated with the present disclosure.
The computer server 202 can communicate with one or more remote computer systems through the network 230. The one or more remote computer systems may be, for example, personal computers, laptops, tablets, telephones, Smart phones, or personal digital assistants.
In some applications the computer system 200 includes a single server 202. In other situations, the system includes multiple servers in communication with one another through an intranet, extranet and/or the internet.
The server 202 can be adapted to store measurement data or a database as provided herein, patient information from the subject, such as, for example, medical history, family history, demographic data and/or other clinical or personal information of potential relevance to a particular application. Such information can be stored on the storage unit 215 or the server 202 and such data can be transmitted through a network.
Methods as described herein can be implemented by way of machine (or computer processor) executable code (or software) stored on an electronic storage location of the server 202, such as, for example, on the memory 210, or electronic storage unit 215. During use, the code can be executed by the processor 205. In some cases, the code can be retrieved from the storage unit 215 and stored on the memory 210 for ready access by the processor 205. In some situations, the electronic storage unit 215 can be precluded, and machine-executable instructions are stored on memory 210. Alternatively, the code can be executed on a second computer system 240.
Aspects of the systems and methods provided herein, such as the server 202, can be embodied in programming. Various aspects of the technology may be thought of as “products” or “articles of manufacture” typically in the form of machine (or processor) executable code and/or associated data that is carried on or embodied in a type of machine readable medium. Machine-executable code can be stored on an electronic storage unit, such memory (e.g., read-only memory, random-access memory, flash memory) or a hard disk. “Storage” type media can include any or all of the tangible memory of the computers, processors or the like, or associated modules thereof, such as various semiconductor memories, tape drives, disk drives and the like, which may provide non-transitory storage at any time for the software programming. All or portions of the software may at times be communicated through the Internet or various other telecommunication networks. Such communications, for example, may enable loading of the software from one computer or processor into another, for example, from a management server or host computer into the computer platform of an application server. Thus, another type of media that may bear the software elements includes optical, electrical, and electromagnetic waves, such as used across physical interfaces between local devices, through wired and optical landline networks and over various air-links. The physical elements that carry such waves, such as wired or wireless likes, optical links, or the like, also may be considered as media bearing the software. As used herein, unless restricted to non-transitory, tangible “storage” media, terms such as computer or machine “readable medium” can refer to any medium that participates in providing instructions to a processor for execution.
The computer systems described herein may comprise computer-executable code for performing any of the algorithms or algorithms-based methods described herein. In some applications the algorithms described herein will make use of a memory unit that is comprised of at least one database.
Data relating to the present disclosure can be transmitted over a network or connections for reception and/or review by a receiver. The receiver can be but is not limited to the subject to whom the report pertains; or to a caregiver thereof, e.g., a health care provider, manager, other health care professional, or other caretaker; a person or entity that performed and/or ordered the analysis. The receiver can also be a local or remote system for storing such reports (e.g. servers or other systems of a “cloud computing” architecture). In one embodiment, a computer-readable medium includes a medium suitable for transmission of a result of an analysis of a biological sample using the methods described herein.
Hence, a machine readable medium, such as computer-executable code, may take many forms, including but not limited to, a tangible storage medium, a carrier wave medium or physical transmission medium. Non-volatile storage media include, for example, optical or magnetic disks, such as any of the storage devices in any computer(s) or the like, such as may be used to implement the databases, etc. shown in the drawings. Volatile storage media include dynamic memory, such as main memory of such a computer platform. Tangible transmission media include coaxial cables; copper wire and fiber optics, including the wires that comprise a bus within a computer system. Carrier-wave transmission media may take the form of electric or electromagnetic signals, or acoustic or light waves such as those generated during radio frequency (RF) and infrared (IR) data communications. Common forms of computer-readable media therefore include for example: a floppy disk, a flexible disk, hard disk, magnetic tape, any other magnetic medium, a CD-ROM, DVD or DVD-ROM, any other optical medium, punch cards paper tape, any other physical storage medium with patterns of holes, a RAM, a ROM, a PROM and EPROM, a FLASH-EPROM, any other memory chip or cartridge, a carrier wave transporting data or instructions, cables or links transporting such a carrier wave, or any other medium from which a computer may read programming code and/or data. Many of these forms of computer readable media may be involved in carrying one or more sequences of one or more instructions to a processor for execution.
For purposes of the present invention, “treating” or “treatment” describes the management and care of a subject for the purpose of combating the disease, condition, or disorder. Treating includes the administration of a checkpoint inhibitor therapy when it is determined that the subject would be provided a benefit by the administration of the treatment to prevent the onset of the symptoms or complications, alleviating the symptoms or complications, or eliminating the disease, condition, or disorder.
The term “treating” can be characterized by one or more of the following: (a) the reducing, slowing or inhibiting the growth of cancer, including reducing slowing or inhibiting the growth of cancer cells; (b) preventing the further growth of tumors; (c) reducing or preventing the metastasis of cancer within a patient, and (d) reducing or ameliorating at least one symptom of the cancer. In some embodiments, the optimum effective amounts can be readily determined by one of ordinary skill in the art using routine experimentation.
As used herein, the terms “effective amount” and “therapeutically effective amount” refer to the quantity of active therapeutic agent or agents sufficient to yield a desired therapeutic response without undue adverse side effects such as toxicity, irritation, or allergic response. The specific “effective amount” will, obviously, vary with such factors as the particular condition being treated, the physical condition of the subject, the type of animal being treated, the duration of the treatment, the nature of concurrent therapy (if any), and the specific formulations employed and the structure of the compounds or its derivatives.
As used herein, the terms “administering” and “administration” refer to any method of providing a pharmaceutical preparation to a subject. Such methods are well known to those skilled in the art and include, but are not limited to, oral administration, transdermal administration, administration by inhalation, nasal administration, topical administration, intravaginal administration, intraaural administration, rectal administration, sublingual administration, buccal administration, and parenteral administration, including injectable such as intravenous administration, intra-arterial administration, intramuscular administration, intradermal administration, intrathecal administration and subcutaneous administration. Administration can be continuous or intermittent. In various aspects, a preparation can be administered therapeutically; that is, administered to treat an existing disease or condition. In a preferred embodiment, the administration is intravenous administration.
The terms “nucleic acid” and “nucleic acid molecule,” as used herein, refer to a compound comprising a nucleobase and an acidic moiety, e.g., a nucleoside, a nucleotide, or a polymer of nucleotides. Nucleic acids generally refer to polymers comprising nucleotides or nucleotide analogs joined together through backbone linkages such as but not limited to phosphodiester bonds. Nucleic acids include deoxyribonucleic acids (DNA) and ribonucleic acids (RNA) such as messenger RNA (mRNA), transfer RNA (tRNA), etc. Typically, polymeric nucleic acids, e.g., nucleic acid molecules comprising three or more nucleotides are linear molecules, in which adjacent nucleotides are linked to each other via a phosphodiester linkage. In some embodiments, “nucleic acid” refers to individual nucleic acid residues (e.g. nucleotides and/or nucleosides). In some embodiments, “nucleic acid” refers to an oligonucleotide chain comprising three or more individual nucleotide residues. As used herein, the terms “oligonucleotide” and “polynucleotide” can be used interchangeably to refer to a polymer of nucleotides (e.g., a string of at least three nucleotides). In some embodiments, “nucleic acid” encompasses RNA as well as single and/or double-stranded DNA. Nucleic acids may be naturally occurring, for example, in the context of a genome, a transcript, an mRNA, tRNA, rRNA, siRNA, snRNA, a plasmid, cosmid, chromosome, chromatid, or other naturally occurring nucleic acid molecule. On the other hand, a nucleic acid molecule may be a non-naturally occurring molecule, e.g., a recombinant DNA or RNA, an artificial chromosome, an engineered genome, or fragment thereof, or a synthetic DNA, RNA, DNA/RNA hybrid, or include non-naturally occurring nucleotides or nucleosides. Furthermore, the terms “nucleic acid,” “DNA,” “RNA,” and/or similar terms include nucleic acid analogs, i.e. analogs having other than a phosphodiester backbone. Nucleic acids can be purified from natural sources, produced using recombinant expression systems and optionally purified, chemically synthesized, etc. Where appropriate, e.g., in the case of chemically synthesized molecules, nucleic acids can comprise nucleoside analogs such as analogs having chemically modified bases or sugars, and backbone modifications. A nucleic acid sequence is presented in the 5′ to 3′ direction unless otherwise indicated. In some embodiments, a nucleic acid is or comprises natural nucleosides (e.g. adenosine, thymidine, guanosine, cytidine, uridine, deoxyadenosine, deoxythymidine, deoxyguanosine, and deoxycytidine); nucleoside analogs (e.g., 2-aminoadenosine, 2-thiothymidine, inosine, pyrrolo-pyrimidine, 3-methyl adenosine, 5-methylcytidine, 2-aminoadenosine, C5-bromouridine, C5-fluorouridine, C5-iodouridine, C5-propynyl-uridine, C5-propynyl-cytidine, C5-methylcytidine, 2-aminoadenosine, 7-deazaadenosine, 7-deazaguanosine, 8-oxoadenosine, 8-oxoguanosine, O(6)-methylguanine, and 2-thiocytidine); chemically modified bases; biologically modified bases (e.g., methylated bases); intercalated bases; modified sugars (e.g., 2′-fluororibose, ribose, 2′-deoxyribose, arabinose, and hexose); and/or modified phosphate groups (e.g., phosphorothioates and 5′-N-phosphoramidite linkages).
The present invention has been described in terms of one or more preferred embodiments, and it should be appreciated that many equivalents, alternatives, variations, and modifications, aside from those expressly stated, are possible and within the scope of the invention.
It should be apparent to those skilled in the art that many additional modifications beside those already described are possible without departing from the inventive concepts. In interpreting this disclosure, all terms should be interpreted in the broadest possible manner consistent with the context. Variations of the term “comprising” should be interpreted as referring to elements, components, or steps in a non-exclusive manner, so the referenced elements, components, or steps may be combined with other elements, components, or steps that are not expressly referenced. Embodiments referenced as “comprising” certain elements are also contemplated as “consisting essentially of” and “consisting of” those elements. The term “consisting essentially of” and “consisting of” should be interpreted in line with the MPEP and relevant Federal Circuit interpretation. The transitional phrase “consisting essentially of” limits the scope of a claim to the specified materials or steps “and those that do not materially affect the basic and novel characteristic(s)” of the claimed invention. “Consisting of” is a closed term that excludes any element, step or ingredient not specified in the claim. For example, with regard to sequences “consisting of” refers to the sequence listed in the SEQ ID NO. and does refer to larger sequences that may contain the SEQ ID as a portion thereof.
As used in this specification and the claims, the singular forms “a,” “an,” and “the” include plural forms unless the context clearly dictates otherwise. For example, the term “a substituent” should be interpreted to mean “one or more substituents,” unless the context clearly dictates otherwise.
As used herein, “about”, “approximately,” “substantially,” and “significantly” will be understood by persons of ordinary skill in the art and will vary to some extent on the context in which they are used. If there are uses of the term which are not clear to persons of ordinary skill in the art given the context in which it is used, “about” and “approximately” will mean up to plus or minus 10% of the particular term and “substantially” and “significantly” will mean more than plus or minus 10% of the particular term.
As used herein, the terms “include” and “including” have the same meaning as the terms “comprise” and “comprising.” The terms “comprise” and “comprising” should be interpreted as being “open” transitional terms that permit the inclusion of additional components further to those components recited in the claims. The terms “consist” and “consisting of” should be interpreted as being “closed” transitional terms that do not permit the inclusion of additional components other than the components recited in the claims. The term “consisting essentially of” should be interpreted to be partially closed and allowing the inclusion only of additional components that do not fundamentally alter the nature of the claimed subject matter.
The phrase “such as” should be interpreted as “for example, including.” Moreover, the use of any and all exemplary language, including but not limited to “such as”, is intended merely to better illuminate the invention and does not pose a limitation on the scope of the invention unless otherwise claimed.
Furthermore, in those instances where a convention analogous to “at least one of A, B and C, etc.” is used, in general such a construction is intended in the sense of one having ordinary skill in the art would understand the convention (e.g., “a system having at least one of A, B and C” would include but not be limited to systems that have A alone, B alone, C alone, A and B together, A and C together, B and C together, and/or A, B, and C together). It will be further understood by those within the art that virtually any disjunctive word and/or phrase presenting two or more alternative terms, whether in the description or figures, should be understood to contemplate the possibilities of including one of the terms, either of the terms, or both terms. For example, the phrase “A or B” will be understood to include the possibilities of “A” or ‘B or “A and B.”
All language such as “up to,” “at least,” “greater than,” “less than,” and the like, include the number recited and refer to ranges which can subsequently be broken down into ranges and subranges. A range includes each individual member. Thus, for example, a group having 1-3 members refers to groups having 1, 2, or 3 members. Similarly, a group having 6 members refers to groups having 1, 2, 3, 4, or 6 members, and so forth.
The modal verb “may” refers to the preferred use or selection of one or more options or choices among the several described embodiments or features contained within the same. Where no options or choices are disclosed regarding a particular embodiment or feature contained in the same, the modal verb “may” refers to an affirmative act regarding how to make or use and aspect of a described embodiment or feature contained in the same, or a definitive decision to use a specific skill regarding a described embodiment or feature contained in the same. In this latter context, the modal verb “may” has the same meaning and connotation as the auxiliary verb “can.”
The invention will be more fully understood upon consideration of the following non-limiting examples.
The present invention has been described in terms of one or more preferred embodiments, and it should be appreciated that many equivalents, alternatives, variations, and modifications, aside from those expressly stated, are possible and within the scope of the invention.
The following Examples are offered for illustrative purposes only, and are not intended to limit the scope of the present invention in any way. Indeed, various modifications of the invention in addition to those shown and described herein will become apparent to those skilled in the art from the foregoing description and the following examples and fall within the scope of the appended claims.
While immune checkpoint therapies (ICT) have improved outcomes for some cancer patients, most patients do not respond to ICT. Previous whole-exome sequencing (WES) and transcriptome sequencing of tumors identified multiple factors that are associated with favorable ICT outcome, including expression of PD-L11, high tumor mutational burden2, and the presence of tumor-infiltrating CD8+ T cells3. Markers indicative of unfavorable response include defects in IFNγ pathways or antigen presentation4,5. While these studies represented a first step in identifying biomarkers, studies using single-cell RNA sequencing (scRNA-seq) have the potential to greatly improve the identification of factors underlying ICT outcomes. For example, one scRNA-seq study of 48 tumor biopsies of responding and non-responding tumors after ICT treatment has the potential to be insightful given the number of patients and high quality data6.
To determine if some types of immune cells and their subclusters are associated with ICT outcomes, we analyze the scRNA-seq datasets from multiple outstanding studies6_8 and identify the immune cell subpopulations that could play an important role in determining ICT responsiveness. The analysis of several additional bulk RNA-seq datasets of melanoma9_12 identifies and validates an ICT outcome signature—ImmuneCells.Sig—enriched with the genes characteristic of the immune cell subsets detected in the scRNA-seq studies. It predicts the ICT outcomes of melanoma patients more accurately than the previously reported ICT response signatures.
Specifically, we find that a subset of macrophages (cluster 12) and a subset of gammadelta (γδ) T cells (cluster 21) are highly enriched in the ICT non-responding tumors. On the other hand, the percentage of a subset of B cells (cluster 22) is significantly smaller in the ICT non-responders compared to the responders. The validated ImmuneCells.Sig ICT outcome signature is enriched with the genes characteristic of the above three immune cell subsets. It can predict the ICT outcomes of melanoma patients more accurately than the previous outstanding signatures, thereby supporting the role of these specific types of immune cells in affecting the ICT outcomes. These findings substantially extend our understanding of the factors associated with ICT responsiveness. Our results may warrant further investigation in the cancer immunotherapy setting.
Association of Immune Cell Populations with ICT Outcome
We utilized the Seurat package13,14 to perform fine clustering of the original 16,291 single cells based on raw data from a previous melanoma study6. The melanoma patient response categories were defined by RECIST (Response evaluation criteria in solid tumors) as: complete response (CR) and partial response (PR) for responders, or stable disease (SD) and progressive disease (PD) for non-responders15. Progression-free survival was also considered in distinguishing the responders from non-responders. To relate molecular and cellular variables with responses of individual lesions to therapy, the previous study classified each of the 48 tumor samples based on radiologic assessments into progression/non-responder (NR; n=31, including SD/PD samples) or regression/responder (R; n=17, including CR/PR samples)6. The gene expression data of single cells from tumors with different ICT outcomes, i.e., regression/responder (Responder—‘R’; n.patients=17; n.cells=5564) and progression/non-responder (Non-Responder—‘NR’; n.patients=31; n.cells=10,727), were aligned and projected in two-dimensional space through uniform manifold approximation and projection (UMAP)16 to allow the identification of ICT-outcome-associated immune cell populations. This analysis generated 23 cell clusters across all samples (
We utilized gene expression patterns of canonical markers to classify the 23 clusters into 10 major immune cell populations (
We tested the 23 immune cell clusters for their percentage differences between the non-responders and responders at the patient level (
To account for clinical differences, we divided the melanoma samples into subgroups according to three factors: (1) ICT outcomes, (2) sample collection time (before or after ICT), and (3) treatment schemes (
TREM2hi Macrophages May Contribute to ICT Resistance
Of the macrophage populations in clusters 6, 12, and 23 (
Significantly Enriched Pathways in TREM2hi Macrophages
To identify if functional heterogeneity of these macrophage subsets could be associated with ICT outcomes, we performed ‘Reactome pathways’ analysis for macrophages based on cluster-specific genes detected by Seurat (
Validation of the TREM2hi Macrophage Signature
Since TREM2hi macrophages correlated with ICT resistance, we determined if tumors enriched in TREM2hi macrophages were associated with poor ICT outcomes. Based on the overexpressed genes of this macrophage subset, we developed a 40-gene set to characterize TREM2hi macrophages, which included the genes highly correlated with TREM2 expression (those for the complement system or M2 polarization), and other overexpressed genes (
Association of γδ T- and B-Cell Subsets with ICT Outcome
We also identified two clusters of γδ T cells (927 cells total; clusters 8 and 21,
We also identified a correlation between the presence of B cells and ICT response. All four B-cell clusters (13, 14, 17, and 22) were less abundant in the ICT non-responders, which suggests that tumor-associated B cells, in general, are associated with favorable ICT response. Most notably, the percentage of cluster 22 B cells (named as B_c22) was 9.3-fold lower in NR versus R (
Validation in the Other scRNA-Seq Datasets of ICT Patients
To validate the results we found based on the initial scRNA-seq data, we downloaded and re-analyzed another scRNA-seq dataset of melanoma with corresponding immunotherapy efficacy data7. This dataset did not have γδ T-cell data available. Interestingly, the deeper clustering of the macrophages and B cells sequenced by this study showed the existence of similar macrophage and B-cell subpopulations that resemble our identified TREM2hi macrophages and B_c22 B cells (
We also analyzed a single-cell RNA-seq dataset of basal cell carcinoma (BCC) patients before and after anti-PD-1 therapy8. We found that the results of our study can be generalized to BCC treated with ICT. Although this BCC scRNA-seq dataset did not sequence the γδ T cells, the results for macrophages and B cells in this BCC dataset are similar to our findings for the melanoma dataset. First, we did general clustering analyses and identified the overall macrophages and B cells populations (
The Development of an ICT Outcome Signature
Because the TREM2hi Mφ, Tgd_c21 and B_c22 populations exhibited the greatest quantitative differences between ICT non-responders and responders, we hypothesized that the expression of the feature genes of these populations may predict ICT outcome. To explore this hypothesis, we developed an ICT responsiveness signature based on the scRNA-seq dataset and a bulk gene expression dataset—GSE782209 using the cancerclass R package23. This signature had significantly high prognostic values for ICT outcomes in the discovery dataset. Specifically, for the GSE782209 dataset (N=28, NR vs R: 13 vs 15), the signature had an AUC (Area Under The Curve) of 0.98 (95% confidence interval [CI], 0.96-1), sensitivity of 93% (95% CI, 72-100%), and specificity of 85% (95% CI, 59-97%;
To validate the above ICT response signature—ImmuneCells.Sig, we analyzed three independent gene expression datasets of melanoma patients to test the predictive performance of ImmuneCells.Sig10_12. For the first two datasets (GSE91061 and PRJEB23709)10,11, the pretreatment melanoma samples were selected for validation. Neither of these datasets were used to develop the ImmuneCells.Sig. For the GSE91061 dataset (N=51, NR vs R: 25 vs 26), ImmuneCells.Sig performed well in differentiating NR from R tumors with an AUC of 0.96 (95% CI, 0.94-0.99), sensitivity of 88% (95% CI, 72-97%), and specificity of 92% (95% CI, 78-99%;
For further validation, we downloaded and analyzed the third dataset that includes the gene expression profile of a big cohort of melanoma patients who were treated by the anti-PD-1 immunotherapy, from which a large number of pretreatment melanoma samples from 103 patients with distinct response to ICT (46 responders vs 57 non-responders) had been subjected to RNA-seq12. Applied to this large dataset that was named as MGSP (melanoma genome sequencing project), the predictive value of ImmuneCells.Sig was still high. Specifically, it differentiated progressors from responders with an AUC of 0.88 (95% CI, 0.84-0.91), sensitivity of 79% (95% CI, 68-87%), and specificity of 79% (95% CI, 67-88%;
Among the four bulk RNA-seq datasets, only the PRJEB23709 dataset had pre-ICT biopsies for melanoma patients treated with either anti-PD-1 (41 patients: 19 non-responders vs 22 responders) or the combination of anti-PD-1 and anti-CTLA-4 drugs (32 patients: 8 non-responders vs 24 responders). We split the PRJEB23709 dataset into PRJEB23709_Pre_anti-PD-1 and PRJEB23709_Pre_Combo according to the treatment scheme (anti-PD-1 or combination of anti PD-1 and anti-CTLA-4). In each dataset, we tested the performance of ImmuneCells.Sig. It was found that ImmuneCells.Sig can accurately distinguish responders from non-responders in both Pre_anti-PD-1 and Pre_Combo subgroups. For PRJEB23709_Pre_anti-PD-1 subset, the performance of ImmuneCells.Sig is as follows: AUC=0.88 (95% CI, 0.83-0.94), sensitivity=86% (95% CI, 68-96%), and specificity=79% (95% CI, 58-92%;
Using the R package cancerclass, we can calculate the z-score in each pre-therapy biopsy based on the expression values of the ImmuneCells.Sig genes to predict who are more likely to respond to anti-PD-1 or anti-PD-1 plus anti-CTLA-4 combo therapy. For example, in the model built from Pre-anti-PD-1 dataset of PRJEB23709_Pre_anti-PD-1, the threshold z-score of 0.19 yielded sensitivity of 91% for responders. In the model built from Pre-Combo dataset of PRJEB23709_Pre_Combo, the threshold z-score of 0.1 yielded sensitivity of 91% for responders. Therefore, if we test a pre-therapy melanoma sample, the corresponding patient may not respond to either anti-PD-1 treatment or anti-PD-1 plus anti-CTLA-4 combo treatment if the z-score is <0.1, but may respond to the more toxic combo treatment if z-score is within the range of [0.1, 0.19], and may respond to the less toxic anti-PD-1 treatment alone if the z-score is >0.19. Therefore, prediction of the outcomes of different therapy regimen is possible based on the application of ImmuneCells.Sig.
To further evaluate the predictive performance of the ImmuneCells.Sig signature, we compared the ImmuneCells.Sig with the other 12 ICT response signatures reported previously (
A large-scale single-cell RNA-seq study of tumor samples of melanoma patients treated by ICT6 was re-analyzed to dissect individual cell populations that may correlate with response. Three immune cell clusters had drastically different percentages in ICT responders vs non-responders. The TREM2hi macrophages and Tgd_c21 T cells were markedly higher in the non-responders and could contribute to ICT resistance; in contrast, the B_c22 B cells were higher in the responders and could contribute to ICT anti-tumor response. TREM2hi macrophages, the most enriched immune cell subcluster in the non-responders, displayed a distinct gene expression pattern, with overexpression of key genes of the complement system. Expression of complement effectors and receptors has been associated with cancer progression and poor prognosis33,34. Among all the complement elements that may have the pro-cancer activities, C1q chains, C3-derived fragments, and C5a are likely the most important modulators of tumor progression35,36. In a clear-cell renal cell carcinoma (ccRCC) model, mice deficient in C1q, C4, or C3 displayed decreased tumor growth, whereas tumors infiltrated with high densities of C1q-producing macrophages exhibited an immunosuppressed microenvironment37. The classical complement pathway is a key inflammatory mechanism that is activated by cooperation between tumor cells and tumor-associated macrophages, favoring cancer progression37. Our findings extend this premise; TREM2hi macrophages, which overexpress major elements of the complement system and activation of the complement cascade, are enriched in ICT non-responders and could be the major macrophage subset that contributes to ICT resistance.
Although the role of complement system is not completely understood, other studies described different mechanisms by which complement activation in the tumor microenvironment can enhance tumor growth, such as altering the immune profile of tumor-infiltrating leukocytes, increasing cancer cell proliferation, and suppressing CD8+ TIL function38. More recently, complement effectors such as C1q, C3a, C5a, and others have been associated with inhibition of anti-tumor T-cell responses through the recruitment and/or activation of immunosuppressive cell subpopulations such as MDSCs (myeloid-derived suppressor cells), Tregs, or M2 tumor-associated macrophages (TAMs)39. The rationale of inhibiting the complement system for therapeutic combinations to enhance the anti-tumor efficacy of anti-PD-1/PD-L1 checkpoint inhibitors has been proposed based on the supporting evidence that complement blocks many of the effector routes associated with the cancer-immunity cycle39. Our study results were in line with these findings and suggest that the TREM2hi macrophage population which has an activated complement system could be another source or consequence of complement activation contributing to the blockade of cancer-immunity cycle.
Many M2 polarization genes, some of which are known to be tumor-promoting, were also overexpressed in TREM2hi macrophages. For example, CD276 (B7-H3) plays a role in down-regulating T-cells involved in tumor immunity40,41. High CD276 expression is associated with increased tumor size, lymphovascular invasion, poorly differentiated tumors, and shorter overall patient survival42,43. CD276 expression is also associated with tumor-infiltrating FOXP3+ regulatory T cells which inhibit effector T cells44,45 and is important for immune evasion and tumorigenesis in prostate cancer46. CD276 also inhibits NK cell lysis of tumor cells47. The overexpression of CD276 in TREM2hi macrophages likely has implications for promoting ICT resistance. PD-L2, an important immune co-inhibitory molecule48, was also overexpressed in the TREM2hi macrophages. Increased expression of PD-L2 in tumor-associated macrophages contributes to suppressing anti-tumor immunity in mice treated with anti-PD-L1 monoclonal antibody49. Thus, the high PD-L2 expression in TREM2hi macrophages could facilitate ICT resistance and tumor progression. Some single-cell studies reported that M1 and M2 signatures are positively correlated in myeloid populations50,51. We checked the expression of M1 markers from these studies in the TREM2hi macrophages (
A γδ T cells subset, Tgd_c21, was present at much higher levels in the non-responders. Despite their role in anti-tumor cytotoxicity, γδ T cells could also promote cancer progression by inhibiting anti-tumor responses and enhancing cancer angiogenesis. Consequently, γδ T cells have a dual effect and are considered as being both friends and foes of cancer53. The enrichment of the Tgd_c21 cells in the ICT non-responders suggests an association with ICT resistance. The top Tgd_c21 marker genes are oncogenic by nature including RM254, BIRC5 (Survivin)55, SPC2456,57, UBE2C58,59, and CDCA560. Pathway analysis revealed a significant reduction in ligand-receptor binding capacity, IFNα and IFNβ signaling, IFN-γ response, and immunoregulatory interactions of Tgd_c21 cells, suggesting that Tgd_c21 cells may be a type of ‘exhausted’ γδ T cell with impaired anti-tumor immune functions. A previous study showed that the positive outcome of PD-1 blockade on treating leukemia may be because that it induces significant upregulation of the potent pro-inflammatory and anti-tumor cytokine IFN-γ in certain types of γδ T cells61. Complementing their study, we showed that the failure of immunotherapy in treating melanoma may be associated with some types of γδ T cells (e.g., Tgd_c21). The pathway analysis showed that this subset of γδ T cells—Tgd_c21 had decreased activity of the anti-tumor IFN-γ pathway in the non-responders than the responders subjected to the immunotherapy (
All B-cell clusters were depressed in the ICT non-responders. Apart from their role in antibody production, B cells also are an important source of cytokines and chemokines that contribute to anti-tumor immune responses62. Therefore, the decreased B-cell percentages in non-responders could contribute to ICT resistance and/or progression of ICT-resistant tumors. We compared the present B-cell subpopulation signature (B_c22, based on cutoff P value 0.05) with the other B-cell signature recently published in the context of ICT by Helmink et al.63 and found several genes shared by both signatures including TCL1A, ITIH5, LAX1, KCNA3, CD79A, AREG, GBP1, ATP8A, and IGLL5. Both our signature and their signature characterized the B-cell populations that were significantly enriched in the ICT responders versus non-responders. However, the B cells associated with these two signatures were different. This is because our B_c22 (single cell cluster 22) signature was developed based on the scRNA-seq data of melanoma samples and its corresponding B cells were a subset of B cells that were highly enriched in the ICT responders than the non-responders. We also identified three other B-cell subpopulations corresponding to clusters 13, 14, and 17 (
For comparison with ImmuneCells.Sig, we used the gene signature representing the three component cell clusters (TREM2hi macrophages, Tgd_c21 γδ T cells, and B_c22 B cells) identified from the scRNA-seq data (
The decreased percentage of B cells and increased percentage of macrophages/monocytes in ICT non-responding patients had been reported previously6. However, the important subsets of these immune cell populations were not revealed as in this study. Moreover, we identified an ICT outcome gene expression signature, ImmuneCells.Sig, that is enriched for the characteristic genes of TREM2hi macrophages, Tgd_c21, and B_c22 subpopulations. The ImmuneCells.Sig signature outperformed the other outstanding signatures in predicting the outcome of immune checkpoint therapies across all four independent datasets9_12. Our characterization of these immune cell populations provides the opportunities to improve the efficacy of cancer immunotherapy and to better understand the mechanisms of ICT resistance.
Study Design
Single-cell RNA-sequencing data (accession number GEO: GSE120575) of melanoma samples from the initial publication6 were down-loaded and re-analyzed for this manuscript. For the validation purposes, two other scRNA-seq datasets7,8 of melanoma and BCC were also downloaded, which are accessible through GEO accession number: GSE115978 and GSE123813. For the development of the ICT outcome signature, we analyzed the transcriptome-level gene expression data set (GSE78220) of an immune checkpoint therapy (ICT) study9. For the validation of the identified ICT outcome signature—ImmuneCells.Sig, we analyzed three large public gene expression datasets of immunotherapy10_12 (respectively accession number: GSE91061, ENA project PRJEB23709, dbGaP phs000452.v3.p1). The first dataset10 (GSE91061) consisted of pretreatment melanoma samples from 51 patients (25 non-responders and 26 responders). For the second dataset11 (PRJEB23709), the scRNA-seq data of the 73 pretreatment tumors were analyzed. Among these 73 samples, 41 are from the melanoma patients subjected to anti-PD-1 therapy and consist of 19 non-responders and 22 responders; 32 are from the melanoma patients subjected to combined anti-PD-1 and anti-CTLA-4 therapy and consist of 8 non-responders and 24 responders. The third dataset (phs000452.v3.p1) is from a large melanoma genome sequencing project (MGSP)12 from which the whole-transcriptome sequencing (RNA-seq) data from 103 pretreatment tumor tissue samples from 103 patients with distinct ICT outcomes (47 responders and 56 non-responders) were available and used for validation in this study.
Single-Cell RNA Sequencing Data Analysis
The data from a previous scRNA-seq study of melanoma checkpoint immunotherapy6 were analyzed. Specifically, we utilized the Seurat v3.0 R package13,14 to perform the fine clustering of the 16,291 single cells. The gene expression data from single cells of both conditions, i.e., regression/responder (R group: n.patients=17; n.cells=5564) and progression/non-responders (NR group: n.patients=31; n.cells=10,727), were aligned and projected in a 2-dimensional space through uniform manifold approximation and projection (UMAP)16 to allow identification of ICT-outcome-associated immune cell populations. Highly variable genes—genes with relatively high average expression and variability—were detected with Seurat13,14. These genes were used for downstream clustering analysis. Principal component analysis (PCA) was used for dimensionality reduction and the number of significant principal components was calculated using built in the JackStraw function. t-distributed stochastic neighbor embedding (t-SNE) and UMAP were used for data visualization in two dimensions.
The built-in FindMarkers function in the Seurat package was used to identify differentially expressed genes. From the results of the Seurat package, genes with adjusted P values<0.05 were considered as differentially expressed genes. Adjusted P values were calculated based on Bonferroni correction using all features in the dataset following Seurat manual [https://satijalab.org/seurat/v3.0/de_vignette.html]. Genes retrieved from Seurat analysis were displayed in heatmap using scaled gene expression calculated with the Seurat-package built-in function. Fold change plots were created in R with ggplot2 package. For the two scRNA-seq data7,8 of melanoma and BCC that were used for validation, i.e., GSE115978 and GSE123813 datasets, the pre-processed gene expression data were downloaded, processed, and analyzed in the same way as done for the discovery scRNA-seq dataset—GSE120575.
RNA-Seq Data and ICT Responsiveness Signature Analysis
For the bulk RNA-seq datasets9_11, we processed them in the following steps. The downloaded FASTQ files containing the RNA-seq reads were aligned to the hg19 human genome using Bowtie-TopHat (version 2.0.4)65,66. Gene-level read counts were obtained using the htseq-count Python script from HTSeq v0.11.1 [https://htseq.readthedocs.io/en/release_0.11.1/] in the union mode. We further utilized the iDEP v0.9267 [http://bioinformatics.sdstate.edu/idep/] to transform the read counts data using the regularized log (rlog) transformation method originally implemented in the DESeq2 v1.28.1 package68 [https://bioconductor.org/packages/release/bioc/html/DESeq2.html], as it effectively reduces mean-dependent variance. The transformed data are used for the downstream analysis and available as detailed in the Data availability statement.
Because three single-cell clusters—TREM2hi macrophages, Tgd_c21, and B_c22 exhibited large quantitative changes between the ICT responders and non-responders, we hypothesized that the tumor expression of the feature genes of these specific immune cell populations may be useful to predict the ICT outcome. In order to test this hypothesis, we developed an ICT responsiveness signature based on the scRNA-seq dataset and a bulk gene expression dataset—GSE782209 using the cancerclass R package23. To validate this ICT response signature—ImmuneCells.Sig, we analyzed three independent gene expression datasets of melanoma patients10_12 (GSE91061, PRJEB23709, and MGSP datasets) and corroborated the high prediction values of ImmuneCells.Sig. We also compared the ImmuneCells.Sig with the other 12 ICT response signatures reported previously (Table 1)9,24_32 across the above four gene expression datasets of melanoma patients. The corresponding R codes are available as detailed in the Code availability statement.
Pathway Analyses
Pathway analyses were conducted using several excellent software tools, including IPA software (IPA release June 2020, QIAGEN Inc., [https://www.qiagenbioinformatics.com/products/ingenuitypathway-analysis]), Gene Set Variation Analysis69 (GSVA v1.36.2, [https://bioconductor.org/packages/release/bioc/html/GSVA.html]), and Gene Set Enrichment Analysis22 (GSEA v4.0.0, [https://www.gsea-msigdb.org/gsea/index.jsp]). GSEA analysis was performed for pre-ranked differentially expressed genes using the option—GseaPreranked. One thousand permutations were used to calculate significance. A gene set was considered to be significantly enriched in one of the two groups when the raw P value<0.05 and the FDR (false discovery rate) was <0.25 for the corresponding gene set. In addition, we utilized an R-package called Fast Gene Set Enrichment Analysis (fgsea v1.15.1, [https://github.com/ctlab/fgsea]). The package implements a special algorithm to calculate the empirical enrichment score null distributions simultaneously for all the gene set sizes, which allows up to several hundred times faster execution time compared to original Broad implementation of GSEA. Reactome pathways analyses were performed using Protein ANalysis THrough Evolutionary Relationships (PANTHER v15.0, [http://pantherdb.org/]). The associated settings are—Analyze type: PANTHER Overrepresentation Test, release 20190711; Annotation Version and Release Date: Gene Ontology database Released 2019-07-03 [http://geneontology.org/]) with lists of significantly enriched genes in the corresponding clusters as detected by Seurat.
Statistical Analysis
The performance of the ImmuneCells.Sig as a classifier for ICT outcome was evaluated with the use of receiver-operating-characteristic curves (ROC), calculation of AUC (Area under the ROC Curve), and estimates of sensitivity and specificity implemented in the cancerclass v1.32.0 R package23. This classification protocol starts with a feature selection step and continues with nearest-centroid classification. The binomial confidence intervals for sensitivity and specificity were calculated by the Wilson procedure implemented in the cancerclass R package23. Fisher's exact test was used for categorical variables. All confidence intervals are reported as two-sided binomial 95% confidence intervals. Statistical analysis was performed with R software, version 3.5.3 (R Project for Statistical Computing).
This application claims the benefit of priority under 35 U.S.C. § 119(e) of U.S. Provisional Application No. 63/066,079, filed Aug. 14, 2020, the contents of which is incorporated herein by reference in its entirety.
This invention was made with government support under R01 CA134682 awarded by the National Institute of Health. The Government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2021/045191 | 8/9/2021 | WO |
Number | Date | Country | |
---|---|---|---|
63066079 | Aug 2020 | US |