Throughout this disclosure, reference is made to technical and patent literature. In some aspects, the literature is referenced by an Arabic numeral, the complete bibliographic citation for which immediately preceding the claims. These disclosures are provided to describe the state of the art to which this disclosure pertains and are incorporated herein by reference.
The genomes of cancer cells harbor somatic mutations imprinted by the activities of different mutational processes1,2. Most single-base substitutions and small insertions and deletions (indels) are independently scattered across the genomic landscape; however, a subset of substitutions and indels tend to cluster together3,4. This clustering has been attributed to a combination of heterogeneous mutation rates across the genomic landscape, biophysical characteristics of exogenous carcinogens, dysregulation of endogenous processes, and the occurrence of larger events associated with genome instability; amongst others4-16. Prior analyses of clustered mutations have focused on single-base substitutions and revealed several classes of clustered events, including doublet-and multi-base substitutions1,9,16-19, diffuse hypermutation termed omikli15, and longer events termed kataegis12,14,16,20. The majority of kataegic events were found to be strand-coordinated which is usually defined as sharing the same strand and reference allele2,16. Previous studies have also revealed 9 clustered mutational signatures4 of different mutational processes as well as clustered driver substitutions due to APOBEC3-associated mutagenesis15 or due to carcinogenic-triggered POLH mutagenesis4. To the best of Applicant's knowledge, analysis of cluster indels or a comprehensive exploration of clustered driver mutations has never been performed.
Doublet-base substitutions have been extensively examined revealing multiple endogenous and exogenous mutational processes that can cause these events, including, failure of DNA repair pathways and exposure to environmental mutagens1,2,16,17. In contrast, multi-base substitutions have not been comprehensively explored presumably due to their small numbers in most cancer genomes. Moreover, only a handful of reported processes have been associated with omikli and kataegic events with majority of these processes attributed to AID/APOBEC3 family of deaminases4,5,12,14-16,21-24. For example, in B-cell lymphomas clustered tracks of C>T and C>G mutations at WRCY motifs are the result of direct replication over AID lesions21. Alternatively, AID-induced lesions can be processed by the mismatch repair pathway that recruits the error-prone DNA polymerase n resulting in non-canonical AID mutations21. In addition to AID, the APOBEC3 enzymes, which are typically responsible for anti-viral responses and for limiting the mobility of mobile elements25-31, are a substantial contributor of clustered mutational events2,4,12,14-16,24,32. Specifically, the APOBEC3 enzymes give rise to omikli and kataegis by requiring single-stranded DNA as a substrate14,15,24,32. Omikli were found enriched in early replicating regions and more prevalent in microsatellite stable tumors indicating a role of mismatch repair in exposing short single-stranded DNA regions while processing mismatched bases during replication15. Further, the differential activity of mismatch repair towards gene-rich regions results in an increased mutational burden of omikli mutations within cancer driver genes15. Kataegis is less prevalent than omikli as it likely depends on longer tracks of single-stranded DNA12-14 Such tracks are typically available during repair of double-strand breaks and the majority of kataegis has been observed within 10 kb of detected breakpoints11.
Amplification of known cancer genes due to double-strand breaks and complex rearrangements is known to drive tumorigenesis in many cancer types33. Recent studies have elucidated high copy number states of circular extrachromosomal DNA (ecDNA), which often harbor known cancer genes and have been found in most human cancers33-36. The circular nature of ecDNAs and their rapid replication patterns mimic double stranded DNA viral pathogens indicating a potential substrate for APOBEC3 mutagenesis, which may ultimately contribute to the subclonal diversification of tumors harboring ecDNA through accelerated diversification of the extrachromosomal oncoproteins.
Described herein is a comprehensive examination of clustered substitutions and clustered indels across 2,583 cancer genomes spanning 30 different tumor types. The results elucidate a multitude of mutational processes giving rise to clustered mutations, including clustered driver mutations that associated with differential gene expression and changes in overall survival, and reveal recurrent APOBEC3 mutagenesis, termed kyklonas, fueling the evolution of ecDNA.
The application of these discoveries are further provided herein. In one aspect, a method of treating inhibiting the growth of a cancer cell or treating a cancer in a subject in need thereof, wherein the subject has one or more of TP53, EGFR, KIT, KMT2C, ELF3, APC and ARID1A or lacks a clustered mutation in a BRAF gene in a sample isolated from the subject is disclosed. The method comprises, consists of, or consists essentially of administering an aggressive therapy to the subject, thereby inhibiting the growth of the cancer cell or treating the cancer in the subject. If the subject does not have a clustered mutation in one or more of TP53, EGFR, KIT, KMT2C, ELF3, APC and ARID1A or lacks a clustered mutation in a BRAF gene, a less aggressive therapy can be administered to the therapy.
In yet another aspect, a method for selecting a cancer patient for an aggressive therapy is disclosed. The method comprises, consists of, or consists essentially of assaying for and/or detecting at least one clustered mutation in a gene selected from TP53, EGFR, KIT, KMT2C, ELF3, APC and ARID1A and/or no clustered mutation in a BRAF gene in a sample isolated from the subject wherein the subject is selected for the therapy if the one or more clustered mutations are found in TP53, EGFR, KIT, KMT2C, ELF3, APC and/or ARID1A and/or no BRAF gene clustered mutation is detected in the sample isolated from the cancer patient.
Cancer patients determined to have mutations that have better predictive outcomes, such as a longer overall survival will in one aspect can receive therapy, but a less aggressive can be selected for the initial or subsequent therapy.
In yet another aspect, a method for identifying whether a cancer patient is likely to experience a relatively longer or shorter overall survival is disclosed. The method comprises, consists of, or consists essentially of assaying for and/or detecting at least one clustered mutation in a gene selected from TP53, EGFR, KIT, KMT2C, ELF3, APC and ARID1A or a BRAF gene in a sample isolated from the patient, wherein the patient is likely to experience longer overall survival if the clustered mutation is detected in BRAF or a clustered mutation is not detected in the one clustered mutation in a gene selected from TP53, EGFR, KIT, KMT2C, ELF3, APC and ARID1A, and the patient is likely to experience shorter overall survival if the clustered mutation is detected in one clustered mutation in a gene selected from TP53, EGFR, KIT, KMT2C, ELF3, APC and ARID1A, or not detected in the BRAF gene.
It is to be understood that the present disclosure is not limited to particular aspects described, as such may, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular aspects only, and is not intended to be limiting, since the scope of the present disclosure will be limited only by the appended claims.
Unless defined otherwise, all technical and scientific terms used herein have the same meanings as commonly understood by one of ordinary skill in the art to which this technology belongs. Although any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present technology, the preferred methods, devices and materials are now described. All technical and patent publications cited herein are incorporated herein by reference in their entirety. Nothing herein is to be construed as an admission that the present technology is not entitled to antedate such disclosure by virtue of prior invention.
The practice of the present technology will employ, unless otherwise indicated, conventional techniques of tissue culture, immunology, molecular biology, microbiology, cell biology, and recombinant DNA, which are within the skill of the art. See, e.g., Sambrook and Russell eds. (2001) Molecular Cloning: A Laboratory Manual, 3rd edition; the series Ausubel et al. eds. (2007) Current Protocols in Molecular Biology; the series Methods in Enzymology (Academic Press, Inc., N.Y.); MacPherson et al. (1991) PCR 1: A Practical Approach (IRL Press at Oxford University Press); MacPherson et al. (1995) PCR 2: A Practical Approach; Harlow and Lane eds. (1999) Antibodies, A Laboratory Manual; Freshney (2005) Culture of Animal Cells: A Manual of Basic Technique, 5th edition; Gait ed. (1984) Oligonucleotide Synthesis; U.S. Pat. No. 4,683,195; Hames and Higgins eds. (1984) Nucleic Acid Hybridization; Anderson (1999) Nucleic Acid Hybridization; Hames and Higgins eds. (1984) Transcription and Translation; Immobilized Cells and Enzymes (IRL Press (1986)); Perbal (1984) A Practical Guide to Molecular Cloning; Miller and Calos eds. (1987) Gene Transfer Vectors for Mammalian Cells (Cold Spring Harbor Laboratory); Makrides ed. (2003) Gene Transfer and Expression in Mammalian Cells; Mayer and Walker eds. (1987) Immunochemical Methods in Cell and Molecular Biology (Academic Press, London); and Herzenberg et al. eds (1996) Weir's Handbook of Experimental Immunology.
All numerical designations, e.g., pH, temperature, time, concentration, and molecular weight, including ranges, are approximations which are varied (+) or (−) by increments of 1.0 or 0.1, as appropriate, or alternatively by a variation of +/−15%, or alternatively 10%, or alternatively 5%, or alternatively 2%. It is to be understood, although not always explicitly stated, that all numerical designations are preceded by the term “about”. It also is to be understood, although not always explicitly stated, that the reagents described herein are merely exemplary and that equivalents of such are known in the art.
It is to be inferred without explicit recitation and unless otherwise intended, that when the present technology relates to a polypeptide, protein, polynucleotide or antibody, an equivalent or a biologically equivalent of such is intended within the scope of the present technology.
As used in the specification and claims, the singular form “a”, “an”, and “the” include plural references unless the context clearly dictates otherwise. For example, the term “a cell” includes a plurality of cells, including mixtures thereof.
As used herein, the term “comprising” is intended to mean that the compositions and methods include the recited elements, but do not exclude others. “Consisting essentially of” when used to define compositions and methods, shall mean excluding other elements of any essential significance to the combination for the intended use. For example, a composition consisting essentially of the elements as defined herein would not exclude trace contaminants from the isolation and purification method and pharmaceutically acceptable carriers, such as phosphate buffered saline, preservatives and the like. “Consisting of” shall mean excluding more than trace elements of other ingredients and substantial method steps for administering the compositions disclosed herein. Aspects defined by each of these transition terms are within the scope of the present disclosure.
As used herein, the term “animal” refers to living multi-cellular vertebrate organisms, a category that includes, for example, mammals and birds. The term “mammal” includes both human and non-human mammals.
In one aspect, the term “equivalent” or “biological equivalent” of an antibody means the ability of the antibody to selectively bind its epitope protein or fragment thereof as measured by ELISA or other suitable methods. Biologically equivalent antibodies include, but are not limited to, those antibodies, peptides, antibody fragments, antibody variant, antibody derivative and antibody mimetics that bind to the same epitope as the reference antibody.
In one aspect, the term “equivalent” of “chemical equivalent” of a chemical means the ability of the chemical to selectively interact with its target protein, DNA, RNA or fragment thereof as measured by the inactivation of the target protein, incorporation of the chemical into the DNA or RNA or other suitable methods. Chemical equivalents include, but are not limited to, those agents with the same or similar biological activity and include, without limitation a pharmaceutically acceptable salt or mixtures thereof that interact with and/or inactivate the same target protein, DNA, or RNA as the reference chemical.
The term “allele,” which is used interchangeably herein with “allelic variant” refers to alternative forms of a gene or portions thereof. Alleles occupy the same locus or position on homologous chromosomes. When a subject has two identical alleles of a gene, the subject is said to be homozygous for the gene or allele. When a subject has two different alleles of a gene, the subject is said to be heterozygous for the gene. Alleles of a specific gene can differ from each other in a single nucleotide, or several nucleotides, and can include substitutions, deletions and insertions of nucleotides. An allele of a gene can also be a form of a gene containing a mutation.
The term “genetic marker” refers to an allelic variant of a polymorphic region of a gene of interest and/or the expression level of a gene of interest.
The term “polymorphism” refers to the coexistence of more than one form of a gene or portion thereof. A portion of a gene of which there are at least two different forms, i.e., two different nucleotide sequences, is referred to as a “polymorphic region of a gene.” A polymorphic region can be a single nucleotide, the identity of which differs in different alleles.
The term “genotype” refers to the specific allelic composition of an entire cell or a certain gene and in some aspects a specific polymorphism associated with that gene, whereas the term “phenotype” refers to the detectable outward manifestations of a specific genotype.
The term “isolated” as used herein refers to molecules or biological or cellular materials being substantially free from other materials. In one aspect, the term “isolated” refers to nucleic acid, such as DNA or RNA, or protein or polypeptide, or cell or cellular organelle, or tissue or organ, separated from other DNAs or RNAs, or proteins or polypeptides, or cells or cellular organelles, or tissues or organs, respectively, that are present in the natural source. The term “isolated” also refers to a nucleic acid or peptide that is substantially free of cellular material, viral material, or culture medium when produced by recombinant DNA techniques, or chemical precursors or other chemicals when chemically synthesized. Moreover, an “isolated nucleic acid” is meant to include nucleic acid fragments which are not naturally occurring as fragments and would not be found in the natural state. The term “isolated” is also used herein to refer to polypeptides which are isolated from other cellular proteins and is meant to encompass both purified and recombinant polypeptides. The term “isolated” is also used herein to refer to cells or tissues that are isolated from other cells or tissues and is meant to encompass both cultured and engineered cells or tissues.
As used herein, “treating” or “treatment” of a disease in a subject refers to (1) preventing the symptoms or disease from occurring in a subject that is predisposed or does not yet display symptoms of the disease; (2) inhibiting the disease or arresting its development; or (3) ameliorating or causing regression of the disease or the symptoms of the disease. As understood in the art, “treatment” is an approach for obtaining beneficial or desired results, including clinical results. For the purposes of this technology, beneficial or desired results can include one or more, but are not limited to, alleviation or amelioration of one or more symptoms, diminishment of extent of a condition (including a disease), stabilized (i.e., not worsening) state of a condition (including disease), delay or slowing of condition (including disease), progression, amelioration or palliation of the condition (including disease), states and remission (whether partial or total), whether detectable or undetectable. In one aspect, treatment excludes prophylaxis.
As used herein, “aggressive therapy” or “aggressive chemotherapy” may refer to any one or a combination of therapeutic cancer therapies, including but not limited to any form of chemical drug therapy meant to destroy rapidly growing/proliferating cancer cells within the body. “Aggressive chemotherapy” refers to any therapy that may extend beyond the first line of treatment or the standard therapeutic regimen for any particular cancer or tumor. “Aggressive chemotherapy” may include, but is not limited to, adoptive cell therapy, immune checkpoint blockades including PD1, PD-L1, and CTLA4, pretargeted radioimmunotherapy, oncolytic viral therapy, or cancer vaccines.
When the disease is cancer, the following clinical endpoints are non-limiting examples of treatment: (1) elimination of a cancer in a subject or in a tissue/organ of the subject or in a cancer loci; (2) reduction in tumor burden (such as number of cancer cells, number of cancer foci, number of cancer cells in a foci, size of a solid cancer, concentrate of a liquid cancer in the body fluid, and/or amount of cancer in the body); (3) stabilizing or delay or slowing or inhibition of cancer growth and/or development, including but not limited to, cancer cell growth and/or division, size growth of a solid tumor or a cancer loci, cancer progression, and/or metastasis (such as time to form a new metastasis, number of total metastases, size of a metastasis, as well as variety of the tissues/organs to house metastatic cells); (4) less risk of having a cancer growth and/or development; (5) inducing an immune response of the patient to the cancer, such as higher number of tumor-infiltrating immune cell, higher number of activated immune cells, or higher number cancer cell expressing an immunotherapy target, or higher level of expression of an immunotherapy target in a cancer cell; (6) higher probability of survival and/or increased duration of survival, such as increased overall survival (OS, which may be shown as 1-year, 2-year, 5-year, 10-year, or 20-year survival rate), increased progression free survival (PFS), increased disease free survival (DFS), increased time to tumor recurrence (TTR) and increased time to tumor progression (TTP). In some embodiments, the subject after treatment experiences one or more endpoints selected from tumor response, reduction in tumor size, reduction in tumor burden, increase in overall survival, increase in progression free survival, inhibiting metastasis, improvement of quality of life, minimization of drug-related toxicity, and avoidance of side-effects (e.g., decreased treatment emergent adverse events). In some embodiments, improvement of quality of life includes resolution or improvement of cancer-specific symptoms, such as but not limited to fatigue, pain, nausea/vomiting, lack of appetite, and constipation; improvement or maintenance of psychological well-being (e.g., degree of irritability, depression, memory loss, tension, and anxiety); improvement or maintenance of social well-being (e.g., decreased requirement for assistance with eating, dressing, or using the restroom; improvement or maintenance of ability to perform normal leisure activities, hobbies, or social activities; improvement or maintenance of relationships with family). In some embodiments, improved patient quality of life that is measured qualitatively through patient narratives or quantitatively using validated quality of life tools known to those skilled in the art, or a combination thereof. Additional non-limiting examples of endpoints include reduced hospital admissions, reduced drug use to treat side effects, longer periods off-treatment, and earlier return to work or caring responsibilities. In one aspect, prevention or prophylaxis is excluded from treatment.
As used herein, immune cells are cells of the immune system, including but not limited to lymphocytes (such as, T-cells, B-cells, natural killer (NK) cells, and natural killer T (NKT) cells), myeloid-derived cells (such as granulocytes (basophils, eosinophils, neutrophils, mast cells), monocytes, macrophages, and dendritic cells (DC)). T cells are divided into two broad categories: CD8+ T cells or CD4+ T cells, based on which protein is present on the cell's surface. CD8+ T cells also are called cytotoxic T cells or cytotoxic lymphocytes (CTLs). The four major CD4+ T-cell subsets are TH1, TH2, TH17, and Treg, with “TH” referring to “T helper cell.” T cells may also refer to gamma delta T cell. Dendritic cells (DC) are an important antigen-presenting cell (APC), and they also can develop from monocytes. In some embodiments, the immune cells refer to a killer cell, including but not limited to: a cytotoxic T cell, a gamma delta T cell, a NK cell and a NK-T cell. In one embodiment, the immune cell is a CD45+ cell.
The term “subject,” “host,” “individual,” and “patient” are as used interchangeably herein to refer to animals, typically mammalian animals. Any suitable mammal can be treated by a method described herein. Non-limiting examples of mammals include humans, non-human primates (e.g., apes, gibbons, chimpanzees, orangutans, monkeys, macaques, and the like), domestic animals (e.g., dogs and cats), farm animals (e.g., horses, cows, goats, sheep, pigs) and experimental animals (e.g., mouse, rat, rabbit, guinea pig). In some embodiments, a mammal is a human. A mammal can be any age or at any stage of development (e.g., an adult, teen, child, infant, or a mammal in utero). A mammal can be male or female. In some embodiments, a subject is a human. In some embodiments, a subject has or is diagnosed of having or is suspected of having a cancer. The subject can be a male or female.
In certain embodiments, the terms “disease” “disorder” and “condition” are used interchangeably herein, referring to a cancer, a status of being diagnosed with a cancer, or a status of being suspect of having a cancer. “Cancer”, which is also referred to herein as “tumor”, is a known medically as an uncontrolled division of abnormal cells in a part of the body, benign or malignant. In one embodiment, cancer refers to a malignant neoplasm, a broad group of diseases involving unregulated cell division and growth, and invasion to nearby parts of the body. Non-limiting examples of cancers include carcinomas, sarcomas, leukemia and lymphoma, e.g., colon cancer, colorectal cancer, rectal cancer, gastric cancer, esophageal cancer, head and neck cancer, breast cancer, brain cancer, lung cancer, stomach cancer, liver cancer, gall bladder cancer, or pancreatic cancer. In one embodiment, the term “cancer” refers to a solid tumor, which is an abnormal mass of tissue that usually does not contain cysts or liquid areas, including but not limited to, sarcomas, carcinomas, and certain lymphomas (such as Non-Hodgkin's lymphoma). In another embodiment, the term “cancer” refers to a liquid cancer, which is a cancer presenting in body fluids (such as, the blood and bone marrow), for example, leukemias (cancers of the blood) and certain lymphomas.
Additionally or alternatively, a cancer may refer to a local cancer (which is an invasive malignant cancer confined entirely to the organ or tissue where the cancer began), a metastatic cancer (referring to a cancer that spreads from its site of origin to another part of the body), a non-metastatic cancer, a primary cancer (a term used describing an initial cancer a subject experiences), a secondary cancer (referring to a metastasis from primary cancer or second cancer unrelated to the original cancer), an advanced cancer, an unresectable cancer, or a recurrent cancer. As used herein, an advanced cancer refers to a cancer that had progressed after receiving one or more of: the first line therapy, the second line therapy, or the third line therapy.
The term “chemotherapy” encompasses cancer therapies that employ chemical or biological agents or other therapies, such as radiation therapies, e.g., a small molecule drug or a large molecule, such as antibodies, immunotherapies, RNAi and gene therapies. Non-limiting examples of chemotherapies are provided below. It should be understood, although not always explicitly stated, that when a particular therapy is noted, the scope of the disclosure includes equivalents unless excluded.
The term “contacting” means direct or indirect binding or interaction between two or more. A particular example of direct interaction is binding. A particular example of an indirect interaction is where one entity acts upon an intermediary molecule, which in turn acts upon the second referenced entity. Contacting as used herein includes in solution, in solid phase, in vitro, ex vivo, in a cell and in vivo. Contacting in vivo can be referred to as administering, or administration.
As used herein, the term “administration” and “administering” are used to mean introducing an agent into a subject. Routes of administration include, but are not limited to, oral (such as a tablet, capsule or suspension), topical, transdermal, intranasal, vaginal, rectal, subcutaneous intravenous, intravenous, intraarterial, intramuscular, intraosseous, intraperitoneal, intraocular, subconjunctival, sub-Tenon's, intravitreal, retrobulbar, intracameral, intratumoral, epidural and intrathecal.
An “immunotherapy agent” means a type of cancer treatment which uses a patient's own immune system to fight cancer, including but not limited to a physical intervene, a chemical substance, a biological molecule or particle, a cell, a tissue or organ, or any combinations thereof, enhancing or activating or initiating a patient's immune response against cancer. Non-limiting examples of immunotherapy agents include antibodies, immune regulators, checkpoint inhibitors, an antisense oligonucleotide (ASO), a RNA interference (RNAi), a Clustered Regularly Interspaced Short Palindromic Repeat (CRISPR) system, a viral vector, an anti-cancer cell therapy (e.g., transplanting an anti-cancer immune cell optionally amplified and/or activated in vivo, or administering an immune cell expressing a chimeric antigen receptor (CAR)), a CAR therapy, and cancer vaccines. As used herein, unless otherwise specified, an immunotherapy agent is not an inhibitor of thymidylate biosynthesis, or an anthracycline or other topoisomerase II inhibitor. As used herein, immune checkpoint refers to a regulator and/or modulator of the immune system (such as an immune response, an anti-tumor immune response, a nascent anti-tumor immune response, an anti-tumor immune cell response, an anti-tumor T cell response, and/or an antigen recognition of T cell receptor in the process of immune response). Their interaction activates either inhibitory or activating immune signaling pathways. Thus a checkpoint may contain one of the two signals: a stimulatory immune checkpoint that stimulates an immune response, and an inhibitory immune checkpoint inhibiting an immune response. In some embodiments, the immune checkpoint is crucial for self-tolerance, which prevents the immune system from attacking cells indiscriminately. However, some cancers can protect themselves from attack by stimulating immune checkpoint targets. In some embodiments, the immune checkpoints are present on T cells, antigen-presenting cells (APCs) and/or tumor cells.
One target of an immunotherapy agent is a tumor-specific antigen while the immunotherapy directs or enhances the immune system to recognize and attack tumor cells. Non-limiting examples of such agent includes a cancer vaccine presenting a tumor-specific antigen to the patient's immune system, a monoclonal antibody or an antibody-drug conjugate specifically binding to a tumor-specific antigen, a bispecific antibody specifically binding to a tumor-specific antigen and an immune cell (such as a T-cell engager or a NK-cell engager), an immune cell (such as a killer cell) specifically binding to a tumor-specific antigen (such as a CAR-T cell, a CAR-NK cell, and a CAR-NKT cell), a polynucleotide (or a vector comprising the same) transfecting/transducing an immune cell to express an tumor-specific antibody of an antigen binding fragment thereof (such as a CAR), or a polynucleotide (or a vector comprising the same) transfecting/transducing a cancer cell to express an antigen or a marker which can be recognized by an immune cell.
Another exemplified target is an inhibitory immune checkpoint which suppresses the nascent anti-tumor immune response, such as A2AR, B7-H3, B7-H4, BTLA, CTLA-4, CTLA-4/B7-1/B7-2, IDO, KIR, LAG3, NOX2, PD-1, PD-L1 and TIM-3, VISTA, SIGLEC7 (Sialic acid-binding immunoglobulin-type lectin 7, also designated as CD328) and SIGLEC9 (Sialic acid-binding immunoglobulin-type lectin 9, also designated as CD329). Non-limiting examples of such agent includes an antagonist or inhibitor of an inhibitory immune checkpoint, an agent reducing the expression and/or activity of an inhibitory immune checkpoint (such as via an antisense oligonucleotide (ASO), a RNA interference (RNAi), or a Clustered Regularly Interspaced Short Palindromic Repeat (CRISPR) system), an antibody or an antibody-drug conjugate or a ligand specifically binding to and reducing (or inhibiting) the activity of an inhibitory immune checkpoint, an immune cell with reduced (or inhibited) an inhibitory immune checkpoint (and optionally specifically binding to a tumor-specific antigen, such as a CAR-T cell, a CAR-NK cell, and a CAR-NKT cell), and a polynucleotide (or a vector comprising the same) transfecting/transducing an immune cell or a cancer cell to reduce or inhibit an inhibitory immune checkpoint thereof. Reducing expression or activity of such inhibitory immune checkpoint enhances immune response of a patient to a cancer.
A further possible immunotherapy target is a stimulatory checkpoint molecule (including but not limited to 4-1BB, CD27, CD28, CD40, CD122, CD137, OX40, GITR and ICOS), wherein the immunotherapy agent actives or enhances the anti-tumor immune response. Non-limiting examples of such agent includes an agonist of a stimulatory checkpoint, an agent increasing the expression and/or activity of a stimulating immune checkpoint, an antibody or an antibody-drug conjugate or a ligand specifically binding to and activating or enhancing the activity of a stimulating immune checkpoint, an immune cell with increased expression and/or activity of a stimulating immune checkpoint (and optionally specifically binding to a tumor-specific antigen, such as a CAR-T cell, a CAR-NK cell, and a CAR-NKT cell), and a polynucleotide (or a vector comprising the same) transfecting/transducing an immune cell or a cancer cell to express a stimulating immune checkpoint thereof.
Additional or alternative targets may be utilized by an immunotherapy agent, such as an immune regulating agent, including but not limited to, an agent activating an immune cell, an agent recruiting an immune cell to a cancer or a cancer cell, or an agent increasing immune cell infiltrated into a solid tumor and/or a cancer loci. Non-limiting examples of such agent is an immune regulator or a variant, a mutant, a fragment, an equivalent thereof.
In some embodiments, an immunotherapy agent utilizes one or more targets, such as a bispecific T cell engager, a bispecific NK cell engager, or a CAR cell therapy. In some embodiments, the immunotherapy agent targets one or more immune regulatory or effector cells.
As used herein, the term “antibody” collectively refers to immunoglobulins or immunoglobulin-like molecules including by way of example and without limitation, IgA, IgD, IgE, IgG and IgM, combinations thereof, and similar molecules produced during an immune response in any vertebrate, for example, in mammals such as humans, goats, rabbits, rat, canine, donkey, mice, camelids (such as dromedaries, llamas, and alpacas), as well as non-mammalian species, such as shark immunoglobulins. Unless specifically noted otherwise, the term “antibody” includes intact immunoglobulins and “antibody fragments” or “antigen binding fragments” that specifically bind to a molecule of interest (or a group of highly similar molecules of interest) to the substantial exclusion of binding to other molecules (for example, antibodies and antibody fragments that have a binding constant for the molecule of interest that is at least 103 M−1 greater, at least 104M−1 greater or at least 105 M−1 greater than a binding constant for other molecules in a biological sample). The term “antibody” also includes genetically engineered forms such as chimeric antibodies (for example, murine or humanized non-primate antibodies), heteroconjugate antibodies (such as, bispecific antibodies). See also, Pierce Catalog and Handbook, 1994-1995 (Pierce Chemical Co., Rockford, Ill.); Owen et al., Kuby Immunology, 7th Ed., W.H. Freeman & Co., 2013; Murphy, Janeway's Immunobiology, 8th Ed., Garland Science, 2014; Male et al., Immunology (Roitt), 8th Ed., Saunders, 2012; Parham, The Immune System, 4th Ed., Garland Science, 2014. The term “antibody” includes any protein or peptide containing molecule that comprises at least a portion of an immunoglobulin molecule, such as the whole antibody and any antigen binding fragment or a single chain thereof. The terms “antibody,” “antibodies” and “immunoglobulin” also include immunoglobulins of any isotype, fragments of antibodies which retain specific binding to antigen, including, but not limited to, Fab, Fab′, F(ab)2, Fv, scFv, dsFv, Fd fragments, dAb, VH, VL, VhH, and V-NAR domains; minibodies, diabodies, triabodies, tetrabodies and kappa bodies; multispecific antibody fragments formed from antibody fragments and one or more isolated. Examples of such include, but are not limited to a complementarity determining region (CDR) of a heavy or light chain or a ligand binding portion thereof, a heavy chain or light chain variable region, a heavy chain or light chain constant region, a framework (FR) region, or any portion thereof, at least one portion of a binding protein, chimeric antibodies, humanized antibodies, single-chain antibodies, and fusion proteins comprising an antigen-binding portion of an antibody and a non-antibody protein. The variable regions of the heavy and light chains of the immunoglobulin molecule contain a binding domain that interacts with an antigen. The constant regions of the antibodies (Abs) may mediate the binding of the immunoglobulin to host tissues. The antibodies can be polyclonal, monoclonal, multispecific (e.g., bispecific antibodies), and antibody fragments, so long as they exhibit the desired biological activity.
As used herein, the term “monoclonal antibody” refers to an antibody produced by a single clone of B-lymphocytes or by a cell into which the light and heavy chain genes of a single antibody have been transfected. Monoclonal antibodies are produced by methods known to those of skill in the art, for instance by making hybrid antibody-forming cells from a fusion of myeloma cells with immune spleen cells. Monoclonal antibodies include humanized monoclonal antibodies.
In some embodiments, the antibody is a bispecific immune cell engager, referring to a bispecific monoclonal antibody that is capable of recognizing and specifically binding to a tumor antigen (such as CD19, EpCAM, MCSP, HER2, EGFR or CS-1) and an immune cell, and directing an immune cell to cancer cells, thereby treating a cancer. Non-limiting examples of such antibody include bispecific T cell engager, bispecific cytotoxic T lymphocytes (CTL) engager, and bispecific NK cell engager. In one embodiment, the engager is a fusion protein consisting of two single-chain variable fragments (scFvs) of different antibodies. Additionally or alternatively, the immune cell is a killer cell, including but not limited to: a cytotoxic T cell, a gamma delta T cell, a NK cell and a NK-T cell.
As used herein, the term “antigen binding domain” refers to any protein or polypeptide domain that can specifically bind to an antigen target.
The term “chimeric antigen receptor” (CAR), as used herein, refers to a fused protein comprising an extracellular domain capable of binding to an antigen, a transmembrane domain derived from a polypeptide different from a polypeptide from which the extracellular domain is derived, and at least one intracellular domain. The “chimeric antigen receptor (CAR)” is sometimes called a “chimeric receptor”, a “T-body”, or a “chimeric immune receptor (CIR).” The “extracellular domain capable of binding to an antigen” means any oligopeptide or polypeptide that can bind to a certain antigen. The “intracellular domain” or “intracellular signaling domain” means any oligopeptide or polypeptide known to function as a domain that transmits a signal to cause activation or inhibition of a biological process in a cell. In certain embodiments, the intracellular domain may comprise, alternatively consist essentially of, or yet further comprise one or more costimulatory signaling domains in addition to the primary signaling domain. The “transmembrane domain” means any oligopeptide or polypeptide known to span the cell membrane and that can function to link the extracellular and signaling domains. A chimeric antigen receptor may optionally comprise a “hinge domain” which serves as a linker between the extracellular and transmembrane domains.
As used herein, a CAR therapy may refer to administrating an immune cell expressing a CAR into a subject as well as contacting a vector expressing a CAR in an immune cell (such as in vivo).
As used herein, the term “NK cell,” also known as natural killer cell, refers to a type of lymphocyte that originates in the bone marrow and play a critical role in the innate immune system. NK cells provide rapid immune responses against viral-infected cells, tumor cells or other stressed cell, even in the absence of antibodies and major histocompatibility complex on the cell surfaces. NK cells for using in a cell therapy and/or a CAR therapy may either be isolated or obtained from a commercially available source. Non-limiting examples of commercial NK cell lines include lines NK-92 (ATCC® CRL-2407™), NK-92 MI (ATCC® CRL-2408™). Further examples include but are not limited to NK lines HANK1, KHYG-1, NKL, NK—YS, NOI-90, and YT. Non-limiting exemplary sources for such commercially available cell lines include the American Type Culture Collection, or ATCC, (http://www.atcc.org/) and the German Collection of Microorganisms and Cell Cultures (https://www.dsmz.de/).
As used herein, the term “T cell,” refers to a type of lymphocyte that matures in the thymus. T cells play an important role in cell-mediated immunity and are distinguished from other lymphocytes, such as B cells, by the presence of a T-cell receptor on the cell surface. T-cells for using in a cell therapy and/or a CAR therapy may either be isolated or obtained from a commercially available source. “T cell” includes all types of immune cells expressing CD3 including T-helper cells (CD4+ cells), cytotoxic T-cells (CD8+ cells), natural killer T-cells, T-regulatory cells (Treg) and gamma-delta T cells. A “cytotoxic cell” includes CD8+ T cells, natural-killer (NK) cells, and neutrophils, which cells are capable of mediating cytotoxicity responses. Non-limiting examples of commercially available T-cell lines include lines BCL2 (AAA) Jurkat (ATCC® CRL-2902™), BCL2 (S70A) Jurkat (ATCC® CRL-2900™), BCL2 (S87A) Jurkat (ATCC® CRL-2901™), BCL2 Jurkat (ATCC® CRL-2899™), Neo Jurkat (ATCC® CRL-2898™), TALL-104 cytotoxic human T cell line (ATCC #CRL-11386). Further examples include but are not limited to mature T-cell lines, e.g., such as Deglis, EBT-8, HPB-MLp-W, HUT 78, HUT 102, Karpas 384, Ki 225, My-La, Se-Ax, SKW-3, SMZ-1 and T34; and immature T-cell lines, e.g., ALL-SIL, Be13, CCRF-CEM, CML-T1, DND-41, DU.528, EU-9, HD-Mar, HPB-ALL, H-SB2, HT-1, JK-T1, Jurkat, Karpas 45, KE-37, KOPT-K1, K-T1, L-KAW, Loucy, MAT, MOLT-1, MOLT 3, MOLT-4, MOLT 13, MOLT-16, MT-1, MT-ALL, P12/Ichikawa, Peer, PER0117, PER-255, PF-382, PFI-285, RPMI-8402, ST-4, SUP-T1 to T14, TALL-1, TALL-101, TALL-103/2, TALL-104, TALL-105, TALL-106, TALL-107, TALL-197, TK-6, TLBR-1, -2, -3, and -4, CCRF-HSB-2 (CCL-120.1), J.RT3-T3.5 (ATCC TIB-153), J45.01 (ATCC CRL-1990), J.CaM1.6 (ATCC CRL-2063), RS4; 11 (ATCC CRL-1873), CCRF-CEM (ATCC CRM-CCL-119); and cutaneous T-cell lymphoma lines, e.g., HuT78 (ATCC CRM-TIB-161), MJ[G11] (ATCC CRL-8294), HuT102 (ATCC TIB-162). Null leukemia cell lines, including but not limited to REH, NALL-1, KM-3, L92-221, are another commercially available source of immune cells for using in a CAR therapy, as are cell lines derived from other leukemias and lymphomas, such as K562 erythroleukemia, THP-1 monocytic leukemia, U937 lymphoma, HEL erythroleukemia, HL60 leukemia, HMC-1 leukemia, KG-1 leukemia, U266 myeloma. Non-limiting exemplary sources for such commercially available cell lines include the American Type Culture Collection, or ATCC, (http://www.atcc.org/) and the German Collection of Microorganisms and Cell Cultures (https://www.dsmz.de/).
As used herein, a “tumor-specific antigen” refers to an antigenic substance produced in tumor cells, capable of triggering an immune response in a subject. In some embodiments, such tumor-specific antigen is not expressed on or in a cell in the subject, which is not a cancer cell. In some embodiment, such tumor-specific antigen may still be expressed in or on some non-cancer cells. For example, a tumor-specific antigen may not be expressed on the cell surface of a non-cancer cell in the subject. In one embodiment, the tumor-specific antigen may be expressed in or on a non-cancer cell of the subject, but in a much lower level compared to a cancer cell. In another embodiment, the tumor-specific antigen may be expressed in or on a non-cancer cell of the subject which is not adjacent to a cancer or a cancer cell. Non-limiting examples of a tumor-specific antigen includes: Alphafetoprotein (AFP), Beta-2-microglobulin (B2M), Beta-human chorionic gonadotropin (Beta-hCG), Bladder Tumor Antigen (BTA), C-kit/CD117, CA15-3/CA27.29, CA19-9, CA-125, CA 27.29, Calcitonin, Carcinoembryonic antigen (CEA), Chromogranin A (CgA), Cytokeratin fragment 21-1, Des-gamma-carboxy prothrombin (DCP), Estrogen receptor (ER)/progesterone receptor (PR), Epithelial tumor antigen (ETA), Fibrin/fibrinogen, Gastrin, HE4, overexpressed HER2/neu, 5-HIAA, Lactate dehydrogenase, Melanoma-associated antigen (MAGE), MUC-1, Neuron-specific enolase (NSE), Nuclear matrix protein 22, Programmed death ligand 1 (PD-L1), Prostate-specific antigen (PSA), Prostatic Acid Phosphatase (PAP), Soluble mesothelin-related peptides (SMRP), Somatostatin receptor, Tyrosinase, Thyroglobulin, abnormal products of ras, p53, alpha folate receptor, 5T4, αvβ6 integrin, BCMA, B7-H3, B7-H6, CAIX, CD16, CD19, CD20, CD22, CD25, CD30, CD33, CD44, CD44v6, CD44v7/8, CD70, CD79a, CD79b, CD123, CD138, CD171, CEA, CSPG4, EGFR, EGFR family including ErbB2 (HER2), EGFRvni, EGP2, EGP40, EPCAM, EphA2, EpCAM, FAP, fetal AchR, FRoc, GD2, GD3, Glypican-3 (GPC3), HL A-A 1+M AGE 1, HLA-A2+MAGE1, HL A-A3+M AGE 1, HLA-A1+NY-ESO-1, HL A-A2+NY-ESO-1, HLA-A3+NY-ESO-1, IL-11Roc, IL-13Ra2, Lambda, Lewis-Y, Kappa, Mesothelin, Mucl, Muc16, NCAM, NKG2D Ligands, NY-ESO-1, PRAME, PSCA, PSMA, RORI, SSX, Survivin, TAG72, TEMs, VEGFR2, and WT-1.
“An effective amount” or “therapeutically effect amount” intends to indicate the amount of a compound or agent administered or delivered to the patient which is most likely to result in the desired response to treatment. The amount is empirically determined by the patient's clinical parameters including, but not limited to the Stage of disease, age, gender, histology, and likelihood for tumor recurrence.
A “patient” as used herein intends an animal patient, a mammal patient or yet further a human patient. For the purpose of illustration only, a mammal includes but is not limited to a simian, a murine, a bovine, an equine, a porcine or an ovine subject. The patient can be a female or male.
The term “clinical outcome”, “clinical parameter”, “clinical response”, or “clinical endpoint” refers to any clinical observation or measurement relating to a patient's reaction to a therapy. Non-limiting examples of clinical outcomes include tumor response (TR), overall survival (OS), progression free survival (PFS), disease free survival, time to tumor recurrence (TTR), time to tumor progression (TTP), relative risk (RR), objective response rate (RR or ORR), toxicity or side effect.
The term “suitable for a therapy” or “suitably treated with a therapy” shall mean that the patient is likely to exhibit one or more desirable clinical outcomes as compared to patients having the same disease and receiving the same therapy but possessing a different characteristic that is under consideration for the purpose of the comparison. In one aspect, the characteristic under consideration is a genetic polymorphism or a somatic mutation. In another aspect, the characteristic under consideration is expression level of a gene or a polypeptide. In one aspect, a more desirable clinical outcome is relatively higher likelihood of or relatively better tumor response such as tumor load reduction. In another aspect, a more desirable clinical outcome is relatively longer overall survival. In yet another aspect, a more desirable clinical outcome is relatively longer progression free survival or time to tumor progression. In yet another aspect, a more desirable clinical outcome is relatively longer disease free survival. In further another aspect, a more desirable clinical outcome is relative reduction or delay in tumor recurrence. In another aspect, a more desirable clinical outcome is relatively decreased metastasis. In another aspect, a more desirable clinical outcome is relatively lower relative risk. In yet another aspect, a more desirable clinical outcome is relatively reduced toxicity or side effects. In some embodiments, more than one clinical outcomes are considered simultaneously. In one such aspect, a patient possessing a characteristic, such as a genotype of a genetic polymorphism, can exhibit more than one more desirable clinical outcomes as compared to patients having the same disease and receiving the same therapy but not possessing the characteristic. As defined herein, the patient is considered suitable for the therapy. In another such aspect, a patient possessing a characteristic can exhibit one or more desirable clinical outcome but simultaneously exhibit one or more less desirable clinical outcome. The clinical outcomes will then be considered collectively, and a decision as to whether the patient is suitable for the therapy will be made accordingly, taking into account the patient's specific situation and the relevance of the clinical outcomes. In some embodiments, progression free survival or overall survival is weighted more heavily than tumor response in a collective decision making.
Response criteria can be based on the RECIST criteria (Therasse and Arbuck et al., 2000, New Guidelines to Evaluate Response to Treatment in Solid Tumors, J Natl Cancer Inst, 92:205-16). A “complete response” (CR) to a therapy refers to the clinical status of a patient with evaluable but non-measurable disease, whose tumor and all evidence of disease have disappeared following administration of the therapy. In this context, a “partial response” (PR) refers to a response that is anything less than a complete response. “Stable disease” (SD) indicates that the patient is stable following the therapy. “Progressive disease” (PD) indicates that the tumor has grown (i.e. become larger) or spread (i.e. metastasized to another tissue or organ) or the overall cancer has gotten worse following the therapy. For example, tumor growth of more than 20 percent since the start of therapy typically indicates progressive disease. “Non-response” (NR) to a therapy refers to status of a patient whose tumor or evidence of disease has remained constant or has progressed.
“Overall Survival” (OS) refers to the length of time of a cancer patient remaining alive following a cancer therapy.
“Progression free survival” (PFS) or “Time to Tumor Progression” (TTP) refers to the length of time following a therapy, during which the tumor in a cancer patient does not grow. Progression-free survival includes the amount of time a patient has experienced a complete response, partial response or stable disease.
“Disease free survival” refers to the length of time following a therapy, during which a cancer patient survives with no signs of the cancer or tumor.
“Time to Tumor Recurrence (TTR)” refers to the length of time, following a cancer therapy such as surgical resection or chemotherapy, until the tumor has reappeared (come back). The tumor may come back to the same place as the original (primary) tumor or to another place in the body.
“Relative Risk” (RR), in statistics and mathematical epidemiology, refers to the risk of an event (or of developing a disease) relative to exposure. Relative risk is a ratio of the probability of the event occurring in the exposed group versus a non-exposed group.
“Objective response rate” refers to the proportion of responders (patients with either a partial (PR) or complete response (CR) compared to nonresponders (patients with either SD or PD). Response duration can be measured from the time of initial response until documented tumor progression.
The term “identify” or “identifying” is to associate or affiliate a patient closely to a group or population of patients who likely experience the same or a similar clinical response to a therapy.
The term “selecting” a patient for a therapy refers to making an indication that the selected patient is suitable for the therapy. Such an indication can be made in writing by, for instance, a handwritten prescription or a computerized report making the corresponding prescription or recommendation.
A “normal cell corresponding to the tumor tissue type” refers to a normal cell from a same tissue type as the tumor tissue. A non-limiting examples is a normal lung cell from a patient having lung tumor, or a normal colon cell from a patient having colon tumor.
The term “amplification” or “amplify” as used herein means one or more methods known in the art for copying a target nucleic acid, thereby increasing the number of copies of a selected nucleic acid sequence. Amplification can be exponential or linear. A target nucleic acid can be either DNA or RNA. The sequences amplified in this manner form an “amplicon.” While the exemplary methods described hereinafter relate to amplification using the polymerase chain reaction (“PCR”), numerous other methods are known in the art for amplification of nucleic acids (e.g., isothermal methods, rolling circle methods, etc.). The skilled artisan will understand that these other methods can be used either in place of, or together with, PCR methods.
The term “complement” as used herein means the complementary sequence to a nucleic acid according to standard Watson/Crick base pairing rules. A complement sequence can also be a sequence of RNA complementary to the DNA sequence or its complement sequence, and can also be a cDNA. The term “substantially complementary” as used herein means that two sequences hybridize under stringent hybridization conditions. The skilled artisan will understand that substantially complementary sequences need not hybridize along their entire length. In particular, substantially complementary sequences comprise a contiguous sequence of bases that do not hybridize to a target or marker sequence, positioned 3′ or 5′ to a contiguous sequence of bases that hybridize under stringent hybridization conditions to a target or marker sequence.
As used herein, the term “hybridize” or “specifically hybridize” refers to a process where two complementary nucleic acid strands anneal to each other under appropriately stringent conditions. Hybridizations are typically conducted with probe-length nucleic acid molecules. Nucleic acid hybridization techniques are well known in the art. Those skilled in the art understand how to estimate and adjust the stringency of hybridization conditions such that sequences having at least a desired level of complementarity will stably hybridize, while those having lower complementarity will not. For examples of hybridization conditions and parameters, see, e.g., Sambrook, et al., 1989, Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Press, Plainview, N. Y.; Ausubel, F. M. et al. 1994, Current Protocols in Molecular Biology. John Wiley & Sons, Secaucus, N.J.
“Primer” as used herein refers to an oligonucleotide that is capable of acting as a point of initiation of synthesis when placed under conditions in which primer extension is initiated (e.g., primer extension associated with an application such as PCR). The primer is complementary to a target nucleotide sequence and it hybridizes to a substantially complementary sequence in the target and leads to addition of nucleotides to the 3′-end of the primer in the presence of a DNA or RNA polymerase. The 3′-nucleotide of the primer should generally be complementary to the target sequence at a corresponding nucleotide position for optimal expression and amplification. An oligonucleotide “primer” can occur naturally, as in a purified restriction digest or can be produced synthetically. The term “primer” as used herein includes all forms of primers that can be synthesized including, peptide nucleic acid primers, locked nucleic acid primers, phosphorothioate modified primers, labeled primers, and the like.
Primers are typically between about 5 and about 100 nucleotides in length, such as between about 15 and about 60 nucleotides in length, such as between about 20 and about 50 nucleotides in length, such as between about 25 and about 40 nucleotides in length. In some embodiments, primers can be at least 8, at least 12, at least 16, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, at least 55, at least 60 nucleotides in length. An optimal length for a particular primer application can be readily determined in the manner described in H. Erlich, PCR Technology. Principles and Application for DNA Amplification (1989).
“Probe” as used herein refers to nucleic acid that interacts with a target nucleic acid via hybridization. A probe can be fully complementary to a target nucleic acid sequence or partially complementary. The level of complementarity will depend on many factors based, in general, on the function of the probe. A probe or probes can be used, for example to detect the presence or absence of a mutation in a nucleic acid sequence by virtue of the sequence characteristics of the target. Probes can be labeled or unlabeled, or modified in any of a number of ways well known in the art. A probe can specifically hybridize to a target nucleic acid.
Probes can be DNA, RNA or a RNA/DNA hybrid. Probes can be oligonucleotides, artificial chromosomes, fragmented artificial chromosome, genomic nucleic acid, fragmented genomic nucleic acid, RNA, recombinant nucleic acid, fragmented recombinant nucleic acid, peptide nucleic acid (PNA), locked nucleic acid, oligomer of cyclic heterocycles, or conjugates of nucleic acid. Probes can comprise modified nucleobases, modified sugar moieties, and modified internucleotide linkages. A probe can be fully complementary to a target nucleic acid sequence or partially complementary. A probe can be used to detect the presence or absence of a target nucleic acid. Probes are typically at least about 10, 15, 21, 25, 30, 35, 40, 50, 60, 75, 100 nucleotides or more in length.
“Detecting” as used herein refers to determining the presence of a nucleic acid of interest in a sample or the presence of a protein of interest in a sample. Detection does not require the method to provide 100% sensitivity and/or 100% specificity.
“Detectable label” as used herein refers to a molecule or a compound or a group of molecules or a group of compounds used to identify a nucleic acid or protein of interest. In some cases, the detectable label can be detected directly. In other cases, the detectable label can be a part of a binding pair, which can then be subsequently detected. Signals from the detectable label can be detected by various means and will depend on the nature of the detectable label. Detectable labels can be isotopes, fluorescent moieties, colored substances, and the like. Examples of means to detect detectable label include but are not limited to spectroscopic, photochemical, biochemical, immunochemical, electromagnetic, radiochemical, or chemical means, such as fluorescence, chemifluorescence, or chemiluminescence, or any other appropriate means.
“TaqMan® PCR detection system” as used herein refers to a method for real time PCR. In this method, a TaqMan® probe which hybridizes to the nucleic acid segment amplified is included in the PCR reaction mix. The TaqMan® probe comprises a donor and a quencher fluorophore on either end of the probe and in close enough proximity to each other so that the fluorescence of the donor is taken up by the quencher. However, when the probe hybridizes to the amplified segment, the 5′-exonuclease activity of the Taq polymerase cleaves the probe thereby allowing the donor fluorophore to emit fluorescence which can be detected.
As used herein, the term “sample” or “test sample” refers to any liquid or solid material containing nucleic acids. In suitable embodiments, a test sample is obtained from a biological source (i.e., a “biological sample”), such as cells in culture or a tissue sample from an animal, preferably, a human. In an exemplary embodiment, the sample is a tumor or liquid biopsy sample.
“Target nucleic acid” as used herein refers to segments of a chromosome, a complete gene with or without intergenic sequence, segments or portions a gene with or without intergenic sequence, or sequence of nucleic acids to which probes or primers are designed. Target nucleic acids can include wild type sequences, nucleic acid sequences containing mutations, deletions or duplications, tandem repeat regions, a gene of interest, a region of a gene of interest or any upstream or downstream region thereof. Target nucleic acids can represent alternative sequences or alleles of a particular gene. Target nucleic acids can be derived from genomic DNA, cDNA, or RNA. As used herein, target nucleic acid can be native DNA or a PCR-amplified product.
As used herein the term “stringency” is used in reference to the conditions of temperature, ionic strength, and the presence of other compounds, under which nucleic acid hybridizations are conducted. With high stringency conditions, nucleic acid base pairing will occur only between nucleic acids that have sufficiently long segments with a high frequency of complementary base sequences. Exemplary hybridization conditions are as follows. High stringency generally refers to conditions that permit hybridization of only those nucleic acid sequences that form stable hybrids in 0.018 M NaCl at 65° C. High stringency conditions can be provided, for example, by hybridization in 50% formamide, 5×Denhardt's solution, 5×SSC (saline sodium citrate) 0.2% SDS (sodium dodecyl sulfate) at 42° C., followed by washing in 0.1×SSC, and 0.1% SDS at 65° C. Moderate stringency refers to conditions equivalent to hybridization in 50% formamide, 5×Denhardt's solution, 5×SSC, 0.2% SDS at 42° C., followed by washing in 0.2×SSC, 0.2% SDS, at 65° C. Low stringency refers to conditions equivalent to hybridization in 10% formamide, 5×Denhardt's solution, 6×SSC, 0.2% SDS, followed by washing in 1° SSC, 0.2% SDS, at 50° C.
As used herein the term “substantially identical” refers to a polypeptide or nucleic acid exhibiting at least 50%, 75%, 85%, 90%, 95%, or even 99% identity to a reference amino acid or nucleic acid sequence over the region of comparison. For polypeptides, the length of comparison sequences will generally be at least 20, 30, 40, or 50 amino acids or more, or the full length of the polypeptide. For nucleic acids, the length of comparison sequences will generally be at least 10, 15, 20, 25, 30, 40, 50, 75, or 100 nucleotides or more, or the full length of the nucleic acid.
“TP53 gene” or “tumor protein P53 gene” is a gene that provides instructions for making the tumor suppressor protein p53. The protein p53 plays a role in regulating cell division by preventing cells from growing or proliferating too fast. P53 attaches directly to DNA when DNA damage is detected, where p53 determines whether the DNA will be repaired or whether the cell with undergo apoptosis. If the cell can be repaired, p53 activates DNA repair genes to fix the damage. P53 is crucial in preventing the development of tumors. Mutations in the TP53 gene are universal across cancer types. TP53 mutations are correlated to the onset of various cancers, including but not limited to, breast cancer, bladder cancer, cholangiocarcinoma, lung cancer, melanoma and ovarian cancer.
“EGFR gene” or “epidermal growth factor receptor gene” is a gene that encodes the EGFR protein. EGFR is protein kinase a transmembrane glycoprotein. Mutations in the EGFR gene have been correlated with many types of cancer, including but not limited to non-small cell lung cancer, glioblastoma, and basal-like breast cancers. Tyrosine kinase inhibitors have shown efficacy in EGFR amplified tumors. Thus, TK inhibitors can be an aggressive therapy for the cancers having less favorable prognosis with EGFR as the marker for treatment.
“BRAF gene” or “B-Raf proto-oncogene” is a gene that encodes for an RAF serine/threonine protein kinase. BRAF plays a role in regulating cell division, differentiation and secretion. Mutations in BRAF are often correlated with cancer-causing mutations in melanoma and other forms of cancer as well.
“KIT” gene (also known as c-Kit) encodes a receptor tyrosine kinase. As disclosed by the National Library of Medicine (https://www.ncbi.nlm.nih.gov/gene/3815, last accessed on Dec. 12, 2022), the gene was initially identified as a homolog of the feline sarcoma viral oncogene v-kit and is often referred to as proto-oncogene c-Kit. The canonical form of this glycosylated transmembrane protein has an N-terminal extracellular region with five immunoglobulin-like domains, a transmembrane region, and an intracellular tyrosine kinase domain at the C-terminus. Upon activation by its cytokine ligand, stem cell factor (SCF), this protein phosphorylates multiple intracellular proteins that play a role in in the proliferation, differentiation, migration and apoptosis of many cell types and thereby plays an important role in hematopoiesis, stem cell maintenance, gametogenesis, melanogenesis, and in mast cell development, migration and function. This protein can be a membrane-bound or soluble protein. Mutations in this gene are associated with gastrointestinal stromal tumors, mast cell disease, acute myelogenous leukemia, and piebaldism. Multiple transcript variants encoding different isoforms have been found for this gene. See also, Gene Cards (https://www.genecards.org/cgi-bin/carddisp.pl?gene-KIT, last accessed on Dec. 12, 2022).
The “KMT2C” gene is a member of the myeloid/lymphoid or mixed-lineage leukemia (MLL) family and encodes a nuclear protein with an AT hook DNA-binding domain, a DHHC-type zinc finger, six PHD-type zinc fingers, a SET domain, a post-SET domain and a RING-type zinc finger. This protein is a member of the ASC-2/NCOA6 complex (ASCOM), which possesses histone methylation activity and is involved in transcriptional coactivation. Sequence information for the gene and the encoded protein is found at GeneCards (https://www.genecards.org/cgi-bin/carddisp.pl?gene=KMT2C, last accessed on Dec. 12, 2022).
The “ELF3” gene enables DNA-binding transcription activator activity, RNA polymerase II-specific and sequence-specific double-stranded DNA binding activity. Involved in inflammatory response; negative regulation of transcription, DNA-templated; and positive regulation of transcription by RNA polymerase II. Located in Golgi apparatus; cytosol; and nucleoplasm. Sequence information for the gene and its encoded protein can be found at GeneCards (https://www.genecards.org/cgi-bin/carddisp.pl?gene=ELF3, last accessed on Dec. 12, 2022).
The “APC” gene encodes a tumor suppressor protein that acts as an antagonist of the Wnt signaling pathway. It is also involved in other processes including cell migration and adhesion, transcriptional activation, and apoptosis. Defects in this gene cause familial adenomatous polyposis (FAP), an autosomal dominant pre-malignant disease that usually progresses to malignancy. Mutations in the APC gene have been found to occur in most colorectal cancers, where disease-associated mutations tend to be clustered in a small region designated the mutation cluster region (MCR) and result in a truncated protein product. Sequence information for the gene and the encoded protein is found at GeneCards (https://www.genecards.org/cgi-bin/carddisp.pl?gene=APC, last accessed on Dec. 12, 2022).
The “AIRD1A” gene encodes a member of the SWI/SNF family, whose members have helicase and ATPase activities and are thought to regulate transcription of certain genes by altering the chromatin structure around those genes. The encoded protein is part of the large ATP-dependent chromatin remodeling complex SNF/SWI, which is required for transcriptional activation of genes normally repressed by chromatin. It possesses at least two conserved domains that could be important for its function. Two transcript variants encoding different isoforms have been found for this gene. Sequence information for the gene and the encoded proteins can be found at GeneCards (https://www.genecards.org/cgi-bin/carddisp.pl?gene=ARID1A, last accessed on Dec. 12, 2022).
A “composition” typically intends a combination of the active agent, e.g., compound or composition, and a naturally-occurring or non-naturally-occurring carrier, inert (for example, a detectable agent or label) or active, such as an adjuvant, diluent, binder, stabilizer, buffers, salts, lipophilic solvents, preservative, adjuvant or the like and include pharmaceutically acceptable carriers. Carriers also include pharmaceutical excipients and additives proteins, peptides, amino acids, lipids, and carbohydrates (e.g., sugars, including monosaccharides, di-, tri-, tetra-oligosaccharides, and oligosaccharides; derivatized sugars such as alditols, aldonic acids, esterified sugars and the like; and polysaccharides or sugar polymers), which can be present singly or in combination, comprising alone or in combination 1-99.99% by weight or volume. Exemplary protein excipients include serum albumin such as human serum albumin (HSA), recombinant human albumin (rHA), gelatin, casein, and the like. Representative amino acid/antibody components, which can also function in a buffering capacity, include alanine, arginine, glycine, arginine, betaine, histidine, glutamic acid, aspartic acid, cysteine, lysine, leucine, isoleucine, valine, methionine, phenylalanine, aspartame, and the like. Carbohydrate excipients are also intended within the scope of this technology, examples of which include but are not limited to monosaccharides such as fructose, maltose, galactose, glucose, D-mannose, sorbose, and the like; disaccharides, such as lactose, sucrose, trehalose, cellobiose, and the like; polysaccharides, such as raffinose, melezitose, maltodextrins, dextrans, starches, and the like; and alditols, such as mannitol, xylitol, maltitol, lactitol, xylitol sorbitol (glucitol) and myoinositol.
As used herein, the terms “nucleic acid sequence” and “polynucleotide” are used interchangeably to refer to a polymeric form of nucleotides of any length, either ribonucleotides or deoxyribonucleotides. Thus, this term includes, but is not limited to, single-, double-, or multi-stranded DNA or RNA, genomic DNA, cDNA, DNA-RNA hybrids, or a polymer comprising purine and pyrimidine bases or other natural, chemically or biochemically modified, non-natural, or derivatized nucleotide bases.
The term “encode” as it is applied to nucleic acid sequences refers to a polynucleotide which is said to “encode” a polypeptide if, in its native state or when manipulated by methods well known to those skilled in the art, can be transcribed and/or translated to produce the mRNA for the polypeptide and/or a fragment thereof. The antisense strand is the complement of such a nucleic acid, and the encoding sequence can be deduced therefrom.
As used herein, the term “vector” refers to a nucleic acid construct deigned for transfer between different hosts, including but not limited to a plasmid, a virus, a cosmid, a phage, a BAC, a YAC, etc. In some embodiments, plasmid vectors may be prepared from commercially available vectors. In other embodiments, viral vectors may be produced from baculoviruses, retroviruses, adenoviruses, AAVs, etc. according to techniques known in the art. In one embodiment, the viral vector is a lentiviral vector. It is to be understood that the vectors contain the necessary regulatory elements for replication or expression of the inserted polynucleotide, including for example promoters or enhancer elements.
The term “promoter” as used herein refers to any sequence that regulates the expression of a coding sequence, such as a gene. Promoters may be constitutive, inducible, repressible, or tissue-specific, for example. A “promoter” is a control sequence that is a region of a polynucleotide sequence at which initiation and rate of transcription are controlled. It may contain genetic elements at which regulatory proteins and molecules may bind such as RNA polymerase and other transcription factors.
As used herein, the term “isolated cell” generally refers to a cell that is substantially separated from other cells of a tissue. “Immune cells” includes, e.g., white blood cells (leukocytes) which are derived from hematopoietic stem cells (HSC) produced in the bone marrow, lymphocytes (T cells, B cells, natural killer (NK) cells) and myeloid-derived cells (neutrophil, eosinophil, basophil, monocyte, macrophage, dendritic cells). “T cell” includes all types of immune cells expressing CD3 including T-helper cells (CD4+ cells), cytotoxic T-cells (CD8+ cells), natural killer T-cells, T-regulatory cells (Treg) and gamma-delta T cells. A “cytotoxic cell” includes CD8+ T cells, natural-killer (NK) cells, and neutrophils, which cells are capable of mediating cytotoxicity responses.
The term “transduce” or “transduction” as it is applied to the production of chimeric antigen receptor cells refers to the process whereby a foreign nucleotide sequence is introduced into a cell. In some embodiments, this transduction is done via a vector.
As used herein, the term “autologous,” in reference to cells refers to cells that are isolated and infused back into the same subject (recipient or host). “Allogeneic” refers to non-autologous cells.
An “effective amount” or “efficacious amount” refers to the amount of an agent, or combined amounts of two or more agents, that, when administered for the treatment of a mammal or other subject, is sufficient to effect such treatment for the disease. The “effective amount” will vary depending on the agent(s), the disease and its severity and the age, weight, etc., of the subject to be treated.
A “solid tumor” is an abnormal mass of tissue that usually does not contain cysts or liquid areas. Solid tumors can be benign or malignant. Different types of solid tumors are named for the type of cells that form them. Examples of solid tumors include sarcomas, carcinomas, and lymphomas.
As used herein, the term “label” intends a directly or indirectly detectable compound or composition that is conjugated directly or indirectly to the composition to be detected, e.g., N-terminal histidine tags (N-His), magnetically active isotopes, e.g., 115Sn, 117Sn and 119Sn, a non-radioactive isotopes such as 13C and 15N, polynucleotide or protein such as an antibody so as to generate a “labeled” composition. The term also includes sequences conjugated to the polynucleotide that will provide a signal upon expression of the inserted sequences, such as green fluorescent protein (GFP) and the like. The label may be detectable by itself (e.g., radioisotope labels or fluorescent labels) or, in the case of an enzymatic label, may catalyze chemical alteration of a substrate compound or composition which is detectable. The labels can be suitable for small scale detection or more suitable for high-throughput screening. As such, suitable labels include, but are not limited to magnetically active isotopes, non-radioactive isotopes, radioisotopes, fluorochromes, chemiluminescent compounds, dyes, and proteins, including enzymes. The label may be simply detected or it may be quantified. A response that is simply detected generally comprises a response whose existence merely is confirmed, whereas a response that is quantified generally comprises a response having a quantifiable (e.g., numerically reportable) value such as an intensity, polarization, and/or other property. In luminescence or fluorescence assays, the detectable response may be generated directly using a luminophore or fluorophore associated with an assay component actually involved in binding, or indirectly using a luminophore or fluorophore associated with another (e.g., reporter or indicator) component. Examples of luminescent labels that produce signals include, but are not limited to bioluminescence and chemiluminescence. Detectable luminescence response generally comprises a change in, or an occurrence of a luminescence signal. Suitable methods and luminophores for luminescently labeling assay components are known in the art and described for example in Haugland, Richard P. (1996) Handbook of Fluorescent Probes and Research Chemicals (6th ed). Examples of luminescent probes include, but are not limited to, aequorin and luciferases.
Examples of suitable fluorescent labels include, but are not limited to, fluorescein, rhodamine, tetramethylrhodamine, eosin, erythrosin, coumarin, methyl-coumarins, pyrene, Malacite green, stilbene, Lucifer Yellow, Cascade Blue™, and Texas Red. Other suitable optical dyes are described in the Haugland, Richard P. (1996) Handbook of Fluorescent Probes and Research Chemicals (6th ed.).
In another aspect, the fluorescent label is functionalized to facilitate covalent attachment to a cellular component present in or on the surface of the cell or tissue such as a cell surface marker. Suitable functional groups, include, but are not limited to, isothiocyanate groups, amino groups, haloacetyl groups, maleimides, succinimidyl esters, and sulfonyl halides, all of which may be used to attach the fluorescent label to a second molecule. The choice of the functional group of the fluorescent label will depend on the site of attachment to either a linker, the agent, the marker, or the second labeling agent.
As used herein, the term “immunoconjugate” comprises an antibody or an antibody derivative associated with or linked to a second agent, such as a cytotoxic agent, a detectable agent, a radioactive agent, a targeting agent, a human antibody, a humanized antibody, a chimeric antibody, a synthetic antibody, a semisynthetic antibody, or a multispecific antibody.
“Immune response” broadly refers to the antigen-specific responses of lymphocytes to foreign substances. The terms “immunogen” and “immunogenic” refer to molecules with the capacity to elicit an immune response. All immunogens are antigens, however, not all antigens are immunogenic. An immune response disclosed herein can be humoral (via antibody activity) or cell-mediated (via T cell activation). The response may occur in vivo or in vitro. The skilled artisan will understand that a variety of macromolecules, including proteins, nucleic acids, fatty acids, lipids, lipopolysaccharides and polysaccharides have the potential to be immunogenic. The skilled artisan will further understand that nucleic acids encoding a molecule capable of eliciting an immune response necessarily encode an immunogen. The artisan will further understand that immunogens are not limited to full-length molecules, but may include partial molecules.
A host cell can be a eukaryotic or a prokaryotic cell. “Eukaryotic cells” comprise all of the life kingdoms except monera. They can be easily distinguished through a membrane-bound nucleus. Animals, plants, fungi, and protists are eukaryotes or organisms whose cells are organized into complex structures by internal membranes and a cytoskeleton. The most characteristic membrane-bound structure is the nucleus. Unless specifically recited, the term “host” includes a eukaryotic host, including, for example, yeast, higher plant, insect and mammalian cells. Non-limiting examples of eukaryotic cells or hosts include simian, bovine, porcine, murine, rat, avian, reptilian and human.
“Prokaryotic cells” that usually lack a nucleus or any other membrane-bound organelles and are divided into two domains, bacteria and archaea. In addition to chromosomal DNA, these cells can also contain genetic information in a circular loop called on episome. Bacterial cells are very small, roughly the size of an animal mitochondrion (about 1-2 μm in diameter and 10 μm long). Prokaryotic cells feature three major shapes: rod shaped, spherical, and spiral. Instead of going through elaborate replication processes like eukaryotes, bacterial cells divide by binary fission. Examples include but are not limited to Bacillus bacteria, E. coli bacterium, and Salmonella bacterium.
As used herein, the term “detectable marker” refers to at least one marker capable of directly or indirectly, producing a detectable signal. A non-exhaustive list of this marker includes enzymes which produce a detectable signal, for example by colorimetry, fluorescence, luminescence, such as horseradish peroxidase, alkaline phosphatase, β-galactosidase, glucose-6-phosphate dehydrogenase, chromophores such as fluorescent, luminescent dyes, groups with electron density detected by electron microscopy or by their electrical property such as conductivity, amperometry, voltammetry, impedance, detectable groups, for example whose molecules are of sufficient size to induce detectable modifications in their physical and/or chemical properties, such detection may be accomplished by optical methods such as diffraction, surface plasmon resonance, surface variation, the contact angle change or physical methods such as atomic force spectroscopy, tunnel effect, or radioactive molecules such as 32p, 35S or 125I.
As used herein, the term “purification label” refers to at least one marker useful for purification or identification. A non-exhaustive list of this marker includes His, lacZ, GST, maltose-binding protein, NusA, BCCP, c-myc, CaM, FLAG, GFP, YFP, cherry, thioredoxin, poly (NANP), V5, Snap, HA, chitin-binding protein, Softag 1, Softag 3, Strep, or S-protein. Suitable direct or indirect fluorescence marker comprise FLAG, GFP, YFP, RFP, dTomato, cherry, Cy3, Cy 5, Cy 5.5, Cy 7, DNP, AMCA, Biotin, Digoxigenin, Tamra, Texas Red, rhodamine, Alexa fluors, FITC, TRITC or any other fluorescent dye or hapten.
As used herein, the term “expression” refers to the process by which polynucleotides are transcribed into mRNA and/or the process by which the transcribed mRNA is subsequently being translated into peptides, polypeptides, or proteins. If the polynucleotide is derived from genomic DNA, expression may include splicing of the mRNA in a eukaryotic cell. The expression level of a gene may be determined by measuring the amount of mRNA or protein in a cell or tissue sample. In one aspect, the expression level of a gene from one sample may be directly compared to the expression level of that gene from a control or reference sample. In another aspect, the expression level of a gene from one sample may be directly compared to the expression level of that gene from the same sample following administration of a compound.
As used herein, “homology” or “identical”, percent “identity” or “similarity”, when used in the context of two or more nucleic acids or polypeptide sequences, refers to two or more sequences or subsequences that are the same or have a specified percentage of nucleotides or amino acid residues that are the same, e.g., at least 60% identity, preferably at least 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or higher identity over a specified region (e.g., nucleotide sequence encoding an antibody described herein or amino acid sequence of an antibody described herein). Homology can be determined by comparing a position in each sequence that may be aligned for purposes of comparison. When a position in the compared sequence is occupied by the same base or amino acid, then the molecules are homologous at that position. A degree of homology between sequences is a function of the number of matching or homologous positions shared by the sequences. The alignment and the percent homology or sequence identity can be determined using software programs known in the art, for example those described in Current Protocols in Molecular Biology (Ausubel et al., eds. 1987) Supplement 30, section 7.7.18, Table 7.7.1. Preferably, default parameters are used for alignment. A preferred alignment program is BLAST, using default parameters. In particular, preferred programs are BLASTN and BLASTP, using the following default parameters: Genetic code=standard; filter=none; strand=both; cutoff=60; expect=10; Matrix=BLOSUM62; Descriptions=50 sequences; sort by=HIGH SCORE; Databases=non-redundant, GenBank+EMBL+DDBJ+PDB+GenBank CDS translations+SwissProtein+SPupdate+PIR. Details of these programs can be found at the following Internet address: ncbi.nlm.nih.gov/cgi-bin/BLAST. The terms “homology” or “identical”, percent “identity” or “similarity” also refer to, or can be applied to, the complement of a test sequence. The terms also include sequences that have deletions and/or additions, as well as those that have substitutions. As described herein, the preferred algorithms can account for gaps and the like. Preferably, identity exists over a region that is at least about 25 amino acids or nucleotides in length, or more preferably over a region that is at least 50-100 amino acids or nucleotides in length. An “unrelated” or “non-homologous” sequence shares less than 40% identity, or alternatively less than 25% identity, with one of the sequences disclosed herein.
“Administration” can be effected in one dose, continuously or intermittently throughout the course of treatment. Methods of determining the most effective means and dosage of administration are known to those of skill in the art and will vary with the composition used for therapy, the purpose of the therapy, the target cell being treated, and the subject being treated. Single or multiple administrations can be carried out with the dose level and pattern being selected by the treating physician. Suitable dosage formulations and methods of administering the agents are known in the art. Route of administration can also be determined and method of determining the most effective route of administration are known to those of skill in the art and will vary with the composition used for treatment, the purpose of the treatment, the health condition or disease stage of the subject being treated, and target cell or tissue. Non-limiting examples of route of administration include oral administration, nasal administration, infusion, injection, and topical application. As understood by those of skill in the art, the therapies can be co-administered with other therapies, such as immuno-oncology or chemotherapy. The therapies can be administered simultaneously or concurrently.
The phrase “first line” or “second line” or “third line” refers to the order of treatment received by a patient. First line therapy regimens are treatments given first, whereas second or third line therapy are given after the first line therapy or after the second line therapy, respectively. The National Cancer Institute defines first line therapy as “the first treatment for a disease or condition. In patients with cancer, primary treatment can be surgery, chemotherapy, radiation therapy, or a combination of these therapies. First line therapy is also referred to those skilled in the art as “primary therapy and primary treatment.” See National Cancer Institute website at www.cancer.gov, last visited on May 1, 2008. Typically, a patient is given a subsequent chemotherapy regimen because the patient did not show a positive clinical or sub-clinical response to the first line therapy or the first line therapy has stopped.
In one aspect, the term “equivalent” or “biological equivalent” of an antibody means the ability of the antibody to selectively bind its epitope protein or fragment thereof as measured by ELISA or other suitable methods. Biologically equivalent antibodies include, but are not limited to, those antibodies, peptides, antibody fragments, antibody variant, antibody derivative and antibody mimetics that bind to the same epitope as the reference antibody.
It is to be inferred without explicit recitation and unless otherwise intended, that when the present disclosure relates to a polypeptide, protein, polynucleotide or antibody, an equivalent or a biologically equivalent of such is intended within the scope of this disclosure. As used herein, the term “biological equivalent thereof” is intended to be synonymous with “equivalent thereof” when referring to a reference protein, antibody, polypeptide or nucleic acid, intends those having minimal homology while still maintaining desired structure or functionality. Unless specifically recited herein, it is contemplated that any polynucleotide, polypeptide or protein mentioned herein also includes equivalents thereof. For example, an equivalent intends at least about 70% homology or identity, or at least 80% homology or identity and alternatively, or at least about 85%, or alternatively at least about 90%, or alternatively at least about 95%, or alternatively 98% percent homology or identity and exhibits substantially equivalent biological activity to the reference protein, polypeptide or nucleic acid. Alternatively, when referring to polynucleotides, an equivalent thereof is a polynucleotide that hybridizes under stringent conditions to the reference polynucleotide or its complement.
A polynucleotide or polynucleotide region (or a polypeptide or polypeptide region) having a certain percentage (for example, 80%, 85%, 90%, or 95%) of “sequence identity” to another sequence means that, when aligned, that percentage of bases (or amino acids) are the same in comparing the two sequences. The alignment and the percent homology or sequence identity can be determined using software programs known in the art, for example those described in Current Protocols in Molecular Biology (Ausubel et al., eds. 1987) Supplement 30, section 7.7.18, Table 7.7.1. Preferably, default parameters are used for alignment. A preferred alignment program is BLAST, using default parameters. In particular, preferred programs are BLASTN and BLASTP, using the following default parameters: Genetic code=standard; filter=none; strand=both; cutoff=60; expect=10; Matrix=BLOSUM62; Descriptions=50 sequences; sort by=HIGH SCORE; Databases=non-redundant, GenBank+EMBL+DDBJ+PDB+GenBank CDS translations+SwissProtein+SPupdate+PIR. Details of these programs can be found at the following Internet address: ncbi.nlm.nih.gov/cgi-bin/BLAST.
“Hybridization” refers to a reaction in which one or more polynucleotides react to form a complex that is stabilized via hydrogen bonding between the bases of the nucleotide residues. The hydrogen bonding may occur by Watson-Crick base pairing, Hoogstein binding, or in any other sequence-specific manner. The complex may comprise two strands forming a duplex structure, three or more strands forming a multi-stranded complex, a single self-hybridizing strand, or any combination of these. A hybridization reaction may constitute a step in a more extensive process, such as the initiation of a PCR reaction, or the enzymatic cleavage of a polynucleotide by a ribozyme.
Examples of stringent hybridization conditions include: incubation temperatures of about 25° C. to about 37° C.; hybridization buffer concentrations of about 6×SSC to about 10×SSC; formamide concentrations of about 0% to about 25%; and wash solutions from about 4×SSC to about 8×SSC. Examples of moderate hybridization conditions include: incubation temperatures of about 40° C. to about 50° C.; buffer concentrations of about 9×SSC to about 2×SSC; formamide concentrations of about 30% to about 50%; and wash solutions of about 5×SSC to about 2×SSC. Examples of high stringency conditions include: incubation temperatures of about 55° C. to about 68° C.; buffer concentrations of about 1×SSC to about 0.1×SSC; formamide concentrations of about 55% to about 75%; and wash solutions of about 1×SSC, 0.1×SSC, or deionized water. In general, hybridization incubation times are from 5 minutes to 24 hours, with 1, 2, or more washing steps, and wash incubation times are about 1, 2, or 15 minutes. SSC is 0.15 M NaCl and 15 mM citrate buffer. It is understood that equivalents of SSC using other buffer systems can be employed.
A “normal cell corresponding to the tumor tissue type” refers to a normal cell from a same tissue type as the tumor tissue. A non-limiting example is a normal lung cell from a patient having lung tumor, or a normal colon cell from a patient having colon tumor.
The term “isolated” as used herein refers to molecules or biologicals or cellular materials being substantially free from other materials. In one aspect, the term “isolated” refers to nucleic acid, such as DNA or RNA, or protein or polypeptide (e.g., an antibody or derivative thereof), or cell or cellular organelle, or tissue or organ, separated from other DNAs or RNAs, or proteins or polypeptides, or cells or cellular organelles, or tissues or organs, respectively, that are present in the natural source. The term “isolated” also refers to a nucleic acid or peptide that is substantially free of cellular material, viral material, or culture medium when produced by recombinant DNA techniques, or chemical precursors or other chemicals when chemically synthesized. Moreover, an “isolated nucleic acid” is meant to include nucleic acid fragments which are not naturally occurring as fragments and would not be found in the natural state. The term “isolated” is also used herein to refer to polypeptides which are isolated from other cellular proteins and is meant to encompass both purified and recombinant polypeptides. The term “isolated” is also used herein to refer to cells or tissues that are isolated from other cells or tissues and is meant to encompass both cultured and engineered cells or tissues.
As used herein, the term “monoclonal antibody” refers to an antibody produced by a single clone of B-lymphocytes or by a cell into which the light and heavy chain genes of a single antibody have been transfected. Monoclonal antibodies are produced by methods known to those of skill in the art, for instance by making hybrid antibody-forming cells from a fusion of myeloma cells with immune spleen cells. Monoclonal antibodies include humanized monoclonal antibodies.
The term “protein”, “peptide” and “polypeptide” are used interchangeably and in their broadest sense to refer to a compound of two or more subunit amino acids, amino acid analogs or peptidomimetics. The subunits may be linked by peptide bonds. In another aspect, the subunit may be linked by other bonds, e.g., ester, ether, etc. A protein or peptide must contain at least two amino acids and no limitation is placed on the maximum number of amino acids which may comprise a protein's or peptide's sequence. As used herein the term “amino acid” refers to either natural and/or unnatural or synthetic amino acids, including glycine and both the D and L optical isomers, amino acid analogs and peptidomimetics.
The terms “polynucleotide” and “oligonucleotide” are used interchangeably and refer to a polymeric form of nucleotides of any length, either deoxyribonucleotides or ribonucleotides or analogs thereof. Polynucleotides can have any three-dimensional structure and may perform any function, known or unknown. The following are non-limiting examples of polynucleotides: a gene or gene fragment (for example, a probe, primer, EST or SAGE tag), exons, introns, messenger RNA (mRNA), transfer RNA, ribosomal RNA, RNAi, ribozymes, cDNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes and primers. A polynucleotide can comprise modified nucleotides, such as methylated nucleotides and nucleotide analogs. If present, modifications to the nucleotide structure can be imparted before or after assembly of the polynucleotide. The sequence of nucleotides can be interrupted by non-nucleotide components. A polynucleotide can be further modified after polymerization, such as by conjugation with a labeling component. The term also refers to both double- and single-stranded molecules. Unless otherwise specified or required, any aspect of this technology that is a polynucleotide encompasses both the double-stranded form and each of two complementary single-stranded forms known or predicted to make up the double-stranded form.
As used herein, the term “purified” does not require absolute purity; rather, it is intended as a relative term. Thus, for example, a purified nucleic acid, peptide, protein, biological complexes or other active compound is one that is isolated in whole or in part from proteins or other contaminants. Generally, substantially purified peptides, proteins, biological complexes, or other active compounds for use within the disclosure comprise more than 80% of all macromolecular species present in a preparation prior to admixture or formulation of the peptide, protein, biological complex or other active compound with a pharmaceutical carrier, excipient, buffer, absorption enhancing agent, stabilizer, preservative, adjuvant or other co-ingredient in a complete pharmaceutical formulation for therapeutic administration. More typically, the peptide, protein, biological complex or other active compound is purified to represent greater than 90%, often greater than 95% of all macromolecular species present in a purified preparation prior to admixture with other formulation ingredients. In other cases, the purified preparation may be essentially homogeneous, wherein other macromolecular species are not detectable by conventional techniques.
In one aspect, a method of treating inhibiting the growth of a cancer cell or treating a cancer in a subject in need thereof, wherein the subject has a clustered mutation in one or more, or alternatively two or more of, or alternatively three or more of, or alternatively four or more of, or alternatively five or more of, or alternatively six or more of, or alternatively all seven of TP53, EGFR, KIT, KMT2C, ELF3, APC and ARID1A gene(s) or lacks a clustered mutation in a BRAF gene and/or no clustered mutation in the BRAF gene in a sample isolated from the subject is disclosed. The method comprises, consists of, or consists essentially of administering an aggressive therapy to the subject, thereby inhibiting the growth of the cancer cell or treating the cancer in the subject.
The cancer cell can be an animal or a mammalian cell. Non-limiting examples of mammalian cells include human cells, non-human primate cells (e.g., apes, gibbons, chimpanzees, orangutans, monkeys, macaques, and the like), domestic animals (e.g., dogs and cats), farm animals (e.g., horses, cows, goats, sheep, pigs) and experimental animal cells (e.g., mouse, rat, rabbit, guinea pig). In some embodiments, the cell is a human cell. A mammal can be any age or at any stage of development (e.g., an adult, teen, child, infant, or a mammal in utero). A mammal can be male or female. In some embodiments, a subject is a human. In some embodiments, a subject has or is diagnosed of having or is suspected of having a cancer.
The subject can be any animal, typically a mammal. Any suitable mammal can be treated by a method described herein. Non-limiting examples of mammals include humans, non-human primates (e.g., apes, gibbons, chimpanzees, orangutans, monkeys, macaques, and the like), domestic animals (e.g., dogs and cats), farm animals (e.g., horses, cows, goats, sheep, pigs) and experimental animals (e.g., mouse, rat, rabbit, guinea pig). In some embodiments, a mammal is a human. A mammal can be any age or at any stage of development (e.g., an adult, teen, child, infant, or a mammal in utero). A mammal can be male or female. In some embodiments, a subject is a human. In some embodiments, a subject has or is diagnosed of having or is suspected of having a cancer.
In further aspects, the cancer cell or cancer is selected from a carcinoma, a sarcoma or a blood cancer. In yet further aspects, the cancer cell or cancer is selected from circulatory system, for example, heart (sarcoma [angiosarcoma, fibrosarcoma, rhabdomyosarcoma, liposarcoma], myxoma, rhabdomyoma, fibroma, lipoma and teratoma), mediastinum and pleura, and other intrathoracic organs, vascular tumors and tumor-associated vascular tissue; respiratory tract, for example, nasal cavity and middle ear, accessory sinuses, larynx, trachea, bronchus and lung such as small cell lung cancer (SCLC), non-small cell lung cancer (NSCLC), bronchogenic carcinoma (squamous cell, undifferentiated small cell, undifferentiated large cell, adenocarcinoma), alveolar (bronchiolar) carcinoma, bronchial adenoma, sarcoma, lymphoma, chondromatous hamartoma, mesothelioma; gastrointestinal system, for example, esophagus (squamous cell carcinoma, adenocarcinoma, leiomyosarcoma, lymphoma), colon cancer, colorectal cancer, rectal cancer, stomach (carcinoma, lymphoma, leiomyosarcoma), gastric, pancreas (ductal adenocarcinoma, insulinoma, glucagonoma, gastrinoma, carcinoid tumors, vipoma), small bowel (adenocarcinoma, lymphoma, carcinoid tumors, Karposi's sarcoma, leiomyoma, hemangioma, lipoma, neurofibroma, fibroma), large bowel (adenocarcinoma, tubular adenoma, villous adenoma, hamartoma, leiomyoma); gastrointestinal stromal tumors and neuroendocrine tumors arising at any site; genitourinary tract, for example, kidney (adenocarcinoma, Wilm's tumor [nephroblastoma], lymphoma, leukemia), bladder and/or urethra (squamous cell carcinoma, transitional cell carcinoma, adenocarcinoma), prostate (adenocarcinoma, sarcoma), testis (seminoma, teratoma, embryonal carcinoma, teratocarcinoma, choriocarcinoma, sarcoma, interstitial cell carcinoma, fibroma, fibroadenoma, adenomatoid tumors, lipoma); liver, for example, hepatoma (hepatocellular carcinoma), cholangiocarcinoma, hepatoblastoma, angiosarcoma, hepatocellular adenoma, hemangioma, pancreatic endocrine tumors (such as pheochromocytoma, insulinoma, vasoactive intestinal peptide tumor, islet cell tumor and glucagonoma); bone, for example, osteogenic sarcoma (osteosarcoma), fibrosarcoma, malignant fibrous histiocytoma, chondrosarcoma, Ewing's sarcoma, malignant lymphoma (reticulum cell sarcoma), multiple myeloma, malignant giant cell tumor chordoma, osteochronfroma (osteocartilaginous exostoses), benign chondroma, chondroblastoma, chondromyxofibroma, osteoid osteoma and giant cell tumors; nervous system, for example, neoplasms of the central nervous system (CNS), primary CNS lymphoma, skull cancer (osteoma, hemangioma, granuloma, xanthoma, osteitis deformans), meninges (meningioma, meningiosarcoma, gliomatosis), brain cancer (astrocytoma, medulloblastoma, glioma, ependymoma, germinoma [pinealoma], glioblastoma multiform, oligodendroglioma, schwannoma, retinoblastoma, congenital tumors), spinal cord neurofibroma, meningioma, glioma, sarcoma); reproductive system, for example, gynecological, uterus (endometrial carcinoma), cervix (cervical carcinoma, pre-tumor cervical dysplasia), ovaries (ovarian carcinoma [serous cystadenocarcinoma, mucinous cystadenocarcinoma, unclassified carcinoma], granulosa-thecal cell tumors, Sertoli-Leydig cell tumors, dysgerminoma, malignant teratoma), vulva (squamous cell carcinoma, intraepithelial carcinoma, adenocarcinoma, fibrosarcoma, melanoma), vagina (clear cell carcinoma, squamous cell carcinoma, botryoid sarcoma (embryonal rhabdomyosarcoma), fallopian tubes (carcinoma) and other sites associated with female genital organs; placenta, penis, prostate, testis, and other sites associated with male genital organs; hematologic system, for example, blood (myeloid leukemia [acute and chronic], acute lymphoblastic leukemia, chronic lymphocytic leukemia, myeloproliferative diseases, multiple myeloma, myelodysplastic syndrome), Hodgkin's disease, non-Hodgkin's lymphoma [malignant lymphoma]; oral cavity, for example, lip, tongue, gum, floor of mouth, palate, and other parts of mouth, parotid gland, and other parts of the salivary glands, tonsil, oropharynx, nasopharynx, pyriform sinus, hypopharynx, and other sites in the lip, oral cavity and pharynx; skin, for example, malignant melanoma, cutaneous melanoma, basal cell carcinoma, squamous cell carcinoma, Karposi's sarcoma, moles dysplastic nevi, lipoma, angioma, dermatofibroma, and keloids; and other tissues comprising connective and soft tissue, retroperitoneum and peritoneum, eye, intraocular melanoma, and adnexa, breast, head or/and neck, anal region, thyroid, parathyroid, adrenal gland and other endocrine glands and related structures, secondary and unspecified malignant neoplasm of lymph nodes, secondary malignant neoplasm of respiratory and digestive systems and secondary malignant neoplasm of other sites.
Further, the cancer may be a primary cancer or a metastatic cancer. Moreover, the sample may be a cancer cell isolated from a tumor, a peripheral blood sample or a liquid biopsy. In a further aspect the clustered mutation or lack of the clustered mutation in the BRAF gene is specifically linked to specific cancer type, whether primary or metastatic, see, for example
The aggressive therapy can be selected from adoptive cell therapy, immune checkpoint blockades including PD1, PD-L1, and CTLA4, pretargeted radioimmunotherapy, oncolytic viral therapy, or cancer vaccines. It also can include TK inhibitors or combination chemotherapy (i.e., two or more agents administered in combination). The particular therapy will depend on the patient, the cancer and the cluster status of the subject.
In a yet further aspect, the aggressive chemotherapy comprises one or more selected from monoclonal antibodies, optionally selected from monospecific antibodies, bispecific antibodies, multispecific antibodies and a bispecific immune cell engager, antibody-drug conjugates, CAR therapies optionally selected from a CAR NK therapy, a CAR T therapy, a CAR cytotoxic T therapy, a CAR gamma-delta T therapy, a CAR NK therapy, cell therapies, inhibitors or antagonists of an inhibitory immune checkpoint, activators or agonists of a stimulatory immune checkpoint optionally selected from an activating ligand, immune regulators, cancer vaccines, and a vector delivering each thereof to a subject optionally in an oncolytic virus therapy.
In another aspect, the aggressive chemotherapy comprises a checkpoint inhibitor. Non-limiting examples of such include GS4224, AMP-224, CA-327, CA-170, BMS-1001, BMS-1166, peptide-57, M7824, MGD013, CX-072, UNP-12, NP-12, or a combination of two or more thereof.
Additional checkpoint inhibitors comprises one or more selected from an anti-PD-1 agent, an anti-PD-L1 agent, an anti-CTLA-4 agent, an anti-LAG-3 agent, an anti-TIM-3 agent, an anti-TIGIT agent, an anti-VISTA agent, an anti-B7-H3 agent, an anti-BTLA agent, an anti-ICOS agent, an anti-GITR agent, an anti-4-1BB agent, an anti-OX40 agent, an anti-CD27 agent, an anti-CD28 agent, an anti-CD40 agent, and an anti-Siglec-15 agent. In a further aspect, the checkpoint inhibitor comprises an anti-PD1 agent or an anti-PD-L1 agent. In one aspect, the anti-PD1 agent comprises an anti-PD1 antibody or an antigen binding fragment thereof. In a further aspect, the anti-PD1 antibody comprises nivolumab, pembrolizumab, cemiplimab, spartalizumab, camrelizumab, sintilimab, tislelizumab, toripalimab, AMF 514, or a combination of two or more thereof. In another aspect, the anti-PD-L1 agent comprises an anti-PD-L1 antibody or an antigen binding fragment thereof. In a further aspect, the anti-PD-L1 antibody comprises avelumab, durvalumab, atezolizumab, envafolimab, or a combination of two or more thereof.
In one aspect, the checkpoint inhibitor comprises an anti-CTLA-4 agent. In another aspect, the anti-CTLA-4 agent comprises an anti-CTLA-4 antibody or an antigen binding fragment thereof. In a yet further aspect, the anti-CTLA-4 antibody comprises ipilimumab, tremelimumab, zalifrelimab, or AGEN1181, or a combination thereof.
In one aspect the therapy further comprises surgical resection of the cancer, tumor or cancer cells. The therapy can be a first-line, second-line, third-line, fourth-line, fifth-line therapy.
In yet another aspect, a method for selecting a cancer patient for an aggressive therapy is disclosed. The method comprises, consists of, or consists essentially of assaying for and/or detecting at least one clustered mutation in a gene selected from one or more, or alternatively two or more of, or alternatively three or more of, or alternatively four or more of, or alternatively five or more of, or alternatively six or more of, or alternatively all seven of TP53, EGFR, KIT, KMT2C, ELF3, APC and ARID1A gene(s) or lacks a clustered mutation in a BRAF gene in a sample isolated from the subject wherein the subject is selected for the therapy if the clustered mutation is detected in the sample isolated from the cancer patient or if the BRAF gene is not detected. Non-limiting examples of aggressive therapies are disclosed herein and incorporated herein by reference.
For example, the aggressive therapy can be selected from adoptive cell therapy, immune checkpoint blockades including PD1, PD-L1, and CTLA4, pretargeted radioimmunotherapy, oncolytic viral therapy, or cancer vaccines. It also can include TK inhibitors or combination chemotherapy (i.e., two or more agents administered in combination). The particular therapy will depend on the patient, the cancer and the cluster status of the subject.
In a yet further aspect, the aggressive chemotherapy comprises one or more selected from monoclonal antibodies, optionally selected from monospecific antibodies, bispecific antibodies, multispecific antibodies and a bispecific immune cell engager, antibody-drug conjugates, CAR therapies optionally selected from a CAR NK therapy, a CAR T therapy, a CAR cytotoxic T therapy, a CAR gamma-delta T therapy, a CAR NK therapy, cell therapies, inhibitors or antagonists of an inhibitory immune checkpoint, activators or agonists of a stimulatory immune checkpoint optionally selected from an activating ligand, immune regulators, cancer vaccines, and a vector delivering each thereof to a subject optionally in an oncolytic virus therapy.
In another aspect, the aggressive chemotherapy comprises a checkpoint inhibitor. Non-limiting examples of such include GS4224, AMP-224, CA-327, CA-170, BMS-1001, BMS-1166, peptide-57, M7824, MGD013, CX-072, UNP-12, NP-12, or a combination of two or more thereof.
Additional checkpoint inhibitors comprises one or more selected from an anti-PD-1 agent, an anti-PD-L1 agent, an anti-CTLA-4 agent, an anti-LAG-3 agent, an anti-TIM-3 agent, an anti-TIGIT agent, an anti-VISTA agent, an anti-B7-H3 agent, an anti-BTLA agent, an anti-ICOS agent, an anti-GITR agent, an anti-4-1BB agent, an anti-OX40 agent, an anti-CD27 agent, an anti-CD28 agent, an anti-CD40 agent, and an anti-Siglec-15 agent. In a further aspect, the checkpoint inhibitor comprises an anti-PD1 agent or an anti-PD-L1 agent. In one aspect, the anti-PD1 agent comprises an anti-PD1 antibody or an antigen binding fragment thereof. In a further aspect, the anti-PD1 antibody comprises nivolumab, pembrolizumab, cemiplimab, spartalizumab, camrelizumab, sintilimab, tislelizumab, toripalimab, AMF 514, or a combination of two or more thereof. In another aspect, the anti-PD-L1 agent comprises an anti-PD-L1 antibody or an antigen binding fragment thereof. In a further aspect, the anti-PD-L1 antibody comprises avelumab, durvalumab, atezolizumab, envafolimab, or a combination of two or more thereof. In one aspect, the checkpoint inhibitor comprises an anti-CTLA-4 agent. In another aspect, the anti-CTLA-4 agent comprises an anti-CTLA-4 antibody or an antigen binding fragment thereof. In a yet further aspect, the anti-CTLA-4 antibody comprises ipilimumab, tremelimumab, zalifrelimab, or AGEN1181, or a combination thereof. In yet another aspect, a method for identifying whether a cancer patient is likely to experience a relatively longer or shorter overall survival is disclosed. The method comprises, consists of, or consists essentially of assaying for and/or detecting at least one or more, or alternatively two or more of, or alternatively three or more of, or alternatively four or more of, or alternatively five or more of, or alternatively six or more of, or alternatively all seven of TP53, EGFR, KIT, KMT2C, ELF3, APC and ARID1A gene(s) or lacks a clustered mutation in a BRAF gene in a sample isolated from the patient, wherein the patient is likely to experience longer overall survival if the clustered mutation is detected in BRAF and the patient is likely to experience shorter overall survival if the clustered mutation is detected in at least one or more, or alternatively two or more of, or alternatively three or more of, or alternatively four or more of, or alternatively five or more of, or alternatively six or more of, or alternatively all seven of TP53, EGFR, KIT, KMT2C, ELF3, APC and ARID1A gene(s).
The subject can be any animal, typically a mammal. Any suitable mammal can be treated by a method described herein. Non-limiting examples of mammals include humans, non-human primates (e.g., apes, gibbons, chimpanzees, orangutans, monkeys, macaques, and the like), domestic animals (e.g., dogs and cats), farm animals (e.g., horses, cows, goats, sheep, pigs) and experimental animals (e.g., mouse, rat, rabbit, guinea pig). In some embodiments, a mammal is a human. A mammal can be any age or at any stage of development (e.g., an adult, teen, child, infant, or a mammal in utero). A mammal can be male or female. In some embodiments, a subject is a human. In some embodiments, a subject has or is diagnosed of having or is suspected of having a cancer.
In further aspects, the cancer cell or cancer is selected from a carcinoma, a sarcoma or a blood cancer. In yet further aspects, the cancer cell or cancer is selected from circulatory system, for example, heart (sarcoma [angiosarcoma, fibrosarcoma, rhabdomyosarcoma, liposarcoma], myxoma, rhabdomyoma, fibroma, lipoma and teratoma), mediastinum and pleura, and other intrathoracic organs, vascular tumors and tumor-associated vascular tissue; respiratory tract, for example, nasal cavity and middle ear, accessory sinuses, larynx, trachea, bronchus and lung such as small cell lung cancer (SCLC), non-small cell lung cancer (NSCLC), bronchogenic carcinoma (squamous cell, undifferentiated small cell, undifferentiated large cell, adenocarcinoma), alveolar (bronchiolar) carcinoma, bronchial adenoma, sarcoma, lymphoma, chondromatous hamartoma, mesothelioma; gastrointestinal system, for example, esophagus (squamous cell carcinoma, adenocarcinoma, leiomyosarcoma, lymphoma), stomach (carcinoma, lymphoma, leiomyosarcoma), colon cancer, colorectal cancer, rectal cancer, gastric, pancreas (ductal adenocarcinoma, insulinoma, glucagonoma, gastrinoma, carcinoid tumors, vipoma), small bowel (adenocarcinoma, lymphoma, carcinoid tumors, Karposi's sarcoma, leiomyoma, hemangioma, lipoma, neurofibroma, fibroma), large bowel (adenocarcinoma, tubular adenoma, villous adenoma, hamartoma, leiomyoma); gastrointestinal stromal tumors and neuroendocrine tumors arising at any site; genitourinary tract, for example, kidney (adenocarcinoma, Wilm's tumor [nephroblastoma], lymphoma, leukemia), bladder and/or urethra (squamous cell carcinoma, transitional cell carcinoma, adenocarcinoma), prostate (adenocarcinoma, sarcoma), testis (seminoma, teratoma, embryonal carcinoma, teratocarcinoma, choriocarcinoma, sarcoma, interstitial cell carcinoma, fibroma, fibroadenoma, adenomatoid tumors, lipoma); liver, for example, hepatoma (hepatocellular carcinoma), cholangiocarcinoma, hepatoblastoma, angiosarcoma, hepatocellular adenoma, hemangioma, pancreatic endocrine tumors (such as pheochromocytoma, insulinoma, vasoactive intestinal peptide tumor, islet cell tumor and glucagonoma); bone, for example, osteogenic sarcoma (osteosarcoma), fibrosarcoma, malignant fibrous histiocytoma, chondrosarcoma, Ewing's sarcoma, malignant lymphoma (reticulum cell sarcoma), multiple myeloma, malignant giant cell tumor chordoma, osteochronfroma (osteocartilaginous exostoses), benign chondroma, chondroblastoma, chondromyxofibroma, osteoid osteoma and giant cell tumors; nervous system, for example, neoplasms of the central nervous system (CNS), primary CNS lymphoma, skull cancer (osteoma, hemangioma, granuloma, xanthoma, osteitis deformans), meninges (meningioma, meningiosarcoma, gliomatosis), brain cancer (astrocytoma, medulloblastoma, glioma, ependymoma, germinoma [pinealoma], glioblastoma multiform, oligodendroglioma, schwannoma, retinoblastoma, congenital tumors), spinal cord neurofibroma, meningioma, glioma, sarcoma); reproductive system, for example, gynecological, uterus (endometrial carcinoma), cervix (cervical carcinoma, pre-tumor cervical dysplasia), ovaries (ovarian carcinoma [serous cystadenocarcinoma, mucinous cystadenocarcinoma, unclassified carcinoma], granulosa-thecal cell tumors, Sertoli-Leydig cell tumors, dysgerminoma, malignant teratoma), vulva (squamous cell carcinoma, intraepithelial carcinoma, adenocarcinoma, fibrosarcoma, melanoma), vagina (clear cell carcinoma, squamous cell carcinoma, botryoid sarcoma (embryonal rhabdomyosarcoma), fallopian tubes (carcinoma) and other sites associated with female genital organs; placenta, penis, prostate, testis, and other sites associated with male genital organs; hematologic system, for example, blood (myeloid leukemia [acute and chronic], acute lymphoblastic leukemia, chronic lymphocytic leukemia, myeloproliferative diseases, multiple myeloma, myelodysplastic syndrome), Hodgkin's disease, non-Hodgkin's lymphoma [malignant lymphoma]; oral cavity, for example, lip, tongue, gum, floor of mouth, palate, and other parts of mouth, parotid gland, and other parts of the salivary glands, tonsil, oropharynx, nasopharynx, pyriform sinus, hypopharynx, and other sites in the lip, oral cavity and pharynx; skin, for example, malignant melanoma, cutaneous melanoma, basal cell carcinoma, squamous cell carcinoma, Karposi's sarcoma, moles dysplastic nevi, lipoma, angioma, dermatofibroma, and keloids; and other tissues comprising connective and soft tissue, retroperitoneum and peritoneum, eye, intraocular melanoma, and adnexa, breast, head or/and neck, anal region, thyroid, parathyroid, adrenal gland and other endocrine glands and related structures, secondary and unspecified malignant neoplasm of lymph nodes, secondary malignant neoplasm of respiratory and digestive systems and secondary malignant neoplasm of other sites.
The cancer can be primary or metatstatic. In some aspects, the clustered mutation is specifically linked to a primary or metastatic cancer, see, e.g.,
Any suitable method for identifying the genotype in the patient sample can be used and the disclosures described herein are not to be limited to these methods. For the purpose of illustration only, the genotype is determined by a method comprising, or alternatively consisting essentially of, or yet further consisting of, sequencing, hybridization, nucleic acid amplification, including polymerase chain reaction (PCR), real-time PCR, reverse transcriptase PCR (RT-PCR), nested PCR, ligase chain reaction, or PCR-RFLP, or microarray. These methods as well as equivalents or alternatives thereto are described herein.
Information obtained using the diagnostic assays described herein is useful for determining if a subject will likely, more likely, or less likely to respond to cancer treatment of a given type. Based on the prognostic information, a doctor can recommend a therapeutic protocol, useful for treating reducing the malignant mass or tumor in the patient or treat cancer in the individual.
In addition, knowledge of the identity of a particular allele in an individual (the gene profile) allows customization of therapy for a particular disease to the individual's genetic profile, the goal of “pharmacogenomics”. For example, an individual's genetic profile can enable a doctor: 1) to more effectively prescribe a drug that will address the molecular basis of the disease or condition; 2) to better determine the appropriate dosage of a particular drug and 3) to identify novel targets for drug development. The identity of the genotype or expression patterns of individual patients can then be compared to the genotype or expression profile of the disease to determine the appropriate drug and dose to administer to the patient.
The ability to target populations expected to show the highest clinical benefit, based on the normal or disease genetic profile, can enable: 1) the repositioning of marketed drugs with disappointing market results; 2) the rescue of drug candidates whose clinical development has been discontinued as a result of safety or efficacy limitations, which are patient subgroup-specific; and 3) an accelerated and less costly development for drug candidates and more optimal drug labeling.
The methods and compositions disclosed herein can be used to detect nucleic acids associated with the genetic polymorphisms identified herein using a biological sample obtained from a patient. Biological samples can be obtained by standard procedures and can be used immediately or stored, under conditions appropriate for the type of biological sample, for later use. Any liquid or solid biological material obtained from the patient believed to contain nucleic acids comprising the region the polymorphic region can be a suitable sample. The sample can be a tumor sample, a peripheral blood sample or a liquid biopsy.
Methods of obtaining test samples are known to those of skill in the art and include, but are not limited to, aspirations, tissue sections, swabs, drawing of blood or other fluids, surgical or needle biopsies.
In some aspects, the biological sample is a tissue or a cell sample. Suitable patient samples in the methods include, but are not limited to, blood, plasma, serum, a biopsy tissue, fine needle biopsy sample, amniotic fluid, plasma, pleural fluid, saliva, semen, serum, tissue or tissue homogenates, frozen or paraffin sections of tissue or combinations thereof. In some aspects, the biological sample comprises, or alternatively consisting essentially of, or yet further consisting of, at least one of a tumor cell, a normal cell adjacent to a tumor, a normal cell corresponding to the tumor tissue type, a blood cell, a peripheral blood lymphocyte, or combinations thereof. In some aspects, the biological sample is an original sample recently isolated from the patient, a fixed tissue, a frozen tissue, a resection tissue, or a microdissected tissue. In some aspects, the biological samples are processed, such as by sectioning of tissues, fractionation, purification, nucleic acid isolation, or cellular organelle separation.
In some embodiments, nucleic acid (DNA or RNA) is isolated from the sample according to any methods known to those of skill in the art. In some aspects, genomic DNA is isolated from the biological sample. In some aspects, RNA is isolated from the biological sample. In some aspects, cDNA is generated from mRNA in the sample. In some embodiments, the nucleic acid is not isolated from the biological sample (e.g., the polymorphism is detected directly from the biological sample).
Methods to detect or assay for clustered mutations are known in the art and described herein. In some aspects, detection of a clustered mutations or polymorphisms can be accomplished by molecular cloning of the specified allele and subsequent sequencing of that allele using techniques known in the art, in some aspects, after isolation of a suitable nucleic acid sample. In some aspects, the gene sequences can be amplified directly from a genomic DNA preparation from the biological sample using PCR, and the sequence composition is determined by sequencing the amplified product (i.e., amplicon). Alternatively, the PCR product can be analyzed following digestion with a restriction enzyme, a method known as PCR-RFLP.
In some embodiments, the clustered mutations or polymorphism is detected using allele specific hybridization using probes overlapping the polymorphic site. In some aspects, the nucleic acid probes are between 5 and 40 nucleotides in length. In some aspects, the nucleic acid probes are about 5, about 10, about 15, about 20, about 25, about 30, about 35, or about 40 or more nucleotides flanking the polymorphic site.
In another embodiment of the disclosure, several nucleic acid probes capable of hybridizing specifically to the nucleic acid containing the allelic variant are attached to a solid phase support, e.g., a “chip” or “microarray. Such gene chips or microarrays can be used to detect genetic variations by a number of techniques known to one of skill in the art. In one technique, oligonucleotides are arrayed on a gene chip for determining the DNA sequence by the sequencing by hybridization approach. The probes of the disclosure also can be used for fluorescent detection of a genetic sequence. A probe also can be affixed to an electrode surface for the electrochemical detection of nucleic acid sequences.
In one aspect, “gene chips” or “microarrays” containing probes or primers for the gene of interest are provided alone or in combination with other probes and/or primers. A suitable sample is obtained from the patient extraction of genomic DNA, RNA, or any combination thereof and amplified if necessary. The DNA or RNA sample is contacted to the gene chip or microarray panel under conditions suitable for hybridization of the gene(s) of interest to the probe(s) or primer(s) contained on the gene chip or microarray. The probes or primers can be detectably labeled thereby identifying the polymorphism in the gene(s) of interest. Alternatively, a chemical or biological reaction can be used to identify the probes or primers which hybridized with the DNA or RNA of the gene(s) of interest. The genetic profile of the patient is then determined with the aid of the aforementioned apparatus and methods.
In some aspects, whole genome sequencing, in particular with the “next generation sequencing” techniques, which employ massively parallel sequencing of DNA templates, can be used to obtain genotypes of relevant polymorphisms. Exemplary NGS sequencing platforms for the generation of nucleic acid sequence data include, but are not limited to, Illumina's sequencing by synthesis technology (e.g., Illumina MiSeq or HiSeq System), Life Technologies' Ion Torrent semiconductor sequencing technology (e.g., Ion Torrent PGM or Proton system), the Roche (454 Life Sciences) GS series and Qiagen (Intelligent BioSystems) Gene Reader sequencing platforms.
In some aspects, nucleic acid comprising, or alternatively consisting essentially of, or yet further consisting of the polymorphism is amplified to produce an amplicon containing the polymorphism. Nucleic acids can be amplified by various methods known to the skilled artisan. Nucleic acid amplification can be linear or exponential. Amplification is generally carried out using polymerase chain reaction (PCR) technologies. Alternative or modified PCR amplification methods can also be used and include, for example, isothermal amplification methods, rolling circle methods, Hot-start PCR, real-time PCR, Allele-specific PCR, Assembly PCR or Polymerase Cycling Assembly (PCA), Asymmetric PCR, Colony PCR, Emulsion PCR, Fast PCR, Real-Time PCR, nucleic acid ligation, Gap Ligation Chain Reaction (Gap LCR), Ligation-mediated PCR, Multiplex Ligation-dependent Probe Amplification, (MLPA), Gap Extension Ligation PCR (GEXL-PCR), quantitative PCR (Q-PCR), Quantitative real-time PCR (QRT-PCR), multiplex PCR, Helicase-dependent amplification, Intersequence-specific (ISSR) PCR, Inverse PCR, Linear-After-The-Exponential-PCR (LATE-PCR), Methylation-specific PCR (MSP), Nested PCR, Overlap-extension PCR, PAN-AC assay, Reverse Transcription PCR (RT-PCR), Rapid Amplification of cDNA Ends (RACE PCR), Single molecule amplification PCR (SMA PCR), Thermal asymmetric interlaced PCR (TAIL-PCR), Touchdown PCR, long PCR, nucleic acid sequencing (including DNA sequencing and RNA sequencing), transcription, reverse transcription, duplication, DNA or RNA ligation, and other nucleic acid extension reactions known in the art. The skilled artisan will understand that other methods can be used either in place of, or together with, PCR methods, including enzymatic replication reactions developed in the future. See, e.g., Saiki, “Amplification of Genomic DNA” in PCR Protocols, Innis et al., eds., Academic Press, San Diego, Calif., 13-20 (1990); Wharam, et al., 29 (11) Nucleic Acids Res, E54-E54 (2001); Hafner, et al., 30 (4) Biotechniques, 852-6, 858, 860 passim (2001).
In some aspects, nucleic acid comprising, or alternatively consisting essentially of, or yet further consisting of the polymorphism of interest is amplified to produce an amplicon. In some aspects, a nucleic acid containing the region of interest is amplified using a forward primer and a reverse primer the flank the region of interest. In some aspects, the amplicon containing the region of interest (e.g. an amplicon having the polymorphic sequence) is detected by hybridizing a nucleic acid probe containing the polymorphism or a complement thereof to the corresponding complementary strand of the amplicon and detecting the hybrid formed between the nucleic acid probe and the complementary strand of the amplicon. In some aspects, amplicon containing the region of interest is sequenced (e.g., dideoxy chain termination methods (Sanger method and variants thereof), Maxam & Gilbert sequencing, pyrosequencing, exonuclease digestion and next-generation sequencing methods).
In some embodiments, the amplification includes a labeled primer or probe, thereby allowing detection of the amplification products corresponding to that primer or probe. In particular embodiments, the amplification can include a multiplicity of labeled primers or probes; such primers can be distinguishably labeled, allowing the simultaneous detection of multiple amplification products.
In some embodiments, the amplification products are detected by any of a number of methods such as gel electrophoresis, column chromatography, hybridization with a nucleic acid probe, or sequencing the amplicon.
Detectable labels can be used to identify the primer or probe hybridized to a genomic nucleic acid or amplicon. Detectable labels include but are not limited to fluorophores, isotopes (e.g., 32P, 33P, 35S, 3H, 14C, 125I, 131I) electron-dense reagents (e.g., gold, silver), nanoparticles, enzymes commonly used in an ELISA (e.g., horseradish peroxidase, beta-galactosidase, luciferase, alkaline phosphatase), chemiluminescent compounds, colorimetric labels (e.g., colloidal gold), magnetic labels (e.g., Dynabeads®), biotin, digoxigenin, haptens, proteins for which antisera or monoclonal antibodies are available, ligands, hormones, oligonucleotides capable of forming a complex with the corresponding oligonucleotide complement.
In one embodiment, a primer or probe is labeled with a fluorophore that emits a detectable signal. The term “fluorophore” as used herein refers to a molecule that absorbs light at a particular wavelength (excitation frequency) and subsequently emits light of a longer wavelength (emission frequency). While a suitable reporter dye is a fluorescent dye, any reporter dye that can be attached to a detection reagent such as an oligonucleotide probe or primer is suitable for use in the methods described. Suitable fluorescent moieties include, but are not limited to, the following fluorophores working individually or in combination: 4-acetamido-4′-isothiocyanatostilbene-2,2′disulfonic acid; acridine and derivatives, e.g. acridine, acridine isothiocyanate; Alexa Fluors: Alexa Fluor® 350, Alexa Fluor® 488, Alexa Fluor® 546, Alexa Fluor® 555, Alexa Fluor® 568, Alexa Fluor® 594, Alexa Fluor® 647 (Molecular Probes); 5-(2′-aminoethyl)aminonaphthalene-1-sulfonic acid (EDANS); 4-amino-N-[3-vinylsulfonyl)phenyl]naphthalimide-3,5 disulfonate (Lucifer Yellow VS); N-(4-anilino-1-naphthyl) maleimide; anthranilamide; Black Hole Quencher™ (BHQ™) dyes (biosearch Technologies); BODIPY dyes: BODIPY® R-6G, BODIPY® 530/550, BODIPY® FL; Brilliant Yellow; coumarin and derivatives: coumarin, 7-amino-4-methylcoumarin (AMC, Coumarin 120), 7-amino-4-trifluoromethylcouluarin (Coumarin 151); Cy2®, Cy3®, Cy3.5®, Cy5®, Cy5.5®; cyanosine; 4′,6-diaminidino-2-phenylindole (DAPI); 5′,5″-dibromopyrogallol-sulfonephthalein (Bromopyrogallol Red); 7-diethylamino-3-(4′-isothiocyanatophenyl)-4-methylcoumarin; diethylenetriamine pentaacetate; 4,4′-diisothiocyanatodihydro-stilbene-2,2′-disulfonic acid; 4,4′-diisothiocyanatostilbene-2,2′-disulfonic acid; 5-[dimethylamino]naphthalene-1-sulfonyl chloride (DNS, dansyl chloride); 4-(4′-dimethylaminophenylazo)benzoic acid (DABCYL); 4-dimethylaminophenylazophenyl-4′-isothiocyanate (DABITC); Eclipse™ (Epoch Biosciences Inc.); eosin and derivatives: eosin, eosin isothiocyanate; erythrosin and derivatives: erythrosin B, erythrosin isothiocyanate; ethidium; fluorescein and derivatives: 5-carboxyfluorescein (FAM), 5-(4,6-dichlorotriazin-2-yl)aminofluorescein (DTAF), 2′,7′-dimethoxy-4′5′-dichloro-6-carboxyfluorescein (JOE), fluorescein, fluorescein isothiocyanate (FITC), hexachloro-6-carboxyfluorescein (HEX), QFITC (XRITC), tetrachlorofluorescein (TET); fluorescamine; IR144; IR1446; lanthamide phosphors; Malachite Green isothiocyanate; 4-methylumbelliferone; ortho cresolphthalein; nitrotyrosine; pararosaniline; Phenol Red; B-phycoerythrin, R-phycoerythrin; allophycocyanin; o-phthaldialdehyde; Oregon Green®; propidium iodide; pyrene and derivatives: pyrene, pyrene butyrate, succinimidyl 1-pyrene butyrate; QSY® 7; QSYR 9; QSYR 21; QSYR 35 (Molecular Probes); Reactive Red 4 (Cibacron® Brilliant Red 3B-A); rhodamine and derivatives: 6-carboxy-X-rhodamine (ROX), 6-carboxyrhodamine (R6G), lissamine rhodamine B sulfonyl chloride, rhodamine (Rhod), rhodamine B, rhodamine 123, rhodamine green, rhodamine X isothiocyanate, riboflavin, rosolic acid, sulforhodamine B, sulforhodamine 101, sulfonyl chloride derivative of sulforhodamine 101 (Texas Red); terbium chelate derivatives; N,N,N′,N′-tetramethyl-6-carboxyrhodamine (TAMRA); tetramethyl rhodamine; and tetramethyl rhodamine isothiocyanate (TRITC).
In some aspects, the primer or probe is further labeled with a quencher dye such as Tamra, Dabcyl, or Black Hole Quencher® (BHQ), especially when the reagent is used as a self-quenching probe such as a TaqMan® (U.S. Pat. Nos. 5,210,015 and 5,538,848) or Molecular Beacon probe (U.S. Pat. Nos. 5,118,801 and 5,312,728), or other stemless or linear beacon probe (Livak et al., 1995, PCR Method Appl., 4:357-362; Tyagi et al, 1996, Nature Biotechnology, 14:303-308; Nazarenko et al., 1997, Nucl. Acids Res., 25:2516-2521; U.S. Pat. Nos. 5,866,336 and 6,117,635).
In some aspects, methods for real time PCR use fluorescent primers/probes, such as the TaqMan® primers/probes (Heid, et al., Genome Res 6:986-994, 1996), molecular beacons, and Scorpion™ primers/probes. Real-time PCR quantifies the initial amount of the template with more specificity, sensitivity and reproducibility, than other forms of quantitative PCR, which detect the amount of final amplified product. Real-time PCR does not detect the size of the amplicon. The probes employed in Scorpion®™ and TaqMan® technologies are based on the principle of fluorescence quenching and involve a donor fluorophore and a quenching moiety. The term “donor fluorophore” as used herein means a fluorophore that, when in close proximity to a quencher moiety, donates or transfers emission energy to the quencher. As a result of donating energy to the quencher moiety, the donor fluorophore will itself emit less light at a particular emission frequency that it would have in the absence of a closely positioned quencher moiety. The term “quencher moiety” as used herein means a molecule that, in close proximity to a donor fluorophore, takes up emission energy generated by the donor and either dissipates the energy as heat or emits light of a longer wavelength than the emission wavelength of the donor. In the latter case, the quencher is considered to be an acceptor fluorophore. The quenching moiety can act via proximal (i.e., collisional) quenching or by Forster or fluorescence resonance energy transfer (“FRET”). Quenching by FRET is generally used in TaqMan® primers/probes while proximal quenching is used in molecular beacon and Scorpion™ type primers/probes.
The detectable label can be incorporated into, associated with or conjugated to a nucleic acid primer or probe. Labels can be attached by spacer arms of various lengths to reduce potential steric hindrance or impact on other useful or desired properties. See, e.g., Mansfield, Mol. Cell. Probes (1995), 9:145-156.
Detectable labels can be incorporated into nucleic acid probes by covalent or non-covalent means, e.g., by transcription, such as by random-primer labeling using Klenow polymerase, or nick translation, or, amplification, or equivalent as is known in the art. For example, a nucleotide base is conjugated to a detectable moiety, such as a fluorescent dye, e.g., Cy3™ or Cy5™ and then incorporated into nucleic acid probes during nucleic acid synthesis or amplification. Nucleic acid probes can thereby be labeled when synthesized using Cy3™- or Cy5™-dCTP conjugates mixed with unlabeled dCTP.
Nucleic acid probes can be labeled by using PCR or nick translation in the presence of labeled precursor nucleotides, for example, modified nucleotides synthesized by coupling allylamine-dUTP to the succinimidyl-ester derivatives of the fluorescent dyes or haptens (such as biotin or digoxigenin) can be used; this method allows custom preparation of most common fluorescent nucleotides, see, e.g., Henegariu et al., Nat. Biotechnol. (2000), 18:345-348.
Nucleic acid probes can be labeled by non-covalent means known in the art. For example, Kreatech Biotechnology's Universal Linkage System® (ULS®) provides a non-enzymatic labeling technology, wherein a platinum group forms a co-ordinative bond with DNA, RNA or nucleotides by binding to the N7 position of guanosine. This technology can also be used to label proteins by binding to nitrogen and sulfur containing side chains of amino acids. See, e.g., U.S. Pat. Nos. 5,580,990; 5,714,327; and 5,985,566; and European Patent No. 0539466.
Labeling with a detectable label also can include a nucleic acid attached to another biological molecule, such as a nucleic acid, e.g., an oligonucleotide, or a nucleic acid in the form of a stem-loop structure as a “molecular beacon” or an “aptamer beacon”. Molecular beacons as detectable moieties are described; for example, Sokol (Proc. Natl. Acad. Sci. USA (1998), 95:11538-11543) synthesized “molecular beacon” reporter oligodeoxynucleotides with matched fluorescent donor and acceptor chromophores on their 5′ and 3′ ends. In the absence of a complementary nucleic acid strand, the molecular beacon remains in a stem-loop conformation where fluorescence resonance energy transfer prevents signal emission. On hybridization with a complementary sequence, the stem-loop structure opens increasing the physical distance between the donor and acceptor moieties thereby reducing fluorescence resonance energy transfer and allowing a detectable signal to be emitted when the beacon is excited by light of the appropriate wavelength. See also, e.g., Antony (Biochemistry (2001), 40:9387-9395), describing a molecular beacon consist of a G-rich 18-mer triplex forming oligodeoxyribonucleotide. See also U.S. Pat. Nos. 6,277,581 and 6,235,504.
Aptamer beacons are similar to molecular beacons; see, e.g., Hamaguchi, Anal. Biochem. (2001), 294:126-131; Poddar, Mol. Cell. Probes (2001), 15:161-167; Kaboev, Nucleic Acids Res. (2000), 28: E94. Aptamer beacons can adopt two or more conformations, one of which allows ligand binding. A fluorescence-quenching pair is used to report changes in conformation induced by ligand binding. See also, e.g., Yamamoto et al., Genes Cells (2000), 5:389-396; Smimov et al., Biochemistry (2000), 39:1462-1468.
The nucleic acid primer or probe can be indirectly detectably labeled via a peptide. A peptide can be made detectable by incorporating predetermined polypeptide epitopes recognized by a secondary reporter (e.g., leucine zipper pair sequences, binding sites for secondary antibodies, transcriptional activator polypeptide, metal binding domains, epitope tags). A label can also be attached via a second peptide that interacts with the first peptide (e.g., S—association).
As readily recognized by one of skill in the art, detection of the complex containing the nucleic acid from a sample hybridized to a labeled probe can be achieved through use of a labeled antibody against the label of the probe. In one example, the probe is labeled with digoxigenin and is detected with a fluorescent labeled anti-digoxigenin antibody. In another example, the probe is labeled with FITC, and detected with fluorescent labeled anti-FITC antibody. These antibodies are readily available commercially. In another example, the probe is labeled with FITC, and detected with anti-FITC antibody primary antibody and a labeled anti-anti FITC secondary antibody.
Nucleic acids can be amplified prior to detection or can be detected directly during an amplification step (i.e., “real-time” methods, such as in TaqMan® and Scorpion™ methods). In some embodiments, the target sequence is amplified using a labeled primer such that the resulting amplicon is detectably labeled. In some embodiments, the primer is fluorescently labeled. In some embodiments, the target sequence is amplified and the resulting amplicon is detected by electrophoresis.
With regard to the exemplary primers and probes, those skilled in the art will readily recognize that nucleic acid molecules can be double-stranded molecules and that reference to a particular site on one strand refers, as well, to the corresponding site on a complementary strand. In defining a variant position, allele, or nucleotide sequence, reference to an adenine, a thymine (uridine), a cytosine, or a guanine at a particular site on one strand of a nucleic acid molecule also defines the thymine (uridine), adenine, guanine, or cytosine (respectively) at the corresponding site on a complementary strand of the nucleic acid molecule. Thus, reference can be made to either strand in order to refer to a particular variant position, allele, or nucleotide sequence. Probes and primers, can be designed to hybridize to either strand and detection methods disclosed herein can generally target either strand.
In some embodiments, the primers and probes comprise additional nucleotides corresponding to sequences of universal primers (e.g., T7, M13, SP6, T3) which add the additional sequence to the amplicon during amplification to permit further amplification and/or prime the amplicon for sequencing.
As noted above, the disclosure further provides methods of treating a patient selected by any method of the above embodiments, or identified as likely to experience a more favorable clinical outcome by any of the above methods, following the therapy. In some embodiments, the methods entail administering to the patients such a therapy. The therapy can be any one of the group of: a first line, second line, third line, a fourth line, or a fifth line therapy.
The agents or drugs can be administered as a composition. A “composition” typically intends a combination of the active agent and another carrier, e.g., compound or composition, inert (for example, a detectable agent or label) or active, such as an adjuvant, diluent, binder, stabilizer, buffers, salts, lipophilic solvents, preservative, adjuvant or the like and include pharmaceutically acceptable carriers. Carriers also include pharmaceutical excipients and additives proteins, peptides, amino acids, lipids, and carbohydrates.
Various delivery systems are known and can be used to administer a chemotherapeutic agent of the disclosure, e.g., encapsulation in liposomes, microparticles, microcapsules, expression by recombinant cells, receptor-mediated endocytosis. See e.g., Wu and Wu (1987) J. Biol. Chem. 262:4429-4432 for construction of a therapeutic nucleic acid as part of a retroviral or other vector, etc. Methods of delivery include but are not limited to intra-arterial, intra-muscular, intravenous, intranasal and oral routes. In a specific embodiment, it can be desirable to administer the pharmaceutical compositions of the disclosure locally to the area in need of treatment; this can be achieved by, for example, and not by way of limitation, local infusion during surgery, by injection or by means of a catheter.
The agents identified herein as effective for their intended purpose can be administered to subjects or individuals identified by the methods herein as suitable for the therapy. Therapeutic amounts can be empirically determined and will vary with the pathology being treated, the subject being treated and the efficacy and toxicity of the agent.
Methods of administering pharmaceutical compositions are well known to those of ordinary skill in the art and include, but are not limited to, oral, microinjection, intravenous or parenteral administration. The compositions are intended for topical, oral, or local administration as well as intravenously, subcutaneously, or intramuscularly. Administration can be effected continuously or intermittently throughout the course of the treatment. Methods of determining the most effective means and dosage of administration are well known to those of skill in the art and will vary with the cancer being treated and the patient and the subject being treated. Single or multiple administrations can be carried out with the dose level and pattern being selected by the treating physician.
Kits or panel for use in detecting the polymorphism of interest in patient biological samples are provided. In some embodiments, a kit comprises, or consists essentially of, or yet further consists of at least one reagent necessary to perform the assay. For example, the kit can comprise an enzyme, a buffer or any other necessary reagent (e.g. PCR reagents and buffers). For example, in some aspects, a kit contains, in an amount sufficient for at least one assay, any of the hybridization assay probes, amplification primers, and/or antibodies suitable for detection in a packaging material.
The various components of the kit can be provided in a variety of forms. For example, in some aspects, the required enzymes, the nucleotide triphosphates, the probes, primers, and/or antibodies are be provided as a lyophilized reagent. These lyophilized reagents can be pre-mixed before lyophilization so that when reconstituted they form a complete mixture with the proper ratio of each of the components ready for use in the assay. In addition, the kits can contain a reconstitution reagent for reconstituting the lyophilized reagents of the kit. In exemplary kits for amplifying target nucleic acid derived from a colorectal cancer patients, the enzymes, nucleotide triphosphates and required cofactors for the enzymes are provided as a single lyophilized reagent that, when reconstituted, forms a proper reagent for use in the present amplification methods.
Typically, the kits will also include instructions recorded in a tangible form (e.g., contained on paper or an electronic medium) for using the packaged probes, primers, and/or antibodies in a detection assay for determining the presence or amount of the polymorphism of interest in a test sample.
In some aspects, the kits further comprise a solid support for anchoring the nucleic acid of interest on the solid support. The target nucleic acid can be anchored to the solid support directly or indirectly through a capture probe anchored to the solid support and capable of hybridizing to the nucleic acid of interest. Examples of such solid support include but are not limited to beads, microparticles (for example, gold and other nano particles), microarray, microwells, multiwell plates. The solid surfaces can comprise a first member of a binding pair and the capture probe or the target nucleic acid can comprise a second member of the binding pair. Binding of the binding pair members will anchor the capture probe or the target nucleic acid to the solid surface. Examples of such binding pairs include but are not limited to biotin/streptavidin, hormone/receptor, ligand/receptor, and antigen/antibody.
In one aspect, the kit further comprises, or consists essentially of, or yet further consists of an effective amount of the therapy.
The kit can comprise at least one probe or primer which is capable of specifically hybridizing to the gene of interest and instructions for use. For example, in some aspects, the kits comprise at least one of the above described nucleic acids. Exemplary kits for amplifying at least a portion of the gene of interest comprise two primers. For example, in some embodiments, the kit comprises, or consists essentially of, or yet further consists of a forward primer and a reverse primer that flank the polymorphism.
In some embodiments, the kit further comprises, or consists essentially of, or yet further consists of a nucleic acid probe for the detection of the amplicon. In some embodiments, the nucleic acid probe has about 5, about 10, about 15, about 20, or about 25, or about 30, about 35, about 40 or more contiguous nucleotides. In some aspects, the nucleic acid primers and/or probes are lyophilized.
Oligonucleotides, whether used as probes or primers, contained in a kit can be detectably labeled. Labels can be detected either directly, for example for fluorescent labels, or indirectly. Indirect detection can include any detection method known to one of skill in the art, including biotin-avidin interactions, antibody binding and the like. Fluorescently labeled oligonucleotides also can contain a quenching molecule. Oligonucleotides can be bound to a surface. In one embodiment, the surface is silica or glass. In another embodiment, the surface is a metal electrode.
The test samples used in the diagnostic kits include cells, protein or membrane extracts of cells, or biological fluids such as sputum, blood, serum, plasma, or urine. The test samples can also be a tumor cell, a normal cell adjacent to a tumor, a normal cell corresponding to the tumor tissue type, a blood cell, a peripheral blood lymphocyte, or combinations thereof. The test sample used in the above-described method will vary based on the assay format, nature of the detection method and the tissues, cells or extracts used as the sample to be assayed. Methods for preparing protein extracts or membrane extracts of cells are known in the art and can be readily adapted in order to obtain a sample which is compatible with the system utilized.
The kits can include all or some of the positive controls, negative controls, reagents, primers, sequencing markers, probes and antibodies described herein for determining the subject's genotype in the polymorphic region of the gene of interest or target region.
As amenable, these suggested kit components can be packaged in a manner customary for use by those of skill in the art. For example, these suggested kit components can be provided in solution or as a liquid dispersion or the like.
Typical packaging materials would include solid matrices such as glass, plastic, paper, foil, micro-particles and the like, capable of holding within fixed limits hybridization assay probes, and/or amplification primers. Thus, for example, the packaging materials can include glass vials used to contain sub-milligram (e.g., picogram or nanogram) quantities of a contemplated probe, primer, or antibodies or they can be microtiter plate wells to which probes, primers, or antibodies have been operatively affixed, i.e., linked so as to be capable of participating in an amplification and/or detection methods.
The instructions will typically indicate the reagents and/or concentrations of reagents and at least one assay method parameter which might be, for example, the relative amounts of reagents to use per amount of sample. In addition, such specifics as maintenance, time periods, temperature, and buffer conditions can also be included.
The diagnostic systems contemplate kits having any of the hybridization assay probes, amplification primers, or antibodies described herein, whether provided individually or in one of the combinations described above, for use in determining the presence or amount of a polymorphism of interest, or as identified herein.
The disclosure now being generally described, it will be more readily understood by reference to the following example which is included merely for purposes of illustration of certain aspects and embodiments of the present disclosure, and are not intended to limit the disclosure.
To identify clustered mutations, a sample dependent intra-mutational distance (IMD) cutoff was derived where mutations below the cutoff were unlikely to occur by chance (q-value <0.01). A statistical approach utilizing the IMD cutoff, variant allele frequencies (VAFs), and corrections for local sequence context was applied to each specimen (see
Examining 2,583 whole-genome sequenced cancers from the Pan-Cancer Analysis of Whole Genomes (PCAWG) project revealed a total of 1,686,013 clustered single-base substitutions and 21,368 clustered indels (
The overall survival was compared between patients with cancers harboring high and low numbers of clustered mutations within whole-genome sequenced PCAWG and whole-exome sequenced TCGA cancer types36. Better overall survival was observed only in whole-genome sequenced ovarian cancers containing high-levels of clustered substitutions or clustered indels (q-values <0.05;
Mutational signature analysis was performed for each category of clustered events elucidating 12 DBS, 5 MBS, 17 omikli, 9 kataegic, and 6 clustered indel signatures (
In cancer genomes, omikli were previously attributed to APOBEC3 mutagenesis6 with some indirect evidence from experimental models23,37,38 Applicant's analysis of sequencing data39 from the clonally expanded breast cancer cell line BT-474 with active APOBEC3 mutagenesis experimentally confirmed the existence of APOBEC3-associated omikli events (cosine similarity: 0.99;
From the 9 kataegic signatures, 4 have been reported previously including 2 associated with APOBEC3 deaminases (SBS2 and SBS13) and 2 associated with canonical or non-canonical AID activities (SBS84 and SBS85;
The remaining other clustered substitutions exhibited inconsistent VAFs likely representing mutations at highly mutable genomic regions or the effects of co-occurring large mutational events such as copy number alterations (
Different cancers revealed distinct tendencies of clustered indel mutagenesis (
The PCAWG project elucidated a constellation of mutations putatively driving cancer development10. The disclosed data reveals significant enrichments of clustered substitutions and clustered indels amongst these driver mutations. Specifically, whereas only 3.7% of all substitutions and 0.9% of all indels are clustered events, they contribute 8.4% and 6.9% of substitution and indel drivers, respectively (q-values <1e-5; Fisher's exact tests;
In each sample, kataegic mutations were separated into distinct events based on consistent VAFs across adjacent mutations and IMD distances greater than the sample-dependent IMD threshold. Applicant's analysis revealed that 36.2% of all kataegic events occurred within 10 kb of a structural breakpoint but not on detected focal amplifications (
Recurrent Kyklonic Mutagenesis of ecDNA
While only 9.6% of kataegic events occur within ecDNA regions, >30% of ecDNAs had one or more associated kyklonic events (
More recurrent APOBEC3 kataegis was observed across circular ecDNA regions compared to other forms of structural variations (
Recurrent kyklonic events were increased within or near known cancer-associated genes including TP53, (DK4, and MD) M2; amongst others (
Validation of Kyklonic Events in ecDNA
Kyklonic events were further investigated across three additional independent cohorts, including: 61 sarcomas44, 280 lung cancers45, and 186 esophageal squamous cell carcinomas46. Comparable rates of clustered mutagenesis were found for both substitutions and indels as the ones reported in PCAWG with a 2.4- and 5.0-fold enrichment of clustered substitutions and indels within driver events, respectively (
Somatic variant calls of single-base substitutions, small insertions and deletions, and structural variations were downloaded for the 2,583 white-listed whole-genome sequenced samples from PCAWG along with the corresponding list of consensus driver events10. Epidemiological and clinical features for all available samples were downloaded from the official PCAWG release (https://dcc.icgc.org/releases/PCAWG). The collection of whole-exome sequenced samples from TCGA along with all available clinical features were downloaded from the Genomic Data Commons (https://gdc.cancer.gov/). The MSK-IMPACT Clinical Sequencing Cohort43 composed of 10,000 clinical cases was downloaded from cBioPortal (https://www.cbioportal.org/study/summary?id=msk_impact_2017). The subclassification of focal amplifications comprised of circular extrachromosomal DNA (ecDNA), linear amplifications, breakage-fusion-bridge cycles (BFBs), and heavily rearranged events, and their corresponding genomic locations were obtained for a subset of samples (n=1,291) as reported34.
Experimental models used to validate clustered events were derived from previous studies using primary Hupki mouse embryonic fibroblasts (MEFs) exposed to ultraviolet light41, human induced pluripotent stem cells (iPSC) exposed to benzo[a]pyrene40, and clonally expanded BT-474 human breast cancer cell line with episodically active APOBEC339.
Independent cohorts used to validate kyklonic events were collected from multiple sources. The 61 undifferentiated sarcomas44 and 187 high-confidence esophageal squamous cell carcinomas46 were downloaded from the European Genome-phenome Archive (EGAD00001004162 and EGAD00001006868, respectively). The 280 lung adenocarcinomas45 were downloaded from dbGaP under the accession number (phs001697.v1.p1). Clustered mutations in validation samples were analyzed using the same approach as the one utilized in the original cohort.
SigProfilerSimulator (v1.0.2) was used to derive an intra-mutational distance (IMD) cutoff51 that is unlikely to occur by chance based upon the tumor mutational burden and the mutational patterns for a given sample. Specifically, each tumor sample was simulated while maintaining the sample's mutational burden on each chromosome, the +/−2 bp sequence context for each mutation, and the transcriptional strand bias ratios across all mutations. All mutations in each sample were simulated 100 times and the IMD cutoff was calculated such that 90% of the mutations below this cutoff could not appear by chance (q-value <0.01). For example, in a sample with an IMD threshold of 500 bp, one may observe 1.000 mutations within this threshold with no more than 100 mutations expected based on the simulated data (q-value <0.01). P-values were calculated using z-tests by comparing the number of real mutations and the distribution of simulated mutations that occur below the same IMD threshold. A maximum cutoff of 10 kb was used for all IMD thresholds. By generating a background distribution that reflects the random distribution of events used to reduce the false positive rate, this model also considers regional heterogeneities of mutation rates, partially attributed to replication timing and expression, and variances in clonality by correcting for mutation-rich regions and mutation-poor regions within 1 Mb windows. The 1 Mb window size has been utilized and established as an appropriate scale when considering the variability in mutation rates associated with chromatin structure, replication timing, and genome architecture14,52,53. The 1 Mb window ensures that subsequent mutations likely occurred as single events using a maximum cutoff of 0.10 for differences in the variant allele frequencies (VAFs). The regional IMD cutoff was determined using a sliding window approach that calculated the fold enrichment between the real and simulated mutation densities within 1 Mb windows across the genome. The IMD cutoffs were further increased, for regions that had higher than 9-fold enrichments of clustered mutations and where >90% of the clustered mutations were found within the original data, to capture additional clustered events while maintaining the original criteria (<10% of the mutations below this cutoff appear by chance; q-value <0.01). Lastly, as VAF of mutations may confound the definition of clustered events in ecDNA, Applicant calculated the distribution of inter-event distances within recurrently mutated ecDNA while disregarding the VAF of individual mutations. This resulted in the exact same separation of kataegic events using only the inter-event distances as a criterion for the grouping of mutations into a single event.
Subsequently, all clustered mutations with consistent VAFs were classified into one of four categories (
The clustered mutational catalogues of the examined samples were summarized in SBS288 and ID83 matrices using SigProfilerMatrixGenerator55 (version 1.2.0) for each tissue type and each category of clustered events. For example, six matrices were constructed for clustered mutations found in Breast-AdenoCA: one matrix for DBSs, one matrix for MBSs, one matrix for omikli, one matrix for kataegis, one matrix for other clusters substitutions, and one matrix for clustered indels. The SBS288 classification considers the 5′ and 3′ bases immediately flanking each single-base substitution (referred to using the pyrimidine base in the Watson-Crick base pair) resulting in 96 individual mutation channels. Further, this classification considers the strand orientation for mutations that occur within genic regions resulting in three possible categories; (i) transcribed; pyrimidine base occurs on the template strand; (ii) untranscribed; pyrimidine base occurs on the coding strand; or (iii) non-transcribed; pyrimidine base occurs in an intergenic region. Note that mutations in genic regions that are bi-directionally transcribed were evenly split amongst the coding and template strand channels. Combined, this results in a classification consisting of 288 mutation channels, which were used as input for de novo signature extraction of clustered substitutions. The ID83 mutational classification has been previously described55.
Mutational signatures were extracted from the generated matrices using SigProfilerExtractor (v1.1.0), a Python based tool that uses nonnegative matrix factorization to decipher both the number of operative processes within a given cohort and the relative activities of each process within each sample56. The algorithm was initialized using random initialization and by applying multiplicative updates using the Kullback-Leibler divergence with 500 replicates. Each de novo extracted mutational signature was subsequently decomposed into the COSMIC (v3) set of signatures (https://cancer.sanger.ac.uk/signatures/) requiring a minimum cosine similarity of 0.80 for all reconstructed signatures. All de novo extractions and subsequent decomposition were visually inspected and, as previously done1, manual corrections were performed for 2.2% of extractions (4 out of 180 extractions) where the total number of operative signatures was adjusted ±1. Consistent with prior visualizations10, Applicant included all cancer types within the PCAWG cohort which may comprise as few as one sample for certain cancer types. Similarly, consistent with prior visualizations1, decomposed signature activity plots required that each cancer type have more than 2 samples and used mutation thresholds for each clustered category; 25 mutations per sample were required for doublet-base substitutions, omikli events, and other clustered mutations; 15 mutations per sample were required for multi-base substitutions and kataegic events; 10 mutations were required per sample for clustered indels.
A subset of clustered mutational signatures was validated using previously sequenced in vitro cell line models. As done for PCAWG samples, Applicant generated a background model using SigProfilerSimulator51 to calculate the clustered IMD cutoff for each sample and partitioned each substitution into the appropriate category of clustered events. Mutational spectra were generated for each subclass within each sample using SigProfilerMatrixGenerator55 and were compared against the de novo signatures extracted from human cancer. The cosine similarity between the in vitro mutational spectra and de novo observed clustered signatures was calculated to assess the degree of similarity. Applicant notes that the average cosine similarity between two random nonnegative vectors is 0.75, and the cosine similarities above 0.81 reflect p-values below 0.0151.
Associations with Cancer Risk Factors
Homologous recombination deficiency (HRD) was defined for breast cancers using the status of BRCA1, BRCA2, RAD51C, and PALB257. Samples with a germline, somatic, or epigenetic alteration in one of these genes were considered HR-deficient, while samples without any known alterations in these genes were considered HR-proficient. The number of clustered indels were compared between HR-deficient and HR-proficient samples. The smoking status of lung cancers was determined using the clinical annotation from TCGA (https://portal.gdc.cancer.gov/repository). The number of clustered indels associated with tobacco smoking (ID6) were compared between samples annotated as lifelong non-smokers and samples annotated as current and reformed smokers. The status of alcohol consumption was determined using the annotations from the official PCAWG release (https://dcc.icgc.org/releases/PCAWG). The total number of clustered indels were compared in samples annotated with no alcohol consumption and those annotated as daily and weekly drinkers.
All RNA-seq expression data was downloaded as a part of the official PCAWG release (https://dcc.icgc.org/releases/PCAWG). The relative expression data found within this release were normalized using fragments per kilobase of exon per million mapped fragment (FPKM) normalization and upper quartile normalization. The relative expressions of a gene were compared between those harboring clustered or non-clustered events. Each distribution was then normalized to the average expression of the wild-type gene. Only genes with at least 10 total events (i.e., clustered and non-clustered mutations) including at least 5 clustered events were considered for examination.
The distance to the nearest structural variation breakpoint was calculated for each mutation in each subclass using the minimum distance to the nearest adjacent upstream or downstream breakpoint. Each distribution was modeled using a Gaussian mixture with an automatic selection criterion for the number of components ranging between one and five components using the minimum Bayesian information criteria (BIC) across all iterations. Modelling of kataegic events resulted in an optimal fit of three components, which was used to separate kataegic substitutions into SV-associated and non-SV associated mutations. Doublet-base substitutions and multi-base substitutions were both modelled using a single Gaussian distribution relating to non-SV associated mutations, while omikli and other clustered mutations were modelled using a mixture of two components likely reflecting leakage of smaller kataegic events contributing to a weak SV-associated distribution. To account for the frequency of breakpoints across each sample, Applicant normalized the minimum distance of each mutation to the nearest SV by calculating the expected distance between a mutation and SV for each sample using the total number of breakpoints and the overall length of a given chromosome (data not shown). After normalizing the kataegic events, Applicant observed an optimal solution of two components with one SV-associated distribution (on average each mutation occurs within one thousandth of the expected distance to nearest structural variation) and one non-SV associated distribution (on average occurring within the expected distance to the nearest structural variation). The normalized kyklonic events are consistent with the non-SV associated distribution reflecting kataegic events that occur on ecDNA typically of lengths 1-10 Mb35.
The enrichment score of RTCA and YTCA penta-nucleotides quantifies the frequency for which each TpCpA>TpKpA mutation occurs at either an RTCA or YTCA context. To account for motif availability, this score is calculated using the +/−20 bp sequence context around each mutation and normalized by the number of cytosine bases and C>N mutations within the set of 41-mers surrounding each mutation of interest7.
All RNA-seq expression data was downloaded as a part of the official PCAWG release (https://dcc.icgc.org/releases/PCAWG). The relative expression data found within this release were normalized using fragments per kilobase of exon per million mapped fragment (FPKM) normalization and upper quartile normalization. The APOBEC3A/B normalized expression were compared between samples harboring ecDNA versus samples with no detected ecDNA and between samples with kyklonas and without kyklonas. All p-values were generated using a Mann-Whitney U test and were corrected for multiple hypothesis testing using the Benjamini-Hochberg false discovery rate procedure.
Circular ecDNA and Kataegis
The collection of ecDNA ranges were intersected with the catalog of clustered mutations, which was used to determine the overlapped mutational burden for each subclass of clustered event and the mutational spectra of overlapping kataegic events. Enrichments of events were calculated using statistical background models generated using SigProfilerSimulator51 that shuffled the dominant mutation in each clustered event across the genome (i.e., the most frequent mutation type in a single event). The decomposed kyklonic mutational spectra was generated using the decomposition module within SigProfilerExtractor56. Only mutational signatures increasing the overall cosine similarity with at least 0.01 were used. In both the original and validation cohorts, SBS2 and SBS13 were sufficient to explain the kyklonic mutational spectra with no other known mutational signature increasing the cosine similarity with more than 0.01. Comparisons between ecDNA with and without cancer genes were performed using the set of cancer genes from the Cancer Gene Census (CGC)58. All statistical comparisons and p-values were calculated using a two-tailed Mann-Whitney U test unless otherwise specified. For each set of tests, p-values were corrected for multiple hypothesis testing using the Benjamini-Hochberg false discovery rate procedure. The predicted effect of each overlapping variant was determined using ENSEMBL's Variant Effect Predictor tool by reporting only the most severe consequence59.
All survival analyses, including the generation of Kaplan-Meier curves, Cox regressions, and Log-rank tests, were performed using the Lifelines Python package (v0.24.4). Across the 30 distinct whole-genome sequenced cancer types included in the PCAWG study, only 6 cancer types contained enough samples to explore the associations between survival and overall number of clustered mutations. The sufficient sample size criteria required more than 50 samples with survival endpoints with at least 30 of the samples with an observed clustered event. Each cancer type was analyzed separately by comparing the survival of samples with a high clustered mutational burden (top 80th percentile across a given cancer type) to the survival of samples with a low clustered mutational burden (bottom 20th percentile across a given cancer type).
Analysis of whole-exome sequenced samples from TCGA was altered to reflect the limited resolution for identifying clustered mutations within the exome. Specifically, SigProfilerSimulator (v1.0.2)51 was used to derive an IMD cutoff for each sample based on the tumor mutational burden within the exome and the mutational patterns for a given sample. Mutations were randomly shuffled while maintaining the mutational burden within the exome of each chromosome, the +/−2 bp sequence context for each mutation, and the transcriptional strand bias ratios across all mutations. Each sample was simulated 100 times and an IMD cutoff was calculated using the same methods as outlined for the detection of clustered events within PCAWG. Due to the limited number of detected events, 22 cancer types had sufficient data to perform survival analysis. Each cancer type was analyzed separately by comparing samples with at least a single clustered event to samples with no detected clustered events within the exome.
For both PCAWG and TCGA analyses, survival distributions within a given cancer type were compared using a Log-rank test. Cox regressions were performed to determine hazards ratios and to correct for age and total mutational burden. All p-values were also corrected for multiple hypothesis testing using the Benjamini-Hochberg false discovery rate procedure.
To investigate differential survival associated with the detection of clustered events within cancer driver genes, Kaplan-Meier survival curves were compared between individuals harboring clustered versus non-clustered mutations within a given cancer driver gene. The distributions were compared using a Log-rank test. Cox regressions were performed to determine the hazards ratios and to correct for age, total mutational burden, and cancer type across TCGA. Cox regressions performed for the MSK-IMPACT cohort were corrected for total mutational burden and cancer type. No corrections were performed for age as these metadata were not available for the MSK-IMPACT cohort. All p-values were also corrected for multiple hypothesis testing using the Benjamini-Hochberg false discovery rate procedure.
All three validation cohorts were analyzed analogous to the PCAWG cohorts. Specifically, clustered mutations were classified by calculating a sample-dependent IMD threshold for clustered versus non-clustered mutations using a background model generated by SigProfilerSimulator51. All clustered mutations were subclassified into either DBS, MBS, omikli, kataegis, or other mutations. AmpliconArchitect (version 1.2) was used to determine regions of focal amplifications60, which were utilized for subsequent validation of kyklonic events by overlapping kataegic events with all detected focal amplifications. The decomposed kyklonic mutational spectra was generated using the decomposition module within SigProfilerExtractor56. Only mutational signatures increasing the overall cosine similarity with at least 0.01 were used. In both the original and validation cohorts, SBS2 and SBS13 were sufficient to explain the kyklonic mutational spectra with no other known mutational signature increasing the cosine similarity with more than 0.01.
No data were generated specifically for this study. All data were and can be downloaded from the appropriate links, repositories, and references. Specifically, for the discovery cohort, all data and metadata were obtained from the official PCAWG release: https://dcc.icgc.org/releases/PCAWG. All data and metadata for TCGA samples were obtained from GDC: https://gdc.cancer.gov/. Genomics data for clonally expanded cell lines were downloaded from European Genome-phenome Archive: EGAD00001004201, EGAD00001004203, and EGAD00001004583. For the three validation cohorts, datasets were downloaded as submitted by the original publications and genomics data were downloaded from their respective repositories: EGAD00001004162 for 61 undifferentiated sarcomas44 (European Genome-phenome Archive), EGAD00001006868 for 187 high-confidence esophageal squamous cell carcinomas46 (European Genome-phenome Archive), and phs001697.v1.p1 for 280 lung adenocarcinomas45 (dbGaP). Somatic mutations and metadata for the MSK-IMPACT Clinical Sequencing Cohort composed of 10,000 clinical cases42 were downloaded from cBioPortal: https://www.cbioportal.org/study/summary?id=msk_impact_2017.
The SigProfiler compendium of tools are developed as Python packages and are freely available for installation through PyPI or directly through GitHub (https://github.com/AlexandrovLab/). For all tools, each package is fully functional, free, and open sourced distributed under the permissive 2-Clause BSD License and are accompanied by extensive documentation: (i) SigProfilerMatrixGenerator55 (version 1.2.0; https://github.com/AlexandrovLab/SigProfilerMatrixGenerator); (ii) SigProfilerSimulator51: (version 1.0.2; https://github.com/AlexandrovLab/SigProfilerSimulator); (iii) SigProfilerExtractor56: (version 1.1.0; https://github.com/AlexandrovLab/SigProfilerExtractor). Each SigProfiler tool also has an R wrapper available for installation through the GitHub repositories. AmpliconArchitect34 (version 1.2) is also freely available and can downloaded from https://github.com/virajbdeshpande/AmpliconArchitect. The core computational pipelines used by the PCAWG Consortium for alignment, quality control and variant calling are available to the public at https://dockstore.org/search?search=pcawg under the GNU General Public License v.3.0, which allows for reuse and distribution.
Clustered mutagenesis in cancer can occur through different mutational processes, with AID/APOBEC3 deaminases playing the most prominent role. In addition to enzymatic deamination, other endogenous and exogenous sources imprint many of the observed clustered indels and substitutions. Importantly, a multitude of mutational processes can give rise to omikli events including tobacco carcinogens and exposure to ultraviolet light. Clustered substitutions and indels were highly enriched in driver events and associated with differential gene expression, implicating them in cancer development and cancer evolution. Some clustered mutational signatures are associated with known cancer risk factors or the activity or failure of DNA repair processes. Importantly, clustered mutations in TP53, EGFR, and BRAF associated with changes in overall survival and can be detected in most types of sequencing data, including clinically actionable targeted panels such as MSK-IMPACT. Clustered mutations with clinical significance were also detected in KIT, KMT2C, ELF3, APC and AIIDIA.
A large proportion of kataegic events occur within 10 kb of detected structural variant breakpoints with a mutational pattern suggesting the activity of APOBEC3. Multiple distinct kataegic events, independent of detected breakpoints, were observed on circular ecDNA, termed kyklonas, implicating recurrent APOBEC3 mutagenesis. The circular topology of ecDNAs47 and their rapid replication patterns are reminiscent of the structure and behavior of the circular genomes of several double stranded DNA based pathogens including herpesviruses, papillomaviruses, and polyomaviruses32-35. Importantly, prior pan-virome studies have shown that these double stranded DNA viral genomes often manifest mutations from APOBEC3 enzymes48-50. As such, recurrent APOBEC3 mutagenesis on ecDNA is likely representative of an anti-viral response where the ecDNA viral-like structure is treated as an infectious agent and attacked by APOBEC3 enzymes. ecDNAs harbor a plethora of cancer-associated genes and are responsible for many gene amplification events that can accelerate tumor evolution. Repeated mutagenic attacks of these ecDNA reveals functional effects within known oncogenes implicating additional modes of oncogenesis that may ultimately contribute to subclonal tumor evolution, subsequent evasion to therapy, and clinical outcome. Further investigations with large-scale clinically annotated whole-genome sequenced cancers are required to fully understand the clinical implications of clustered mutations and kyklonas.
The clinical utility of detecting clustered events in driver genes was evaluated by comparing the survival amongst individuals with clustered mutations versus individuals harboring non-clustered mutations within each driver gene across all whole-exome sequenced samples in TCGA. For each of these comparisons, Applicant performed Cox regressions considering the effects from age and TMB while correcting for cancer type and multiple hypothesis testing. These results were validated in targeted panel sequencing data from the Memorial Sloan Kettering-Integrated Mutation Profiling of Actionable Cancer Targets (MSK-IMPACT) cohort42,43. These analyses revealed a significant difference in survival between individuals with clustered and individuals with non-clustered mutations detected in TP53, EGFR, and BRAF. Specifically, individuals with clustered events within BRAF had a better overall survival compared to ones with non-clustered events (q-values <0.05;
To determine the cutoff of the number of mutations in an omikli versus a kataegic event, Applicant modelled the distribution of clustered event sizes (excluding DBSs, MBSs, and other clustered events with disagreeable variant allele frequencies) using a mixture of two Poisson distributions (
Analyzing the mapping scores of clustered indels. Applicant examined the mapping scores across the genome for clustered indels to ensure that the majority of events fall within high confidence regions. For this analysis Applicant used a consensus list of blacklisted genomic regions developed by ENCODE1 and the complete set of clustered indels as identified from the 2,583 PCAWG samples.
Using the MSK-MET targeted panel sequencing cohort, differential survival was observed across four genes in primary diseases including EGFR in non-small cell lung cancers, KIT in gastrointestinal stromal tumors, KMT2 C in bladder cancers, and ELF3 in bladder cancers and across two genes in metastatic diseases including AP (in colorectal cancers and ARID1A in bladder cancers. All associations resulted in a worse overall outcome with the presence of a clustered mutation within the gene of interest. See
Thus, it should be understood that although the present disclosure has been specifically disclosed by preferred embodiments and optional features, modification, improvement and variation of the disclosure embodied therein herein disclosed can be resorted to by those skilled in the art, and that such modifications, improvements and variations are considered to be within the scope of this disclosure. The materials, methods, and examples provided here are representative of preferred embodiments, are exemplary, and are not intended as limitations on the scope of the disclosure.
The disclosure has been described broadly and generically herein. Each of the narrower species and subgeneric groupings falling within the generic disclosure also form part of the disclosure. This includes the generic description of the disclosure with a proviso or negative limitation removing any subject matter from the genus, regardless of whether or not the excised material is specifically recited herein.
In addition, where features or aspects of the disclosure are described in terms of Markush groups, those skilled in the art will recognize that the disclosure is also thereby described in terms of any individual member or subgroup of members of the Markush group.
All publications, patent applications, patents, and other references mentioned herein are expressly incorporated by reference in their entirety, to the same extent as if each were incorporated by reference individually. Several references are identified by an Arabic number, and the full bibliographic citation or these references are provided below. In case of conflict, the present specification, including definitions, will control.
This application claims priority under 35 U.S.C. § 119 (e) to U.S. Provisional Application No. 63/289,601, filed Dec. 14, 2021, the contents of which are incorporated herein by reference in their entireties.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2022/052745 | 12/13/2022 | WO |
Number | Date | Country | |
---|---|---|---|
63289601 | Dec 2021 | US |