This application claims the benefit of priority from European Provisional Application No. EP19166374, filed Mar. 29, 2019, the entire contents of which are incorporated herein by reference.
The Sequence Listing in an ASCII text file, named as 38272PCT_SequenceListing.txt of 495 KB, created on Mar. 27, 2020, and submitted to the United States Patent and Trademark Office via EFS-Web, is incorporated herein by reference.
The aryl hydrocarbon receptor (AHR) is a ligand-activated transcription factor involved in the regulation of diverse processes such as embryogenesis, vasculogenesis, drug metabolism, cell motility and immune modulation, and cancer. In preclinical studies AHR activation by tryptophan metabolites generated through indoleamine-2,3-dioxygenase (IDO1) and/or tryptophan-2,3-dioxygenase (TDO2) promoted tumor progression by enhancing the motility, anoikis and clonogenic survival of the tumor cells as well as by suppressing anti-tumor immune responses.
As ligand binding is necessary for AHR activation, the expression level of AHR alone does not allow inference of its activation state. AHR activation is commonly detected by its nuclear translocation, the activity of cytochrome P-450 enzymes or the binding of AHR-ARNT to dioxin-responsive elements (DRE) using reporter assays. While all of these methods are applicable in vitro, they are laborious, require special equipment and are expensive. In addition, relying on cytochrome P-450 enzymes is limited to conditions where they are regulated, which is not always the case, given the ligand and cell type specificity of AHR activation.
AHR target gene expression is context-specific, and therefore an AHR activation signature consisting of diverse AHR target genes is required to efficiently detect AHR activation across different cells/tissues and in response to diverse AHR ligands.
The expression of a specific gene is mostly not regulated by a single transcription factor but several transcriptions factors acting separately or in combination. Therefore a single marker is not specific as a readout for a certain transcription factor. Specific for detecting biomarkers for AHR activity, as we know that AHR target gene expression is very cell type and context dependent a single marker might be a readout for AHR activity in one condition but not the other. In addition, a single biomarker cannot capture functional outcomes. (Rothhammer, V. & Quintana, F. J. 2019. The aryl hydrocarbon receptor: an environmental sensor integrating immune responses in health and disease. Nat Rev Immunol, doi:10.1038/s41577-019-0125-8).
The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
As used herein, the term “about” refers to a variation within approximately +10% from a given value.
An “AHR signaling modulator” or an “AHR modulator” as used herein, refers to a modulator which affects AHR signaling in a cell. In some embodiments, an AHR signaling modulator exhibits direct effects on AHR signaling. In some embodiments, the direct effect on AHR is mediated through direct binding to AHR. In some embodiments, a direct modulator exhibits full or partial agonistic and/or antagonistic effects on AHR. In some embodiments, an AHR modulator is an indirect modulator.
In some embodiments, an AHR signaling modulator is a small molecule compound. The term “small molecule compound” herein refers to small organic chemical compound, generally having a molecular weight of less than 2000 daltons, 1500 daltons, 1000 daltons, 800 daltons, or 600 daltons.
In some embodiments, an AHR modulator comprises a 2-phenylpyrimidine-4-carboxamide compound, a sulphur substituted 3-oxo-2,3-dihydropyridazine-4-carboxamide compound, a 3-oxo-6-heteroaryl-2-phenyl-2,3-dihydropyridazine-4-carboxamide compound, a 2-hetarylpyrimidine-4-carboxamide compound, a 3-oxo-2,6-diphenyl-2,3-dihydropyridazine-4-carboxamide compound, a 2-heteroaryl-3-oxo-2,3-dihydro-4-carboxamide compound, PDM 2, 1,3-dichloro-5-[(1E)-2-(4-methoxyphenyl)ethenyl]-benzene, a-Naphthoflavone, 6, 2′,4′-Trimethoxyflavone, CH223191, a tetrahydropyridopyrimidine derivative, StemRegenin-1, CH223191, GNF351, CB7993113 HP163. PX-A590, PX-A548. PX-A275, PX-A758, PX-A446, PX-A24590, PX-A25548, PX-A25275, PX-A25758, PX-A26446, an Indole AHR inhibitor, and an oxazole-containing (OxC) compound.
In some embodiments, a direct AHR modulator comprises:
(a) Drugs: e.g. Omeprazole, Sulindac, Leflunomide, Tranilast, Laquinimod, Flutamide, Nimodipine, Mexiletine, 4-Hydroxy-Tamoxifen, Vemurafenib etc.
(b) Synthethic compounds: e.g. 10-Chloro-7H-benzimidazo[2,1-a]benz[de]isoquinolin-7-one (10-CI-BBQ), Pifithrin-α hydrobromide,
(c) Natural compounds: e.g., kynurenine, kynurenic acid, cinnabarinic acid, ITE, FICZ, indoles including indole-3-carbinol, indole-3-pyruvate, indole-aldehyde, microbial metabolites, dietary components, quercetin, resveratrol, curcurmin, or
(d) Toxic compounds: e.g. TCDD, cigarette smoke, 3-methylcholantrene, benzo(a)pyrene, 2,3,7,8-tetrachlorodibenzofuran, fuel emissions, halogenated and nonhalogenated aromatic hydrocarbon, pesticides.
In some embodiments, indirect AHR modulators affect AHR activation through modulation of the levels of AHR agonists or antagonists.
In some embodiments, the modulation of the levels of AHR agonists or antagonists is mediated through one or more of the following:
(a) regulation of enzymes modifying AHR ligands e.g. the cytochrome p450 enzymes by e.g. cytochrome p450 enzyme inhibitors including 3′methoxy-4′nitroflavone (MNF), alpha-naphthoflavone (a-NF), fluoranthene (FL), phenanthrene (Phe), pyrene (PY) etc.
(b) regulation of enzymes producing AHR ligands including direct and indirect inhibitors/activators/inducers of tryptophan-catabolizing enzymes e.g. IDO1 pathway modulators (indoximod, NLG802), IDO1 inhibitors (1-methyl-L-tryptophan, Epacadostat, PX-D26116, navoximod, PF-06840003, NLG-919A, BMS-986205, INCB024360A, KHK2455, LY3381916, MK-7162, TDO2 inhibitors (680C91, LM10, 4-(4-fluoropyrazol-1-yl)-1,2-oxazol-5-amine, fused imidazo-indoles, indazoles), dual IDO/TDO inhibitors (HTI-1090/SHR9146, DN1406131, RG70099, EPL-1410), immunotherapy incuding immune checkpoint inhibition, vaccination, and cellular therapies, chemotherapy, immune stimulants, radiotherapy, exposure to UV light, and targeted therapies such as e.g. imatinib etc.
In some embodiments, indirect AHR modulators affect AHR activation through modulation of the expression of the AHR including e.g. HSP 90 inhibitors such as 17-allylamino-demethoxygeldanamycin (17-AAG), celastrol.
In some embodiments, indirect AHR modulators affect AHR activation by affecting binding partners/co-factors modulating the effects of AHR including e.g. estrogen receptor alpha (ESRI).
Examples of AHR modulators are listed in U.S. Pat. No. 9,175,266, US2019/225683, WO2019101647AL, WO2019101642A1, WO2019101643A1, WO2019101641AL, WO2018146010A1, AU2019280023A1, WO2020039093A1, WO2020021024A1, WO2019206800A1, WO2019185870A1, WO2019115586A1, EP3535259A1, WO2020043880A1 and EP3464248A1, all of which are incorporated by reference in their entirety.
As used herein, the phrase “biological sample” refers to any sample taken from a living organism. In some embodiments, the living organism is a human. In some embodiments, the living organism is a non-human animal.
In some embodiments, a biological sample includes, but is not limited to, biological fluids comprising biomarkers, cells, tissues, and cell lines. In some embodiments, a biological sample includes, but is not limited to, primary cells, induced pluripotent cells (IPCs), hybridomas, recombinant cells, whole blood, stem cells, cancer cells, bone cells, cartilage cells, nerve cells, glial cells, epithelial cells, skin cells, scalp cells, lung cells, mucosal cells, muscle cells, skeletal muscles cells, striated muscle cells, smooth muscle cells, heart cells, secretory cells, adipose cells, blood cells, erythrocytes, basophils, eosinophils, monocytes, lymphocytes, T-cells, B-cells, neutrophils, NK cells, regulatory T-cells, dendritic cells, Th17 cells, Th1 cells, Th2 cells, myeloid cells, macrophages, monocyte derived stromal cells, bone marrow cells, spleen cells, thymus cells, pancreatic cells, oocytes, sperm, kidney cells, fibroblasts, intestinal cells, cells of the female or male reproductive tracts, prostate cells, bladder cells, eye cells, corneal cells, retinal cells, sensory cells, keratinocytes, hepatic cells, brain cells, kidney cells, and colon cells, and the transformed counterparts of said cell types thereof.
The phrase “computer readable medium” refers to a computer readable storage device or a computer readable signal medium. A computer readable storage device, may be, for example, a magnetic, optical, electronic, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing: however, the computer readable storage device is not limited to these examples except a computer readable storage device excludes computer readable signal medium. Additional examples of the computer readable storage device can include: a portable computer diskette, a hard disk, a magnetic storage device, a portable compact disc read-only memory (CD-ROM), a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical storage device, or any appropriate combination of the foregoing; however, the computer readable storage device is also not limited to these examples. Any tangible medium that can contain, or store, a program for use by or in connection with an instruction execution system, apparatus, or device could be a computer readable storage device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, such as, but not limited to, in baseband or as part of a carrier wave. A propagated signal may take any of a plurality of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium (exclusive of computer readable storage device) that can communicate, propagate, or transport a program for use by or in connection with a system, apparatus, or device. Program code embodied on a computer readable signal medium may be transmitted using any appropriate medium, including but not limited to wireless, wired, optical fiber cable, RF. etc., or any suitable combination of the foregoing.
In some embodiments, the term “condition” includes, but is not limited to a disease, or a cellular state. In some embodiments, the condition comprises cancer, diabetes, autoimmune disorder, degenerative disorder, inflammation, infection, drug treatment, chemical exposure, biological stress, mechanical stress, or environmental stress.
In some embodiments, the condition is cancer. In some embodiments, the cancer is selected from Adrenocortical carcinoma(ACC), Bladder Urothelial Carcinoma (BLCA), Breast invasive carcinoma (BRCA), Cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC), Cholangiocarcinoma (CHOL), Colon adenocarcinoma (COAD), Lymphoid Neoplasm Diffuse Large B-cell Lymphoma (DLBC), Esophageal carcinoma (ESCA), Glioblastoma multiforme (GBM), Head and Neck squamous cell carcinoma (HNSC), Kidney Chromophobe (KICH), Kidney renal clear cell carcinoma (KIRC), Kidney renal papillary cell carcinoma (KIRP), Brain Lower Grade Glioma (LGG), Liver hepatocellular carcinoma (LIHC), Lung adenocarcinoma (LUAD), Lung squamous cell carcinoma (LUSC), Mesothelioma (MESO). Ovarian serous cystadenocarcinoma (OV), Pancreatic adenocarcinoma (PAAD), Pheochromocytoma and Paraganglioma (PCPG), Prostate adenocarcinoma (PRAD), Rectum adenocarcinoma (READ), Sarcoma (SARC), Skin Cutaneous Melanoma (SKCM), Stomach adenocarcinoma (STAD), Testicular Germ Cell Tumors (TGCT), Thyroid carcinoma (THCA), Thymoma (THYM), Uterine Corpus Endometrial Carcinoma (UCEC), Uterine Carcinosarcoma (UCS), and Uveal Melanoma (UVM).
In some embodiments, different outcomes of a condition comprise positive response to treatment and no response to treatment. In some embodiments, different outcomes of a condition comprise favorable prognosis and unfavorable prognosis. In some embodiments, the different outcomes of the condition comprise death from the condition and survival from the condition. In some embodiments, the different outcomes of the condition are not binary, i.e., there are different levels, degrees or gradations between two opposite outcomes.
The phrase “fold change” refers to the ratio between the value of a specific biomarker in two different conditions. In some embodiments, one of the two conditions could be a control. The phrase “absolute fold change” is used herein in the case of comparing the log transformed value of a specific biomarker between two conditions. Absolute fold change is calculated by raising the exponent of the logarithm to the fold change value and then reporting the modulus of the number.
As used herein, the phrase “functional outcome” or “functional group” refers to groups of biomarkers represented by common gene ontology (GO) terms. In some embodiments, the gene ontology terms include terms that describe biological processes. In some embodiments, the gene ontology terms include terms that describe molecular functions. In some embodiments, the gene ontology terms include terms that describe cellular components. In some embodiments, the phrase “functional outcome” or “functional group” includes, but is not limited to, angiogenesis, positive regulation of vasculature development, reactive oxygen species metabolic process, reactive nitrogen species metabolic process, organic hydroxy compound metabolic process, xenobiotic metabolic process, cellular ketone metabolic process, toxin metabolic process, alcohol metabolic process, response to drug, response to toxic substance, response to oxidative stress, response to xenobiotic stimulus, response to acid chemical, response to extracellular stimulus, cellular response to biotic stimulus, cellular response to external stimulus, positive regulation of response to external stimulus, response to immobilization stress, response to hyperoxia, cellular response to extracellular stimulus, regulation of hemopoiesis, regulation of blood coagulation, regulation of hemostasis, regulation of coagulation, regulation of homeostatic process, response to temperature stimulus, regulation of blood pressure, blood coagulation, positive regulation of cytokine production, cytokine biosynthetic process, positive regulation of defense response, chemokine production, regulation of response to cytokine stimulus, regulation of chemotaxis, lipid localization, lipid storage, positive regulation of lipid localization, regulation of lipid localization, negative regulation of transport, positive regulation of cell-cell adhesion, myeloid leukocyte migration, positive regulation of locomotion, positive regulation of cellular component movement, regulation of hormone levels, hormone-mediated signaling pathway, positive regulation of smooth muscle cell proliferation, smooth muscle cell proliferation, positive regulation of cell cycle, response to oxygen levels, regulation of DNA binding transcription factor activity, response to transforming growth factor beta, negative regulation of response to external stimulus, ovulation cycle, response to radiation, and sex differentiation.
The term “memory” as used herein comprises program memory and working memory. The program memory may have one or more programs or software modules. The working memory stores data or information used by the CPU in executing the functionality described herein.
The term “processor” may include a single core processor, a multi-core processor, multiple processors located in a single device, or multiple processors in wired or wireless communication with each other and distributed over a network of devices, the Internet, or the cloud. Accordingly, as used herein, functions, features or instructions performed or configured to be performed by a “processor”, may include the performance of the functions, features or instructions by a single core processor, may include performance of the functions, features or instructions collectively or collaboratively by multiple cores of a multi-core processor, or may include performance of the functions, features or instructions collectively or collaboratively by multiple processors, where each processor or core is not required to perform every function, feature or instruction individually. The processor may be a CPU (central processing unit). The processor may comprise other types of processors such as a GPU (graphical processing unit). In other aspects of the disclosure, instead of or in addition to a CPU executing instructions that are programmed in the program memory, the processor may be an ASIC (application-specific integrated circuit), analog circuit or other functional logic, such as a FPGA (field-programmable gate array), PAL (Phase Alternating Line) or PLA (programmable logic array).
The CPU is configured to execute programs (also described herein as modules or instructions) stored in a program memory to perform the functionality described herein. The memory may be, but not limited to, RAM (random access memory), ROM (read-only memory) and persistent storage. The memory is any piece of hardware that is capable of storing information, such as, for example without limitation, data, programs, instructions, program code, and/or other suitable information, either on a temporary basis and/or a permanent basis.
The term “treatment,” as used herein, refers to a reduction, attenuation, diminuation and/or amelioration of the symptoms of a disease. In some embodiments, an effective treatment for cancer achieves, for example, a shrinking of the mass of a tumor and the number of cancer cells. In some embodiments, a treatment avoids (prevents) and reduces the spread of a disease. In some embodiments, the disease is cancer, and treatment affects cancer metastases and/or the formation thereof. In some embodiments, a treatment is a naive treatment (before any other treatment of a disease had started), or a treatment after the first round of treatment (e.g. after surgery or after a relapse). In some embodiments, a treatment is a combined treatment, involving, for example, chemotherapy, surgery, and/or radiation treatment. In some embodiments, treatment can also modulate auto-immune response, infection and inflammation.
Aryl hydrocarbon receptor (AHR) target gene expression is context-specific, and therefore an AHR activation signature consisting of diverse AHR target genes is required to efficiently detect AHR activation across different cells/tissues and in response to diverse AHR ligands. It is therefore an object of the present disclosure, to provide transcriptional AHR activation signatures that enable reliable detection of AHR activation in various human tissues and under different conditions, while maintaining sufficient complexity. Furthermore, additional genes are sought after as markers that help to further understand the complex functions of AHR in particular the context of diseases and conditions related with AHR.
The present disclosure relates to the generation and uses of an improved set (or “panel”) of biomarkers (also “markers” or “genes”) that are AHR target genes, designated as “AHR biomarkers.” The AHR biomarkers described herein allow one to efficiently determine AHR activation groups and sub-groups, in particular for an improved classification of tumors. As used herein, AHR activation groups are called “AHR activation signatures.” The AHR biomarkers comprise markers that are important in diagnosis and therapy, for example for selecting patients for treatment with AHR activation modulating interventions, and monitoring of therapy response. In some embodiments, the AHR biomarkers are selected from biomarkers listed in Table 1.
An aspect of the present disclosure is directed to methods for determining an AHR signature for a given condition. In some embodiments, the AHR signature for a condition is a subset of biomarkers listed in Table 1.
In some embodiments, the method for determining AHR activation signature for a condition comprises: (a) providing at least two biological samples of the condition, wherein the at least two biological samples represent at least two different outcomes for the condition; (b) detecting a biological state of each of the AHR biomarkers of Table 1 for the at least two biological samples; (c) categorizing the AHR biomarkers into at least two groups based on the change of biological state of each marker compared to a control: (d) categorizing the at least two groups into at least two subgroups based on at least one functional outcome of AHR signaling; and (e) designating the markers in the at least two subgroups that correlate with the at least two different outcomes as the AHR activation signature for the condition.
In some embodiments, the biological state detected at step (b) is RNA expression. In some embodiments, the detecting a biological state comprises measuring levels of the biological state. In some embodiments, RNA expression of a biomarker is detected by methods known in the art including, but not limited to, qPCR, RT-qPCR, RNA-Seq, and in-situ hybridization. In some embodiments, the biological state of all AHR biomarkers listed in Table 1 are detected or measured.
In some embodiments, the categorizing in step (c) is achieved by supervised clustering. In some embodiments, the categorizing in step (c) is achieved by unsupervised clustering. In some embodiments, the clustering method comprises one or more methods including, but not limited to, K-means clustering, hierarchical clustering, principle component analysis and non-negative matrix factorization. In some embodiments, the categorizing in step (c) is achieved by a machine learning algorithm.
In some embodiments, the categorizing in step (c) comprises grouping together AHR biomarkers that display at least 1.5 absolute fold upregulation in the biological state. In some embodiments, the categorizing in step (c) comprises grouping together AHR biomarkers that display at least 2 absolute fold, at least 2.5 absolute fold, at least 3 absolute fold, at least 3.5 absolute fold, at least 4 absolute fold, at least 4.5 absolute fold, or at least 5 absolute fold upregulation in the biological state.
In some embodiments, the categorizing in step (c) comprises grouping together AHR biomarkers that display at least 0.67 absolute fold down-regulation in the biological state. In some embodiments, the categorizing in step (c) comprises grouping together AHR biomarkers that display at least 1 absolute fold, 2 absolute fold, at least 2.5 absolute fold, at least 3 absolute fold, at least 3.5 absolute fold, at least 4 absolute fold, at least 4.5 absolute fold, or at least 5 absolute fold down-regulation in the biological state.
In some embodiments, the categorizing in step (d) is achieved by supervised clustering. In some embodiments, the categorizing in step (d) is achieved by unsupervised clustering. In some embodiments, the clustering method comprises one or more methods including, but not limited to, K-means clustering, hierarchical clustering, principle component analysis and non-negative matrix factorization. In some embodiments, the categorizing in step (d) is achieved by a machine learning algorithm.
In some embodiments, the methods of the present disclosure are used to sub-classify tumors/cancer patients based on molecular characteristics known to affect prognosis and therapy response. To obtain even higher granularity it is important to analyze AHR activity in tumor subgroups with specific clinical characteristics. In some embodiments, the AHR signature and the methods described herein are used to analyze and compare clinically defined subgroups of cancer entities, and correlate AHR activity with clinical outcome.
In some embodiments, the AHR activation signature comprises about 5, about 10, about 20, about 30 of the AHR biomarkers according to Table 1 or at least 10%. at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80% or at least 90% or more or all of the AHR biomarkers according to Table 1.
In some embodiments, the AHR activation signature comprises an AHR signature listed in Table 2.
In some embodiments, the methods of the present disclosure are directed to determine a subset of AHR activation signature, called “an AHR subsignature,” wherein the AHR subsignature is enough to categorize a sample to a specific AHR subgroup within the AHR activation state.
In some embodiments, the AHR subsignature comprises at least one biomarker from the AHR activation signature. In some embodiment, the AHR subsignature comprises biomarkers that are about 10%, about 20%, about 50%, about 60%. about 70%, about 80%, about 90% or all biomarkers from the AHR activation signature. In some embodiments, the AHR subsignature is selected from Table 3. In some embodiments, the AHR subsignature is selected from Table 4.
Another aspect of the disclosure is directed to determining an alternative AHR activation signature based on a second biological state that is different than the first biological state used in determining the AHR activation signature. In some embodiments, an AHR activation signature (a first or primary AHR activation signature) is determined for a condition based on a biological state (e.g., RNA expression) and functional outcome characterization of samples for the condition as described above in Section A. Further, the same samples used in generating the AHR activation signature based on the first biological state (e.g., RNA expression) are subjected to another 'omics analysis including, but not limited to genomics, epigenomics, lipidomics, proteomics, transcriptomics, metabolomics and glycomics analysis. Then, the results of the 'omics analysis is correlated with the groups determined by the first/primary AHR activation signature, thereby identifying an alternative (second/secondary) AHR activation signature. The alternative AHR signature is equivalent to the first AHR activation signature in that it allows determination of AHR activation state and characterization of a given sample (e.g., in terms of the outcome of the condition). In some embodiments, once a first AHR activation signature and a second AHR activation signature is determined/defined for a given condition, either AHR activation signature can be utilized to a) determine the AHR activation state, or b) category based on the functional and clinical outcome of the condition. In a specific embodiment, the first AHR activation signature is based on RNA expression, and the second AHR activation signature is based on protein analysis. Alternative AHR signatures are useful for use on samples where, e.g., RNA amount or quality is not good enough for RNA expression analyses (e.g., paraffin-embedded samples, frozen samples). Alternative AHR signatures may also lead to development of other diagnostic techniques (e.g., a protein-based assay looking at the alternative AHR signature of a condition based on proteomics).
In some embodiments, an alternative AHR signature is determined based on a second biological state which includes, but is not limited to, one of mutation state, methylation state, copy number, protein expression, metabolite abundance, and enzyme activity. In some embodiments, the second biological state of at least one biomarker is correlated with the least two subgroups that correlate with the at least two different outcomes. In some embodiments, the second biological state is determined for markers that are not limited to the biomarkers listed in Table 1.
In some embodiments, the alternative AHR signature comprises an alternative AHR signature listed in Table 5. In some embodiments, the alternative AHR signature comprises an alternative AHR signature listed in Table 6.
Another aspect of the instant disclosure is directed to methods for determining the AHR activation state of a biological sample based on a given AHR activation signature specific for a condition. In some embodiments. the biological sample is taken from a subject. In some embodiments, a biological state is determined/measured for AHR biomarker of the given AHR activation signature.
In some embodiments, the AHR activation signature is a subset of AHR biomarkers listed in Table 1. In some embodiments, the AHR activation signature has been previously determined by one or more methods described in Section A. In some embodiments, the AHR activation signature comprises an AHR signature listed in Table 2.
In some embodiments, the AHR activation signature is an alternative/secondary AHR activation signature. In some embodiments, the alternative/secondary AHR activation signature has been determined by one or more methods described in Section B. In some embodiments, the alternative AHR signature comprises an alternative AHR signature listed in Table 5. In some embodiments, the alternative AHR signature comprises an alternative AHR signature listed in Table 6.
In some embodiments, the biological state of each AHR biomarker is used to perform clustering of the AHR biomarkers into subgroups defined by the AHR activation signature. as described in Section A. In some embodiments, the AHR activation signature comprises an AHR signature listed in Table 2.
In some embodiments, the method further comprises treating the subject with an AHR signaling modulator (also “AHR modulator”). In some embodiments, the AHR signaling modulator is administered every day, every other day, twice a week, once a week or once a month. In some embodiments, the AHR signaling modulator is administered together with other drugs as part of a combination therapy.
In some embodiments, an effective amount of a AHR signaling modulator is about 0.01 mg/kg to 100 mg/kg. In other embodiments, the effective amount of an AHR signaling modulator is about 0.01 mg/kg, 0.05 mg/kg, 0.1 mg/kg, 0.2 mg/kg. 0.5 mg/kg, 1 mg/kg, 5 mg/kg, 8 mg/kg, 10 mg/kg, 15 mg/kg, 20 mg/kg, 30 mg/kg, 40 mg/kg, 50 mg/kg, 60 mg/kg, 70 mg/kg, 80 mg/kg, 90 mg/kg, 100 mg/kg, 150 mg/kg, 175 mg/kg or 200 mg/kg of AHR signaling modulator.
Another aspect of the disclosure relates to a method of treating and/or preventing an AHR-related disease or condition in a cell in a patient in need of said treatment. comprising performing a method according to the present invention, and providing a suitable treatment to said patient, wherein said treatment is based, at least in part, on the results of the method according to the present invention, such as providing a compound as identified or monitoring a treatment comprising the method(s) as described herein.
Another aspect of the present disclosure relates to a diagnostic kit comprising materials for performing a method according to the present invention in one or separate containers. optionally together with auxiliary agents and/or instructions for performing said method.
Another aspect of the instant disclosure is directed to screening for or identifying compounds which modulate AHR activity. Another aspect of the instant disclosure is directed to methods for determining the effects of a compound on AHR activation status of a cell.
In some embodiments, a cell is treated with a candidate compound, and in the cell. a biological state of each AHR biomarker of a given AHR activation signature is determined/measured.
In some embodiments, the AHR signature is specific for a condition.
In some embodiments, the AHR activation signature is a subset of AHR biomarkers listed in Table 1. In some embodiments, the AHR activation signature has been previously determined by one or more methods described in Section A. In some embodiments, the AHR activation signature comprises an AHR signature listed in Table 2.
In some embodiments, the AHR activation signature is an alternative/secondary AHR activation signature. In some embodiments, the alternative/secondary AHR activation signature has been determined by one or more methods described in Section B.
In some embodiments, the alternative AHR signature comprises an alternative AHR signature listed in Table 5. In some embodiments, the alternative AHR signature comprises an alternative AHR signature listed in Table 6.
In some embodiments, the biological state of each AHR biomarker in the biological sample is compared to the biological state of each AHR biomarker in a control sample.
In some embodiments, the biological state of each AHR biomarker is used to perform clustering of the AHR biomarkers into subgroups defined by the AHR activation signature, as described in Section A, and thereby determining the effect of the compound on AHR activation status of the cell, and/or categorizing the compound based on AHR activation status of the cell.
In some embodiments, the processor, the computer-readable storage device or the method of the present disclosure (“the technology described herein”) are applied to discover an aryl hydrocarbon receptor (AHR) biomarkers and an AHR activation signature selected from the pool of AHR biomarkers.
Various aspects of the present disclosure may be embodied as a program. software, or computer instructions embodied or stored in a computer or machine usable or readable medium, or a group of media which causes the computer or machine to perform the steps of the method when executed on the computer, processor, and/or machine. A program storage device readable by a machine, e.g., a computer readable medium, tangibly embodying a program of instructions executable by the machine to perform various functionalities and methods described in the present disclosure is also provided.
In some embodiments, the present disclosure includes a system comprising a CPU, a display, a network interface, a user interface, a memory, a program memory and a working memory (
In some embodiments, a processor is programmed to perform:
(i) comparing biological states of aryl hydrocarbon receptor (AHR) biomarkers from at least two samples with known outcomes with biological states of the AHR biomarkers from a control sample;
(ii) categorizing the at least two samples into at least two groups based on the comparison in step (i);
(iii) categorizing the result of step (ii) into at least two subgroups based on at least one functional outcome; and
(iv) identifying AHR biomarkers that correlate with the known outcomes.
In some embodiments, a computer-readable storage device comprises instructions to perform:
(i) comparing biological states of aryl hydrocarbon receptor (AHR) biomarkers from at least two samples with known outcomes with biological states of the AHR biomarkers from a control sample;
(ii) categorizing the at least two samples into at least two groups based on the comparison in step (i);
(iii) categorizing the result of step (ii) into at least two subgroups based on at least one functional outcome: and
(iv) identifying AHR biomarkers that correlate with the known outcomes.
In some embodiments, a processor is programmed to perform:
(i) comparing biological states of aryl hydrocarbon receptor (AHR) biomarkers of an AHR activation signature from a sample with biological states of the AHR biomarkers of the AHR activation signature from a control sample, wherein the AHR activation signature is specific for a condition;
(ii) categorizing the sample into a group based on the comparison in step (i);
(iii) categorizing the result of step (ii) into a subgroup based on at least one functional outcome; and
(iv) determining AHR activation state of the sample.
In some embodiments, a computer-readable storage device comprises instructions to perform:
(i) comparing biological states of aryl hydrocarbon receptor (AHR) biomarkers of an AHR activation signature from a sample with biological states of the AHR biomarkers of the AHR activation signature from a control sample, wherein the AHR activation signature is specific for a condition:
(ii) categorizing the sample into a group based on the comparison in step (i);
(iii) categorizing the result of step (ii) into a subgroup based on at least one functional outcome; and
(iv) determining AHR activation state of the sample.
In some embodiments, the disclosure is directed to a method for determining AHR activation signature for a biological sample, comprising detecting at least one biological state of at least one AHR biomarker according to Table 1 for said sample, identifying a change of said biological state of said at least one AHR biomarker compared to a house keeping gene or control biomarker, and assigning said at least one AHR biomarker to said AHR activation signature for said biological sample, if said at least one biomarker provides a significance of said AHR activation signature of p<0.05 at a minimal number of markers in the signature and/or a fold of change of said AHR activation signature of at least about 1.5 at a minimal number of markers in the signature in the case of up-regulation or of at least about 0.67 at a minimal number of markers in the signature in the case of down-regulation. The method can be in vivo or in vitro, including that the exposure of the cells/samples to AHR modulators could be from external sources, applied directly to the cells or as a result of an endogenous modulator that affects AHR activation both directly or indirectly.
In some embodiments, a housekeeping gene refers to a constitutive gene that is expressed in all cells of the biological sample to be analyzed. Usually housekeeping genes are selected by the person of skill based on their requirement for the maintenance of basic cellular function in the cells of the sample as analyzed under normal, and patho-physiological conditions (if present in the context of the analysis). Examples of housekeeping genes are known to the person of skill, and may involve the ones as disclosed, e.g. in Eisenberg E, Levanon E Y (October 2013). “Human housekeeping genes, revisited”. Trends in Genetics. 29 (10): 569-574.
An aspect of the method according to the present disclosure further involves a step of identifying at least one suitable housekeeping gene and/or at least one suitable control biomarker for the sample to be analyzed, comprising detecting the expression and/or biological function of a potentially suitable housekeeping gene and/or control biomarker in said sample, and identifying said housekeeping gene and/or control biomarker as suitable, if said expression and/or biological function does not change or substantially change over time, when compared to the markers of the respective AHR signature as analyzed (control biomarker). Another suitable marker is the non-mutated version of a marker of the respective AHR signature as analyzed. Therefore, control biomarkers can be markers independent from the AHR signature or be part of the signature itself (particularly in case of mutations).
In some embodiments, the biological state as detected is selected from mutations, nucleic acid methylation, copy numbers, expression, amount of protein, metabolite, and activity of said at least one AHR biomarker.
In some embodiments, the at least one AHR biomarker is then assigned to said AHR activation signature for said biological sample. For this, in one embodiment the marker must show an absolute fold of change of said AHR activation signature of at least about 1.5 at a minimal number of markers in the signature in the case of up-regulation or f at least about 0.67 at a minimal number of markers in the signature in the case of down-regulation. Thus, a panel is created that contains as few as possible markers (i.e. 1, 2, 3, etc.) based on the most “prominent” changes as identified. This embodiment is particularly useful in cases where only a few markers are selected, e.g. in the context of a kit of markers and/or a point of care test, without the necessity of substantial machinery and equipment. In some embodiments, the absolute fold of change of said AHR activation signature is at least about 1.5, at least about 1.8, at least about 2, and at least about 3 or more in the case of up-regulation, or wherein said absolute fold of change of said AHR activation signature is at least about 0.67, at least about 0.57, at least about 0.25 or more in the case of down regulation.
In some embodiments, the AHR activation signature provides a significance of p<0.05, p<0.01, p<0.001, or p<0.0001 or at least an absolute fold of change of said AHR activation signature of at least about 1.5 in case of up-regulation or at least an absolute fold change of at least about 0.67 in the case of down regulation at a minimal number of markers in the signature.
In some embodiments, the AHR activation signature comprises about 5, about 10, about 20, about 30 of said AHR biomarkers according to table 1 or at least 10%. at least 20%, at least 30%, at least 40%, at least 50%, at least 60%. at least 70%, at least 80% or at least 90% or more or all of said AHR biomarkers according to Table 1.
In some embodiments, the AHR activation signature is identified in a sample under physiological conditions or under disease conditions, for example, in biological safety screenings, toxicology studies, cancer, autoimmune disorders, degeneration, inflammation and infection, or under stress conditions, for example, biological, mechanical and environmental stresses.
In some embodiments, the method further comprises the step of using the AHR activation signature for unsupervised clustering or supervised classification of the samples into AHR activation subgroups.
In some embodiments, the method further comprises a step of using an AHR activation signature for unsupervised clustering or supervised classification of said samples into AHR activation subgroups. Respective methods are known to the person of skill for example K-means clustering, hierarchical clustering, principle component analysis and non-negative matrix factorization. Clustering of the biomarkers will depend on the sample and the circumstances to be analyzed, and may be based on the biological function of the biomarkers, and/or the respective functional subgroup of the AHR signature or other groups of interest, e.g., the signaling pathway or network. The AHR signature as established is also capable of detecting AHR activation across different cell/tissue types and in response to diverse ligands. Using the AHR signature, it is possible to determine AHR activation sub-groups by unsupervised clustering methods, which can be utilized for classification of samples. This is important for example, in terms of selecting patients for treatment with AHR activation modulating interventions, and monitoring of therapy response.
In some embodiments, the AHR activation signature or AHR activation subgroups are further used to define AHR activation modulated functions, for example, angiogenesis, drug metabolism, external stress response, hemopoiesis, lipid metabolism, cell motility, and immune modulation.
In another aspect, the disclosure is directed to a method for monitoring AHR activation in a biological sample in response to at least one compound, comprising performing the method for determining AHR activation signature on samples that have been obtained during the course of contacting said sample with at least one pharmaceutically active compound, toxin or other modulator compound, wherein said modulator is preferably selected from an inhibitor or an agonist of said biological state.
In some embodiments, the method for monitoring AHR activation in a biological sample in response to at least one modulator compound comprises performing the method according to the present invention on biological samples/samples that have been obtained during the course of contacting said sample with at least one modulator. The modulator compound can be directly applied to the sample in vitro or through different routes of administration, for example, parenteral preparations, ingestion, topical application, vaccines, i.v., or others, wherein a change in the AHR activation in the presence of said at least one compound compared to the absence of said at least compound indicates an effect of said at least one compound on said AHR activation. In some embodiments, this modulator can be used in additional steps of the method where a classifier is used, or activation is evaluated based on the signature compared to housekeeping genes or control biomarkers as disclosed herein.
In some embodiment, the uses of the AHR-signature also include a method for monitoring an AHR-related disease or condition or function or effect in a cell, comprising performing a method according to the present invention, providing at least one modulator compound to said cell and detecting the change in at least one biological state of the genes of the AHR-signature in said cell in response to said at least one compound, wherein a change in the at least one biological state of the genes of said signature in the presence of said at least one compound compared to the absence of said at least compound indicates an effect of said at least one compound on said AHR-related disease or condition or function or effect.
In some embodiments, the present disclosure relates to a method for screening for a modulator compound of AHR activation genes, comprising performing the method according to the present invention, and further comprising contacting at least one candidate modulator compound with said biological sample, wherein a change in the biological state of said at least one AHR biomarker of said signature in the presence of said at least one compound compared to the absence of said at least compound identifies a modulator. The modulator compound of AHR activation genes can modulate said genes directly or indirectly, i.e., by acting on AHR directly, or indirectly by acting on a signaling pathway upstream of the AHR marker.
In some embodiments, the present disclosure relates to an in-vitro method for screening for a modulator of the expression of AHR-regulated genes, comprising contacting a cell with at least one candidate modulator compound, and detecting at least one of mutations, nucleic acid methylation, copy numbers, expression, amount of protein, metabolites and activity of said genes of the AHR-signature according to table 1, wherein a change as detected of about 5, about 10, about 20, about 30 of said AHR biomarkers according to table 1 or at least 10%, at least 20%, at least 30%, at least 40%. at least 50%, at least 60%. at least 70%, at least 80% or at least 90% or more or all of said AHR biomarkers according to Table 1 in the presence of said at least one compound compared to the absence of said at least compound identifies a modulator. This modulator in preferred embodiments can be used in additional steps of the method where a classifier is used or activation is evaluated based on the signature compared to housekeeping genes or control biomarkers as disclosed herein.
In another aspect, the present disclosure relates to a method for testing the biological safety of a compound, comprising performing a method according to the present invention, and further comprising the step of concluding on the safety of said compound based on said effect as identified. Because of the known relation of AHR to toxic compounds, another advantageous use is a method for testing the biological safety of a compound, comprising performing a method according to the present invention, and further comprising the step of concluding on the safety of said compound based on said effect as identified.
Another aspect of the present invention then relates to a method for producing a pharmaceutical preparation, wherein said compound/modulator as identified (screened) is further formulated into a pharmaceutical preparation by admixing said (at least one) compound as identified (screened) with a pharmaceutically acceptable carrier. Pharmaceutical preparations can be preferably present in the form of injectibles, tablets, capsules, syrups, elixirs, ointments, creams, patches, implants, aerosols, sprays and suppositories (rectal, vaginal and urethral). Another aspect of the present invention then relates to a pharmaceutical preparation as prepared according to the invention.
Another aspect of the disclosure relates to the use of at least one biomarker or a set/panel of biomarkers of about 5, about 10, about 20, about 30 of said AHR biomarkers according to Table 1 or at least 10%. at least 20%, at least 30%, at least 40%. at least 50%, at least 60%, at least 70%, at least 80% or at least 90% or more of the genes according to Table 1 for monitoring AHR activation in a biological sample according to the present invention, or for screening for a modulator of AHR activation genes according to the present invention, or for testing the biological safety according to the present invention or for a diagnosis according to the present invention.
In another aspect, the disclosure is directed to a method for screening for a modulator of AHR activation genes, comprising performing the method for determining AHR activation signature, and further comprising contacting at least one candidate modulator compound with said biological sample or modulating the levels of at least one candidate modulator with said biological sample, wherein a change in the biological state of said at least one AHR biomarker of said signature in the presence of said at least one compound compared to the absence of said at least compound identifies a modulator, wherein said modulator is selected from an inhibitor or an agonist of said biological state.
In some embodiments, the modulator is selected from TCDD, FICZ, Kyn, SR1, CH223191, a proteinaceous AHR binding domain, a small molecule, a peptide, a mutated version of a protein, for example an intracellular or recombinantly introduced protein, and a library of said compounds, environmental substances, probiotics, toxins, aerosols. medicines, nutrients, galenic compositions, plant extracts, volatile compounds, homeopathic substances, incense, pharmaceutical drugs, vaccines, i.v. compounds or compound mixtures derived from organisms for example animals, plants, fungi, bacteria, archaea. chemical compounds, and compounds used in food or cosmetic industry.
In some embodiments, the at least one biological state of said at least one AHR biomarker according to Table 1 for said sample is detected using a high-throughput method.
In the methods of the present invention, in general the biomarkers can be detected and/or determined using any suitable assay. Detection is usually directed at the qualitative information (“marker yes-no”), whereas determining involves analysis of the quantity of a marker (e.g. expression level and/or activity). Detection is also directed at identifying mutations that cause altered functions of individual markers. The choice of the assay(s) depends on the parameter of the marker to be determined and/or the detection process.
Thus, the determining and/or detecting can preferably comprise a method selected from subtractive hybridization, microarray analysis, DNA sequencing, qPCR, ELISA, enzymatic activity tests, cell viability assays, for example an MTT assay, phosphoreceptor tyrosine kinase assays, phospho-MAPK arrays and proliferation assays, for example the BrdU assay, proteomics, HPLC and mass spectrometry.
In some embodiments, the methods of the instant disclosure are also amenable to automation, and said activity and/or expression is preferably assessed in an automated and/or high-throughput format. In some embodiments. this involves the use of chips and respective machinery, such as robots.
Another aspect of the present disclosure is directed to a diagnostic kit comprising materials for performing a method according to this disclosure in one or separate containers. In some embodiments, the kit further comprises auxiliary agents and/or instructions for performing said method. The kit may comprise the panel of biomarkers as identified herein or respective advantageous marker sub-panels as discussed herein. Furthermore, included can be dyes, biomarker-specific antibody, and oligos, e.g. for PCR-assays.
In some embodiments, the present disclosure is directed to a panel of biomarkers identified by a method according to the methods of this disclosure. In some embodiments. the present disclosure is directed to use of the panel of biomarkers for monitoring AHR activation in a biological sample, or for screening for a modulator of AHR activation genes.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one skilled in the art to which this invention belongs. Although any methods and materials similar or equivalent to those described herein can also be used in the practice or testing of the present invention, the preferred methods and materials are now described. All publications mentioned herein are incorporated herein by reference to disclose and describe the methods and/or materials in connection with which the publications are cited.
The following embodiments are part of the invention:
1. A method for determining AHR activation signature for a biological sample, comprising detecting at least one biological state of at least one AHR biomarker according to table 1 for said sample, identifying a change of said biological state of said at least one AHR biomarker compared to a house keeping gene or control biomarker, and assigning said at least one AHR biomarker to said AHR activation signature for said biological sample, if said at least one biomarker provides a significance of said AHR activation signature of p<0.05 at a minimal number of markers in the signature and/or a fold of change of said AHR activation signature of at least about 1.5 at a minimal number of markers in the signature in the case of up-regulation or of at least about 0.67 at a minimal number of markers in the signature in the case of down-regulation.
2. The method according to embodiment 1, wherein said biological sample is selected from a sample comprising biological fluids comprising biomarkers, human cells, tissues, whole blood, cell lines, primary cells, IPCs, hybridomas, recombinant cells, stem cells, and cancer cells, bone cells, cartilage cells, nerve cells, glial cells, epithelial cells, skin cells, scalp cells, lung cells, mucosal cells, muscle cells, skeletal muscles cells, straited muscle cells, smooth muscle cells, heart cells, secretory cells, adipose cells, blood cells, erythrocytes, basophils, eosinophils, monocytes, lymphocytes, T-cells, B-cells, neutrophils, NK cells, regulatory T-cells, dendritic cells, Th17 cells, Th1 cells, Th2 cells, myeloid cells, macrophages, monocyte derived stromal cells, bone marrow cells, spleen cells, thymus cells, pancreatic cells, oocytes, sperm, kidney cells, fibroblasts, intestinal cells, cells of the female or male reproductive tracts, prostate cells, bladder cells, eye cells, corneal cells, retinal cells, sensory cells, keratinocytes, hepatic cells, brain cells, kidney cells, and colon cells, and the transformed counterparts of said cell types thereof.
3. The method according to embodiment 1 or 2, wherein said biological state as detected is selected from mutations, nucleic acid methylation, copy numbers, expression, amount of protein, metabolite, and activity of said at least one AHR biomarker.
4. The method according to any one of embodiments 1 to 3, wherein said AHR activation signature provides a significance of p<0.05, preferably of p<0.01, and more preferably of p<0.001, and more preferably p<0.0001 or at least an absolute fold of change of said AHR activation signature of at least about 1.5 in case of up-regulation or at least an absolute fold change of at least about 0.67 in the case of down regulation at a minimal number of markers in the signature.
5. The method according to any one of embodiments 1 to 4, wherein said AHR activation signature comprises about 5, about 10, about 20, about 30 of said AHR biomarkers according to table 1 or at least 10%., at least 20%, at least 300%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80% or at least 90% or more or all of said AHR biomarkers according to table 1.
6. The method according to any one of embodiments 1 to 5, wherein said AHR activation signature is identified in a sample under physiological conditions or under disease conditions, for example, in biological safety screenings, toxicology studies, cancer, autoimmune disorders, degeneration, inflammation and infection, or under stress conditions, for example, biological. mechanical and environmental stresses.
7. The method according to any one of embodiments 1 to 6, wherein said method further comprises the step of using said AHR activation signature for unsupervised clustering or supervised classification of said samples into AHR activation subgroups.
8. The method according to any one of embodiments 1 to 7, wherein said AHR activation signature or AHR activation subgroups are further used to define AHR activation modulated functions, for example, angiogenesis, drug metabolism, external stress response, hemopoiesis, lipid metabolism, cell motility, and immune modulation.
9. A method for monitoring AHR activation in a biological sample in response to at least one compound, comprising performing the method according to any one of embodiments 1 to 8 on samples that have been obtained during the course of contacting said sample with at least one pharmaceutically active compound, toxin or other modulator compound, wherein said modulator is preferably selected from an inhibitor or an agonist of said biological state.
10. A method for screening for a modulator of AHR activation genes, comprising performing the method according to any one of embodiments 1 to 8, and further comprising contacting at least one candidate modulator compound with said biological sample or modulating the levels of at least one candidate modulator with said biological sample, wherein a change in the biological state of said at least one AHR biomarker of said signature in the presence of said at least one compound compared to the absence of said at least compound identifies a modulator, wherein said modulator is preferably selected from an inhibitor or an agonist of said biological state.
11. The method according to any one of embodiments 9 to 10, wherein said modulator is selected from TCDD, FICZ, Kyn, SR1, CH223191, a proteinaceous AHR binding domain, a small molecule, a peptide, a mutated version of a protein, for example an intracellular or recombinantly introduced protein, and a library of said compounds, antibodies, environmental substances, probiotics, toxins, aerosols, medicines, nutrients, galenic compositions, plant extracts, volatile compounds, homeopathic substances, incense, pharmaceutical drugs, vaccines, i.v., compounds or compound mixtures derived from organisms for example animals, plants, fungi, bacteria, archaea, chemical compounds, and compounds used in food or cosmetic industry.
12. The method according to any one of embodiments 1 to 11, wherein said at least one biological state of said at least one AHR biomarker according to table 1 for said sample is detected using a high-throughput method.
13. A diagnostic kit comprising materials for performing a method according to any one of embodiments 1 to 12 in one or separate containers, optionally together with auxiliary agents and/or instructions for performing said method.
14. A panel of biomarkers identified by a method according to any one of embodiments 1 to 8.
15. Use of a panel of biomarkers according to embodiment 14 for monitoring AHR activation in a biological sample according to embodiment 9, or for screening for a modulator of AHR activation genes according to any one of embodiment 10 to 12.
The specific examples listed below are only illustrative and by no means limiting.
First, existing datasets for different AHR activation or inhibition conditions were identified in the GEO database (Edgar R. et al., Nucleic Acids Res.; 2002: 30(1):207-10). The search was performed using an in-house tool using several keywords. The list of datasets was manually curated and a cutoff for differentially expressed genes was set at log 2 fold change of 0.3 (and an adjusted p-value threshold of 0.05). In addition, AHR targets were retrieved from the Transcription Factor Target Gene Database (Plaisier C L, et al. Causal Mechanistic Regulatory Network for Glioblastoma Deciphered Using Systems Genetics Network Analysis. Cell Syst. 2016 August; 3(2):172-86) and merged with the gene list curated from the GEO search.
A semantic analysis was carried out to correctly identify the appearance of gene names including AHR in given freely available free texts with GeNo (Wermter, J., Tomanek, K. & Hahn, U. High-performance gene name normalization with GeNo. Bioinformatics 25, 815-821 (2009)) and gene interactions, called events, using BioSem (Bui, Q.-C. & Sloot, P. M. A.: A robust approach to extract biomedical events from literature.; Bioinformatics; 28, 2654-2661 (2012)). The output of BioSem was then stored in an ElasticSearch index (Elastic webpage). From this index, event items referencing AHR as an interaction member with a regulation event were selected. Results were manually curated to obtain the final list of literature mentioning AHR associated interaction events. Human orthologues were used to replace mouse genes in the NLP search results. The gene annotations of both text mining and dataset searches results were harmonized by cross referencing with the accepted HGNC symbols (HGNC website) as per the hg38 reference. Genes overlapping between the two lists were used to constitute the core AHR activation signature consisting of 166 genes (Table 1).
Gene ontology analysis of the core AHR activation signature was performed using the clusterProfiler package (Yu, Guangchuang, et al. 2012. “ClusterProfiler: An R Package for Comparing Biological Themes Among Gene Clusters.” OMICS: A Journal of Integrative Biology 16 (5):284-87), applying the method described by Boyle et al. (2004) (Boyle, Elizabeth I. et al. 2004. “GO: TermFinder-open Source Software for Accessing Gene Ontology Information and Finding Significantly Enriched Gene Ontology Terms Associated with a List of Genes.” Bioinformatics (Oxford, England) 20 (18):3710-5). Bonferroni correction was used to control for multiple testing and a p-value cutoff of 0.01 was used for selecting enriched ontology terms. The semantic similarity algorithm GOsemsim (Yu, Guangchuang. et al. 2010. “GOSemSim: An R Package for Measuring Semantic Similarity Among Go Terms and Gene Products.” Bioinformatics 26 (7):976-78) was used for grouping of ontology terms followed by filtering of higher/general levels ontology term. The remaining ontology terms were categorized into eight groups descriptive of AHR activation mediated biological processes.
Additional datasets, not used in defining the AHR biomarker set of Table 1, were used for validation (
Array datasets—The Affiymetrix microarray chips “human gene 2.0 ST” were analyzed using the oligo package and annotated using NetAffx (Carvalho, B.; et al. Exploration, Normalization. and Genotype Calls of High Density Oligonucleotide SNP Array Data. Biostatistics, 2006). Other Affymetrix chips were analyzed using the Affy and Affycoretools packages. Raw CEL files were imported from disk or downloaded from Gene Expression Omnibus (GEO) using GEOquery (Davis S, Meltzer P (2007). “GEOquery: a bridge between the Gene Expression Omnibus (GEO) and BioConductor.” Bioinformatics, 14, 1846-1847), followed by RMA normalized and summarization. Illumina and Agilent array datasets were analyzed using lumi (Du, P., Kibbe, W. A. and Lin, S. M., (2008) ‘lumi: a pipeline for processing Illumina microarray’, Bioinformatics 24(13):1547-1548; and Lin, S. M., Du, P., Kibbe, W. A., (2008) ‘Model-based Variance-stabilizing Transformation for Illumina Microarray Data’, Nucleic Acids Res. 36, e11) and limma (Ritchie, M E, et al. (2015). limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Research 43(7), e47).
RNA-seq datasets—Raw counts and metadata were downloaded from GEO using GEOquery and saved as a DGElist (Robinson, M D, McCarthy, D J, Smyth, G K (2010). edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139-140). The harmonized HT-Seq counts of TCGA datasets were downloaded using TCGAbiolinks (Colaprico A, wt al. (2015). “TCGAbiolinks: An R/Bioconductor package for integrative analysis of TCGA data.” Nucleic Acids Research. doi: 10.1093/nar/gkv1507) from GDC (the NIH GDC website), and only patients with the identifier “primary solid tumor” were retained, with the exception of melanoma that was split into datasets for primary and advanced melanoma cohorts. Genes with less than 10 counts were filtered followed by TMM normalization (Robinson, M D, and Oshlack, A (2010). A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biology11, R25) and variance modelling using voom (Robinson, M D, and Oshlack, A (2010). A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biology 11, R25).
RPPA datasets—Level 4 standardized data was downloaded from The Cancer Proteome Atlas (TCPA) (the TCPA website). The patient datasets were reduced to the overlap between both RPPA and RNAseq data sets.
The eBayes adjusted moderated t-statistic was applied for differential gene expression using limma (Ritchie, M E, et al. (2015). limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Research 43(7), e47) and limma-trend (Phipson, B, et al. (2016). Robust hyperparameter estimation protects against hypervariable genes and improves power to detect differential expression. Annals of Applied Statistics 10(2), 946-963) or the limma RNA-seq pipeline (Ritchie, M E, et al. (2015). limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Research 43(7), e47). Batch effects, when present, were accounted for in the linear regression. Gene set testing of AHR activation was performed using roast (Wu, D., et al. (2010). ROAST: rotation gene set tests for complex microarray experiments. Bioinformatics 26, 2176-2182).
Association of AHR Activation with Patient Groups of Median Separated Enzyme Expression
Assessing the association of AHR activation with Trp degrading enzymes, TCGA patients were divided by the median into groups of high or low expression of IDO1 or TDO2, and differential gene expression and gene set testing was conducted as described above.
Using the AHR signature, the single sample gene set enrichment scores was estimated using the GSVA package (Htnzelmann S, Castelo R, Guinney J (2013). “GSVA: gene set variation analysis for microarray and RNA-Seq data.” BMC Bioinformatics, 14, 7), the inventors refer to as the AHR activation score. This score is used for defining gene co-expression networks representing AHR functional outcomes, and for comparing the status of AHR modulation in patients of different clinical subtypes.
Gene Correlation Networks Associated with AHR Activation
The normalized and voomed DGEList of publicly available GEO data was used for weighted gene co-expression network analysis (WGCNA) (Langfelder P, Horvath S (2008) WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics 2008, 9:559). Soft thresholds were estimated for signed hybrid networks in single block settings. Adjacency and topology overlapping matrices were calculated using bi-correlation matrices and Eigen genes representing the first principle components of each module were returned. Selecting WGCNA modules associated with AHR activation was conducted by performing a global test (Goeman, J. J., van de Geer, S. A., de Kort, F., and van Houwelingen, J. C. (2004). A global test for groups of genes: testing association with a clinical outcome. Bioinformatics, 20(1):93-99; Goeman, J. J., van de Geer, S. A., and van Houwelingen, J. C. (2006). Testing against a high-dimensional alternative. Journal of the Royal Statistical Society Series B Statistical Methodology, 68(3):477-493: and Goeman, J. and Finos, L. (2012). The inheritance procedure: multiple testing of tree-structured hypotheses. Statistical Applications in Genetics and Molecular Biology, 11(1):1-18)) using the AHR activation score as the response and the WGCNA modules as model predictors. Additionally, using Pearson correlation, as implemented in the Hmisc package (Harrell Miscellaneous webpage from R Archive Network), AHR activation scores were correlated with WGCNA modules. Modules that overlapped the global test and Pearson correlation results, with a p-value of 0.05 or less in both tests, were selected as the AHR associated modules, regardless of the direction of association, i.e. both positively and negatively associated modules were retained if overlapping and satisfying the p-value cutoff.
K-means consensus clustering (Monti, S., et al. (2003); Machine Learning, 52, 91-118. and Wilkerson, D. M, Hayes, Neil D (2010). “ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking.” Bioinformatics, 26(12), 1572-1573), using AHR associated modules, was performed to define patient subgroups with AHR activation. The number of clusters for each tumor type was assessed using consensus heatmaps, cumulative distribution function plots, elbow plots and samples' cluster identities. The values of K explored were 2-20, with k=2-4 providing the most stable clusters. The group separation was further examined using principle component analysis.
U-87MG were obtained from ATCC. U-87MG were cultured in phenol red-free, high glucose DMEM medium (Gibco, 31053028) supplemented with 10% FBS (Gibco, 10270106), 2 mM L-glutamine, 1 mM sodium pyruvate, 10 U/mL penicillin and 100 pg/mL streptomycin (referred to as complete DMEM). Cell lines were cultured at 37° C. and 5% CO2. Cell lines were authenticated and certified to be free of mycoplasma contamination.
For treatment of adherent cells with AHR ligands, 4×105 cells per well were seeded in six well plates and incubated for 24 h prior to treatment. Non-adherent cells were seeded at 5×105 cells/mL in 24 well plates and treated immediately. For verification of the generated AHR signature, cells were treated with the established AHR agonists TCDD (10 nM, American Radiolabeled Chemicals Inc.,), FICZ (100 nM, Cayman Chemicals, 19529), Kyn (50 μM, Sigma Aldrich), KynA (50 uM, Sigma-Aldrich, K3375) and indole-3-carboxaldehyde (6.25 μM to 100 μM, Sigma-Aldrich, 129445) for 24 h.
Stable knockdown of AHR in U-87MG cells was achieved using shERWOOD UltramiR Lentiviral shRNA targeting AHR (transOMIC Technologies, TLHSU1400-196-GVO-TRI). Glioma cells were infected with viral supematants containing either shAHR or shControl (shC) sequences to generate stable cell lines. Both shAHR sequences displayed similar knockdown efficiency and stable cell lines with shAHR #1 were used for experiments.
shERWOOD UltramiR shRNA sequences are:
Total RNA was harvested from cultured cells using the RNeasy Mini Kit (Qiagen) followed by cDNA synthesis using the High Capacity cDNA reverse transcriptase kit (Applied Biosystems). StepOne Plus real-time PCR system (Applied Biosystems) was used to perform real time PCR of cDNA samples using SYBR Select Master mix (Thermo Scientific). Data was processed and analysed using the StepOne Software v 2.3. Relative quantification of target genes was done against RNA18S as reference gene using the 2ΔΔCt method. Human primer sequences are,
Graphical and statistical analysis of gene (real time-PCR) was done using GraphPad Prism software versions 6.0 and 8.0. Unless otherwise indicated, data represents the mean±S.E.M of at least 3 independent experiments. In cases where data was expressed as absolute fold of change, these values were Log10 transformed and the resulting values were used for statistical analysis. Depending on the data, the following statistical analyses were applied: two-tailed student's t-test (paired or unpaired) and repeated measures ANOVA with Dunnett's multiple comparisons test. Significant differences were reported as *p<0.05, **p<0.01, ***p<0.001, ****p<0.0001. NS indicates no significant difference. For bioinformatics analysis, unless stated otherwise, all pairwise comparisons were performed using Kruskal-Wallis and Wilcoxon rank sum test, and all reported p-values were adjusted using the Benjamini-Hochberg procedure.
The AHR signature was validated using roast gene set enrichment in distinct datasets of cells treated with TCDD (
In addition, the inventors performed qRT-PCRs of selected signature genes in conditions of AHR activation with TCDD, FICZ or Kyn as well as combined ligand activation and AHR knockdown (
To assess the relative contribution of IDO1 and TDO2 to the AHR activity detected by the AHR-signature, the inventors performed a weighted gene co-expression network analysis (WGCNA) across the 32 TCGA tumor entities. The association between AHR activity (denoted by the AHR-score) and the WGCNA modules was tested to determine which modules show positive or negative associations with the AHR-score (as previously described). The relative contribution of IDO1 and TDO2 to AHR activity was assessed by inspecting the incidence of either of the two enzymes in the positive AAMs (
In three different cancer examples that were clustered using the AAMs, the groups that had high AHR-scores (
The survival difference between the groups was estimated by fitting a multivariate age-adjusted cox proportional hazard model. Kaplan-Meier curves were used for visualizing the fitted cox proportional hazard models. The AHR defined sub-groups showed significant differences in overall survival outcome (
The AHR signature genes are grouped into 56 gene ontology terms according to the biological process representing different AHR biological functions, (these smaller gene groups are denoted AHR-GOs). By using analytic rank based enrichment (PMID: 27322546). the biological process activity (BPA) normalized enrichment score (PMID: 31653878) for each tumor sample was estimated and then the scores were compared between the AHR sub-groups for each cancer. The AHR-GOs BPA scores were averaged for each AHR-subgroup per cancer type and a circular barplot per group was generated (
To define subsets of the AHR signature that could be used for each of the 32 cancer types and subsequently cancer sub-groups we constructed prediction models for the AHR-scores using the least absolute shrinkage and selection operator (lasso) method and a random forest based model of recursive feature elimination (RFE). Lasso is a regularized regression method that applies a penalty to the residual sum of squares of predictors leading to the shrinkage of their coefficients, which leads to decreasing the variance and improving the accuracy of the model. The tuning parameter of the lasso is termed lambda, which was determined by cross validation. RFE was ran using random forest control functions and cross validation was performed using the leave on out (LOOCV). Random forest models using all AHR signature genes were created and feature selection was made based on the root mean squared error (RMSE) of the models. The overlap between the lasso and RFE results comprise the least number of AHR signature genes required for calling AHR activation for the different cancer types (Table 2). Furthermore, these AHR signature subsets were evaluated across the cancer sub-groups identified (
Consensus NMF was applied by using the AAMs previously defined. NMF is a matrix factorization method that constrains the matrix to include only positive values and decomposes the feature matrix into two matrices W and H, which can be used to approximate the original matrix by finding Wand H whose sum of linear combinations (weighted sum of bases) minimizes an error function. The cluster identity is represented by H. The clustering results were determined by evaluating the consensus heatmaps, consensus silhouette coefficient, cophenetic index, sparseness coefficient, and dispersion (
RPPA data of tumor samples were grouped according to class assignments of the AHR cancer subtypes from the different clustering solutions described above. RPPA features were filtered to the top 20% showing the highest variation across the different tumors. By comparing the differential regulation of these features across the AHR subgroups for each cancer, we defined RPPA features that could be used for calling AHR activity in both a cancer specific and cancer sub-group specific manner (
Tumors are increasingly sub-classified based on molecular characteristics known to affect prognosis and therapy response. To obtain even higher granularity it is important to analyze AHR activity in tumor subgroups with specific clinical characteristics. Using the AHR signature and the methods described above, the inventors analyzed and compared clinically defined subgroups of prevalent cancer entities. of which the inventors show examples of AHR activity and clinical outcomes:
Comparison of AHR activity in the histology subtypes Lung Adenocarcinoma (LUAD) versus Lung Squamous Cell Carcinoma (LUSC) revealed a similar distribution of AHR high and low patients in each histological subtype (
Comparison of NSCLC patients with EGFR activating mutations or ALK/ROS1 rearrangement versus a cohort with no mutation/rearrangement by means of the AHR signature revealed that neither EGFR nor ALK mutations differ between AHR high and low groups (
Analysis of PDL-1 (CD274) expression in LUAD and LUSC with high versus low AHR activity revealed increased expression of PDL-1 expression in the AHR high groups (
Analysis of human Papilloma Virus (HPV) positive versus HPV negative HNSCC based on either the clinical annotation or p16 expression, revealed similar distributions of AHR high and AHR low groups among HPV positive and negative tumors (
Using an AHR signature comprising of all the biomarkers in Table 1, allows the detection of AHR modulation caused by both direct and indirect AHR modulators in a cell type and ligand type independent fashion. This approach allowed us to detect the modulation of AHR in HepG2 cells treated with the environmental toxin BaP (
Using the methodology described herein, the inventors have defined AHR activation signature for 32 different cancer types. The cancers were selected from The Cancer Genome Atlas (TCGA) Program of The National Cancer Institute. The TCGA cancers and the AHR activation signatures are listed in Table 2.
Inventors further classified the AHR activation signatures of Table 2 using Kmeans clustering, and determined different subsignatures within the AH-R activation signature as shown in Table 3.
Inventors further classified the AH-R activation signature of Table 2 using non-negative matrix factorization (NMF) clustering to determine different subsignatures within the AHR activation signature as shown in Table 4. Interestingly, different clustering methodologies gave very similar results, validating the strength of AHR biomarkers and AH-R activation signatures.
The inventors have determined alternative (secondary) AH-R activation signatures based on proteomics (Reverse Phase Protein Array (RPPA)) data using Kmeans clustering as shown in Table 5. These alternative AHR activation signatures can be used to determine the AHR activation status of a sample.
The inventors have determined alternative (secondary) AHR activation signatures based on proteomics (Reverse Phase Protein Array (RPPA)) data using NMF clustering as shown in Table 6. These alternative AHR activation signatures can be used to determine the AHR activation status of a sample using protein biomarkers listed in Table 6.
Table 6: Tabular representation of the different RPPA features that could be used to call AHR activation for the 32 TCGA cancers divided among the different AHR subgroups for each cancer entity defined by consensus NMF clustering.
Number | Date | Country | Kind |
---|---|---|---|
19166374.9 | Mar 2019 | EP | regional |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/IB2020/000236 | 3/28/2020 | WO | 00 |