Oncogenic KRAS is a potent initiator of tumorigenesis, yet its nascent effects on the noncoding genome are incompletely understood.
In one aspect, the disclosure provides methods for detecting a RAS pathway mutations in a subject. The methods include obtaining a biological sample from the subject, isolating nucleic acids from the biological sample, and analyzing the expression level of noncoding RNAs in the nucleic acids in conjunction with a corresponding reference level in a control sample, wherein a differential expression level of the noncoding RNAs compared to the corresponding reference level in the control sample indicates that the subject has a RAS pathway mutation.
In some embodiments, the method further comprises, after analyzing, administering to the subject one or more anticancer agents. In certain embodiments, the anticancer agent is an inhibitor of a K-ras gene. In other embodiments, the anticancer agent is an inhibitor of the gene that is identified to have the differential expression level compared to the corresponding reference level for the gene in the control sample.
In some embodiments, the cancer comprises a KRAS mutation. The KRAS mutation can be in a tissue of the subject, such as lung tissue. In certain embodiments, the cancer is lung cancer, such as lung adenocarcinoma.
In some embodiments, the method comprises analyzing the expression level of a gene involved in the interferon (IFN) alpha or gamma response. In certain embodiments, an increase in the expression level of the gene involved in the IFN alpha or gamma response relative to a corresponding reference level for the gene in the control sample from the control subject indicates that the subject has cancer.
In some embodiments, the method comprises analyzing the expression level of a gene encoding a KRAB zinc-finger (KZNF) protein. In certain embodiments, a decrease in the expression level of the gene encoding the KZNF protein relative to a corresponding reference level for the gene in the control sample from the control subject indicates that the subject has cancer.
In some embodiments, measuring the expression level of the one or more genes comprises performing polymerase chain reaction (PCR), reverse transcriptase polymerase chain reaction (RT-PCR), single-cell RNA-sequencing, microarray analysis, a Northern blot, serial analysis of gene expression (SAGE), immunoassay, hybridization capture, cDNA sequencing, direct RNA sequencing, nanopore sequencing, a CRISPR based technology and/or mass spectrometry. Specifically, when PCR is used to measure the expression level, at least one set of oligonucleotide primers comprising a forward primer and a reverse primer capable of amplifying a polynucleotide sequence of the gene can be used.
In some embodiments, the biological sample is a blood sample, a urine sample, a saliva sample, or a tissue sample (e.g., a blood sample). In some embodiments, the subject suspected of having cancer or in need of treatment is a mammal (e.g., a human).
As used herein, the term “RAS pathway mutation” refers to a genetic mutation in a RAS pathway gene. Optionally, the RAS pathway mutation comprises a mutation in KRAS, NRAS, HRAS, EGFR, NF1, MET or BRAF. Optionally, the mutation is in KRAS. As used herein, the term “KRAS mutation” refers to a genetic mutation in the KRAS gene, which acts as an on-off switch in cell signaling and controls cell proliferation.
As used herein, the term “long noncoding RNA” or “lncRNA” refers to RNA polynucleotides that are not translated into proteins. Long ncRNAs may vary in length from several hundred bases to tens of kilo bases (e.g., at least 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, or 2000 bases) and may be located separately from protein coding genes, or reside near or within protein coding genes.
As used herein, the term “polynucleotide” refers to an oligonucleotide, or nucleotide, and fragments or portions thereof, and to DNA or RNA of genomic or synthetic origin, which may be single- or double-stranded, and represent the sense or anti-sense strand. A single polynucleotide can be translated into a single polypeptide unless it is noncoding.
As used herein, the terms “peptide” and “polypeptide” are used interchangeably and describe a single polymer in which the monomers are amino acid residues which are joined together through amide bonds. A polypeptide is intended to encompass any amino acid sequence, either naturally occurring, recombinant, or synthetically produced.
As used herein, the term “substantial identity” or “substantially identical,” used in the context of nucleic acids or polypeptides, refers to a sequence that has at least 50% sequence identity with a reference sequence. Alternatively, percent identity can be any integer from 50% to 100%. In some embodiments, a sequence is substantially identical to a reference sequence if the sequence has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the reference sequence as determined using, e.g., BLAST.
For sequence comparison, typically one sequence acts as a reference sequence, to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are entered into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. Default program parameters can be used, or alternative parameters can be designated. The sequence comparison algorithm then calculates the percent sequence identities for the test sequences relative to the reference sequence, based on the program parameters.
A comparison window includes reference to a segment of any one of the number of contiguous positions, e.g., a segment of at least 10 residues. In some embodiments, the comparison window has from 10 to 600 residues, e.g., about 10 to about 30 residues, about 10 to about 20 residues, about 50 to about 200 residues, or about 100 to about 150 residues, in which a sequence may be compared to a reference sequence of the same number of contiguous positions after the two sequences are optimally aligned.
Algorithms that are suitable for determining percent sequence identity and sequence similarity are the BLAST and BLAST 2.0 algorithms, which are described in Altschul et al. (1990) J. Mol. Biol. 215: 403-410 and Altschul et al. (1977) Nucleic Acids Res. 25: 3389-3402, respectively. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (NCBI) web site. The algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold (Altschul et al, supra). These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are then extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always >0) and N (penalty score for mismatching residues; always <0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a word size (W) of 28, an expectation (E) of 10, M=1, N=-2, and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a word size (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff & Henikoff, Proc. Natl. Acad. Sci. USA 89:10915 (1989)).
The BLAST algorithm also performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin & Altschul, Proc. Nat'l. Acad. Sci. USA 90:5873-5787 (1993)). One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. For example, an amino acid sequence is considered similar to a reference sequence if the smallest sum probability in a comparison of the test amino acid sequence to the reference amino acid sequence is less than about 0.01, more preferably less than about 10−5, and most preferably less than about 10−20.
Most of the human genome is noncoding and transcribed into RNA (1, 2), but how the noncoding transcriptome contributes to cancer formation is poorly understood. About half of the human genome is comprised of transposable elements (TE) (3), whose expression patterns are often altered in cancer (4) . Additionally, TEs contribute substantially to the noncoding transcriptome and are present in the exonic sequences of thousands of long noncoding RNAs (lncRNAs) and other classes of regulatory RNAs (5). Noncoding RNA networks become disrupted in cancer (6, 7) and epigenetic reprogramming, where early activation of RAS signaling leads to coordinate activation of noncoding RNAs in single cells (8). While RAS genes are among the most frequently mutated oncogenes in cancer (9), the extent to which RAS regulates the noncoding transcriptome during cellular transformation remains unknown.
To determine the landscape of noncoding RNAs affected by oncogenic RAS signaling, we performed RNA sequencing (RNA-seq) on human lung epithelial cells (AALE) that undergo malignant transformation upon introduction of mutant KRAS (10). The transcriptomes of AALE cells transduced with control vector were compared to AALEs that were transformed by mutant KRAS and analyzed the distribution of differentially expressed transcripts across the genome.
The transcriptomes of human lung and kidney cells transformed with mutant KRAS to define the landscape of RAS-regulated noncoding RNAs were analyzed. It was determined that oncogenic RAS upregulates noncoding transcripts throughout the genome, many of which arise from transposable elements. These repetitive sequences are preferential targets of KRAB zinc-finger proteins, which are broadly downregulated in mutant KRAS cells and lung adenocarcinomas. Moreover, KRAS-mediated reprogramming of repetitive noncoding RNA induces an interferon response that contributes to cellular transformation. The results reveal the extent to which mutant KRAS remodels the noncoding transcriptome, expanding the scope of genomic elements regulated by this fundamental signaling pathway.
Provided are methods for detecting a RAS pathway mutation in a subject. The methods include the steps of obtaining a biological sample from the subject, isolating nucleic acids from the biological sample, and analyzing the expression level of noncoding RNAs in the nucleic acids in conjunction with a corresponding reference level in a control sample, wherein a differential expression level of the noncoding RNAs compared to the corresponding reference level in the control sample indicates that the subject has a RAS pathway mutation. The method includes determining whether the subject has a KRAS mutation based on the differential expression levels of the noncoding RNAs in the biological sample of the subject compared to the expression levels of the corresponding reference genes in a control sample, optionally, from a control subject. Optionally, the RAS pathway mutation is in KRAS, NRAS, HRAS, EGFR, NFI, MET or BRAF. Optionally, the RAS pathway mutation is in KRAS. Optionally, the subject has or is suspected of having cancer. Optionally, the cancer is a RAS mutant cancer. Optionally, the RAS mutant cancer is a lung cancer, pancreatic cancer, colorectal cancer, or melanoma.
Optionally, the method further comprises analyzing the expression level of a gene involved in the interferon (IFN) alpha, IFN gamma, or KRAB-Zn Finger response. An increase in the expression level of the gene involved in the IFN alpha, IFN-gamma, or a decrease in the expression level of the gene involved in the KRAB-Zn finger response relative to a corresponding reference level for the gene in the control sample from the control subject indicates that the subject has cancer. Optionally, the method comprises analyzing the expression level of noncoding RNAs transcribed from transposable elements. Optionally, the transposable element is selected from the group consisting of a LINE-1 element, LINE-2 element, ERVK element, ERV1 element, EVRL element, EVRL-MaLR element, Alu element, hAT-Charlie element, MIR element, and combinations thereof. Optionally, the LINE-1 element is L1MC4a. Optionally, the Alu element is selected from the group consisting of AluSu, AluSg, AluJo, AluY, AluSz6, and combinations thereof. Optionally, the hAT-Charlie element is MER20. Optionally, the transposable element is a transposable element from Table 1.
Optionally, analyzing the expression level of the noncoding RNAs comprises performing polymerase chain reaction (PCR), reverse transcriptase polymerase chain reaction (RT-PCR), single-cell RNA-sequencing, microarray analysis, a Northern blot, serial analysis of gene expression (SAGE), immunoassay, hybridization capture, cDNA sequencing, direct RNA sequencing, nanopore sequencing, mass spectrometry, a CRISPR based technology, or combinations thereof. Optionally, measuring the expression level of the noncoding RNAs comprises performing sequencing. Optionally, the sequencing comprises obtaining one or more sequencing reads of the noncoding RNAs. Optionally, the analyzing comprises aligning the sequencing reads of the noncoding RNAs to repetitive sequences in a human genome.
The biological sample can include extracellular vesicles isolated from cells from the subject. Optionally, the nucleic acids from the biological sample includes polyadenylated RNAs. Optionally, the biological sample is a blood sample, a urine sample, a saliva sample, or a tissue sample. Optionally, the biological sample is a tissue sample. The subject can be a mammal, e.g., a human.
Optionally, the methods further include administering to the subject one or more anticancer agents. Optionally, the anticaner agent is an inhibitor of a RAS pathway gene. Optionally, the anticancer agent is an inhibitor of KRAS.
In the provided methods, if the gene in the biological sample from the subject displays a differential expression level relative to the corresponding reference gene in the control sample from the control subject, i.e., higher or lower than the expression level of the gene in the control sample by at least 2%, 4%, 6%, 8%, 10%, 20%, 30%, 40%, or 50%, then the subject may have cancer and/or a KRAS mutation. In certain embodiments, the cancer and/or the KRAS mutation may be in a tissue of the subject (e.g., lung).
In some embodiments, the method comprises analyzing the expression level of one or more genes involved in the interferon (IFN) alpha or gamma response. The expression level of one or more genes involved in the IFN alpha or gamma response can increase in response to a KRAS mutation. In yet other embodiments, the method comprises analyzing the expression level of a gene encoding a KRAB zinc-finger (KZNF) protein. The expression level of a gene encoding a KZNF protein can decrease in response to a KRAS mutation.
In the methods described herein, in some embodiments, the subject is suspected of having a RAS pathway mutation, e.g., a RAS pathway mutation is in a lung, colorectal, or pancreas tissue of the subject.
In the methods described herein, in some embodiments, the cancer is a lung cancer (e.g., lung adenocarcinoma). The cancer may be characterized by an oncogenic defect in the RAS pathway. In particular embodiments, the oncogenic defect comprises an activating mutation in KRAS.
In some embodiments of the methods described herein, an increased expression level of a noncoding RNA in a biological sample from a subject compared to a corresponding reference expression level of the same gene in a control sample from a control subject may indicate that the subject has cancer. In some embodiments of the methods described herein, once it is determined that a subject (e.g., a subject suspected of having cancer) has an increased expression level of the gene relative to a control sample, the subject may be administered a therapeutically effective amount of an inhibitor to inhibit the expression level of the gene.
An inhibitor of the gene refers to an agent that inhibits or decreases the expression level and/or the activity of the gene. An inhibitor may inhibits or decreases the transcription of the gene, binds to the gene, and/or inhibits interaction between the gene and another protein or nucleic acid. In some embodiments, an inhibitor may be an inhibitory RNA (e.g., small interfering RNA (siRNA), an antisense RNA, microRNA (miRNA), and short hairpin RNA (shRNA)), an aptamer, an antibody, a CRISPR RNA or a small molecule.
In some embodiments, an inhibitor may be an inhibitory RNA, e.g., small interfering RNA (siRNA), an antisense RNA, microRNA (miRNA), a CRISPR RNA or short hairpin RNA (shRNA). In some embodiments, the inhibitory RNA targets a sequence that is identical or substantially identical (e.g., at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical) to a target sequence in the gene. A target sequence in the gene may be a portion of the gene comprising at least 20, at least 30, at least 40, at least 50, at least 60, at least 70, at least 80, at least 90, or at least 100 contiguous nucleotides, e.g., from 20-500, 20-250, 20-100, 50-500, or 50-250 contiguous nucleotides.
In some embodiments of the methods described herein, once it is determined that a subject (e.g., a subject suspected of having cancer) has an increased expression level of one or more noncoding RNAs relative to a control sample, the subject may be administered a therapeutically effective amount of an siRNA that inhibits or decreases the expression level of the gene. An siRNA may be produced from a short hairpin RNA (shRNA). A shRNA is an artificial RNA molecule with a hairpin turn that can be used to silence target gene expression via the siRNA it produces in cells. See, e.g., Fire et. al., Nature 391:806-811, 1998; Elbashir et al., Nature 411:494-498, 2001; Chakraborty et al., Mol Ther Nucleic Acids 8:132-143, 2017; and Bouard et al., Br. J. Pharmacol. 157:153-165, 2009. Expression of shRNA in cells is typically accomplished by delivery of plasmids or through viral or bacterial vectors. Suitable bacterial vectors include but not limited to adeno-associated viruses (AAVs), adenoviruses, and lentiviruses. After the vector has integrated into the host genome, the shRNA is then transcribed in the nucleus by polymerase II or polymerase III (depending on the promoter used). The resulting pre-shRNA is exported from the nucleus, then processed by Dicer and loaded into the RNA-induced silencing complex (RISC). The sense strand is degraded by RISC and the antisense strand directs RISC to an mRNA that has a complementary sequence. A protein called Ago2 in the RISC then cleaves the mRNA, or in some cases, represses translation of the mRNA, leading to its destruction and an eventual reduction in the protein encoded by the mRNA. Thus, the shRNA leads to targeted gene silencing.
In some embodiments, once it is determined that a subject (e.g., a subject suspected of having cancer) has an increased expression level of one or more noncoding RNAs relative to a control sample, the subject may be administered a therapeutically effective amount of an shRNA capable of hybridizing to a portion of the gene. The shRNA may be encoded in a vector. In some embodiments, the vector further comprises appropriate expression control elements known in the art, including, e.g., promoters (e.g., inducible promoters or tissue specific promoters), enhancers, and transcription terminators.
In some embodiments, once it is determined that a subject (e.g., a subject suspected of having cancer) has an increased expression level of one or more genes in Tables 1-3 relative to a control sample, the subject may be administered a therapeutically effective amount of an siRNA capable of hybridizing to a portion of the gene. The siRNA may be encoded in a vector. In some embodiments, the vector further comprises appropriate expression control elements known in the art, including, e.g., promoters (e.g., inducible promoters or tissue specific promoters), enhancers, and transcription terminators.
Techniques and methods for measuring the expression levels of genes are available in the art. For example, detection and/or quantification of noncoding RNAs may be accomplished by any one of a number methods or assays employing recombinant DNA or RNA technologies known in the art, including but not limited to, polymerase chain reaction (PCR), single-cell RNA-sequencing, reverse transcription PCR (RT-PCR), microarrays, Northern blot, serial analysis of gene expression (SAGE), immunoassay, hybridization capture, cDNA sequencing, direct RNA sequencing, nanopore sequencing, CRISPR based technology, and mass spectrometry.
In some embodiments, hybridization capture methods may be used for detection and/or quantification of the noncoding RNAs. Some examples of hybridization capture methods include, e.g., capture hybridization analysis of RNA targets (CHART), chromatin isolation by RNA purification (ChIRP), CRISPR based technology and RNA affinity purification (RAP). In general, cells and tissues expressing the RNA of interest can be cross-linked and solubilized by shearing. The RNA of interest can then be enriched using rationally designed biotin tagged antisense oligonucleotides. The captured RNA complexes can then be rinsed and eluted. The eluted material can be analyzed for the molecules of interest. The associated RNAs are commonly analyzed with qPCR or high throughput sequencing, and the recovered proteins can be analyzed with Western blots or mass spectrometry. General techniques for performing hybridization capture methods are described in the art and can be found in, e.g., Machyna and Simon, Briefings in Functional Genomics 17(2):96-103, 2018, which is incorporated herein by reference in its entirety. Further, Li et al, JCI Insight. 3(7):e98942, 2018 also describes methods of studying RNA (e.g., extracellular RNA) and is incorporated herein by reference in its entirety.
In some embodiments, microarrays may be used to measure the expression levels of the genes. An advantage of microarray analysis is that the expression of each of the genes can be measured simultaneously, and microarrays can be specifically designed to provide a diagnostic expression profile for a particular disease or condition (e.g., cancer). Microarrays may be prepared by selecting probes which comprise a polynucleotide sequence, and then immobilizing such probes to a solid support or surface. For example, the probes may comprise DNA sequences, RNA sequences, or copolymer sequences of DNA and RNA. The polynucleotide sequences of the probes may also comprise DNA and/or RNA analogues, or combinations thereof. For example, the polynucleotide sequences of the probes may be full or partial fragments of genomic nucleic acids. The polynucleotide sequences of the probes may also be synthesized nucleotide sequences, such as synthetic oligonucleotide sequences. Probes may be immobilized to a solid support which may be either porous or non-porous. For example, the probes may be polynucleotide sequences which are attached to a nitrocellulose or nylon membrane or filter covalently at either the 3′ or the 5′ end of the polynucleotide. Such hybridization probes are well-known in the art (see, e.g., Sambrook, et al., Molecular Cloning: A Laboratory Manual (3rd Ed., 2001). In one embodiment, a microarray may include a support or surface with an ordered array of binding (e.g., hybridization) sites or “probes” each representing one of the genes described herein. More specifically, each probe of the array may be located at a known, predetermined position on the solid support such that the identity (i.e., the sequence) of each probe can be determined from its position in the array (i.e., on the support or surface). Each probe may be covalently attached to the solid support at a single site.
Quantitative reverse transcriptase PCR (qRT-PCR) can also be used to determine the expression profiles of the genes. The first step in gene expression profiling by RT-PCR is the reverse transcription of the RNA template into cDNA, followed by its exponential amplification in a PCR reaction. The two most commonly used reverse transcriptases are avilo myeloblastosis virus reverse transcriptase (AMY-RT) and Moloney murine leukemia virus reverse transcriptase (MLVRT). The reverse transcription step is typically primed using specific primers, random hexamers, or oligo-dT primers, depending on the circumstances and the goal of expression profiling. For example, extracted RNA can be reverse-transcribed using a GeneAmp RNA PCR kit (Perkin Elmer, CA, USA), following the manufacturer's instructions. The derived cDNA can then be used as a template in the subsequent PCR reaction. Although the PCR step can use a variety of thermostable DNA-dependent DNA polymerases, it typically employs the Taq DNA polymerase, which has a 5′-3′ nuclease activity but lacks a 3′-5′ proofreading endonuclease activity. Thus, TAQMAN PCR typically utilizes the 5′-nuclease activity of Taq polymerase to hydrolyze a hybridization probe bound to its target amplicon, but any enzyme with equivalent 5′ nuclease activity can be used. Two oligonucleotide primers are used to generate an amplicon typical of a PCR reaction. A third oligonucleotide, or probe, may be designed to detect nucleotide sequence located between the two PCR primers. The probe is non-extendible by Taq DNA polymerase enzyme, and may be labeled with a reporter fluorescent dye and a quencher fluorescent dye. Any laser-induced emission from the reporter dye is quenched by the quenching dye when the two dyes are located close together as they are on the probe. During the amplification reaction, the Taq DNA polymerase enzyme cleaves the probe in a template-dependent manner. The resultant probe fragments disassociate in solution, and signal from the released reporter dye is free from the quenching effect of the second fluorophore. One molecule of reporter dye is liberated for each new molecule synthesized, and detection of the unquenched reporter dye provides the basis for quantitative interpretation of the data.
Serial Analysis Gene Expression (SAGE) can also be used to determine RNA expression level. SAGE analysis does not require a special device for detection, and may be used for simultaneously detecting the expression of a large number of transcription products. First, RNA is extracted, converted into cDNA using a biotinylated oligo (dT) primer, and treated with a four-base recognizing restriction enzyme (Anchoring Enzyme: AE) resulting in AE-treated fragments containing a biotin group at their 3′ terminus. Next, the AE-treated fragments are incubated with streptavidin for binding. The bound cDNA is divided into two fractions, and each fraction is then linked to a different double-stranded oligonucleotide adapter (linker) A or B. These linkers are composed of: (1) a protruding single strand portion having a sequence complementary to the sequence of the protruding portion formed by the action of the anchoring enzyme, (2) a 5′ nucleotide recognizing sequence of the IIS-type restriction enzyme (cleaves at a predetermined location no more than 20 bp away from the recognition site) serving as a tagging enzyme (TE), and (3) an additional sequence of sufficient length for constructing a PCR-specific primer. The linker-linked cDNA is cleaved using the tagging enzyme, and only the linker-linked cDNA sequence portion remains, which is present in the form of a short-strand sequence tag. Next, pools of short-strand sequence tags from the two different types of linkers are linked to each other, followed by PCR amplification using primers specific to linkers A and B. As a result, the amplification product is obtained as a mixture comprising myriad sequences of two adjacent sequence tags (ditags) bound to linkers A and B. The amplification product is treated with the anchoring enzyme, and the free ditag portions are linked into strands in a standard linkage reaction. The amplification product is then cloned. Determination of the clone's nucleotide sequence can be used to obtain a readout of consecutive ditags of constant length. The presence of the gene corresponding to each tag can then be identified from the nucleotide sequence of the clone and information on the sequence tags.
In methods described herein, a subject may be administered one or more anticancer agents alone or in combination with one or more inhibitors that inhibit the expression levels of one or more noncoding RNAs. An anticancer agent may be a RAS pathway inhibitor, a cytotoxic agent, a chemotherapeutic agent, or an immunosuppressive agent. An anticancer agent may be a natural or synthetic agent. In some embodiments, an anticancer agent may be capable of treating cancer, activating immune response, and/or reducing tumor load. In some embodiments, an anticancer agent may inhibit the proliferation of and/or kill cancer cells. An anticancer agent may be a small molecule, a peptide, or a protein. In some embodiments, an anticancer agent may be an agent that inhibits and/or down regulates the activity of a protein that prevents immune cell activation or a protein that exerts immunosuppressive effects.
Examples of anticancer agents include, but are not limited to, RAS pathway inhibitors such as the mutant KRAS specific inhibitors including Sotorasib/AMG 510 (LUMARKRAS™), Adagrasib (MRTX849), MRTX1133, and GDC-6036; alkylating agents such as thiotepa and cyclosphosphamide (CYTOXAN®); alkyl sulfonates such as busulfan, improsulfan and piposulfan; aziridines such as benzodopa, carboquone, meturedopa, and uredopa; ethylenimines and methylamelamines including altretamine, triethylenemelamine, triethylenephosphoramide, triethylenethiophosphoramide and trimethylomelamine; acetogenins (especially bullatacin and bullatacinone); delta-9-tetrahydrocannabinol (dronabinol, MARINOL®); beta-lapachone; lapachol; colchicines; betulinic acid; a camptothecin (including the synthetic analogue topotecan (HYCAMTIN®), CPT-11 (irinotecan, CAMPTOSAR®), acetylcamptothecin, scopolectin, and 9-aminocamptothecin); bryostatin; callystatin; CC-1065 (including its adozelesin, carzelesin and bizelesin synthetic analogues); podophyllotoxin; podophyllinic acid; teniposide; cryptophycins (particularly cryptophycin 1 and cryptophycin 8); dolastatin; duocarmycin (including the synthetic analogues, KW-2189 and C B1-TM1); eleutherobin; pancratistatin; a sarcodictyin; spongistatin; nitrogen mustards such as chlorambucil, chlornaphazine, chlorophosphamide, estramustine, ifosfamide, mechlorethamine, mechlorethamine oxide hydrochloride, melphalan, novembichin, phenesterine, prednimustine, trofosfamide, uracil mustard; nitrosoureas such as carmustine, chlorozotocin, fotemustine, lomustine, nimustine, and ranimnustine; antibiotics such as the enediyne antibiotics (e.g., calicheamicin, especially calicheamicin gammall and calicheamicin omegall (see, e.g., Nicolaou et al. Angew. Chem Intl. Ed. Engl., 33: 183-186 (1994)); CDP323, an oral alpha-4 integrin inhibitor; dynemicin, including dynemicin A; an esperamicin; neocarzinostatin chromophore and related chromoprotein enediyne antibiotic chromophores), aclacinomysins, actinomycin, authramycin, azaserine, bleomycin, cactinomycin, carabicin, caminomycin, carzinophilin, chromomycins, dactinomycin, daunorubicin, detorubicin, 6-diazo-5-oxo-L-norleucine, doxorubicin (including ADRIAMYCIN®, morpholino-doxorubicin, cyanomorpholino-doxorubicin, 2-pyrrolino-doxorubicin, doxorubicin HCl liposome injection (DOXIL®), liposomal doxorubicin TLC D-99 (MYOCET®), peglylated liposomal doxorubicin (CAELYX®), and deoxydoxorubicin), epirubicin, esorubicin, idarubicin, marcellomycin, mitomycins such as mitomycin C, mycophenolic acid, nogalamycin, olivomycins, peplomycin, porfiromycin, puromycin, quelamycin, rodorubicin, streptonigrin, streptozocin, tubercidin, ubenimex, zinostatin, zorubicin; anti-metabolites such as methotrexate, gemcitabine (GEMZAR®), tegafur (UFTORAL®), capecitabine (XELODA®), an epothilone, and 5-fluorouracil (5-FU); combretastatin; folic acid analogues such as denopterin, methotrexate, pteropterin, trimetrexate; purine analogs such as fludarabine, 6-mercaptopurine, thiamiprine, thioguanine; pyrimidine analogs such as ancitabine, azacitidine, 6-azauridine, 5-azacytidine, carmofur, cytarabine, dideoxyuridine, doxifluridine, enocitabine, floxuridine; androgens such as calusterone, dromostanolone propionate, epitiostanol, mepitiostane, testolactone; anti-adrenals such as aminoglutethimide, mitotane, trilostane; folic acid replenisher such as frolinic acid; aceglatone; aldophosphamide glycoside; aminolevulinic acid; eniluracil; amsacrine; bestrabucil; bisantrene; edatraxate; defofamine; demecolcine; diaziquone; elformithine; elliptinium acetate; an epothilone; etoglucid; gallium nitrate; hydroxyurea; lentinan; lonidainine; maytansinoids such as maytansine and ansamitocins; mitoguazone; mitoxantrone; mopidanmol; nitraerine; pentostatin; phenamet; pirarubicin; losoxantrone; 2-ethylhydrazide; procarbazine; PSK® polysaccharide complex (JHS Natural Products, Eugene, Oreg.); razoxane; rhizoxin; sizofuran; spirogermanium; tenuazonic acid; triaziquone; 2,2′, 2′-trichlorotriethylamine; trichothecenes (especially T-2 toxin, verracurin A, roridin A and anguidine); urethan; vindesine (ELDISINE®, FILDESIN®); dacarbazine; mannomustine; mitobronitol; mitolactol; pipobroman; gacytosine; arabinoside (“Ara-C”); thiotepa; taxoid, e.g., paclitaxel (TAXOL®, Bristol-Myers Squibb Oncology, Princeton, N.J.), albumin-engineered nanoparticle formulation of paclitaxel (ABRAXANE™), and docetaxel (TAXOTERE®, Rhome-Poulene Rorer, Antony, France); chloranbucil; 6-thioguanine; mercaptopurine; methotrexate; platinum agents such as cisplatin, oxaliplatin (e.g., ELOXATIN®), and carboplatin; vincas, which prevent tubulin polymerization from forming microtubules, including vinblastine (VELBAN®), vincristine (ONCOVIN®), vindesine (ELDISINE®, FILDESIN®), and vinorelbine (NAVELBINE®); etoposide (VP-16); ifosfamide; mitoxantrone; leucovorin; novantrone; edatrexate; daunomycin; aminopterin; ibandronate; topoisomerase inhibitor RFS 2000; difluoromethylornithine (DMFO); retinoids such as retinoic acid, including bexarotene (TARGRETIN®); bisphosphonates such as clodronate (for example, BONEFOS® or OSTAC®), etidronate (DIDROCAL®), NE-58095, zoledronic acid/zoledronate (ZOMETA®), alendronate (FOSAMAX®), pamidronate (AREDIA®), tiludronate (SKELID®), or risedronate (ACTONEL®); troxacitabine (a 1,3-dioxolane nucleoside cytosine analog); antisense oligonucleotides, particularly those that inhibit expression of genes in signaling pathways implicated in aberrant cell proliferation, such as, for example, PKC-alpha, Raf, H-Ras, and epidermal growth factor receptor (EGF-R) (e.g., erlotinib (Tarceva™)); and VEGF-A that reduce cell proliferation; vaccines such as THERATOPE® vaccine and gene therapy vaccines, for example, ALLOVECTINO vaccine, LEUVECTIN® vaccine, and VAXID® vaccine; topoisomerase 1 inhibitor (e.g., LURTOTECAN®); rmRH (e.g., ABARELIX®); BAY439006 (sorafenib; Bayer); SU-11248 (sunitinib, SUTENT®, Pfizer); perifosine, COX-2 inhibitor (e.g. celecoxib or etoricoxib), proteosome inhibitor (e.g. PS341); bortezomib (VELCADE®); CCI-779; tipifarnib (R11577); orafenib, ABT510; Bcl-2 inhibitor such as oblimersen sodium (GENASENSE®); pixantrone; EGFR inhibitors; tyrosine kinase inhibitors; serine-threonine kinase inhibitors such as rapamycin (sirolimus, RAPAMUNE®); farnesyltransferase inhibitors such as lonafarnib (SCH 6636, SARASAR™); and pharmaceutically acceptable salts, acids or derivatives of any of the above; as well as combinations of two or more of the above such as CHOP, an abbreviation for a combined therapy of cyclophosphamide, doxorubicin, vincristine, and prednisolone; and FOLFOX, an abbreviation for a treatment regimen with oxaliplatin (ELOXATIN™) combined with 5-FU and leucovorin.
In some embodiments, an anticancer agent is cisplatin, carboplatin, oxaliplatin, bleomycin, mitomycin C, calicheamicins, maytansinoids, doxorubicin, idarubicin, daunorubicin, epirubicin, busulfan, carmustine, lomustine, semustine, methotrexate, 6-mercaptopurine, fludarabine, 5-azacytidine, pentostatin, cytarabine, gemcitabine, 5-fluorouracil, hydroxyurea, etoposide, teniposide, topotecan, irinotecan, chlorambucil, cyclophosphamide, ifosfamide, melphalan, bortezomib, vincristine, vinblastine, vinorelbine, paclitaxel, or docetaxel.
In some embodiments, the anticancer agent is a chemotherapeutic agent. In some embodiments, chemotherapeutic agents may kill cancer cells or inhibit cancer cell growth. Chemotherapeutic agents may function in a non-specific manner, for example, inhibiting the process of cell division known as mitosis. Examples of chemotherapeutic agents include, but are not limited to, antimicrotubule agents (e.g., taxanes and vinca alkaloids), topoisomerase inhibitors and antimetabolites (e.g., nucleoside analogs acting as such, for example, Gemcitabine), mitotic inhibitors, alkylating agents, antimetabolites, antitumor antibiotics, mitotic inhibitors, anthracyclines, intercalating agents, agents capable of interfering with a signal transduction pathway, agents that promote apoptosis, proteosome inhibitors, and alike.
Alkylating agents are most active in the resting phase of the cell. These types of drugs are cell-cycle non-specific. Exemplary alkylating agents include, but are not limited to, nitrogen mustards, ethylenimine derivatives, alkyl sulfonates, nitrosoureas and triazenes): uracil mustard (Aminouracil Mustard®, Chlorethaminacil®, Demethyldopan®, Desmethyldopan®, Haemanthamine®, Nordopan®, Uracil nitrogen Mustard®, Uracillost®, Uracilmostaza®, Uramustin®, Uramustine®), chlormethine (Mustargen®), cyclophosphamide (Cytoxan®, Neosar®, Clafen®, Endoxan®, Procytox®, Revimmune™), ifosfamide (Mitoxana®), melphalan (Alkeran®), Chlorambucil (Leukeran®), pipobroman (Amedel®, Vercyte®), triethylenemelamine (Hemel®, Hexalen®, Hexastat®), triethylenethiophosphoramine, thiotepa (Thioplex®), busulfan (Busilvex®, Myleran®), carmustine (BiCNU®), lomustine (CeeNU®), streptozocin (Zanosar®), and Dacarbazine (DTIC-Dome®). Additional exemplary alkylating agents include, without limitation, Oxaliplatin (Eloxatin®); Temozolomide (Temodar® and Temodal®); Dactinomycin (also known as actinomycin-D, Cosmegen®); Melphalan (also known as L-PAM, L-sarcolysin, and phenylalanine mustard, Alkeran®); Altretamine (also known as hexamethylmelamine (HMM), Hexalen®); Carmustine (BiCNU®); Bendamustine (Treanda®); Busulfan (Busulfex® and Myleran®); Carboplatin (Paraplatin®); Lomustine (also known as CCNU, CeeNU®); Cisplatin (also known as CDDP, Platinol® and Platinol®-AQ); Chlorambucil (Leukeran®); Cyclophosphamide (Cytoxan® and Neosar®); Dacarbazine (also known as DTIC, DIC and imidazole carboxamide, DTIC-Dome®); Altretamine (also known as hexamethylmelamine (HMM), Hexalen®); Ifosfamide (Ifex®); Prednumustine; Procarbazine (Matulane®); Mechlorethamine (also known as nitrogen mustard, mustine and mechloroethamine hydrochloride, Mustargen®); Streptozocin (Zanosar®); Thiotepa (also known as thiophosphoamide, TESPA and TSPA, Thioplex®); Cyclophosphamide (Endoxan®, Cytoxan®, Neosar®, Procytox®, Revimmune®); and Bendamustine HCl (Treanda®).
Antitumor antibiotics are chemotherapeutic agents obtained from natural products produced by species of the soil fungus, e.g., Streptomyces. These drugs act during multiple phases of the cell cycle and are considered cell-cycle specific. There are several types of antitumor antibiotics, including but are not limited to anthracyclines (e.g., Doxorubicin, Daunorubicin, Epirubicin, Mitoxantrone, and Idarubicin), chromomycins (e.g., Dactinomycin and Plicamycin), mitomycin, and bleomycin.
Antimetabolites are types of chemotherapeutic agents that are cell-cycle specific. When cells incorporate these antimetabolite substances into the cellular metabolism, they are unable to divide. This class of chemotherapeutic agents include folic acid antagonists such as Methotrexate; pyrimidine antagonists such as 5-Fluorouracil, Foxuridine, Cytarabine, Capecitabine, and Gemcitabine; purine antagonists such as 6-Mercaptopurine and 6-Thioguanine; Adenosine deaminase inhibitors such as Cladribine, Fludarabine, Nelarabine and Pentostatin.
Exemplary anthracyclines that can be used include, e.g., doxorubicin (Adriamycin® and Rubex®); Bleomycin (Lenoxane®); Daunorubicin (dauorubicin hydrochloride, daunomycin, and rubidomycin hydrochloride, Cerubidine®); Daunorubicin liposomal (daunorubicin citrate liposome, DaunoXome®); Mitoxantrone (DHAD, Novantrone®); Epirubicin (Ellence); Idarubicin (Idamycin®, Idamycin PFS®); Mitomycin C (Mutamycin®); Geldanamycin; Herbimycin; Ravidomycin; and Desacetylravidomycin.
Antimicrotubule agents include vinca alkaloids and taxanes. Exemplary vinca alkaloids include, but are not limited to, vinorelbine tartrate (Navelbine®), Vincristine (Oncovin®), and Vindesine (Eldisine®)); vinblastine (also known as vinblastine sulfate, vincaleukoblastine and VLB, Alkaban-AQ® and Velban®); and vinorelbine (Navelbine®). Exemplary taxanes that can be used include, but are not limited to paclitaxel and docetaxel. Non-limiting examples of paclitaxel agents include nanoparticle albumin-bound paclitaxel (ABRAXANE, marketed by Abraxis Bioscience), docosahexaenoic acid bound-paclitaxel (DHA-paclitaxel, Taxoprexin, marketed by Protarga), polyglutamate bound-paclitaxel (PG-paclitaxel, paclitaxel poliglumex, CT-2103, XYOTAX, marketed by Cell Therapeutic), the tumor-activated prodrug (TAP), ANG105 (Angiopep-2 bound to three molecules of paclitaxel, marketed by ImmunoGen), paclitaxel-EC-1 (paclitaxel bound to the erbB2-recognizing peptide EC-1; see Li et al., Biopolymers (2007) 87:225-230), and glucose-conjugated paclitaxel (e.g., 2′-paclitaxel methyl 2-glucopyranosyl succinate, see Liu et al., Bioorganic & Medicinal Chemistry Letters (2007) 17:617-620).
Exemplary proteosome inhibitors that can be used include, but are not limited to, Bortezomib (Velcade®); Carfilzomib (PX-171-007, (S)-4-Methyl-N-((S)-1-(((S)-4-methyl-1-((R)-2-methyloxiran-2-yl)-1-oxope-ntan-2-yl) amino)-1-oxo-3-phenylpropan-2-yl)-2-((S)-2-(2-morpholinoacetamid-o)-4-phenylbutanamido)-pentanamide); marizomib (NPI-0052); ixazomib citrate (MLN-9708); delanzomib (CEP-18770); and O-Methyl-N-[(2-methyl-5-thiazolyl)carbonyl]-L-seryl-O-methyl-N-[(1S)-2-[(-2R)-2-methyl-2-oxiranyl]-2-oxo-1-(phenylmethyl)ethyl]-L-serinamide (ONX-0912).
In some embodiments, the chemotherapeutic agent is selected from the group consisting of chlorambucil, cyclophosphamide, ifosfamide, melphalan, streptozocin, carmustine, lomustine, bendamustine, uramustine, estramustine, carmustine, nimustine, ranimustine, mannosulfan busulfan, dacarbazine, temozolomide, thiotepa, altretamine, 5-fluorouracil (5-FU), 6-mercaptopurine (6-MP), capecitabine, cytarabine, floxuridine, fludarabine, gemcitabine, hydroxyurea, methotrexate, pemetrexed, daunorubicin, doxorubicin, epirubicin, idarubicin, SN-38, ARC, NPC, campothecin, topotecan, 9-nitrocamptothecin, 9-aminocamptothecin, rubifen, gimatecan, diflomotecan, BN80927, DX-895 If, MAG-CPT, amsacrine, etoposide, etoposide phosphate, teniposide, doxorubicin, paclitaxel, docetaxel, gemcitabine, accatin III, 10-deacetyltaxol, 7-xylosyl-10-deacetyltaxol, cephalomannine, 10-deacetyl-7-epitaxol, 7-epitaxol, 10-deacetylbaccatin III, 10-deacetyl cephalomannine, gemcitabine, Irinotecan, albumin-bound paclitaxel, Oxaliplatin, Capecitabine, Cisplatin, docetaxel, irinotecan liposome, and etoposide, and combinations thereof.
In certain embodiments, the chemotherapeutic agent is administered at a dose and a schedule that may be guided by doses and schedules approved by the U.S. Food and Drug Administration (FDA) or other regulatory body, subject to empirical optimization.
In still further embodiments, more than one chemotherapeutic agent may be administered simultaneously, or sequentially in any order during the entire or portions of the treatment period. The two agents may be administered following the same or different dosing regimens.
Cell Lines. The AALE stable cell lines pBABE-mCherry Puro (control) and pBABE-FLAGKRAS(G12D) Zeo (mutant KRAS) were generated using retroviral transduction, followed by selection in puromycin of zeocin, respectively, 2 days post-infection. Both lines were cultured in SABM Basal Medium (Lonza SABM basal medium) with added supplements and growth factors (Lonza SAGM SingleQuot Kit Suppl. & Growth Factors). AALE cell lines were maintained using Lonza's Reagent Pack subculture reagents. The HA1E cell lines were generated using lentiviral transduction (pLX317) to generate control and mutant HA1E pLX317-KRAS(G12V) stable cell lines using puromycin selection, and cells were cultured in MEM-alpha (Invitrogen) with 10% FBS (Sigma) and 1% penicillin/streptomycin (Gibco). All cell lines tested negative for mycoplasma.
siRNA Knockdowns. AALEs were seeded at 1×106 cells per well of a 6-well plate in complete growth medium, then reverse transfected with 30 pmol siRNA using RNAiMAX lipofectamine according to manufacturer's protocol. Cells were grown for 3 days in transfection medium under standard culture conditions and then harvested for RNA isolation and qPCR as previously described.
Cell Viability Assay. 2×104 cells were subtracted from each siRNA transfection well at the time of transfection and seeded into individual wells of an ultra-low adhesion 96-well plate. The cells were grown in standard culture conditions for 4 days. They were then harvested, and ATP production was measured using the Cell TiterGLO Luminescent Cell Viability Assay (Promega) following the manufacturer's protocol. Luminescence was measured on a Perkin Elmer VICTOR light 1420 Luminescence Counter.
RNA Isolation & Purification. For AALE cell lines, bulk RNA was isolated from cells using Quick-RNA MiniPrep kit (Zymogen). All RNA was quantified via NanoDrop-8000 Spectrophotometer. For HA1E cell lines, bulk RNA was isolated using RNeasy Mini Kit (Qiagen) and quantified via Qubit RNA BR assay kit (Thermo).
qPCR. cDNA was transcribed from 1 μg RNA using iScript cDNA Synthesis Kit (Bio-Rad) according to manufacturer protocol. cDNA was diluted 1:6 and run with iTaq Universal SYBR Green Supermix (Bio-Rad) on ViiA 7 Real-Time PCR System according to manufacturer protocol. Cycle Threshold (CT) values were converted using Standard analysis. Values obtained for target genes were normalized to HPRT.
RNA-seq. For AALE cell lines, 1 μg of total RNA was used as input for the TruSeq Stranded mRNA Sample Prep Kit (Illumina) according to manufacturer protocol. Library quality was determined through the High Sensitivity DNA Kit on a Bioanalyzer 2100 (Agilent Technologies). Multiplexed libraries were sequenced as HiSeq400 100PE runs. For HA1E cell lines, 1 μg of total RNA was used for mRNA enrichment with Dynabeads mRNA DIRECT kit (Thermo). First strand cDNA was generated with AffinityScript Multiple Temperature reverse transcriptase with oligo dT primers. Second strand cDNA was generated with mRNA Second Strand Sythesis Module (New England Biolab). DNA was cleaned up with Agencourt AMPure XP beads twice. Qubit dsDNA High Sensitivity Assay was used for concentration measurement. 1 ng of dsDNA was further subjected to library preparation with Nextera XT DNA sample prep kit (Illumina) per manufacturer instructions. Library size distribution was confirmed with Bioanalyzer (Agilent). Multiplexed libraries were sequenced as NextSeq500 75PE runs.
Single-cell RNA-seq. For single cell RNAseq, 1×106 cells were harvested and resuspended in 1 mL 1×PBS/0.04% BSA (1000 cells/μl) according to the cell preparation guidelines in the 10× Genomics Chromium Single Cell 3′ Reagent Kit User Guide. GEMs were generated from an input of 3,500 cells. We used the 10× Genomics Chromium Single Cell 3′ Reagent Kits version 2 for both the GEM generation and subsequent library preparation and followed the manufacturer's reagent kit protocol. Quantification of all RNAseq libraries was performed by QB3 at UC Berkeley. RNAseq libraries were sequenced as HiSeq4000 100PE runs.
ATAC-seq. 100,000 cells were collected and centrifuged at 500×g for 5 minutes at 4C. Pellets were washed with ice-cold PBS and centrifuged. Pellets were resuspended in icecold lysis buffer. Tagmentation reaction and purification were conducted according to manufacturer's protocol (Active Motif). Libraries were sequenced on a NextSeq500 as 2×75 paired end reads.
Extracellular RNA. The exoRNeasy serum/plasma maxi kit (Qiagen) was used to isolate extracellular vesicles, which were quantified using Nanoparticle Tracking Analysis (Malvern, UK). 30 ml of cell culture supernatant was filtered to remove particles larger than 0.8 μm. The filtrate was precipitated with kit buffer and filtered through a column to collect extracellular vesicles. These vesicles were then lysed with QIAzol® lysis reagent. Total RNA was isolated using the indicated phase separation method and used to make libraries for RNA-seq, which were sequenced on a NextSeq500.
Exosomal RNA. Exosomes were isolated using the Exosome Total Isolation Chip (ExoTIC) as previously described (23). The ExoTIC device was first flushed with 2 mL of 1×PBS buffer. Then, the EVs from culture media were isolated as follows: a five millilitervolume of culture medium was drawn up in the same syringe and connected with the ExoTIC device. This syringe along with the ExoTIC device, was fixed onto a syringe pump. A pump flow rate of 5 mL/h was applied to filter the culture media, concentrating EVs in front of the nanoporous membrane. Free proteins, nucleic acids, etc., which are smaller than the membrane pore size (˜50 nm) pass through the filter pores. The EVcontaining retentate was then washed by running 5 mL of 1× PBS through the device using the same syringe. The ExoTIC device was then disconnected from the syringe, and the purified EV solution was collected via the device inlet using a 200 μL pipet. RNA extracted from the purified EV sample using the miRNeasy Mini Kit (Qiagen) was used to make libraries for RNA-seq, which were sequenced on a NextSeq500.
Statistical Analysis. All quantitative data for functional assays has been reported as means±standard deviation. Statistical significance for these was calculated using a t-test and p-values <0.05 were considered significant.
Data and Code Availability. All code for figures, file parsing, and data processing is available via https:github.com/rreggiar/aale.kras. Sequencing data is accessible via GEO accession GSE120566. Additionally, UCSC Genome Browser tracks used in figures is accessible at https://gnome.ucsc.edu/s/rreggiar/kras.and.ctrl.atac.
RNA-seq Analysis. All fastq files were trimmed with Trimmomatic 2 (0.38) (27) using the Illumina NextSeq PE adapters. The resulting trimmed files were assessed with FastQC (28) and then passed through the following analytical pipeline: Salmon (1.0): pseudoalignment of RNA-seq reads performed with Salmon (29) using the following arguments: -validateMappings -gcBias -seqBias -numBootstraps 20 using an index created from the GENCODE version 32 transcriptome fasta file using standard arguments. STAR (2.7.3a): trimmed reads were aligned to the Human genome with default arguments using a 2-pass approach described previously (30). The resulting .sam files were converted to bam, sorted, and indexed using Samtools with default arguments for all procedures (31). Sleuth (0.30.0): transcript differential expression was performed using Sleuth (32) and Wasabi (1.0.1) to convert the Salmon output into the proper format. Upon completion, the transcripts with q-values below 0.05 in the likelihood-ratio test were used to filter salmon output from which log2fc was manually calculated and paired to the sleuth output. Sleuth was primarily used for quantifying DE of Transposable Element loci in which case the provided reference was the repeat masked loci sequences from the UCSC Genome Browser. DESeq2 (1.24.0): Salmon output was imported into a DESeq object using tximport (33) and differential expression analysis was performed with standard arguments (34). All results were filtered to have padj <=0.01. In the case where R could only generate 0.00 for the padj values, they were reset to the lowest non-zero padj value in the data set. Where count data was used, it was normalized across samples using DESeq.
Transposable Element Content Analysis. Exon and 5′/3′ UTR Overlap: a whole genome .gtf file was downloaded from the UCSC Genome Browser table browser utility. This file was parsed and merged with the GENCODE v.29 reference transcriptome. This modified .gtf (now a .bed file) was passed to bedtools (35) where the overlap function was used with the following arguments: -a modified.gtf bed -b all.ucsc.rmsk.genes.bed -wao -s>retained.overlap.bed alongside a whole genome .gtf retrieved as described above except generated from the repeat-masked browser track. The resulting overlapped bed file was processed and visualized using custom R scripts. Differential Expression: Differential transcript abundance was determined using the Salmon and Sleuth procedures described above provided with a custom index comprising both the GENCODE version 29 transcripts and all transcripts extracted from the Hammel lab GTF file as described in the single cell procedures. Sleuth output was filtered and visualized using R and Tidyverse.
Zinc Finger Protein Analysis. ChIP-exo data and supplementary information were extracted from supplementary data provided by Imbeault et al (24). ZNF genes were cross referenced with DESeq2 and RepeatMasker (36) outputs to extract relevant differential expression data of ZNF proteins and Transposable Element transcripts using R. RepeatMasker output from promoter analyses was cross referenced with ChIP-exo target data to identify potential regulatory targets of differentially expressed KZNFs. Only KZNF targets with ‘score’ [see Imbeault et al] >=75 were kept for analysis. Analysis of all data was performed and visualized in R using custom scripts.
Gene Set Enrichment Analysis. Genes determined to be significantly differentially expressed in DESeq2 output were first ‘pre-ranked’ in R by the following metric: Score metric=sin(log2FoldChange) *-log10(p-value) The resulting ranked files objects were processed using the R package fgsea alongside gene set files downloaded from msigdb using the R package msigdbr. Additional code was written for select vizualizations.
Gene Ontology Analysis. Upregulated gene names were extracted from DESeq2 output using bash command line tools. Name lists were pasted into the Gene Ontology Consortium's Enrichment Analysis tool powered by PANTHER. Output data was exported as .txt files and parsed using bash command line tools. Parsed data was visualized using custom R scripts.
Single Cell Analysis. Cell Ranger: Single cell output data was processed using 10× pipeline CellRanger. The mkfastq functionality was used to generate fastq files for further downstream analysis. Salmon—Alevin: fastq files generated above were passed to Salmon alevin with the default arguments for CHROMIUM V2 data. alevin was used to psuedoalign the libraries to both the GENCODE v32 combined with the repeat masked loci sequences extracted from Hg38 via the UCSC Genome Browser. A salmon index was built from this reference with standard arguments. These alevin output matrices were imported into R using tximport. STAR-solo: This feature within the STAR software was used to generate single cell SAM files for downstream processing. Run with the recommended arguments. Seurat (3.0): Normalization and UMAP clustering were performed with Seurat following their described approach optimized to our data set (see code notebook). Additional code was written to extract count data from Seurat single cell objects using the SingleCellExperiment R package (37).
TCGA ZNF Analysis. TCGA-LUAD and GTEX lung phenotype and normalized count data were downloaded from the UCSC Xena browser TOIL data repository (https://xenabrowser.net/datapages/?cohort=TCGA%20TARGET%20GTEx&addHub=https%3A%2Fxenatreehousegiuscs.edu&removeHub=https%3A%2Fxena.treehouse.gi.ucsc.edu%3A443). The files were combined and patients were grouped by their KRAS mutation status and identity. These data were compared to and visualized alongside of data generated from our analysis using custom R code. Significance was determined with a one-way t test implemented in the R t test( ) function.
RNA Editing. RNAEditinglndexer: Single cell SAM files were subsampled into 3 equal subsets per cluster based on barcode. Each SAM file was then converted into a BAM as described above and used as input for the RNAEditinglndexer script with bed files generated by extracting the locations of detected Transposable Element loci from Sleuth output (21).
ATAC-seq Analysis. ENCODE: The ENCODE ATAC-seq pipeline was used for alignment, quality control, MACS analysis with default arguments to produce output for downstream analysis. HOMER: Narrow peaks files produced above were processed using a variety of HOMER tools with default arguments where not explicitly stated: findMotifsGenome was used to find enriched motifs in each ATAC library and their subsets. annotatePeaks was used to generate detailed annotations of the motifs found above and their context. It was also used to create gene set differential accessibility histograms by using the -hist argument set to 1. mergePeaks allowed us to identify unique and overlapping peaks across the two libraries. makeTagDirectory was used to generate peak height estimates to be applied on all histogram comparisons (38).
Quantification and statistical analysis. All statistical analyses were performed with R (version 3.6.1) running from the Rocker ‘Tidyverse’ Docker container (rocker/tidyverse:3.6.1). Unpaired, bi-directional t test was performed with the t.test( ) function on samples with 3 biological or technical replicates. Linear regression was carried out with the lm( )function.
Additional Code. All analysis was performed in the R programming language with supplemental scripts written in Bash.
To determine the transcriptomic landscape of protein-coding and noncoding RNAs regulated by oncogenic RAS signaling, RNA sequencing (RNA-seq) was performed on human airway epithelial cells (AALE) that undergo malignant transformation upon introduction of mutant KRAS (
To explore the biological pathways that are perturbed by oncogenic RAS signaling, gene set enrichment analysis (GSEA) (15) was performed using genes that were differentially expressed in our mutant KRAS AALE cells. GSEA revealed that the most significantly enriched pathway was the interferon (IFN) alpha response, while the third most enriched pathway was the IFN gamma response (
It was then investigated whether this mutant KRAS ISG signature was specific to lung cells or if other cell types responded similarly. RNAseq was performed on human embryonic kidney cells (HA1E) that were primed for oncogenic KRAS driven transformation (17) and analyzed how mutant KRAS altered their transcriptomes. Similar to transformed AALEs, thousands of protein-coding RNAs (n=2635 up, n=2639 down) and hundreds of lncRNAs were upregulated (n=165) or downregulated (n=223) (
To assess the heterogeneity of the KRAS ISG signature in mutant KRAS AALEs, single-cell RNA-seq (scRNA-seq) (n=1503 cells) (
To elucidate the molecular mechanisms involved in inducing intrinsic ISG signatures in mutant KRAS AALE cells, an Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq) (18) was performed. In mutant KRAS AALEs, open chromatin was strongly enriched at gene promoters for upregulated KRAS signaling genes, as well as KRAS ISG signature genes (
The molecular basis for intrinsic ISG signature activation in mutant KRAS AALE cells was investigated by analyzing the abundance of repetitive noncoding RNAs transcribed from TEs, which induce an IFN response in cancer cells when aberrantly expressed (19, 20). The LINE-1 element L1MC4a, the Alu elements AluSx, AluSg, AluJo, AluY, and AluSz6, and the hAT-Charlie element MER20 were all significantly upregulated in mutant KRAS AALE cells (
TE expression heterogeneity was examined in our single-cell RNA-seq data from mutant KRAS AALEs. Substantial heterogeneity was not observed in ALU, LINE, MER, or LTR class TE expression (
To test whether extracellular RNAs that are released from mutant KRAS cells might also exhibit differential RNA editing, extracellular vesicles were isolated from the culture media of control and mutant KRAS AALEs (22, 23). Extracellular vesicles isolated from mutant KRAS AALE cell culture media were comprised of two different sized classes of vesicles that were ˜150 nm and ˜213 nm in diameter, while vesicles from control AALE media were ˜196 nm in size (
Given the known roles of KRAB zinc-finger proteins (KZNFs) in TE silencing (24), the involvement of KZNFs in TE regulation in mutant KRAS AALEs was examined. KZNFs in mutant KRAS AALEs, showed a broad and significant down-regulation of repressive KRAB domain-containing zinc finger proteins (
KZNF chromatin immunoprecipitation sequencing (ChIP-seq) data (24) was analyzed using the University of California Santa Cruz (UCSC) Repeat Browser platform (25). Several of the significantly down-regulated KZNFs in mutant KRAS AALEs bind to the consensus TE sequences of MER20 and L1MC4a elements (
Collectively, our findings reveal the genomic impact of oncogenic KRAS signaling on repetitive noncoding RNAs and ISGs. Our conclusions are based on deeply sequencing and analyzing the transcriptomes of mutant KRAS-transformed AALE cells at the population, single-cell, and extracellular levels, as well as the epigenomic level, building on work identifying noncoding RNAs that are coordinately regulated with RAS signaling genes during epigenomic reprogramming (7). The molecular basis for the intrinsic ISG signature was observed in mutant KRAS AALE cells is different from TE-induced IFN responses in cancer cells treated with DNA methyltransferase inhibitors (19, 20). Instead the results indicate a prominent role for broad KZNF suppression during early stages of mutant KRAS-driven cellular transformation. The Examples herein suggest oncogenic RAS signaling contributes to the early induction of intrinsic ISG signatures that are observed across many cancers and cancer cells lines with ADAR dependencies (16, 26).
In summary, mutant KRAS both directly and indirectly activates repetitive noncoding RNAs through activation of RAS pathway transcription factors and repression of KZNFs that target these TE-containing loci in the human genome. Moreover, mutant KRAS reprograms the epigenome to both directly and indirectly activate intrinsic ISG signatures through opening chromatin at ISG promoters and activating repetitive noncoding RNAs that are recognized by dsRNA-binding RNA sensors such as PKR and MDA5 (
One or more features from any embodiments described herein or in the figures may be combined with one or more features of any other embodiment described herein in the figures without departing from the scope of the disclosure.
All publications, patents and patent applications cited in this specification are herein incorporated by reference as if each individual publication or patent application were specifically and individually indicated to be incorporated by reference. Although the foregoing disclosure has been described in some detail by way of illustration and example for purposes of clarity of understanding, it will be readily apparent to those of ordinary skill in the art in light of the teachings of this disclosure that certain changes and modifications may be made thereto without departing from the spirit or scope of the appended claims.
1. S. Djebali et al., Landscape of transcription in human cells. Nature 489, 101-108 (2012).
2. E. S. Lander et al., Initial sequencing and analysis of the human genome. Nature 409, 860-921 (2001).
3. K. H. Burns, Transposable elements in cancer. Nat Rev Cancer 17, 415-424 (2017).
4. J. L. Rinn, H. Y. Chang, Long Noncoding RNAs: Molecular Modalities to Organismal Functions. Annu Rev Biochem 89, 283-308 (2020).
5. D. Kelley, J. Rinn, Transposable elements reveal a stem cell-specific class of long noncoding RNAs. Genome Biol 13, R107 (2012).
6. F. J. Slack, A. M. Chinnaiyan, The Role of Non-coding RNAs in Oncology. Cell 179, 1033-1055 (2019).
7. D. H. Kim et al., Single-cell transcriptome analysis reveals dynamic changes in lncRNA expression during reprogramming. Cell Stem Cell 16, 88-101 (2015).
8. O. A. Kent et al., Repression of the miR-143/145 cluster by oncogenic Ras initiates a tumor-promoting feed-forward pathway. Genes Dev 24, 2754-2759 (2010).
9. N. Cancer Genome Atlas Research, Comprehensive molecular profiling of lung adenocarcinoma. Nature 511, 543-550 (2014).
10. E. L. Jackson et al., Analysis of lung tumor initiation and progression using conditional expression of oncogenic K-ras. Genes Dev 15, 3243-3248 (2001).
11. D. K. Simanshu, D. V. Nissley, F. McCormick, RAS Proteins and Their Regulators in Human Disease. Cell 170, 17-33 (2017).
12. a. a. d. h. e. Cancer Genome Atlas Research Network. Electronic address, N. Cancer Genome Atlas Research, Integrated Genomic Characterization of Pancreatic Ductal Adenocarcinoma. Cancer Cell 32, 185-203 e113 (2017).
13. Y. Liu et al., Comparative Molecular Analysis of Gastrointestinal Adenocarcinomas. Cancer Cell 33, 721-735 e728 (2018).
14. A. S. Lundberg et al., Immortalization and transformation of primary human airway epithelial cells by gene transfer. Oncogene 21, 4577-4586 (2002).
15. R. K. Powers, A. Goodspeed, H. Pielke-Lombardo, A. C. Tan, J. C. Costello, GSEA-InContext: identifying novel and common patterns in expression experiments. Bioinformatics 34, i555-i564 (2018).
16. H. Liu et al., Tumor-derived IFN triggers chronic pathway agonism and sensitivity to ADAR loss. Nat Med 25, 95-102 (2019).
17. E. Kim et al., Systematic Functional Interrogation of Rare Cancer Variants Identifies Oncogenic Alleles. Cancer Discov 6, 714-726 (2016).
18. J. D. Buenrostro, B. Wu, H. Y. Chang, W. J. Greenleaf, ATAC-seq: A Method for Assaying Chromatin Accessibility Genome-Wide. Curr Protoc Mol Biol 109, 21 29 21-29 (2015).
19. K. B. Chiappinelli et al., Inhibiting DNA Methylation Causes an Interferon Response in Cancer via dsRNA Including Endogenous Retroviruses. Cell 162, 974-986 (2015).
20. D. Roulois et al., DNA-Demethylating Agents Target Colorectal Cancer Cells by Inducing Viral Mimicry by Endogenous Transcripts. Cell 162, 961-973 (2015).
21. S. H. Roth, E. Y. Levanon, E. Eisenberg, Genome-wide quantification of ADAR adenosine-to-inosine RNA editing activity. Nat Methods 16, 1131-1138 (2019).
22. D. Enderle et al., Characterization of RNA from Exosomes and Other Extracellular Vesicles Isolated by a Novel Spin Column-Based Method. PLoS One 10, e0136133 (2015).
23. F. Liu et al., The Exosome Total Isolation Chip. ACS Nano 11, 10712-10723 (2017).
24. M. Imbeault, P. Y. Helleboid, D. Trono, KRAB zinc-finger proteins contribute to the evolution of gene regulatory networks. Nature 543, 550-554 (2017).
25. J. D. Fernandes et al., The UCSC repeat browser allows discovery and visualization of evolutionary conflict across repeat families. Mob DNA 11, 13 (2020).
26. H. S. Gannon et al., Identification of ADAR1 adenosine deaminase dependency in a subset of cancer cells. Nat Commun 9, 5450 (2018).
27. A. M. Bolger, M. Lohse, B. Usadel, Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114-2120 (2014).
28. J. Brown, M. Pirrung, L. A. McCue, FQC Dashboard: integrates FastQC results into a web-based, interactive, and extensible FASTQ quality control tool. Bioinformatics, (2017).
29. R. Patro, G. Duggal, M. I. Love, R. A. Irizarry, C. Kingsford, Salmon provides fast and bias-aware quantification of transcript expression. Nat Methods 14, 417-419 (2017).
30. A. Dobin et al., STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15-21 (2013).
31. H. Li et al., The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078-2079 (2009).
32. H. Pimentel, N. L. Bray, S. Puente, P. Melsted, L. Pachter, Differential analysis of RNA-seq incorporating quantification uncertainty. Nat Methods 14, 687-690 (2017).
33. C. Soneson, M. I. Love, M. D. Robinson, Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences. F1000Res 4, 1521 (2015).
34. M. I. Love, W. Huber, S. Anders, Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15, 550 (2014).
35. C. Guo et al., Tau Activates Transposable Elements in Alzheimer's Disease. Cell Rep 23, 2874-2880 (2018).
36. S. Tempel, Using and understanding RepeatMasker. Methods Mol Biol 859, 29-51 (2012).
37. T. Stuart et al., Comprehensive Integration of Single-Cell Data. Cell 177, 1888-1902 e1821 (2019).
38. S. Heinz et al., Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol Cell 38, 576-589 (2010).
This application claims priority to U.S. Provisional Application No. 63/242,247, filed Sep. 9, 2021, the disclosure of which is hereby incorporated by reference in its entirety for all purposes.
Number | Date | Country | |
---|---|---|---|
63242247 | Sep 2021 | US |