The present invention relates to RNA containing compositions and methods of their use.
The recent development of total RNA sequencing has allowed a better appreciation of the complexity and breadth of the entire transcriptome (Djebali et al., “Landscape of Transcription in Human Cells,” Nature 48:101-108 (2012); ENCODE Project Consortium, “An Integrated Encyclopedia of DNA Elements in the Human Genome,” Nature 489:57-74 (2012); Harrow et al., “GENCODE: The Reference Human Genome Annotation for the ENCODE Project,” Genome Res. 22:1760-1774 (2012), and Martin et al., “Next-Generation Transcriptome Assembly,” Nature Rev. Genet. 12:671-682 (2011)). Analysis by the Encyclopedia of DNA Elements (“ENCODE”) consortium unexpectedly showed that far more of the mammalian genome than previously appreciated is transcribed into non-coding RNA (“ncRNA”). Several short ncRNA have conserved metabolic and regulatory functions and some anti-viral properties have been assigned to novel classes of ncRNA such as eukaryotic small-interfering RNA, piwi interacting RNA, and prokaryotic CRISPR RNA (Rinn et al., “Genome Regulation by Long Noncoding RNAs,” Ann. Rev. Biochem. 81:145-66 (2012)). In eukaryotes, long non-coding RNA (“lncRNA”), such as long-intergenic non-coding RNA, have been associated with transcriptional, post-transcriptional, and epigenetic regulation (Atianand et al., “Molecular Basis of DNA Recognition in the Immune System,” J. Immunol. 190:1911-1918 (2013) and Zhang et al., “The Ways of Action of Long Non-Coding RNAs in Cytoplasm and Nucleus,” Gene 547:1-9 (2014)).
It is now evident that germ line and cancer cells can have atypical ncRNA transcription, including repetitive elements from regions usually silenced in steady state (Leonova et al., “P53 Cooperates with DNA Methylation and a Suicidal Interferon Response to Maintain Epigenetic Silencing of Repeats and Noncoding RNAs,” Proc. Natl. Acad. Sci. 110:E89-E98 (2013) and Ting et al., “Aberrant Overexpression of Satellite Repeats in Pancreatic and Other Epithelial Cancers,” Science 331:593-596 (2011)). In eukaryotes, transcription of endogenous retroviruses and mobile elements is mostly repressed epigenetically through processes such as histone modification and DNA methylation, preventing disruptive or deregulatory effects due to integration into coding regions. In mammals, DNA methylation targets the cytidine in CpG motifs to form 5-methyl cytosine contributing to down-regulation of transcription for methylated sequences (Jones et al., “The Role of DNA Methylation in Mammalian Epigenetics,” Science 293:1068-1070 (2001)). Epigenetic regulation is strongly associated with developmental process whereas its deregulation, such as by disruption of DNA methylation, can be associated with de-differentiation and carcinogenic processes (Feinberg et al., “The History of Cancer Epigenetics,” Nature Rev. Cancer 4:143-153 (2004) and Yi et al., “Multiple Roles of p53-Related Pathways in Somatic Cell Reprogramming and Stem Cell Differentiation,” Cancer Res. 72:5635-5645 (2012)).
When expressed, endogenous retroviral RNA can activate the innate immune response via several pathways (Zeng et al., “MAVS cGAS and Endogenous Retroviruses in T-independent B Cell Responses,” Science 346:1486-1492 (2014)). In cancers, such as those driven by p53 mutations and epigenetic alterations, ncRNA associated with repetitive elements can be induced (Leonova et al., “P53 Cooperates with DNA Methylation and a Suicidal Interferon Response to Maintain Epigenetic Silencing of Repeats and Noncoding RNAs,” Proc. Natl. Acad. Sci. 110:E89-E98 (2013) and Ting et al., “Aberrant Overexpression of Satellite Repeats in Pancreatic and Other Epithelial Cancers,” Science 331:593-596 (2011)). In a study of mouse and human epithelial malignancies (Ting et al., “Aberrant Overexpression of Satellite Repeats in Pancreatic and Other Epithelial Cancers,” Science 331:593-596 (2011)), several repetitive elements emanating from genomic dark matter and often repressed in steady state conditions, particularly pericentromeric repeats such as GSAT (major satellite) in mouse and HSATII in humans, were only transcribed in cancer cells. A strong induction of repetitive elements from the mouse genome (particularly GSAT, B1, and B2) along with several other ncRNAs in cells bearing p53 oncogenic mutations and exposed to epigenome altering demethylating agents has been demonstrated (Leonova et al., “P53 Cooperates with DNA Methylation and a Suicidal Interferon Response to Maintain Epigenetic Silencing of Repeats and Noncoding RNAs,” Proc. Natl. Acad. Sci. 110:E89-E98 (2013)). Anomalous expression of the murine repetitive element GSAT was shown to trigger transcription of repeat-dependent activated interferon response (TRAIN), which can regulate apoptosis related cell death. The mechanism is that the double strands form immediately via bi-directional transcription. That is, as GSAT is being transcribed in the positive sense by one polymerase (pol II) its complementary DNA strand is also being transcribed by pol-III at the same time. In this model, there is never single stranded GSAT transcribed; the double stranded RNA is formed during RNA transcription. There has been no indication in Leonova et al., “P53 Cooperates with DNA Methylation and a Suicidal Interferon Response to Maintain Epigenetic Silencing of Repeats and Noncoding RNAs,” Proc. Natl. Acad. Sci. 110:E89-E98 (2013) or elsewhere that single stranded RNA GSAT would be immunostimulatory.
The present invention is directed to overcoming these and other deficiencies in the art.
One aspect of the present invention relates to a composition comprising an isolated, single-stranded RNA molecule having a nucleotide sequence comprising 20 or more bases and a pattern of CpG dinucleotides defined by a strength of statistical bias greater than or equal to zero, and a pharmaceutically acceptable carrier suitable for injection.
Another aspect of the present invention relates to a kit comprising a cancer vaccine and the composition of the present invention as an adjuvant to the cancer vaccine.
A further aspect of the present invention relates to a method of treating a subject for a tumor. This method involves administering to a subject the composition of the present invention (i.e., a composition comprising an isolated, single stranded RNA molecule having a nucleotide sequence comprising 20 or more bases and a pattern of CpG dinucleotides defined by a strength of statistical bias greater than or equal to zero, and a pharmaceutically acceptable carrier suitable for injection) under conditions effective to treat the subject for the tumor.
Another aspect of the present invention relates to a method of stimulating an immune response. This method involves providing the composition of the present invention (i.e., a composition comprising an isolated, single-stranded RNA molecule having a nucleotide sequence comprising 20 or more bases and a pattern of CpG dinucleotides defined by a strength of statistical bias greater than or equal to zero, and a pharmaceutically acceptable carrier suitable for injection) and contacting a cell or tissue with the composition under conditions effective to induce or increase an immune response against cancer in the cell or tissue.
A set of novel mathematical tools originally developed to analyze potentially immunostimulatory motif usage in viral and host genome coding sequences was used here. These methods were recently recast in the language of statistical physics and are extended here to analyze ncRNA motif usage (Greenbaum et al., “Patterns of Evolution and Host Gene Mimicry in Influenza and Other RNA Viruses,” PLoS Path. 4:e1000079 (2008) and Greenbaum et al., “Quantitative Theory of Entropic Forces Acting on Constrained Nucleotide Sequences Applied to Viruses,” Proc. Natl. Acad. Sci. 111:5054-5059 (2014)). For the first time, large-scale patterns of motif usage in human and murine transcriptomes, which are used to find anomalies ncRNA expressed in cancer transcriptomes (Rinn et al., “Genome Regulation by Long Noncoding RNAs,” Ann. Rev. Biochem. 81:145-66 (2012) and Ulitsky et al., “lincRNAs: Genomics Evolution and Mechanisms,” Cell 154:26-46 (2013)), were analyzed. As a result, features of ncRNA over-expressed in cancerous cells relative to normal cells were characterized (Leonova et al., “P53 Cooperates with DNA Methylation and a Suicidal Interferon Response to Maintain Epigenetic Silencing of Repeats and Noncoding RNAs,” Proc. Natl. Acad. Sci. 110:E89-E98 (2013); Ting et al., “Aberrant Overexpression of Satellite Repeats in Pancreatic and Other Epithelial Cancers,” Science 331:593-596 (2011); Levine et al., “The maintenance of epigenetic states by p53: the guardian of the epigenome,” Oncotarget 3:1503-1504 (2012)). This analysis includes several large datasets of functionally characterized ncRNA, in addition to pseudogenes and repetitive elements such as satellite DNA, endogenous retroviruses, and long and short interspersed elements. It is demonstrated that many ncRNAs preferentially expressed in cancerous cells display anomalous motif usage patterns compared to the vast majority of ncRNAs whose patterns of motif usage are shown to be consistent with those in coding regions. Based on their unusual pattern of motif usage and differential expression in cancerous versus normal cells, it is predicted that the ncRNA HSATII (human) and the nRNA GSAT (murine) incorporate immunostimulatory motifs in humans and mice respectively. Remarkably, the prediction demonstrating that both directly stimulate antigen-presenting cells and accordingly label them immunostimulatory ncRNAs (“i-ncRNAs”) is validated.
Other features and advantages of the invention will be apparent from the following detailed description and claims.
The invention described herein relates to RNA-containing compositions and methods of their use.
In a first aspect, the present invention relates to a composition comprising an isolated, single stranded RNA molecule having a nucleotide sequence comprising 20 or more bases and a pattern of CpG dinucleotides defined by a strength of statistical bias greater than or equal to zero, and a pharmaceutically acceptable carrier suitable for injection.
The composition of the present invention may be a pharmaceutical composition in the form of a vaccine, or a pharmaceutical composition intended to be co-administered with a vaccine, e.g., as an adjuvant.
In one embodiment, the RNA molecule in the composition of the present invention is an isolated RNA molecule. The term “isolated RNA molecule” includes RNA molecules which are separated from other nucleic acid molecules which are present in the natural source of the RNA. An “isolated” nucleic acid molecule is free of sequences which naturally flank the nucleic acid (i.e., sequences located at the 5′ and 3′ ends of the nucleic acid molecule). For example, in various embodiments, the isolated RNA molecule contains a defined number of bases. Moreover, an “isolated” nucleic acid molecule is substantially free of other cellular material, or culture medium, when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.
In one embodiment, the RNA molecule is a single-stranded RNA molecule.
In another embodiment, the composition comprises an isolated RNA molecule having a nucleotide sequence comprising 20 or more bases and a pattern of CpG dinucleotides defined by a strength of statistical bias greater than or equal to zero, with the proviso that the RNA molecule is not GSAT.
Suitable RNA molecules in the composition of the present invention include, without limitation, an RNA molecule having the nucleotide sequence of SEQ ID NOs:1-319, or a fragment thereof. Such RNA molecules can be isolated using standard molecular biology techniques and the sequence information provided herein. In one embodiment, using all or a portion of the nucleic acid sequence of SEQ ID NOs:1-319 as a hybridization probe, RNA molecules can be isolated using standard hybridization and cloning techniques (e.g., as described in Sambrook, J. et al. Molecular Cloning: A Laboratory Manual, 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989, which is hereby incorporated by reference in its entirety).
Moreover, an RNA molecule in the composition of the present invention can be isolated by the polymerase chain reaction (PCR) using synthetic oligonucleotide primers. In one embodiment, the primers are designed based upon the sequence (or a portion thereof) of any one or more of SEQ ID NOs:1-319.
The RNA molecule in the composition is an RNA molecule of about 20 or more bases in length. The length of the RNA molecule (i.e., the total number of bases) may vary depending on the pattern of CpG dinucleotides and the strength of statistical bias. In one embodiment, the RNA molecule has about 20-1200 bases, about 20-1100 bases, about 20-1000 bases, about 20-900 bases, about 20-800 bases, about 20-700 bases, about 20-600 bases, about 20-500 bases, about 20-450 bases, about 20-400 bases, about 20-350 bases, about 20-300 bases, about 20-250 bases, about 20-200 bases, about 20-190 bases, about 20-185 bases, about 20-180 bases, about 20-175 bases, about 20-170 bases, about 20-165 bases, about 20-160 bases, about 20-155 bases, about 20-150 bases, about 20-145 bases, about 20-140 bases, about 20-135 bases, about 20-130 bases, about 20-125 bases, about 20-120 bases, about 20-115 bases, about 20-110 bases, about 20-105 bases, about 20-100 bases, about 20-95, about 20-90, about 20-85, about 20-80 bases, about 20-75 bases about 20-70 bases, about 20-65 bases, about 20-60 bases about 20-55 bases, about 20-55 bases, about 20-50 bases, about 20-45 bases, about 20-40 bases, about 20-35 bases, or about 20-30 bases.
The RNA molecule of the composition has a pattern of CpG dinucleotides defined by a strength of statistical bias greater than or equal to zero. A physical system can be defined by the various states in which it can exist, and all the parameters involved in known constraints. When no assumption is made about the particular state the system is in, the system can be defined by the probability distribution of each of the states being occupied.
An RNA molecule with a pattern of motifs (e.g., CpG dinucleotides) can be defined by its length, nucleotide frequencies (i.e., the proportion of each nucleotide present in the sequence), and the number of times the motif is observed in the sequence. An RNA molecule of length L can take 4̂L different states, with each of those states being characterized by a number of motifs.
When considering the probability of a number of motifs (e.g., CpG dinucleotides) observed in a particular sequence, a random-nucleotide model can be used to define the probability distribution of observing a given number of motifs in all 4̂L possible sequences of length L, and with nucleotide frequencies according to the proportion observed in the given sequence. The random model gives rise to a distribution of states for such a sequence, each state having a number of motifs.
To quantify deviation of the particular observed sequence (i.e., state) from the random expectation, an additional parameter, referred to here as selective force, or simply force (e.g., force on CpG or force on UpA) may be added to the model. This additional parameter introduces a statistical bias in the probability distribution towards observing a particular state (i.e., a particular number of observed motifs). In the absence of this statistical bias, the probability of a given state (i.e., the number of observed motifs in a particular sequence) simplifies to the product of its nucleotide frequencies, whereas positive force shifts the distribution towards a larger number of observed motifs than what one would expect under the purely random model. Given a particular sequence, the “strength of statistical bias” is defined herein as the value of the force that maximizes the probability of the observed sequence. That is, the strength of statistical bias is the value for the force that results in a probability distribution of the number of motifs for a given sequence with length L and nucleotide frequencies such that the mean of the probability distribution is equal to the observed number of motifs in the sequence, as demonstrated in Example 5 (infra).
The larger the deviation of the number of the motifs observed in a given sequence is from random, the larger the force required to generate a distribution in which the number of observed motifs in the sequence is equal to the mean of the distribution.
The strength of statistical bias can be used as a parameter for identifying anomalous (i.e., outlier) states in a system, including anomalous use of motifs (e.g., CpG dinucleotides and other dinucleotide or trinucleotide repeats) in nucleotide sequences. In order to identify outliers, one must identify a threshold for which any strength of statistical bias that meets or exceeds the threshold will be considered anomalous. In order to identify a threshold, one may generate the distribution of observed strengths of statistical bias against a collection of samples chosen to represent the system (i.e., a reference set or panel). For example, a reference set for nucleotide sequences may include a set of biologically similar sequences, such as non-coding RNAs drawn from a database, such as the ENCODE database, as described in the Examples (infra). After the distribution of observed strengths of statistical bias is generated, it may be fit to a Gaussian distribution, characterized by a mean and standard deviation, and utilized as a null hypothesis (i.e., null distribution) against which to test the strength of statistical bias on any single sample. Once a statistical threshold is set, the identification of anomalous states may be carried out based only on the strength of statistical bias for the particular state in question, without the use of a reference set.
The present invention, as demonstrated in Example 6 (infra), has defined the statistical threshold for identifying sequences with anomalous patterns of CpG dinucleotides as those sequences having a strength of statistical bias greater than or equal to zero.
Specific exemplary RNA molecules of the composition include, without limitation, SEQ ID NOs:1-96 (
The RNA molecule in the composition of the present invention has an immunostimulating effect on cells, including tumor cells. As used herein, the term “immunostimulating effect” or “stimulating an immune response” includes eliciting an immune response, e.g., inducing or increasing T cell-mediated and/or B cell-mediated immune responses that are influenced by modulation of T cell costimulation. Exemplary immune responses include B cell responses (e.g., antibody production), T cell responses (e.g., cytokine production, and cellular cytotoxicity), and activation of cytokine responsive cells, e.g., macrophages. Eliciting an immune response includes an increase in any one or more immune responses. It will be understood that upmodulation of one type of immune response may lead to a corresponding downmodulation in another type of immune response. For example, upmodulation of the production of certain cytokines (e.g., IL-10) can lead to downmodulation of cellular immune responses. The RNA molecule elicits an immunostimulating effect on immune cells. As used herein, the term “immune cell” includes cells that are of hematopoietic origin and that play a role in the immune response. Immune cells include lymphocytes, such as B cells and T cells; natural killer cells; and myeloid cells, such as monocytes, macrophages, eosinophils, mast cells, basophils, and granulocytes. The term “T cell” includes CD4+ T cells and CD8+ T cells. The term T cell also includes both T helper 1 type T cells and T helper 2 type T cells.
In formulating the RNA-containing composition of the present invention, the amount of RNA molecule included in the composition will vary depending on the choice of RNA molecule, its immunostimulating activity, and its intended treatment and subject.
In the composition of the present invention, the RNA molecule is incorporated into pharmaceutical compositions suitable for administration (e.g., by injection). Such compositions typically comprise the RNA molecule and a carrier, e.g., a pharmaceutically acceptable carrier. The pharmaceutically acceptable carrier suitable for injection is, according to one embodiment, a carrier for the RNA molecule. As used herein the language “pharmaceutically acceptable carrier” is intended to include any and all solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. Supplementary active compounds can also be incorporated into the compositions.
The pharmaceutically acceptable carrier may be a stabilizer, an emulsion, liposome, microsphere, immune stimulating complex, nanospheres, montanide, squalene, cyclic dinucleotides, complementary immune modulators, or any combination thereof. The carrier should be suitable for the desired mode of delivery of the composition (i.e., suitable for injection). Exemplary modes of delivery include, without limitation, intravenous injection, intra-arterial injection, intramuscular injection, intracavitary injection, subcutaneously, intradermally, transcutaneously, intrapleurally, intraperitoneally, intraventricularly, intra-articularly, intraocularly, intratumorally, or intraspinally.
A pharmaceutical composition of the invention is formulated to be compatible with its intended route of administration. Solutions or suspensions used for parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol, or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates, or phosphates; and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be enclosed in ampoules, disposable syringes, or multiple dose vials made of glass or plastic.
Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. For intravenous administration, suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). The composition must be sterile and should be fluid to the extent that easy syringeability exists. It must be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, liquid polyethylene glycol, and the like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. It may be preferable to include isotonic agents, for example, sugars, polyalcohols such as manitol, sorbitol, and sodium chloride in the composition. Prolonged absorption of the injectable compositions can be brought about by including in the composition an agent which delays absorption, for example, aluminum monostearate and gelatin.
Sterile injectable solutions can be prepared by incorporating the active compound (i.e., RNA molecule) in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle which contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.
It is especially advantageous to formulate parenteral compositions in dosage unit form for ease of administration and uniformity of dosage. Dosage unit form as used herein refers to physically discrete units suited as unitary dosages for the subject to be treated; each unit containing a predetermined quantity of active compound (i.e., RNA molecule) calculated to produce the desired therapeutic effect in association with the required pharmaceutical carrier. The specification for the dosage unit forms of the invention are dictated by and directly dependent on the unique characteristics of the active compound and the particular therapeutic effect to be achieved, and the limitations inherent in the art of compounding such an active compound for the treatment of individuals.
Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals. The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the methods of the invention (described infra), the therapeutically effective dose can be estimated initially from cell culture assays. A dose may be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound which achieves a half-maximal activity) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma may be measured, for example, by high performance liquid chromatography.
As defined herein, a therapeutically effective amount of an RNA molecule (i.e., an effective dosage) ranges from about 0.001 to 30 mg/kg body weight, or about 0.01 to 25 mg/kg body weight, or about 0.1 to 20 mg/kg body weight, or about 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 6 mg/kg body weight. The skilled artisan will appreciate that certain factors may influence the dosage required to effectively treat a subject, including but not limited to, the severity of the disease or disorder, previous treatments, the general health and/or age of the subject, and other diseases present. Moreover, treatment of a subject with a therapeutically effective amount of an agent can include a single treatment or, preferably, can include a series of treatments.
In one embodiment, a subject is treated with the composition of the present invention in the range of between about 0.1 to 20 mg/kg body weight, one time per week for between about 1 to 10 weeks, preferably between 2 to 8 weeks, more preferably between about 3 to 7 weeks, and even more preferably for about 4, 5, or 6 weeks. It will also be appreciated that the effective dosage of composition used for treatment may increase or decrease over the course of a particular treatment. Changes in dosage may result and become apparent from the results of diagnostic assays.
In one embodiment, nucleic acid molecules can be inserted into vectors and used as gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, intravenous injection, local administration (U.S. Pat. No. 5,328,470, which is hereby incorporated by reference in its entirety) or by stereotactic injection (Chen et al., “Regression of Experimental Gliomas by Adenovirus-Mediated Gene Transfer In Vivo,” Proc. Natl. Acad. Sci. USA 91:3054-3057 (1994), which is hereby incorporated by reference in its entirety). The pharmaceutical preparation of the gene therapy vector can include the gene therapy vector in an acceptable diluent or can comprise a slow release matrix in which the gene delivery vehicle is imbedded. Alternatively, where the complete gene delivery vector can be produced intact from recombinant cells, e.g., retroviral vectors, the pharmaceutical preparation can include one or more cells which produce the gene delivery system. The pharmaceutical compositions can be included in a container, pack, or dispenser together with instructions for administration.
The composition of the present invention can also include an effective amount of an additional adjuvant or mitogen.
Suitable additional adjuvants include, without limitation, Freund's complete or incomplete, mineral gels such as aluminum hydroxide, surface active substances such as lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, dinitrophenol, Bacille Calmette-Guerin, Carynebacterium parvum, non-toxic Cholera toxin, N-acetyl-muramyl-L-threonyl-D-isoglutamine (thr-MDP), N-acetyl-nor-muramyl-L-alanyl-D-isoglutamine (CGP 11637, referred to as nor-MDP), N-acetylmuramyl-L-alanyl-D-isoglutaminyl-L-alanme-2-(r-2′-dipalmitoyl-s-n-glycero-3-hydroxyphosphoryloxy)-ethylamine (CGP 19835 A, referred to as MTP-PE), and RIBI, which contains three components extracted from bacteria, monophosphoryl lipid A, trehalose dimycolate, and cell wall skeleton (MPL+TDM+CWS) in a 2% squalene/TWEEN® 80 emulsion.
As used herein, “mitogen” refers to any agent that stimulates lymphocytes to proliferate independently of an antigen. The mitogen, in combination with the RNA molecule in the composition of the present invention helps to promote an immunostimulating effect on tumor cells. Exemplary mitogen include, without limitation, CpG oligodeoxynucleotides that stimulate immune activation as described in U.S. Pat. No. 6,194,388; U.S. Pat. No. 6,207,646; U.S. Pat. No. 6,214,806; U.S. Pat. No. 6,218,371; U.S. Pat. No. 6,239,116; U.S. Pat. No. 6,339,068; U.S. Pat. No. 6,406,705; and U.S. Pat. No. 6,429,199, each of which is hereby incorporated by reference in its entirety. Any suitable dosage of mitogen can be used to promote an immunostimulating effect on tumor cells. For example, a suitable dosage of mitogen comprises about 50 ng up to about 100 μg per ml, about 100 ng up to about 25 μg per ml, or about 500 ng up to about 5 μg per ml.
The composition may also include an antigen or an antigen-encoding RNA molecule. As used herein, “antigen” refers to any agent that induces an immune response, i.e., a protective immune response, against the antigen, and thereby affords protection against a pathogen or disease (e.g., cancer). The antigen can take any suitable form including, without limitation, whole virus or bacteria; virus-like particle; anti-idiotype antibody; bacterial, viral, or parasite subunit vaccine or recombinant vaccine; and bacterial outer membrane (“OM”) bleb formations containing one or more of bacterial OM proteins.
The antigen can be present in the compositions in any suitable amount that is sufficient to generate an immunologically desired response. The amount of antigen or antigen-encoding RNA molecule to be included in the composition will depend on the immunogenicity of the antigen itself and the efficacy of any adjuvants co-administered therewith. In general, an immunologically or prophylactically effective dose comprises about 1 μg to about 1,000 μg of the antigen, about 5 μg to about 500 μg, or about 10 μg to about 200 μg.
According to another embodiment, the composition (i.e., a first pharmaceutical composition) may further include a cancer vaccine (i.e., as a second pharmaceutical composition) that includes an antigen or a nucleic acid molecule encoding the antigen, and a pharmaceutically suitable carrier. According to this embodiment, the first pharmaceutical composition is intended to be co-administered with the second pharmaceutical composition for purposes of enhancing the efficacy of the vaccine. The first pharmaceutical composition is formulated for and/or administered in a manner that achieves an immunostimulating effect on tumor cells.
Cancer vaccines are known, and include, for example, sipuleucel-T (Provenge®, manufactured by Dendreon), which is approved for use in some men with metastatic prostate cancer. This vaccine is designed to stimulate an immune response to prostatic acid phosphatase (“PAP”), an antigen that is found on most prostate cancer cells. Sipuleucel-T is customized to each patient. The vaccine is created by isolating immune system cells called antigen-presenting cells (“APCs”) from a patient's blood through a procedure called leukapheresis. The APCs are sent to Dendreon, where they are cultured with a protein called PAP-GM-CSF. This protein consists of PAP linked to another protein called granulocyte-macrophage colony-stimulating factor (GM-CSF). The latter protein stimulates the immune system and enhances antigen presentation. APC cells cultured with PAP-GM-CSF constitute the active component of sipuleucel-T. Each patient's cells are returned to the patient's treating physician and infused into the patient, Patients receive three treatments, usually 2 weeks apart, with each round of treatment requiring the same manufacturing process. Although the precise mechanism of action of sipuleucel-T is not known, it appears that the APCs that have taken up PAP-GM-CSF stimulate T cells of the immune system to kill tumor cells that express PAP.
Vaccines to prevent HPV infection and to treat several types of cancer are being studied in clinical trials. Active clinical trials of cancer treatment vaccines include vaccines for bladder cancer, brain tumors, breast cancer, cervical cancer, Hodgkin lymphoma, kidney cancer, leukemia, lung cancer, melanoma, multiple myeloma, non-Hodgkin lymphoma, pancreatic cancer, prostate cancer, and solid tumors. Active clinical trials of cancer preventive vaccines include those for cervical cancer and solid tumors. Cancer vaccines approved from these and other trials may be suitable cancer vaccines for use in combination with the composition of the present invention.
Another aspect of the present invention relates to a kit comprising a cancer vaccine and the composition of the present invention, as well as instructions and a suitable delivery device, which can optionally be pre-filled with the vaccine formulation (i.e., the composition of the present invention and the cancer vaccine). An exemplary delivery device includes, without limitation, a syringe comprising an injectable dose.
A further aspect of the present invention relates to a method of treating a subject for a tumor. This method involves administering to a subject the composition of the present invention under conditions effective to treat the subject for the tumor.
In one embodiment of this and other methods described herein, the subject is a mammal including, without limitation, humans, non-human primates, dogs, cats, rodents, horses, cattle, sheep, and pigs. Both juvenile and adult mammals can be treated. The subject to be treated in accordance with the present invention can be a healthy subject, a subject with a tumor, a subject with cancer, a subject being treated for cancer, a subject in cancer remission, or a subject that has an immune deficiency or is immunosuppressed. Although otherwise healthy, the elderly and the very young may have a less effective (or less developed) immune system and they may benefit greatly from the enhanced immune response.
Tumors include, without limitation, sarcoma, melanoma, lymphoma, leukemia, neuroblastoma, or carcinoma cell tumors.
In carrying out this and the other methods described herein, administering may be carried out as described supra, including, for example, intratumorally or systemically using a pharmaceutical composition as described supra, and amounts, dosages, and administration frequencies described supra.
A further aspect of the present invention relates to a method of stimulating an immune response against cancer in a cell or tissue. This method involves providing the composition of the present invention and contacting a cell or tissue with the composition under conditions effective to stimulate an immune response against cancer in the cell or tissue.
Cancers suitable for treatment in carrying out this aspect of the present invention include, for example and without limitation, those that are incident to pathogen infection, e.g., cervical cancer, vaginal cancer, vulvar cancer, oropharyngeal cancers, anal cancer, penile cancer, and squamous cell carcinoma of the skin caused by papillomavirus infection (D'Souza et al, “Case-Control Study of Human Papillomavirus and Oropharyngeal Cancer,” NEJM 356(19):1944-1956 (2007); Harper et al., “Sustained Immunogenicity and High Efficacy Against HPV 16/18 Related Cervical Neoplasia: Long-term Follow up Through 6.4 Years in Women Vaccinated with Cervarix (GSK's HPV-16/18 AS04 candidate vaccine),” Gynecol. Oncol. 109:158-159 (2008), each of which is hereby incorporated by reference in its entirety) and liver cancer caused by Hepatitis B virus infection (Chang et al., “Decreased Incidence of Hepatocellular Carcinoma in Hepatitis B Vaccines: A 20-Year Follow-up Study,” J. Natl. Cancer Inst. 101:1348-1355 (2009), which is hereby incorporated by reference in its entirety) and Hepatitis C virus infection, Burkitt lymphoma, non-Hodgkin lymphoma, Hodgkin lymphoma, nasopharyngeal carcinoma caused by the Epstein-Barr virus, Kaposi sarcoma caused by the Kaposi sarcoma-associated herpesvirus, adult T-cell leukemia/lymphoma, caused by the human T-cell lymphotropic virus type 1, stomach cancer, mucosa-associated lymphoid tissue lymphoma caused by the bacterium Helicobacter pylori, bladder cancer caused by the parasite Schistosoma hematobium, and cholangiocarcinoma caused by the parasite Opisthorchis viverrini. An enhanced immune response achieved by the methods of treatment and compositions of the present invention may enhance the preventative efficacy of such vaccines for the prevention of cancers.
In one embodiment this and other methods of the present invention are carried out to treat cancers that have already developed in a subject. Thus, the methods and compositions of the present invention are intended to delay or stop cancer cell growth: to cause tumor shrinkage; to prevent cancer from coming back: or to eliminate cancer cells that have not been killed by other forms of treatment.
According to one embodiment, a composition to be administered includes the antigen that is intended to generate the desired immune response as well as the RNA molecule having a pattern of CpG dinucleotides defined by a strength of statistical bias greater than or equal to zero. Thus, the antigen and the RNA molecule are co-administered simultaneously. The composition may be administered as a vaccine in a single dose or in multiple doses, which can be the same or different.
This embodiment may optionally include further administration of a composition of the present invention that includes the RNA molecule but not the antigen. This composition can be administered once or twice daily within several days preceding vaccine administration and for a period of time following vaccine administration. By way of example, post-vaccine administration can be carried out for up to about six weeks following each vaccine administration, preferably at least about two to three weeks, or at least about 3 to 10 days following each vaccine administration.
According to a second embodiment, a vaccine composition to be administered includes the antigen that is intended to generate the desired immune response but not the RNA molecule. However, the RNA molecule can be co-administered at about the same time. For instance, the dosage of the vaccine can be administered interperitoneally or intransally, and a dosage of the RNA molecule can be administered orally at about the same time (same day). The dosage containing the RNA molecule can also be once or twice administered daily for up to about six weeks following the vaccine administration.
In carrying out this method of the present invention, contacting the cell or tissue with the composition may be carried out in vitro or in vivo.
According to another aspect of the present invention, the RNA-containing composition has an immunostimulating effect that primes (e.g., stimulates, induces, enhances, alters, or modulates) the anti-pathogen response of a subject's innate immune system in non-tumor cells. Such a response may find use, e.g., as an adjuvant to a vaccine, a vaccine supplement, or under conditions where such an immunostimulating effect is desirable.
Yet a further aspect of the present invention relates to a method for identifying RNA molecules with immunostimulating patterns of CpG dinucleotides. This method involves providing an RNA molecule, determining the length and frequency of nucleotides in the RNA molecule, determining the number of CpG dinucleotides present in the RNA molecule, calculating the strength of statistical bias on CpG dinucleotides for the RNA molecule, defining a threshold of statistical bias, determining if the strength of statistical bias on CpG dinucleotides for the RNA molecule meets or exceeds the threshold, and characterizing the RNA molecule sequence as possessing an immunostimulating pattern if it meets or exceeds the threshold of statistical bias.
In carrying out this method of the present invention, nucleotide frequencies are calculated by counting the number of times that a nucleotide occurs and dividing that number by the total length of the sequence, L (which may also occur as ambiguously defined bases that cannot be assigned as A, C, G, U, or T). For example, fθ(A), the frequency of A nucleotides, would be the number of occurrences of the base, A, in S0 divided by L, the length of S0, even when ambiguous bases are included.
In a further embodiment, the strength of statistical bias on CpG dinucleotides for the RNA molecule sequence (x(S0)) is determined by maximizing the probability of a sequence (S0) over x, where
Zm(x) is the normalization constant,
P(S|x, m) is the probability of the sequence given the force (x) and motif m,
x is the force on the motif m that introduces a statistical bias over P,
Nm(S) is the number of observed motifs, and
fθ(si) is the nucleotide frequencies.
Defining a threshold of statistical bias can be carried out by providing a reference set comprising a plurality of RNA molecule sequences, calculating the strength of statistical bias on CpG dinucleotides for each RNA molecule sequence in the reference set, generating a distribution of the strengths of statistical bias on CpG dinucleotides for the RNA molecule sequences in the reference set to define a null distribution, setting a statistical significance level, and determining the value of the strength of statistical bias that meets or exceeds the statistical significance value.
The present invention may be further illustrated by reference to the following examples, which should not be construed as limiting.
Using a novel approach from statistical physics, the experiments described herein quantify global transcriptome-wide motif usage for the first time in human and murine ncRNAs determining that most have motif usage consistent with the coding genome. However, an outlier subset of tumor-associated ncRNAs typically of recent evolutionary origin has motif usage that is often indicative of pathogen-associated RNA. For instance, as demonstrated in these examples, the tumor associated human repeat HSATII is enriched in motifs containing CpG dinucleotides in AU-rich contexts which most of the human genome and human adapted viruses have evolved to avoid. It is further demonstrated that a key subset of these ncRNAs function as immunostimulatory “self-agonists” and directly activate cells of the mononuclear phagocytic system to produce pro-inflammatory cytokines. These ncRNAs arise from endogenous repetitive elements that are normally silenced, yet are often very highly expressed in cancers. The innate response in tumors may partially originate from direct interaction of immunogenic ncRNAs expressed in cancer cells with innate pattern recognition receptors and thereby assign a new danger-associated function to a set of dark matter repetitive elements. These findings potentially reconcile several observations concerning the role of ncRNA expression in cancers and their relationship to the tumor microenvironment.
Employing the GENCODE database of long non-coding RNA transcripts from humans and mice (Versions 19 and 2 for human and mouse, respectively) the strength of statistical bias (referred to as a force) on sequence motif usage for all contained lncRNAs was calculated as described in Example 5 (infra). GENCODE lncRNA established a baseline of sequence motif usage expressed in a broad array of cells and tissues so that these patterns of motif usage could be compared with those of ncRNAs expressed in certain cancers. For each sequence, the force (i.e. strength of statistical bias) on all two and three nucleotide motifs was calculated using EQUATION 5 (infra) to calculate the probability of observing a sequence with that number of motifs. The number of sequences in GENCODE for which a given dinucleotide is aberrantly expressed is illustrated in
Average force (i.e. strength of statistical bias) on a given motif in the Human and Mouse GENCODE dataset, for lncRNAs with length greater than 500 nucleotides. The forces (i.e. strengths of statistical bias) are listed for the significant motifs in humans. The force is a measure of the strength of statistical bias to enhance or suppress a motif versus what is expected from that sequence's nucleotide content.
These dinucleotide motif usage patterns are similar in human and mouse genomes across the wide array of cells and cell lines contained in GENCODE (Djebali et al., “Landscape of Transcription in Human Cells,” Nature 48:101-108 (2012) and Harrow et al., “GENCODE: The Reference Human Genome Annotation for the ENCODE Proejct,” Genome Res. 22:1760-1774 (2012), which are hereby incorporated by reference in their entirety). Strikingly, avoidance of the CpG and UpA dinucleotide motifs in this dataset is stronger than in coding regions (
Trinucleotide motifs with significant forces are listed in Table 1, along with dinucleotide motifs. Trinucleotide motifs with significant forces (i.e. strengths of statistical bias) acting on them are conserved between humans and mice, as was the case for dinucleotides, with the exception of UAC and UAG (which are significant in humans but less so in mice). Except for UAG (chain termination codons used in coding RNAs), whenever a trinucleotide motif is significantly enhanced or avoided in humans its reverse complement is also significantly enhanced or avoided suggesting avoidance of complementary motifs. The strongest forces (i.e. strengths of statistical bias) suppress CpG and CpG-containing trinucleotides, particularly when an A or U is next to the core CpG motif. This is consistent with the avoidance of CpGs in AU contexts observed in influenza viruses replicating in humans (Greenbaum et al, “Quantitative Theory of Entropic Forces Acting on Constrained Nucleotide Sequences Applied to Viruses,” Proc. Natl. Acad. Sci. 111:5054-5059 (2014); Greenbaum et al, “Patterns of Olignonculeotide Sequences in Viral and Host Cell RNA Identify Mediators of the Host Innate Immune System,” PLoS One 4:e5969 (2009); Jimenez-Baranda et al., “Oligonucleotide Motifs that Disappear During the Evolution of Influenza Virus in Humans Increase Alpha Interferon Secretion by Plasmacytoid Dendritic Cells,” J. Virol. 85:3893-3904 (2011), which are hereby incorporated by reference in their entirety). Given the apparent bias against CpG and UpA, it was further determined if these were linked. Pearson correlation between these forces across all GENCODE ncRNA in humans and mice showed no correlation between CpG and UpA biases (r=0.0006;
Prior work revealed aberrant expression of non-coding RNA across a spectrum of mouse and human cancers (Leonova et al., “P53 Cooperates with DNA Methylation and a Suicidal Interferon Response to Maintain Epigenetic Silencing of Repeats and Noncoding RNAs,” Proc. Natl. Acad. Sci. 110:E89-E98 (2013) and Ting et al., “Aberrant Overexpression of Satellite Repeats in Pancreatic and Other Epithelial Cancers,” Science 331:593-596 (2011), which are hereby incorporated by reference in their entirety). These sequences were found in the Repbase database of human and murine repetitive elements and the FANTOM database of murine non-coding elements (currently NONCODE) (Jurka et al., “Repbase Update A Database of Eukaryotic Repetitive Elements,” Cytogenetic and Genome Res. 110:462-467 (2005) and Xie et al., “NONCODEv4: Exploring the World of Long Non-Coding RNA Genes,” Nucleic Acids Res. 42:D98-D103 (2014), which are hereby incorporated by reference in their entirety). A high induction of GSAT in a murine testicular teratoma and liposarcoma tumor model was also found (
Listed above are the repetitive elements from Repbase with a significantly high CpG force. These elements are typically not found to be expressed in normal tissue, yet some may be expressed in cancer cells and cell lines.
The forces which quantify the strength of the statistical bias on the often underrepresented CpG and UpA dinucleotides were used to differentiate between ncRNAs found preferentially in cancerous cells and the total lncRNA referenced in GENCODE for humans and mice, as these two dinucleotides essentially account for all significant trinucleotide motifs in this set. The distribution of forces (i.e. strengths of statistical bias) on CpG and UpA were used to define a null hypothesis, which was approximate by a Gaussian distribution (
Many of the ncRNAs from Leonova et al., “P53 Cooperates with DNA Methylation and a Suicidal Interferon Response to Maintain Epigenetic Silencing of repeats and Noncoding RNAs,” Proc. Natl. Acad. Sci. 110:E89-E98 (2013) and Ting et al., “Aberrant Overexpression of Satellite Repeats in Pancreatic and Other Epithelial Cancers,” Science 331:593-596 (2011), which are hereby incorporated by reference in their entirety are outliers of at least three standard deviations with respect to at least one of the significant motifs implicated in the previous section, accounting for 70.46% of the modulated Repbase RNA expression induced in pancreatic cancer along with even higher percentages (74.86% and 85.30%, respectively) in the smaller sets of prostate and lung cancers. HSATII is the most differentially expressed (by a considerable margin) in the pancreatic cancer data and HSATII and BSR are the highest in prostate and lung. In p53 knockout murine cell lines treated with demethylation agents, around 68 ncRNAs are significantly modulated (Leonova et al., “P53 Cooperates with DNA Methylation and a Suicidal Interferon Response to Maintain Epigenetic Silencing of Repeats and Noncoding RNAs,” Proc. Natl. Acad. Sci. 110:E89-E98 (2013), which is hereby incorporated by reference in its entirety). Among those, 78.96% of the total expression comes from outliers as defined above, with the vast majority coming from GSAT and B2. Overall, it was observed that repetitive sequences containing unusual motif usage had varying degrees of conservation. However, the subset preferentially expressed in cancerous cells and tissues are encoded by sequences of more recent evolutionary origin. HSATII and GSAT are only conserved back to primates and mouse, respectively, and 21 of the 22 ncRNAs from Ting et al., “Aberrant Overexpression of Satellite Repeats in Pancreatic and Other Epithelial Cancers,” Science 331:593-596 (2011), hereby incorporated by reference in its entirety, are conserved in humans and primates but no further back in evolution. Any function is likely to be species specific.
This analysis highlights that many ncRNAs upregulated in cancer display abnormal nucleotide motif usage that had previously been related to immunogenic properties in viruses. The innate immune system contains several effector cells that react to immunogenic nucleic acids such as exogenous viral and bacterial nucleic acids as well as endogenous nucleic acids which can be released upon cell death (Atianand et al., “Molecular basis of DNA Recognition in the Immune System,” J. Immunol. 190:1911-1918 (2013), which is hereby incorporated by reference in its entirety). Among those effectors, the mononuclear phagocytic system (macrophages, monocytes, and dendritic cells (“DC” s)) contains key regulators of innate immune activation and adaptive immunity (Guilliams et al., “Dendritic Cells Monocytes and Macrophages: A Unified Nomenclature Based on Ontogeny,” Nature Rev. Immunol. 14:571-578; Kroemer et al., “Immunogenic Cell Death in Cancer Therapy,” Ann. Rev. Immunol. 31:51-72 (2013); Sabado et al., “Dendritic Cell Immunotherapy,” Ann. New York Acad. Sci. 1284:31-45 (2013), which are hereby incorporated by reference in their entirety). DCs efficiently sense and sample their environment to integrate information and mount a proper response which may be tolerogenic or immunogenic. To test whether ncRNA with highly unusual motif usage could be recognized as a danger-associated molecular pattern (“DAMP”) by some nucleic acid sensing pattern recognition receptors (“PRRs”), the effect of human HSATII and murine GSAT following transfection in human monocyte derived DCs (“moDCs”) and murine bone marrow derived macrophages was studied. Liposomal transfection was required for stimulation, whereas naked RNA had no effect; implying recognition is consistent with activation via an endosomal or intracellular sensor (
Different ncRNA were generated by in vitro transcription using minigenes coding for the two main candidate outliers computationally predicted to have immunogenic motif usage (HSATII and GSAT). RNA from minigenes was derived as controls, encoding scrambled versions with the same nucleotide content but normal motif usage (labeled “HSATII-sc” and “GSAT-sc”) and repetitive elements of comparable length, but which have normal motif usage patterns (RMER33 and UCON18), as described below. In human moDCs liposomal transfection of HSATII induced significant production of interleukin 6 and 12 (IL-6 and IL-12), and TNFalpha relative to both endogenous controls and their scrambled versions (
HSATII and GSAT ncRNA induced IL-12 in human moDCs similarly to the TLR3 ligand poly-IC (a synthetic dsRNA mimic;
Pathogen-associated molecular patterns (“PAMPs”) and danger-associated molecular patterns (DAMPs) activate innate immune cells through pattern recognition receptors (PRRs). To better characterize the mechanisms involved in sensing i-ncRNA, the immunomodulatory properties of HSATII and GSAT on a panel of imBMs that lack specific PRRs or effector molecules in their downstream signaling pathways was studied (
MYD88 is a key cytosolic adaptor protein that is used by all TLRs except TLR3 to activate the transcription factor NFkB. Similarly, the mutated form of UNC93b essentially eliminated inflammatory responses in imBMs. While less well characterized than MYD88, this protein is known to interact with several endosomal Toll-like receptors (TLR3, 7, and 9), and has been implicated in TLR trafficking between the endoplasmic reticulum and endosomes, and their resultant maturation (Casrouge et al, “Herpes Simplex Virus Encephalities in Human UNC-93B Deficiency,” Science 314:308-312 (2006); Lee et al., “UNC93B1 Mediates Differential Trafficking of Endosomal TLRs,” eLife 2:e00291; Tabeta et al., “The Unc93B1 Mutation 3d Disrupts Exogenous Antigen Presentation and Signaling via Toll-like Receptors 3 7 and 9,” Nature Immunol. 7:156-164 (2006), which are hereby incorporated by reference in their entirety). The requirement for TLR3, TLR7, and TLR9, which are known to recognize double-stranded RNA, single-stranded RNA, and CpG DNA respectively, was tested (
There is a surprising similarity to be drawn between foreign viral nucleotide sequences and select ncRNAs silent in normal cells, yet transcribed in cancer cells, activating innate immunity (Jimenez-Baranda et al., “Olignonucleotide Motifs That Disappear During the Evolution of Influenza Virus in Humans Increase Alpha Interferon Secretion by Plasmacytoid Dendritic Cells,” J. Virol. 85:3893-3904 (2011); Casrouge et al., “Herpes Simplex Virus Encephalitis in Human UNC-93B Deficiency,” Science 314:308-312 (2006); Bogunovic et al., “Immune Profile and Mitotic Index of Metastatic Melanoma Lesions Enhance Clinical Staging in Predicting Patient Survival,” Proc. Natl. Acad. Sci. 106:20429-20434 (2009); Cosset et al., “Comprehensive Metagenomic Analysis of Glioblastoma Reveals Absence of Known Virus Despite Antiviral-Like Type I Interferon Gene Response,” International J. Cancer 135:1381-1389 (2014), which are hereby incorporated by reference in their entirety). It was determined that ncRNAs expressed predominantly in normal cells from humans and mice reflect patterns of nucleotide sequence motif avoidance, such as underrepresentation of CpG containing sequences and reduced UpA, similar to protein coding RNA. This often includes a many-fold underrepresentation of CpG containing sequences and reduced UpA motif usage when compared to expected levels. However, the genome also harbors repetitive elements, which often have abnormal usage of CpG and UpA motifs than that observed in RNA expressed in normal cells and tissues. Sets of these ncRNA, typically newer genome entries over evolutionary time scales, can be expressed in very high levels in cancerous cells and tumors. This is why human and mouse elements expressed in cancer cells can have different sequences but can share high CpG content and are not generally observed in the human or mouse transcriptome in normal cells.
It was previously proposed that immunostimulatory and proinflammatory properties of highly inflammatory influenza and other RNA viruses derive in part from RNA containing CpGs in AU-rich contexts, which are avoided in RNA viruses circulating in humans. Experimental evidence has supported this hypothesis (Jimenez-Baranda et al., “Olignonucleotide Motifs That Disappear During the Evolution of Influenza Virus in Humans Increase Alpha Interferon Secretion by Plasmacytoid Dendritic Cells,” J. Virol. 85:3893-3904 (2011); Atkinson et al., “The Influence of CpG and UpA Dinocleotide Frequencies on RNA Virus Replication and Characterization of the Innate Cellular Pathways Underlying Virus Attenuation and Enhanced Replication,” Nucleic Acids Res. 42:4527-4545 (2014) and Vabret et al., “The Biased Nucleotide Composition of HIV-1 Triggers Type I Interferon Response and Correlates with Subtype D Increased Pathogenicity,” PLoS One 7:e33501 (2012), which are hereby incorporated by reference in their entirety). The analysis was recently recast in the language of statistical physics in a way that is theoretically insightful and computationally efficient (Greenbaum et al., “Quantitative Theory of Entropic Forces Acting on Constrained Nucleotide Sequences Applied to Virus,” Proc. Natl. Acad. Sci. 111:5054-5059 (2014), which is hereby incorporated by reference in its entirety). In this language, the evolution and optimization of nucleotide sequence motifs is driven by the interplay between selective and entropic forces. The latter randomize motif frequencies in a genome under constraints while the former are largely Darwinian, optimizing for functions enhancing viral replication and spreading. However, ncRNAs mostly transcribed in cancerous cells would not be exposed to the same selective and entropic forces as coding and ncRNA transcribed in normal cells. Based on motif usage patterns, it is predicted that many ncRNA may have immunogenic properties, presenting danger-associated molecular patterns.
HSATII and murine GSAT were focused on experimentally, as they are preferentially and highly expressed in carcinogenic processes and exhibit abnormal patterns of motif usage. In particular, human HSATII is enriched in CpG motifs in AU-rich contexts avoided in genomes of humans and human adapted viruses. It is demonstrated that their computationally predicted immunogenic properties lead to the induction of inflammatory cytokines in human and murine innate cells (
A key role for MYD88 and UNC93b as regulators of GSAT immunogenicity was identified, but without evidence for the common endosomal nucleic acid sensors typically regulated by UNC93b or associated with the MYD88 adaptor (TLRs 2, 4, 7, and 9). These results indicate that in the murine imBM background there is potent induction of TNFalpha. Further studies will be required to elucidate whether TLR13, identified in murine cells and which recognizes ribosomal bacterial and viral RNA, is involved or whether there exist intracellular sensors of i-ncRNA associated with MYD88 (Li et al., Sequence Specific Detection of Bacterial 23S Ribosomal RNA by TLR13,” eLife 1:e00102 (2012); Oldenburg et al., “TLR13 Recognizes Bacterial 23S rRNA Devoid of Erythromycin Resistance-Forming Modification,” Science 337:1111-1115 (2012); Shi et al., “A novel Toll-like Receptor That Recognizes Vesicular Stomatitis Virus,” J. Biol. Chem. 286:4517-4524 (2012), which are hereby incorporated by reference in their entirety), as there are for dsDNA (DHX-9 or -36) (Kim et al., “Aspartate-Glutamate-Alanine-Histidine Box Motif (DEAH)/RNA Helicase A Helicases Sense Microbial DNA in Human Plasmacytoid Dendritic Cells,” Proc. Natl. Acad. Sci. 107:15181-15186 (2010), which is hereby incorporated by reference in its entirety). Interestingly, it is found that alignment of GSAT contains a subsequence conserved in immunogenic RNA isolated from bacterial ribosomal RNA, which specifically activates murine TLR13 (Oldenburg et al., “TLR13 Recognizes Bacterial 23S rRNA Devoid of Erythromycin Resistance-Forming Modification,” Science 337:1111-1115 (2012), which is hereby incorporated by reference in its entirety).
Activation of innate immune signaling can contribute either to carcinogenesis or antitumoral immunity. Toll-like receptor signaling and MYD88 have been associated with tumor development (Wang et al., “Toll-like Receptors and Cancer: MYD88 Mutation and Inflammation,” Frontiers in Immunology 5(367):1-10 (2014), which is hereby incorporated by reference in its entirety). Given that HSATII and GSAT expression has been found to be pervasive in many tumor types and induces responses that differ by species or cell type, the role of i-ncRNA in tumorigenesis is likely dependent on the particular RNA expressed and other properties of the tumor microenvironment. For instance, HSATII activates macrophages and monocytes in this study, suggesting it may be a mechanism for attraction and retention of tumor associated macrophages. These macrophages have consistently been shown to be a poor prognostic in cancer leading to increased tumorigenesis, metastasis, and immunoevasion (Noy et al., “Tumor-Associated Macrophages: From Mechanisms to Therapy,” Immunity 41:49-61 (2014), which is hereby incorporated by reference in its entirety). Under this hypothesis, HSATII is used by the tumor to keep macrophages in the tumor microenvironment while driving out T cells. Interestingly, the viral like behavior of HSATII transcripts is not only found in the immune response to these elements, but also their ability to reverse transcribe in cancer cells akin to retroviruses (Bersani et al., “Pericentromeric Satellite Repeat Expansions Through RNA-Derived DNA Intermediates in Cancer,” Proc. Natl. Acad. Sci. 112(49):15148-15153 (2015), which is hereby incorporated by reference in its entirety).
i-ncRNA, not subject to the same forces as ncRNA transcribed in steady state, may retain or evolve to mimic features of foreign RNA, as seen by comparing HSATII and GSAT to typical human ncRNA and foreign genomic material in
An RNA sequence of length L, hereafter called S0, and a motif m (a series of contiguous nucleotides, e.g., CpG) is considered. L is the total sequence length, comprising the nucleotides A, C, G, and U, along with nucleotide bases that are not clearly defined. The objective is to define a probabilistic model over the set of the 4L sequences, S=(s1 s2 . . . si . . . sL), such that the average value of the number, Nm(S), of occurrences of the motif m in S coincides with the number, Nm(S0), of occurrences that motif in S0. To do so, a random-nucleotide model is considered, where nucleotides are independently distributed according to the frequencies fθ(s), where s=A, C, G, U, found in S0 (or where s=A, C, G, T when S0 is represented as an un-transcribed DNA sequence). The frequency of a nucleotide is calculated by counting the number of times that nucleotide occurs and dividing that number by the total length of the sequence, L (which may also occur for ambiguously defined bases that cannot be assigned as A, C, G, U, or T). For example, fθ(A), the frequency of A nucleotides, would be the number of occurrences of the base, A, in S0 divided by L, the length of S0, even when ambiguous bases are included.
The probability of a sequence S in this least-constrained, maximum entropy model is
ensures the probability is correctly normalized. Parameter x, referred to as a selective force (or just force) on the motif m, introduces a statistical bias over P (Greenbaum et al., “Quantiative Theory of Entropic Forces Acting on Constrained Nucleotide Sequences Applied to Viruses,” Proc. Natl. Acad. Sci. 111:5054-5059 (2014), which is hereby incorporated by reference in its entirety). The force quantifies the strength of statistical bias, which may be due to selection on a motif. In the absence of bias (x=0) the probability of S simplifies to the product its nucleotide frequencies, and the number of motifs is what one would expect in a typical sequence with nucleotide frequencies given by fθ(s). Positive values for x push the distribution towards sequences with Nm(S) larger than what one would expect while negative x favor sequences with a smaller Nm(S) than expected.
The value of the force, x(S0), is computed by maximizing the probability
P(S0|x,m)
of the sequence S0 over x. This is equivalent to finding the value of x such that the average number of motifs
equals Nm(S0). By scanning the sequences S0 in the GENCODE database, the forces x(S0) shown in
The logarithm of the number of sequences having Nm(S) repetitions of m is bounded from above by the entropy of the random-nucleotide model; the equality is reached in the absence of bias only (x=0). The difference between those entropies is the entropy cost corresponding to the constraint on the average number of occurrences of m, and is denoted by σm. It is the Legendre transform of log Zm(x), see EQUATION 2 and EQUATION 3 (supra).
σm=x(S0)Nm(S0)−log Zm(x(S0)) [EQUATION 4]
Efficient computational techniques allow calculation of the sum over the 4L sequences in EQUATION 2 in a time growing only linearly with L.
The aim is to find anomalous motif usage in a sequence where the number of motif occurrences is different from what is expected by chance in the random-nucleotide model, that is, associated to a significant nonzero force. The likelihood of observing the natural sequence S0 with a given motif count is expressed as
This likelihood is therefore directly related to the entropic cost: The larger the cost, the more likely is the motif to be statistically significant.
GSAT and HSATII were demonstrated to be immunogenic, and were outliers relative to the distribution of strengths of statistical bias on CpG and UpA dinucleotides. Since GSAT was less of an outlier than HSATII, GSAT is used to define a minimal threshold of the strength of statistical bias for an immunogenic non-coding RNA. In the mouse GENCODE dataset, version 2 (which is hereby incorporated by reference in its entirety), of long non-coding RNA transcripts, the mean value of the strength of statistical bias on CpG dinucleotides is −1.3678 with a standard deviation of 0.5788, and the mean value of the strength of statistical bias on UpA dinucleotides is −0.5691 with a standard deviation of 0.2455. In the human GENCODE dataset, version 19 (which is hereby incorporated by reference in its entirety), of long-noncoding RNA transcripts, the mean value of the strength of statistical bias on CpG dinucleotides is −1.4341 with a standard deviation of 0.6505, and the mean value of the strength of statistical bias on UpA dinucleotides is −0.6152 with a standard deviation of 0.2834. The strength of statistical bias on GSAT is 0 for CpG dinucleotides and −0.8566 for UpA dinucleotides. This is 2.3629 standard deviations away from the mean of the mouse GENCODE distribution of strengths of statistical bias on CpG dinucleotides and 0.8831 standard deviations away from the mean for UpA dinucleotides. The strength of statistical bias on UpA dinucleotides was therefore not deemed necessary to define GSAT as an outlier as the strength of statistical bias of UpA dinucleotides is not significant for GSAT.
The CpG strength of statistical bias on GSAT is 2.3629 standard deviations from the mean of the distribution of strengths of statistical bias on CpG for the mouse GENCODE dataset and 2.2046 standard deviations away from the mean for the human GENCODE dataset. Therefore, an outlier in the human dataset was defined as a sequence whose strength of statistical bias on CpG dinucleotides has a Z-score (the strength of statistical bias on CpG minus the mean strength of statistical bias divided by the standard deviation) as greater than 2.2046 and for the mouse distribution as having a Z-score greater than 2.3629. This insures that the sequence is both an outlier and that CpG is over-represented relative to the GENCODE distribution.
Mouse repetitive elements meeting this threshold from mouse repeat sequences from the Repbase database are found in Table 3, and their corresponding nucleotide sequences are displayed in
lncRNAs meeting this threshold from the Mouse ENCODE dataset are found in Table 4 and their corresponding nucleotide sequences are displayed in
Human Repetitive elements meeting this threshold from the human repeat sequences from the Repbase database are found in Table 5 and their corresponding nucleotide sequences are displayed in
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Microcebus murinus
Microcebus murinus
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Human ENCODE elements meeting this threshold from the Human ENCODE dataset are found in Table 6 and their corresponding nucleotide sequences are displayed in
For HSATII and GSAT, negative controls were designed in two ways and both negative controls were compared to HSATII and GSAT for all experiments. First, full RNA sequences of both satellites were randomly permuted until scrambled sequences were generated that fell within one half of a standard deviation from the mean value of the strength of statistical bias against CpG and UpA dinucleotides for humans and mice, respectively. These sequences are denoted as HSATII-sc and GSAT-sc. In other words, these sequences had the same length and nucleotide content as HSATII and GSAT but fell within the inner ellipse in
GSAT RNA expression levels were investigated by a custom Taqman Assay in normal mouse tissue versus mouse tumor tissue samples (
Sequences encoding for murine GSAT and human HSATII were generated by custom gene synthesis (Genscript) and cloned into a pCDNA3 backbone (EcoRI/EcoRV) that carries a T7 promoter on the + strand and a SP6 promoter on the—strand (Invitrogen). Sequences encoding for GSAT-sc, HSATII-sc, UCON38, and RMER16A3 were generated as minigenes and sub-cloned in a pIDT-blue backbone with a T7 promoter on the + strand and a T3 promoter on the—strand surrounding the sequence of interest (IDT). To produce high quality RNA, plasmids were digested by the restriction enzymes NotI/NdeI (pCDNA3) and ApaLI (pIDT blue) to isolate the fragment containing the sequence of interest by gel purification (Qiagen). Then the sequences of interest containing the T7 promoter were amplified by PCR (Accuprime-PFX Invitrogen) using the following primer pairs:
PCR products were purified by PCR-Cleanup (Qiagen) and controlled by electrophoresis (0.8% Agarose gel). RNAs were generated by in vitro transcription using the mMESSAGE mMACHINE T7 ultra kit (Ambion) followed by a capping and short polyA reaction. RNAs were then purified using RNA-cleanup (Qiagen), quantified using a nanodrop, and checked by electrophoresis after denaturation at 65° C. for 10 minutes (15% Agarose gel).
MoDCs and imBM were both stimulated by i-ncRNA in the same way. The culturing of these cells is described below. Briefly, cells were plated in 96 flat well plates at 200,000 cells per well for primary cells (MoDCs) and 100,000 cells per well for lines (IMBM). i-ncRNA were transfected via liposomes formed using DOTAP (Roche Life Science) at a ratio of 1 μg DNA per 6 μl DOTAP diluted in HBS following the user-guide recommendations. The cells were stimulated using 2 μg/ml of purified i-ncRNA versus 10 μg/ml total RNA. To stimulate the TLR4 pathway, 100 ng/ml Ultrapure LPS (Invivogen) was used for TLR2: 500 ng/ml Pam2CSK4 (Invivogen) for TLR3: 2 μg/ml HMW PolyIC (Invivogen) TLR7/8: 1 μg/ml CLO97 (Invivogen) and 100 ng/ml R848 (Invivogen) TLR9: CpG B-ODN 1826 3 μM or STING CDN 5 μg/ml (Aduro).
Human moDCs: Human monocyte derived DCs were differentiated as previously described (Frleta et al., “HIV-1 Infection-Induced Apoptotic Microparticles Inhibit Human DCs via CD44,” J. Clinical Invest. 122:4685 (2012), which is hereby incorporated by reference in its entirety). Briefly, PBMCs were prepared by centrifugation over Ficoll-Hypaque gradients (BioWhittaker) from healthy donor buffy coats (New York Blood Center). Monocytes were isolated from PBMCs by adherence and then treated with 100 U/ml GM-CSF (Leukine Sanofi Oncology) and 300 U/ml IL-4 (RandD) in RPMI plus 5% human AB serum (Gemini Bio Products). Differentiation media was renewed on day 2 and day 4 of culture. Mature moDCs were harvested for use on days 5 to 7. For all experiments, harvested DCs were washed and equilibrated in serum-free X-Vivo 15 media (Lonza).
Murine imBMs: Immortalized macrophages were immortalized by infecting bone marrow progenitors with oncogenic v-myc/vraf expressing J2 retrovirus as previously described (Blasi et al., “Selective Immortalization of Murine Macrophages from Fresh Bone Marrow by a raf/myc Recombinant Murine Retrovirus,” Nature 318:667-670 (1985), which is hereby incorporated by reference in its entirety) and differentiated in macrophage differentiated media containing MCSF. ImBM were maintained in 10% FCS PSN DMEM (Gibco). ImBM lines were provided by several collaborators and also obtained from the BEI resource: ICE (Casp1/Casp11), MAVs, IFN-R, IRF3-7, STING and their rescues, Unc93b1 3d/3d, TLR 3, 4, 7, 9, 2-9, 2-4, MYD88, TRIF, TRAM, and TRIF-TRAM.
To characterize whether this pathway could be modulated in the models, production of type I interferon in response to stimulation by the i-ncRNA using human and murine interferon stimulated response element (ISRE) reporter cell lines was evaluated and transcriptome regulation of a panel of immune genes related to the interferon pathway was monitored. Whereas the effect on the inflammatory response is significant in terms of TNFalpha, IL-6, or IL-12 production, the effect on the type I interferon pathway was less prominent.
TLR2 or TLR4 were not required, indicating the observed effect was independent of contamination from bacterial products such as lipoproteins and endotoxins (
Although preferred embodiments have been depicted and described in detail herein, it will be apparent to those skilled in the relevant art that various modifications, additions, substitutions, and the like can be made without departing from the spirit of the invention and these are therefore considered to be within the scope of the invention as defined in the claims which follow.
This application claims the benefit of U.S. Provisional Patent Application Ser. No. 62/116,298, filed Feb. 13, 2015, which is hereby incorporated by reference in its entirety.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US16/18001 | 2/16/2016 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62116298 | Feb 2015 | US |