The invention encompasses products and methods relating to microRNAs involved in various cancers.
MicroRNAs (miRNAs) mediate degradation (Baek et al. 2008) or translational repression (Selbach et al. 2008) of gene transcripts associated with an array of biological processes including many of the hallmarks of cancer (Dalmay and Edwards 2006; D Hanahan and R A Weinberg 2000; Douglas Hanahan and Robert A Weinberg 2011; Ruan et al. 2009). Not surprisingly, dysregulated miRNAs can be readily detected in tumor biopsies (Jiang et al. 2009) and are known to be diagnostic and prognostic indicators (Zen and Chen-Yu Zhang 2010). In some cases miRNAs have also been shown to be potential therapeutic targets (Garofalo and Croce 2011; Nana-Sinkam and Croce 2011). Conservative estimates suggest that each human miRNA regulates several hundred transcripts (Baek et al. 2008; Selbach et al. 2008) and thus miRNA mediated regulation results in statistically significant gene co-expression signatures that are readily discovered through transcriptome profiling (Brueckner et al. 2007; Ceppi et al. 2009; Tsung-Cheng Chang et al. 2007; Fasanaro et al. 2009; Frankel et al. 2008; Georges et al. 2008; Grimson et al. 2007; Lin He et al. 2007; Hendrickson et al. 2008; Charles D Johnson et al. 2007; Karginov et al. 2007; Lee P Lim et al. 2005; Linsley et al. 2007; Malzkorn et al. 2010; Ozen et al. 2008; Sengupta et al. 2008; Tan et al. 2009; Tsai et al. 2009; Valastyan et al. 2009; Wang-Xia Wang et al. 2010; Xiaowei Wang and Xiaohui Wang 2006; Frank Weber et al. 2006).
There are two commonly used strategies to identify the miRNA regulator(s) responsible for the observed co-expression of a set of genes: 1) enrichment of predicted 3′ UTR binding sites for a known miRNA (Betel et al. 2010, 2008; Friedman et al. 2009; Kertesz et al. 2007); or 2) de novo identification of a 3′ UTR motif that is complementary to a seed sequence of a miRNA in miRBase (Fan et al. 2009; Goodarzi et al. 2009; Kozomara and Griffiths-Jones 2011; Linhart et al. 2008). Algorithms utilizing the first strategy incorporate some combination of seed complementarity, cross-species conservation, and thermodynamic properties of the binding site. These algorithms include PITA (Kertesz et al. 2007), TargetScan (Friedman et al. 2009), and both miRanda (Betel et al. 2008) and miRSVR (Betel et al. 2010) from microlMA.org. While the combined modeling of two or more miRNA-binding properties within these algorithms boosts signal, the multiple hypotheses testing required to identify bona fide miRNA-binding sites unfortunately also simultaneously leads to high false negative rates (−32-52%) (Sethupathy et al. 2006).
Despite some progress in assessing the risk of cancer, a need exists for accurate methods of assessing such risks or developing conditions. Treatment of pre-cancer with drugs could postpone or prevent cancer; yet few pre-cancer patients are treated. A major reason is that no simple and unambiguous laboratory test exists to determine the actual risk of an individual to develop cancer. Thus, there remains a need in the art for methods of identifying, diagnosing, and treating these individuals.
The present application provides prognostic methods for determining risk for developing cancer or predicting progression of cancer, and for predicting response to a drug or treatment regimen; diagnostic methods for identifying type(s) of cancer and for identifying a response to a drug or monitor a treatment regimen; therapeutic methods for directing appropriate treatments for patients at risk of progression, for directing appropriate treatments for patients with an identified type of cancer, for administering a drug that increases a miRNA useful for the treatment of cancer and for administering a drug to inhibit a miRNA identified as being involved in causing or exacerbating cancer; computer systems based on algorithms useful in the prognostic, diagnostic and/or therapeutic methods; miRNA products (including, but not limited to, products useful as biomarkers) and panels (i.e., sets of miRNA products); and products (e.g., arrays or kits of reagents) to detect miRNAs or panels of miRNAs and methods of using the detection products.
In a first aspect, a Framework for Inference of Regulation by miRNAs (FIRM) is provided. FIRM integrates three best performing algorithms to infer miRNA that mediate regulation from co-expression signatures. In an exemplary embodiment, FIRM limits the Weeder-miRvestigator method to only those inferences of miRNA mediated regulation with a perfect 7- or 8-mer miRvestigator complementarity p-value (p-value=6.1×10−5 or 1.5×10−5, respectively) to a miRNA seed in miRBase. Inferences of miRNA mediated regulation from the PITA and TargetScan enrichment of predicted miRNA target genes methods are filtered to include only those with Benjamini-Hochberg FDR=0.00. FIRM produces a listing (i.e., a panel) of all co-expression signatures predicted to be regulated by an miRNA. See also, the embodiments represented in
FIRM is, at the most basic level, an assemblage of methods combined to produce a data set of co-expression signatures predicted to be regulated by one or more miRNAs. The methods are performed by one or more computer processors executing one or more sets of instructions. The instructions may be hard-encoded into the processor, as in an application-specific integrated circuit (ASIC), may be semi-permanently encoded into the processor, as is the case in, for example, a field-programmable gate array (FPGA), or may be stored on a memory device and executed by a general purpose processor that, after retrieving the instructions from the memory device, becomes a special purpose processor programmed to perform the methods. Generally, the methods may be stored (or encoded, in hardware implementations such as ASICs and FPGAs) as one or more modules or routines. While described below with respect to three methods (and, accordingly, three modules or routines), the methods of which FIRM is comprised may form more than three routines or fewer than three routines. Additionally, individual steps of the methods need not necessarily be performed in the order described. That is, unless a data dependency exists between two steps, it is possible—as will be understood—for steps to be performed in orders other than those described. Further, any particular step may, as will also be understood, represent one or more sub-steps, operations, functions, etc. As but one illustrative example, any particular method step may include retrieving input data from memory, performing one or more processing steps on the data, and storing one or more outputs to the memory.
An interface is optionally provided to allow one or more users to access the combined results (block 808). In one embodiment, the interface takes the form of a Web page available via a network connection (e.g., the Internet), allowing one or more users to access, search, and filter the combined data from any web-enabled device (e.g., workstations, laptop computers, smart phones, tablet devices, etc.). In another embodiment, the interface takes the form of an additional routine operating on a processor (the same processor or a different processor) communicatively connected to a memory on which the combined results are stored. For example, the interface routine may execute on a computing device and, via a network, may access/retrieve the combined results from a database or memory device located remotely. Alternatively, the interface routine may execute on the processor executing the routines related to blocks 802-806.
In any event, the combined data may later be used for any purpose as generally described throughout the remainder of this application (block 810).
By “statistically significant”, it is meant that the inference is greater than what might be expected to happen by chance alone (which could be a “false positive”). Statistical significance can be determined by any method known in the art. Commonly used measures of significance include the p-value, which presents the probability of obtaining a result at least as extreme as a given data point, assuming the data point was the result of chance alone. A result is often considered highly significant at a p-value of 0.05 or less.
In another aspect, miRNAs are described herein as associated with particular cancers or cancer characteristics. The miRNAs can be measured in an individual and used to evaluate the risk that an individual will develop cancer in the future, for example, the risk that an individual will develop cancer in the next 1, 2, 2.5, 5, 7.5, or 10 years. As used herein, “measuring” includes at least “detecting” a biomarker, but can also include determining the level/quantity of a biomarker. Exemplary miRNAs are shown in the figures. The miRNAs can be employed for methods, kits, computer readable media, systems, and other aspects of the invention which employ individual miRNAs or sets of miRNAs. A panel of miRNAs may comprise one or more miRNAs. MicroRNAs are set out in
In still another aspect, methods of calculating a risk score for developing cancer are provided, comprising (a) obtaining inputs about an individual comprising the level of biomarkers in at least one biological sample from said individual; and (b) calculating a cancer risk score from said inputs; wherein said biomarkers comprise one or more biomarkers selected from
Cancers include, but are not limited to, cancers such as those set out in
In yet another aspect of evaluating risk for developing cancer, the method comprises: (a) obtaining biomarker measurement data, wherein the biomarker measurement data is representative of measurements of biomarkers in at least one biological sample from an individual; and (b) evaluating risk for developing cancer based on an output from a model, wherein the model is executed based on an input of the biomarker measurement data; wherein the biomarkers comprise one or more biomarkers selected from
In an additional aspect, the invention is method of evaluating risk for developing cancer comprising: obtaining biomarker measurements from at least one biological sample from an individual who is a subject that has not been previously diagnosed as having cancer, comparing the biomarker measurement to normal control levels; and evaluating the risk for the individual developing a cancer from the comparison; wherein the biomarkers are defined as set forth in the preceding paragraph.
Similarly, methods are provided of evaluating risk for developing cancer, the method comprising: obtaining biomarker measurement data, wherein the biomarker measurement data is representative of measurements of biomarkers in at least one biological sample from an individual; and evaluating risk for developing cancer based on an output from a model, wherein the model is executed based on an input of the biomarker measurement data; wherein said biomarkers are defined as above.
In some embodiments, the step of evaluating risk comprises computing an index value using the model based on the biomarker measurement data, wherein the index value is correlated with risk of developing cancer in the subject. In some embodiments, evaluating risk comprises normalizing the biomarker measurement data to reference values.
In another aspect, a method of calculating a risk score for cancer progression is provided, comprising (a) obtaining inputs about an individual suffering from cancer comprising the level of biomarkers in at least one biological sample from said individual; and (b) calculating a cancer risk score from said inputs; wherein said biomarkers comprise one or more biomarkers selected from
In some embodiments of the methods disclosed herein, the obtaining biomarker measurement data step comprises measuring the level of at least one of the biomarkers in at least one biological sample from said individual. Optionally, the method includes a step (prior to the step of obtaining biomarker measurement data) of obtaining at least one biological sample from the individual.
In some embodiments, at least one biomarker input is obtained from one or more biological samples collected from the individual, such as from a blood sample, saliva sample, urine sample, cerebrospinal fluid sample, sample of another bodily fluid, or other biological sample including, but not limited to, those described herein.
In some embodiments, at least one biomarker input is obtained from a preexisting record, such as a record stored in a database, data structure, other electronic medical record, or paper, microfiche, or other non-electronic record.
In some embodiments, the biomarkers comprise at least one, at least two, at least three, at least four, at least five, at least six, at least seven, at least eight, at least nine, at least ten, at least eleven, at least twelve, at least thirteen, or more (up to all or all) biomarkers selected from
In another aspect, the invention embraces a method comprising advising an individual of said individual's risk of developing cancer or risk of cancer progression, wherein said risk is based on factors comprising a cancer risk score, and wherein said cancer risk score is calculated as described above. The advising can be performed by a health care practitioner, including, but not limited to, a physician, nurse, nurse practitioner, pharmacist, pharmacist's assistant, physician's assistant, laboratory technician, dietician, or nutritionist, or by a person working under the direction of a health care practitioner. The advising can be performed by a health maintenance organization, a hospital, a clinic, an insurance company, a health care company, or a national, federal, state, provincial, municipal, or local health care agency or health care system. The health care practitioner or person working under the direction of a health care practitioner obtains the medical history of the individual from the individual or from the medical records of the individual. The advising can be done automatically, for example, by a computer, microprocessor, or dedicated device for delivering such advice. The advising can be done by a health care practitioner or a person working under the direction of a health care practitioner via a computer, such as by electronic mail or text message.
In some embodiments of the invention, the cancer risk score is calculated automatically. The cancer risk score can be calculated by a computer, a calculator, a programmable calculator, or any other device capable of computing, and can be communicated to the individual by a health care practitioner, including, but not limited to, a physician, nurse, nurse practitioner, pharmacist, pharmacist's assistant, physician's assistant, laboratory technician, dietician, or nutritionist, or by a person working under the direction of a health care practitioner, or by an organization such as a health maintenance organization, a hospital, a clinic, an insurance company, a health care company, or a national, federal, state, provincial, municipal, or local health care agency or health care system, or automatically, for example, by a computer, microprocessor, or dedicated device for delivering such advice.
In another embodiment, methods providing two or more cancer risk scores to a person, organization, or database are disclosed, where the two or more cancer risk scores are derived from biomarker information representing the biomarker status of the individual at two or more points in time. In any of the foregoing embodiments, the entity performing the method can receive consideration for performing any one or more steps of the methods described.
In another aspect, a method is provided of ranking or grouping a population of individuals, comprising obtaining a cancer risk score for individuals comprised within said population, wherein said cancer risk score is calculated as described above; and ranking individuals within the population relative to the remaining individuals in the population or dividing the population into at least two groups, based on factors comprising said obtained cancer risk scores. The ranking or grouping of the population of individuals can be utilized for one or more of the following purposes: to determine an individual's eligibility for health insurance; an individual's premium for health insurance; to determine an individual's premium for membership in a health care plan, health maintenance organization, or preferred provider organization; to assign health care practitioners to an individual in a health care plan, health maintenance organization, or preferred provider organization; to recommend therapeutic intervention or lifestyle intervention to an individual or group of individuals; to manage the health care of an individual or group of individuals; to monitor the health of an individual or group of individuals; or to monitor the health care treatment, therapeutic intervention, or lifestyle intervention for an individual or group of individuals.
In another aspect, a panel of biomarkers is provided comprising biomarkers selected from
In another aspect, one or more data structures or databases are provided comprising values for one or more biomarkers in
In another aspect, diagnostic test systems are provided comprising (1) means for obtaining test results comprising levels of multiple biomarkers in at least one biological sample; (2) means for collecting and tracking test results for one or more individual biological sample; (3) means for calculating an index value from inputs, wherein said inputs comprise measured levels of biomarkers, and further wherein said measured levels of biomarkers comprise the levels of one or more biomarkers selected from
A diagnostic system is any system capable of carrying out the methods of the invention, including computing systems, environments, and/or configurations that may be suitable for use with the methods or system of the claims include, but are not limited to, personal computers, server computers, hand-held or laptop devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
In some embodiments, a diagnostic test system comprises: means for obtaining test results data representing levels of multiple biomarkers in at least one biological sample; means for collecting and tracking test results data for one or more individual biological samples; means for computing an index value from biomarker measurement data, wherein said biomarker measurement data is representative of measured levels of biomarkers, and further wherein said measured levels of biomarkers comprise the levels of a set or panel of biomarkers as defined elsewhere herein; and means for reporting said index value. In some variations of the diagnostic test system, the index value is a cancer risk score. In some preferred variations, the cancer risk score is computed according to the methods described herein for computing such scores. In some variations, the means for collecting and tracking test results data representing for one or more individuals comprises a data structure or database. In some variations, the means for computing a cancer risk score comprises a computer or microprocessor. In some variations, the means for reporting the cancer risk score comprises a visible display, an audio output, a link to a data structure or database, or a printer.
In some embodiments, a medical diagnostic test system for evaluating risk for developing a cancer or risk for cancer progression, the system comprises: a data collection tool adapted to collect biomarker measurement data representative of measurements of biomarkers in at least one biological sample from an individual; and an analysis tool comprising a statistical analysis engine adapted to generate a representation of a correlation between a risk for developing a cancer and measurements of the biomarkers, wherein the representation of the correlation is adapted to be executed to generate a result; and an index computation tool adapted to analyze the result to determine the individual's risk for developing a cancer or for cancer progression, and represent the result as an index value; wherein said biomarkers are defined as a set or panel as described elsewhere herein. In some variations, the analysis tool comprises a first analysis tool comprising a first statistical analysis engine, the system further comprising a second analysis tool comprising a second statistical analysis engine adapted to select the representation of the correlation between the risk for developing a cancer or risk for cancer progression and measurements of the biomarkers from among a plurality of representations capable of representing the correlation. In some variations, the system further comprising a reporting tool adapted to generate a report comprising the index value.
In some embodiments, a system for diagnosing susceptibility to cancer in a human subject comprises (a) at least one processor; (b) at least one computer-readable medium; (c) a susceptibility database operatively coupled to a computer-readable medium of the system and containing information associating measurements of one or more biomarkers selected from
In some embodiments, a system for diagnosing cancer in a human subject comprises (a) at least one processor; (b) at least one computer-readable medium; (c) a susceptibility database operatively coupled to a computer-readable medium of the system and containing information associating measurements of biomarkers selected from
In the systems in the preceding two paragraphs, the input about the human subject can be a biological sample from the human subject, and the measurement tool comprises a tool to measure one or more biomarkers selected from
In some embodiments of systems comprising a communication tool operatively connected to the analysis tool or routine, the systems comprise a routine stored on a computer-readable medium of the system and adapted to be executed on a processor of the system, to: generate a communication containing the conclusion; and transmit the communication to the subject or the medical practitioner, or enable the subject or medical practitioner to access the communication.
In some embodiments, any of the systems comprise a medical protocol database operatively connected to a computer-readable medium of the system and containing information correlating the conclusion and medical protocols for human subjects at risk for or suffering from cancer; and a medical protocol tool (or routine), operatively connected to the medical protocol database and the analysis tool or routine, stored on a computer-readable medium of the system, and adapted to be executed on a processor of the system, to compare the conclusion from the analysis routine with respect to cancer for the subject and the medical protocol database, and generate a protocol report with respect to the probability that one or more medical protocols in the database will reduce susceptibility to cancer, delay onset of cancer, increase the likelihood of detecting cancer at an early stage to facilitate early treatment or treat the cancer. Where the communication tool is operatively connected to the medical protocol tool or routine, the system may generate a communication that further includes the protocol report.
Yet another aspect is a computer readable medium having computer executable instructions for evaluating risk for developing a cancer, the computer readable medium comprising: a routine, stored on the computer readable medium and adapted to be executed by a processor, to store biomarker measurement data representing a set or panel of biomarkers; and a routine stored on the computer readable medium and adapted to be executed by a processor to analyze the biomarker measurement data to evaluate a risk for developing a cancer or for risk of cancer progression. The panels of biomarkers are defined as described in any of the preceding paragraphs.
Still another aspect is a method developing a model for evaluation of risk for developing a cancer or for cancer progression, the method comprising: obtaining biomarker measurement data, wherein the biomarker measurement data is representative of measurements of biomarkers from a population and includes endpoints of the population; inputting the biomarker measurement data of at least a subset of the population into a model; training the model for endpoints using the inputted biomarker measurement data to derive a representation of a correlation between a risk of developing a cancer or for cancer progression and measurements of biomarkers in at least one biological sample from an individual; wherein said biomarkers for which measurement data is obtained comprise a set or panel of markers of the invention as defined elsewhere herein.
Another aspect is a kit comprising reagents for measuring a panel of biomarkers, wherein the panel of biomarkers are defined as described in any of the preceding paragraphs, or in a figures, or in other descriptions of preferred panels of markers found herein. In some embodiments, such reagents are packaged together. In some embodiments, the reagents are primers used to amplify miRNA(s) in a panel. In some embodiments, the reagents are DNA arrays that hybridize to miRNA(s) in a panel. In some embodiments, the kit further includes an analysis program for evaluating risk of an individual developing a cancer from measurements of the group of biomarkers from at least one biological sample from the individual.
In measuring miRNA, an amplification reaction using appropriate primers as reagents may be done quantitatively, and the amount of amplified RNA can then be determined with an appropriate probe with a detectable label. The probe may be an oligonucleotide including oligos with nonnative linkages such as phosphothiolate or phosphoramidate, or a peptide nucleic acid (PNA). Nonnative bases may also be included. Thus, a kit may comprise a reagent for an assay which reagent is specific for the miRNA(s), as well as additional reagents needed in order to quantitate the results. Specific miRNA levels can also be measured using general molecular biology techniques commonly known in the art such as Northern blot, quantitative reverse transcription polymerase chain reaction (qRT-PCR), next-generation sequencing or microarray. qRT-PCR is a more sensitive and efficient procedure detect specific messenger RNA or microRNA. The RNA sample is first reverse transcribed, the target sequences can then be amplified using thermostable DNA polymerase. The concentration of a particular RNA sequence in a sample can be determined by examining the amount of amplified products. Microarray technology allows simultaneous measurement of the concentrations of multiple RNA species. Oligonucleotides complementary to specific miRNA sequences are immobilized on solid support. The RNA in the sample is labeled with ColorMatrix™ or florescent dye. After subsequent hybridization of the labeled material to the solid support, the intensities of fluorescent for ColorMatrix™ dye remaining on the solid support determines the concentrations of specific RNA sequences in the samples. The concentration of specific miRNA species can also be determined by NanoString™ nCounter™ system which provides direct digital readout of the number of RNA molecules in the sample without the use of amplification. NanoString™ technology involves mixing the RNA sample with pairs of capture and reporter probes, tailored to each RNA sequence of interest. After hybridization and washing away excess probes, probe-bound target nucleic acids are stretched on a surface and scanned to detect fluorescent-barcodes of the reporter probes. This allows for up to 1000-plex measurement with high sensitivity and without amplification bias. Technologies such as electrochemical biosensor arrays, surface plasma resonance and other targeted capture assays can also be utilized to quantify molecular markers simultaneously by measuring changes in electro-current, light absorption, fluorescence, or enzymatic substrates reactions.
Another aspect includes methods for the prophylactic treatment of a subject at risk for a cancer according to procedures described herein. In some embodiments, the invention includes a method of prophylaxis for cancer comprising: obtaining risk score data representing a cancer risk score for an individual, wherein the cancer risk score is computed according to a method or improvement of the invention; and generating prescription treatment data representing a prescription for a treatment regimen to delay or prevent the onset of cancer to an individual identified by the cancer risk score as being at elevated risk for cancer. In some embodiments, a method of prophylaxis for cancer comprises: evaluating risk, for at least one subject, of developing a cancer according to the method or improvement of the invention; and treating a subject identified as being at elevated risk for a cancer with a treatment regimen to delay or prevent the onset of cancer.
Another aspect includes methods for the therapeutic treatment of a subject indentified as having a cancer according to procedures described herein.
In some embodiments, methods for the prophylactic or therapeutic treatment of a subject comprise administering a drug that increases the amount of a miRNA identified herein that is produced by the body to fight a cancer. In some embodiments, methods comprise administering a drug to inhibit a miRNA or decrease the amount of a miRNA identified herein that is part of the cause of or exacerbates a cancer. In some embodiments, methods comprise both administering a drug that increases the amount of a miRNA identified herein that is produced by the body to fight a cancer, and administering a drug to inhibit a miRNA or decrease the amount of a miRNA identified herein that is part of the cause of or exacerbates a cancer. In some embodiments, the subject is treated with the drug and also receives any other standard of care treatment for the cancer. A drug can be any product including, but not limited, to: small molecules; RNAs or vectors encoding RNAs, such as miRNAs (including miRNAs identified herein), snRNAs and antisense RNAs; peptides or polypeptides; and antibody products that penetrate cells.
A further aspect is a method of evaluating the current status of a cancer in an individual comprising obtaining biomarker measurement data and evaluating the current status of a cancer in the individual based on an output from a model, wherein the biomarkers are any biomarker of the invention.
The foregoing paragraphs are not intended to define every aspect of the invention, and additional aspects are described in other sections. This entire document is intended to be related as a unified disclosure, and it should be understood that all combinations of features described herein are contemplated, even if the combination of features are not found together in the same sentence, or paragraph, or section of this document. With respect to aspects of the invention described as a genus, all individual species are individually considered separate aspects of the invention. With respect to aspects described as a range, all sub-ranges and individual values are specifically contemplated.
Aspects and embodiments of the invention are illustrated by the following non-limiting example.
A generalized framework for the inference of regulation by miRNAs (FIRM) was constructed. In Example 1, a compendium of transcriptome profiles was compiled from studies that had interrogated differential expression of genes in response to targeted perturbation of specific miRNAs (Brueckner et al 2007; Ceppi et al. 2009; Tsung-Cheng Chang et al. 2007; Fasanaro et al. 2009; Frankel et al. 2008; Georges et al. 2008; Grimson et al. 2007; Lin He et al. 2007; Hendrickson et al. 2008; Charles D Johnson et al. 2007; Karginov et al. 2007; Lee P Lim et al. 2005; Linsley et al. 2007; Malzkorn et al. 2010; Ozen et al. 2008; Sengupta et al. 2008; Tan et al. 2009; Tsai et al. 2009; Valastyan et al. 2009; Wang-Xia Wang et al. 2010; Frank Weber et al. 2006). In Example 2, using this compendium of miRNA-perturbed transcriptomes it was demonstrated that functional miRNA binding sites (8 bp of complementarity) preferentially reside in the 3′ UTRs. Further, using preferential 3′ UTR localization as a heuristic was demonstrated to significantly increase sensitivity and specificity of miRNA-binding site discovery by Weeder-miRvestigator. In Example 3, using the compendium of miRNA-perturbed transcriptomes the best performing algorithms were identified and integrated into a generalized framework for inference of miRNA regulatory networks. Finally, the utility of this framework was demonstrated by applying it to a set of 2,240 co-expression signatures from 46 different cancers. The original study was able to associate only four signatures to putative regulation by a known miRNA (Goodarzi et al. 2009). In contrast, using the integrated framework 1,324 signatures were explained as potential outcomes of regulation by specific miRNAs in miRBase. By applying functional enrichment and semantic similarity identified within this expansive network specific miRNAs associated with hallmarks of cancer were identified. Further, filtering gene co-expression signatures for specific hallmarks of cancer such as “tissue invasion and metastasis” generated a metastatic cancer-miRNA regulatory network of 33 miRNAs. Importantly, this revealed that a relatively small subset of miRNAs regulate multiple oncogenic processes across different cancers. Through in depth analyses of data from prior studies as well as new data from targeted miRNA-perturbation experiments, the role of miR-29 family members in lung adenocarcinoma was validated and gene targets for regulation by the relatively unknown miR-767-5p were discovered. Example 4 relates to the use of the FIRM approach to identify other miRNAs associated with hallmarks of cancer. The discussion in Example 5 illustrates how these analyses and validations demonstrate how the cancer-miRNA regulatory network can be used to accelerate discovery of miRNA-based biomarkers and therapeutics.
Sequences and RefSeq gene definition files were downloaded from the UCSC genome browser FTP site (ftp://hgdownload.cse.ucsc.edulgoldenPath/currentGenomes/Homo_sapiens). Details can be found in the Supplementary Method section below. The Weeder de novo motif detection algoirthm (Pavesi et al. 2006) was then used to identify over-represented miRNA binding sites in the 3′ UTR of putatively miRNA co-regulated genes (Fan et al. 2009; Linhart et al. 2008).
miRvestigator Identification of Complementary miRNA for 3′ UTR Motif
MiRvestigator employs a hidden Markov model (BIMM) to align and compute a probability describing the complementarity of a specific miRNA seed to a 3′ UTR motif (Plaisier et al. 2011). The miRvestigator HIVIM is described in detail in the supplementary methods. The 3′ UTR motif is first converted to a miRvestigator HIVIM and the Viterbi algorithm is used to provide a complementarity p-value by comparing the HIVIM to all potential seed sequences from miRBase. There are different models for the base-pairing of miRNA seeds to the complementary protein coding transcript binding sites as described in
Motifs were simulated based upon the reverse complement of the 8 bp seed sequence 5′-UGGAAUGU-3′ for miR-1 (MIMAT0000416). The miRNA seed signal determined the percent that the seed nucleotide was given in each column of the PSSM and the remaining signal was distributed randomly to the other three nucleotides. We simulated motifs with different entropies by adding between 10 to 75% noise at a 5 percent interval to each seed nucleotide position. A seed nucleotide signal of 25 percent is the random case as one of the other three nucleotides is likely to have a higher frequency than the seed nucleotide. Thirty sequences were simulated by randomly sampling 8 mers from the distribution 8 mers in 3′ UTRs and inserting an instance of the reverse complement of the miR-1 seed sequence at varying proportions (0 to 100%). The reciever operating characteristic (ROC) area under the curve (AUC) was calculated using the ROCR package (Sing et al. 2005).
Assessing Bias in the Distribution of miRNA Binding Sites
Instances of Weeder motif binding sites from either full transcripts (5′ UTR, coding sequence (CDS), 3′ UTR) or just 3′ UTRs of genes matching to the perturbed miRNA were identified for the compendium of experimentally determined miRNA target gene sets. Significance for the normalized counts per 1 Kbp was calculated for the distribution of matches in each gene region and for each experimentally determined miRNA target gene set by comparison to 1,000,000 randomly sampled gene sets of the same size. A combined p-value was computed by using Stouffer's Z-score method. The ROCR package was again used to compute ROC curves and ROC AUCs for each method. The pROC package was used to calculate the 95% confidence interval and pairwise p-values to determine if there is a significant difference between the ROC curves of the methods (Robin et al. 2011).
Identifying Enriched Predicted miRNA Binding Sites
The PITA, TargetScan, miRanda and miRSVR miRNA target gene prediction databases were downloaded from their respective web sites. The significance for enrichment of genes with a predicted miRNA binding site was calculated using the hypergeometric p-value for each miRNA. The miRNA(s) with the smallest hypergeometric p-value are considered the most likely to regulate the signature. Multiple hypothesis testing correction was applied using the BenjaminHochberg approach for controlling the false discovery rate (FDR) equal to or less than 0.001 (FDR<0.001), and requiring at least 10% of the genes to be targeted by the specific miRNA.
Selecting Optimal Methods to Infer miRNA Regulatory Network
Each inference method was applied to the compendium of 50 miRNA target gene sets (Supplementary Table 2). The ROCR and pROC packages in R were used to compute ROC curves, ROC AUC and p-values between ROC curves.
miR2Disease Overlap
First, we created a mapping between the 46 cancer subtypes and the disease classifications in the manually curated miR2Disease database. Instances were then identified where an inferred miRNA regulator was previously observed to be dysregulated or causal in the same cancer type. Significance of the enrichment of overlap between miR2Disease and the cancer-miRNA regulatory network was calculated using a hypergeometric p-value in R.
Enrichment of GO biological process terms in each cancer co-expression signature were assessed using the topG0 package in R (Alexa et al. 2006) by computing a hypergeometric pvalue with Benjamini-Hochberg correction (FDR<0.05). All GO terms passing the significance threshold for a co-expression signature were included in downstream analyses. Semantic similarity between a significantly enriched GO term and each hallmark of cancer was assessed by using the Jiang and Conrath similarity measure as implemented in the R package GOSim (Fröhlich et al. 2007). For each co-expression signature the similarity scores between its enriched GO terms and the GO terms for each hallmark of cancer was computed, and the maximum for each hallmark was returned. Similarity scores gyeater than or equal to 0.8 were considered sufficient for inferring a link between the enriched GO terms for a co-expression signature and a hallmark of cancer. Random sampling of 1,000 GO terms and computing the Jiang and Conrath scores demonstrated that a similarity score greater than or equal to 0.8 resulted in a permuted p-value<5.1×10−4.
miR-29 Family Co-Expression Signature Overlaps
A hypergeometric p-value was used to test for significant overlap between the lung adenocarcinoma signature genes and the genes up-regulated by in vitro due to knock-down of miR-29 family milMAs.
The 3′ UTRs for genes of interest were amplified from cDNA (primers in Supplementary Table 12) and cloned into the pmirGLO Dual-Luciferase miRNA target expression vector behind firefly luciferase. The sequence and orientation for all 3′ UTRs inserted into pmirGLO were verified by sequencing. HEK293 cells were plated at a density of 100,000 cells per well and cotransfected in 96 well plates 24 hours after plating. Cells were transfected using DharmaFect DUO (Dharmacon) with 75 ng of the 3′ UTR fused reporter vector and either 50 nM of miR-29a, miR-29b, miR-29c, miR-767-5p or cel-miR-67 (negative control) miRNA mimic (Dharmacon). Twenty-four hours after transfection firefly and renilla luciferase activities were measured using the Dual-Glo assay (Promega) on a Synergy 114 hybrid multi-mode microplate reader (BioTek) per manufacturer recommendations. Experiments were conducted in biological triplicates. Luminescence measurements were first background subtracted using a vehicle only control, and then firefly luminescence was normalized to renilla luminescence. Experimental comparisons are made to vector only controls. Student's T-test and fold-changes were calculated using standard methods. MiRNA binding sites for MMP2 and SPARC were deleted using recombinant PCR (primers in Supplementary Table 12). Dose response curves for COL3A1 and SPARC were conducted using 50 nM, 5 nM, 500 pM, 50 pM and 5 pM miRNA mimic concentrations.
Sequences and RefSeq gene definition files were downloaded from the UCSC genome browser FTP site (ftp://hgdownload.cse.ucsc.edu/goldenPath/currentGenomes/Homo sapiens). To reduce overlap the set of RefSeq genes that mapped to an Entrez gene were collapsed and the regulatory regions were merged to include all potential regulatory sequences. The RefSeq to Entrez gene mapping was downloaded from NCBI Gene FTP site (ftp://ftp.ncbi.nih.gov/gene/DATA/gene2refseq.gz). To provide a 3′ untranslated region (UTR) for as many genes as possible we set the minimum 3′ UTR length to the median annotated 3′ UTR length of 844 bp (Kertesz et al. 2007). The same approach was used for the 5′ UTR with a minimum 5′ UTR length of 183 bp. The coding sequences were acquired as they were annotated, and were not filtered in anyway. All annotated introns were removed as they are present only transiently in expressed transcripts. The Weeder de novo motif detection algoirthm (Pavesi et al. 2006) was then used to identify over-represented miRNA binding sites in the 3′ UTR of putatively miRNA co-regulated genes (Fan et al. 2009; Linhart et al. 2008).
miRvestigator Hidden Markov Model (HMM) from Position Specific Scoring Matrix
Two general problems are faced when comparing an miRNA seed which is a string of nucleotides 8 base pairs long (and may be complementary for 6, 7 or 8 base pairs) to a PSSM (a matrix of 4 nucleotide probabilities that must sum to 1 in a column by a variable number of columns). First the miRNA seed sequence must be aligned to the PSSM, and second the certainty of the match between the miRNA seed and the PSSM must be computed. The Viterbi algorithm identifies the optimal path through an HMM for an observed sequence of events, and there can solve both of these problems simultaneously by turning the PSSM into an Hidden Markov Model (HMM) and the miRNA seed nucleotide sequence into the observed sequence of events. The overall structure of the miRvestigator HMM is described in
The significance of a the Viterbi optimal state path probability for a given miRNA is then calculated by exhaustively computing the complete distribution of Viterbi optimal state path probabilities for all potential miRNA k-mer seed sequences (where k=6, 7 or 8 base pairs). Only k-mers which are present in the regulatory regions of the transcripts being investigated are included in the exhaustive computation. The complete distribution of Viterbi probabilities is then used to provide a p-value for each miRBase miRNA seed sequence by counting the number of k-mers with a Viterbi optimal state path probability greater than or equal to the miRNA seed of interest divided by the total number of potential k-mers. This provides a p-value for the alignment and match for each miRNA seed sequences to a PSSM identified from cis-regulatory regions. The miRNAs are then ranked based upon the Viterbi optimal state path p-values and the miRNA(s) with the smallest p-values is the most likely to regulate the set of transcripts.
Modeling Wobble Base-Pairing with miRvestigator HMM
Wobble base-pairing was included in the miRvestigator HMM for the case where a G=U wobble base-pairing defines the miRNA to protein coding transcript complementarity (Baek et al. 2008; Guo et al. 2010; Hendrickson et al. 2009; Selbach et al. 2008). The individual miRNA to protein coding transcript G=U wobble base-pairing is a problem that will need to be solved at the level of de novo motif identification. A wobble base-pairing state is added to the model only if a G and/or U have a nucleotide seed frequency of 25%. For the case where the G seed nucleotide frequency is greater than 25% and the U seed nucleotide frequency is below 25% the wobble state emits the nucleotide A with a probability of 1. For the case where the U seed nucleotide frequency is greater than 25% and the G seed nucleotide frequency is below 25% the wobble state emits the nucleotide C with a probability of 1. For the case where both the G and U seed nucleotide frequencies are greater than 25% the wobble state emits A and C with a probability of 0.5. When a wobble state is added the transition probability from the PSSMn state to the WOBBLEn+1 state is set to 0.19, the transition probability from the PSSMn state to the PSSMn+1 state is set to 0.8, and the transition probability from the PSSMn state to the NM2 state remains at 0.01. The transition probability from the wobble state WOBBLEn to PSSMn+1 is set to 1, which precludes a wobble base-pairing at the terminus of a state path for either transitioning to the NM2 state or to the end state.
Inferring miRNA Mediated Regulation through Analysis of Co-Expressed Genes
The inference of a miRNA regulatory network can be accomplished in two ways. The first approach requires prior knowledge of genome-wide binding site locations for known miRNAs (Sethupathy et al. 2006). There are many algorithms that utilize this target enrichment strategy for inference of miRNA regulatory networks (Betel et al. 2010; Grimson et al. 2007; Linhart et al. 2008). The second approach performs the de novo discovery of conserved putative miRNA-binding sites within the 3′ UTRs of co-expressed genes. Weeder is one such algorithm that accurately discovers conserved cis-regulatory elements in 3′ UTRs (Fan et al. 2009; Linhart et al. 2008). The information of conserved cis-regulatory sequences can then be utilized for pattern matching to seed sequences of known miRNAs in miRBase. We had previously reported a web framework using the miRvestigator algorithm for performing such pattern matching (Plaisier et al. 2011). Here, we present results on the performance of Weeder and miRvestigator applied to simulated datasets. We then utilize a compendium of experimentally generated data from targeted miRNA perturbation studies to demonstrate that restricting Weeder's search space to 3′ UTRs sequences increases the sensitivity and specificity of Weeder-miRvestigator. Finally, we use the compendium to compare the performance of algorithms for the inference of miRNA regulation and combine the optimal methods into an integrated framework.
We constructed a framework for accurate inference of miRNA-mediated regulation using as input just the 3′ UTR sequences of co-expressed genes by coupling Weeder de novo motif detection and miRvestigator for subsequent association to known miRNA seeds (
Restricting Searches to 3′ UTR Increases Sensitivity and Specificity of WeedermiRvestigator
MiRNA target prediction algorithms (including PITA, TargetScan, miRANDA, and miRSVR) improved their performance by restricting searches to the 3′ UTRs of transcripts where it has been demonstrated statistically that functional miRNA binding sites are preferentially located (Grimson et al. 2007). To determine the validity of this heuristic we investigated the distribution of functional miRNA binding sites within co-regulated transcripts by applying Weeder-miRvestigator to full transcript sequences (5′ UTR, coding sequence (CDS) and 3′ UTR). First, we compiled a compendium of miRNA target gene sets from 50 transcriptomes that were generated by perturbing specific miRNAs (22 independent studies, 41 unique mIRNAs, Supplementary Table 2). The analysis was then restricted to target gene sets in the compendium where Weeder-miRvestigator was able to identify the corresponding perturbed miRNA (27 of 50 sets). The 3′ UTRs were significantly enriched for miRNA-binding sites with 8 bp complementarity to the miRNA seed sequence (p-value=3.2×10-5,
Selecting Optimal Methods to Infer a Comprehensive miRNA Regulatory Network
While multiple hypotheses testing correction procedures can reduce the number of false positives (incorrectly inferred regulatory interactions), it also results in a higher false negative rate (i.e. missing regulatory interactions). Therefore, we hypothesized that integrating results from multiple inference methods would construct a more comprehensive cancer-miRNA regulatory network as each method identifies a different subset of the miRNA regulatory network. To assess this we first identified the best performing network inference methods by computing a ROC curve from the predictions of applying each method to the compendium of experimentally determined miRNA target gene sets. In addition to Weeder-miRvestigator, we tested four additional algorithms that infer miRNA regulation through enrichment of predicted binding sites in 3′ UTRs of co-expressed genes: PITA, TargetScan, miRanda and miRSVR. This comparative analysis demonstrated that Weeder-miRvestigator, PITA and TargetScan are the best performing algorithms for inference of miRNA mediated regulation (
A previous study published by Goodarzi, et al. analyzed transcriptome profiles from 46 different cancers and identified 2,240 cancer-subtype characteristic co-expression signatures. Interestingly, the authors were able to associate only four of these signatures to regulation by a specific miRNA in miRBase (Goodarzi et al. 2009). We analyzed these co-expression signatures using FIRM with the intent of constructing a comprehensive cancer-miRNA regulatory network. Weeder-miRvestigator, PITA and TargetScan predicted miRNA regulators for 119, 662 and 1,029 co-expression signatures, respectively (Weeder-miRvestigator criteria: perfect 7-mer or 8-mer match, FDR<0.05, Supplementary Table 4; PITA and TargetScan criteria: FDR<0.001 and enrichment>10%, Supplementary Tables 5 and 6, respectively). There was significant overlap in pairwise comparisons of predictions for the same cancer (Weeder-miRvestigator vs. PITA=0.045, Weeder-miRvestigator vs. TargetScan=0.019 and PITA vs. TargetScan=7.4×10−22;
The Cancer-miRNA Network Recapitulates miR2Disease and Discovers miRNAs that are Causal in Cancers
We investigated whether the cancer-miRNA regulatory network was able to recapitulate miRNAs that are both dysregulated in tumors and causally linked to specific oncogenic processes. We performed this analysis by comparing the cancer-miRNA network to entries in miR2Disease, a manually curated database of miRNAs that are dysregulated and causally associated with 163 human diseases, including the 46 cancers in our study. Remarkably, there was significant enrichment of known dysregulated miRNAs in the cancer-miRNA network. Altogether 191 putative miRNA regulators in our inferred network were previously shown to be dysregulated in patient tumors of the same cancer type (p-value=2.1×10−91, Supplementary Table 7). Importantly, there were significant overlaps with predictions by each of the three algorithms (Weeder-miRvestigator p-value=0.029, PITA p-value=7.4×10−23 and TargetScan p-value=1.1×10−32). This result further demonstrates the value of combining the three algorithms in FIRM to infer a more comprehensive miRNA regulatory network.
Using miR2Disease, we further investigated whether the dysregulated miRNAs predicted by FIRM were also known to causally influence cancer phenotypes. It was striking that over a third of the putative miRNA regulators that were dysregulated were also known to causally affect cancer phenotypes (66 miRNAs, p-value=1.4×10−34, Supplementary Table 7). Among these, three of the most highly connected miRNAs (miR-29b, miR-200b and miR-296-5p) were dysregulated in at least 8 cancers and causal in at least 4 cancers. These results demonstrate that the network inferred by FIRM had captured disease-relevant miRNA regulation of cancer. It also suggests that the network contains novel testable hypotheses regarding the role of miRNAs in regulation of cancer beyond what is documented in miR2Disease. A key next step is the prioritization of these novel testable hypotheses by integrating orthogonal information.
Identifying miRNAs Regulating the Hallmarks of Cancer
Associating a miRNA to a co-expression signature in patient tumors does not by itself implicate it in the regulation of key oncogenic processes. However, the network enables the discovery of cancer-relevant miRNAs through analysis of target genes for functional enrichment of one or more hallmarks of cancer (Douglas Hanahan and Robert A Weinberg 2011; D Hanahan and R A Weinberg 2000): 1) “self sufficiency in growth signals”, 2) “insensitivity to antigrowth signals”, 3) “evading apoptosis”, 4) “limitless replicative potential”, 5) “sustained angiogenesis”, 6) “tissue invasion and metastasis”, 7) “genome instability and mutation”, 8) “tumor promoting inflammation”, 9) “reprogramming energy metabolism”, and 10) “evading immune detection”. We analyzed genes within each of the co-expression signatures for hallmarks of cancer through their associations to specific Gene Ontology (GO) biological process terms.
In total 627 of the 2,240 co-expression signatures were significantly enriched for GO terms (FDR<0.05), and 314 were associated with a putative miRNA in the regulatory network (Supplementary Table 8). To further filter this set and discover specific co-expression signatures associated with oncogenesis, we manually curated the lowest level GO terms for each of the 10 hallmarks of cancer (Supplementary Table 9), e.g. the hallmark of cancer “Evading Apoptosis” is associated with the GO term “Positive Regulation of Anti-Apoptosis”. Based on semantic similarity between GO terms we then associated 158 of the 314 putatively miRNA regulated co-expression signatures to one or more hallmarks of cancer (Jiang-Conrath Semantic Similarity Score>0.8, permuted p-value<5.1×10-4, Supplementary Table 8).
Metastatic potential is one of the defining features of malignant tumors making putative miRNA-regulators of “tissue invasion and metastasis” excellent biomarker candidates. As an initial filter we selected 85 of the 158 “hallmarks of cancer”-associated co-expression signatures that had significant overlap (p-value<0.05) between GO annotated- and putatively miRNAregulated genes. Next, we extracted from these 85 co-expression signatures a subnetwork of 33 miRNAs and their predicted regulatory influences on 47 co-expression signatures associated with “tissue invasion and metastasis”—i.e. the metastatic cancer miRNA-regulatory network (
A Relatively Small Subset of miRNAs Regulate Oncogenic Processes in Diverse Cancers
Regulation of the same oncogenic process by the same miRNA across different cancers reinforces the likelihood that the inferred miRNA regulation is real. In the cancer-miRNA regulatory network the number of co-expression signatures regulated by a miRNA follows a power-law distribution (y=2.1±0.0; goodness of fit p-value<1.0×10-4) with each miRNA predicted to regulate on average 3.3±3.3 co-expression signatures (Barabasi and Albert 1999). This suggests that some miRNAs regulate common biological processes across multiple cancers. Therefore, we filtered the cancer-miRNA regulatory network for miRNAs predicted to regulate genes within two or more co-expression signatures enriched for the same GO term(s). This analysis recovered 24 miRNAs that were predicted to combinatorially regulate 74 non-redundant co-expression signatures. Again, using semantic similarity to the hallmarks of cancer we discovered a subnetwork of 38 co-expression signatures from 30 cancer types that are regulated by 13 highly connected miRNAs (miR-29a/b/c, miR-130a, miR-296-5p, miR-338-5p, miR-369-5p, miR-656, miR-760, miR-767-5p, miR-890, miR-939, miR-1275, miR-1276 and miR-1291)—i.e. a cross-cancer-miRNA regulatory network (
Extracellular Matrix Genes Co-Regulated by miR-29 Family in Lung Adenocarcinoma
In both the metastatic and cross-cancer-miRNA regulatory network, the miR-29 family (miR-29a, miR-29b and miR-29c) was predicted to be responsible for 8 co-expression signatures, five of which were associated with four hallmarks of cancer, viz. “tissue invasion and metastasis”, “sustained angiogenesis”, “insensitivity to anti-growth signals” and “self sufficiency in growth signals” (
Two independent studies demonstrated that over-expression of miR-29a reduces the invasiveness of lung carcinoma cell lines (Muniyappa et al. 2009) and knock-down of miR-29b increases invasiveness (Rothschild et al. 2012). Serving as independent validation of the network predicted role of miR-29 family as regulators of “activating invasion and metastasis” in lung cancer. The direction of this association is concordant with a different set of studies which independently discovered that miR-29 family members were down-regulated in lung adenocarcinomas relative to normal lung (Landi et al. 2010; Yanaihara et al. 2006). Taken together these orthogonal sets of results strongly suggest that down-regulation of the miR-29 family increases tumor invasiveness thereby decreasing patient survival (Rothschild et al. 2012).
A major strength of the cancer-miRNA regulatory network is that it identifies specific genes that are directly regulated by a specific miRNA. For instance, miR-29 family is implicated in modulating metastatic potential of patient tumors because it is predicted to directly regulate 79 and 64 genes in two co-expression signatures—“AD Lung Beer 31” and “AD Lung Bhatacharjee 59”. Notably, the two co-expression signatures have a significant overlap of 32 genes (p-value=2.1×10−46). We assessed whether these genes were indeed targets for regulation by the miR-29 family by investigating if they were differentially regulated when endogenous miRNAs of the miR29 family were knocked-down in a fetal lung fibroblast cell line (Cushing et al. 2011). Sixteen genes from “AD Lung Beer 31”, and 9 genes from “AD Lung Bhattacharjee 59” were up-regulated in response to knock-down of the three miR-29 family members (p-values=6.1×10−14 and 1.5×10−8, respectively). Altogether 17 genes from both co-expression signatures were up-regulated in the Cushing et al. study (Table 1), and notably all of these genes contain one or more miR-29 family binding sites in their 3′ UTRs (Table 1).
Differential regulation of the seventeen genes in the Cushing et al. study does not demonstrate direct regulation by miR29 family miRNAs through physical interaction with predicted binding sites within 3′ UTRs of these genes. However, it is possible to demonstrate direct miRNA regulation by fusing the 3′ UTR of each putative target gene to a luciferase reporter, selectively deleting specific binding sites, and performing luciferase assays in cell lines that are co-transfected with the wildtype or mutated reporter-fusion construct and the synthetic miRNA mimic (at different concentrations) (Lal et al. 2011). We selected a total of 8 genes (COL3A1, COL4A1, COL4A2, FBN1, PDGFRB, SERP1NH1, and SPARC—see Table 1) to investigate using the aforementioned luciferase assay whether they were direct targets for regulation by miR29 family miRNAs (miR-29a, miR-29b and miR-29c). These genes were selected because they were predicted by the FIRM methods to (i) be in co-expression signatures regulated by the miR-29 family, (ii) contain miR-29 family binding sites, (iii) have functional association to “tissue invasion and metastasis” (e.g. collagens, metallo-proteases, etc.), and (iv) be up-regulated by miR-29 family knock-down in lung fibroblasts in the Cushing et al. study.
First, we used qRT-PCR to demonstrate that the miR-29a mimic significantly down regulates transcript levels of luciferase when it is fused to 3′ UTRs of either COL3 A1 or SPARC (COL3A1 p-value=3.2×10−2, fold-change=−3.9; SPARC p-value=4.2×10−2, fold-change=−1.7). This validates our central thesis that perturbing a miRNA results in observable changes in transcript levels of the predicted target transcripts with corresponding miRNA-binding sites in the 3′ UTR. We then assayed the effects of all three miR-29 mimics (miR-29a, miR-29b and miR-29c) on normalized luciferase activity relative to a control (i.e. no miRNA mimic). Significant reduction in normalized luciferase expression (p-value<0.05) was observed for 7 of the 8 genes tested (Table 2), and there was no consequence when luciferase was fused to the negative control 3′ UTR from HIST1H2AC (miR-29a: p-value=0.99, fold-change=1.2). Deletion of all the putative miR-29 binding sites from the 3′ UTRs of MMP2 and SPARC abolished down regulation of luciferase activity by the miR-29 family mimics, conclusively demonstrating that miR-29 directly regulates abundance of predicted target transcripts via binding to the predicted 3′ UTR sites (MMP2-deletion: 1 site deleted, fold-change=1.1, p-value=8.6×10−1; SPARC-deletion: 2 sites deleted, fold-change=1.4, p-value=1.0;
Finally, titration of the miR-29a mimic demonstrated it down regulates COL3A1 and SPARC in a dose-dependent manner (
miR-767-5p Regulates a Collagen-Specific Subset of miR-29 Target Genes
Analysis of predicted regulation by miR-29 demonstrates that the cancer-miRNA regulatory network makes accurate predictions that can be validated experimentally through a combination of miRNA perturbation and targeted mutagenesis of specific binding sites in the 3′ UTRs. We conducted further experimental analysis of predicted regulation by miR-767-5p to assess the specificity of using FIRM inferences to identify genes regulated by a miRNA. We selected miR-767-5p because this miRNA partially shares the miR-29 seed sequence. Specifically, both the metastatic and cross cancer-miRNA regulatory networks contain the PITA predictions that miR-767-5p regulates genes associated with four hallmarks of cancer (“insensitivity to antigrowth signals”, “self sufficiency in growth signals”, “sustained angiogenesis” and “tissue invasion and metastasis”) from four co-expression signatures (AD Ovarian Welsh 20, HSCC Head-Neck Chung 1, and SQ Bhattacharjee 18 and 44) across 3 cancer types (Bhattacharjee et al. 2001; Chung et al. 2004; Welsh et al. 2001).
Unlike the miR-29 family, miR-767-5p has not been previously associated with any oncogenic processes. Therefore, we first evaluated whether there is any evidence for expression of miR-767-5p in head and neck, lung, or ovarian cancers to support the prediction by the cancer-miRNA regulatory network. A scan of miRNA-seq data from The Cancer Genome Atlas (TCGA) shows that miR-767-5p is indeed expressed in lung squamous cell carcinoma, head and neck squamous cell carcinoma, and ovarian serous cystadenocarcinoma (data not shown). Additionally, the MirZ miRNA expression atlas identifies miR-767-5p expression in astrocytoma, osteosarcoma and teratocarcinoma cell lines (Hausser et al. 2009). Future studies with the completed TCGA data will be able to determine whether miR-767-5p is differentially expressed between tumor and normal and whether miR-767-5p is predictive of patient survival. Based on this evidence we proceeded to test the effect of perturbing miR-767-5p on transcript abundance of the PITA predicted targets. Over-expression of miR-767-5p using a miRNA mimic led to significant reduction (p-value<0.05) in the normalized luciferase activity for 3 of the 4 predicted miRNA target genes (COL3A1, COL5A2, COL10A1 and LOX; Table 2).
In addition to validating a novel oncogenesis-associated miRNA, the aforementioned rationale for selecting miR-767-5p was that it also shares 6 bp of similarity to the 8 bp seed region of the miR-29 family leading to a significant overlap between their predicted target genes (65% for PITA and 35% for TargetScan). This may explain why miR-767-5p and the miR-29 family are both predicted regulators of the HSCC Head-Neck Chung 1 co-expression signature. However, the two seed sequences have little similarity in the 3′ region (
The FIRM approach was used to identify miRNAs regulating a number of hallmarks of cancer as described above as well as additional hallmarks of cancer. The miRNAs associated with additional hallmarks of cancer are set out in
As genome-wide analyses for discovery of molecular signatures of complex disease becomes routine it is imperative that these data are integrated into predictive and actionable models that drive targeted hypothesis-driven discovery of diagnostics, prognostics and, ultimately, therapeutics. The systems integration of disparate kinds of information boosts signal to noise enabling the discovery of biologically meaningful patterns as we have demonstrated here through inference of a cancer miRNA regulatory network. The success of the FIRM approach depended not only on integration of three best performing algorithms that use complementary strategies for inference of miRNA regulatory networks, but also on the integration of disparate data types such as gene co-expression, and distributions of both known and de novo discovered miRNA binding sites (
Further, we have also demonstrated that by incorporating the mechanistic basis of miRNA regulation, i.e. binding to complementary sequences in the 3′ UTRs of co-expressed genes, the network can be more easily assayed with targeted experimental and functional evaluation. In doing so we were able to demonstrate that the cancer-miRNA regulatory network had captured a significant proportion of known miRNA dysregulation and their causal influence on cancer phenotypes. In fact the network also made specific experimentally testable novel predictions regarding the role of 158 miRNAs in mediating co-expression of genes associated with oncogenic processes. Among these were 33 miRNAs that were predicted to regulate metastatic processes including a core set of 13 miRNAs that were predicted to regulate the same set of oncogenic processes across different cancer types. Our focused investigation of the role of miR-29 family in promoting metastasis in lung adenocarcinoma demonstrates how these network predictions can drive discovery of new biology.
As a generalizable framework for inferring miRNA mediated regulation, FIRM will also benefit from simultaneous measurement of changes in miRNA and mRNA levels in patient tumors. However, negative correlation with gene expression changes alone does not accurately identify bona fide targets for the miRNA (Tsunglin Liu et al. 2007; Ritchie et al. 2009; Liang Wang et al. 2009). Thus clustering of the gene expression data and subsequent analysis with FIRM will be necessary for the inference of accurate miRNA regulatory networks. Correlation with the putative miRNA regulators could be used post hoc as a secondary screen to filter the predicted list of targets, and prioritize miRNAs for further experimental validation. We have demonstrated the power of this approach by performing targeted experiments to test predictions from the cancer-miRNA regulatory network. These experiments have discovered novel regulation of specific oncogenesis-associated genes by miRNAs that are shared across different cancer types. Importantly, in addition to providing mechanistic linkages between a known tumor suppressor miRNA (miR-29) and regulation of specific genes with metastatic potential, we have also discovered a novel oncogenesis associated miRNA (miR-767-5p). The choice of miRNAs for validating network predictions has also helped to highlight the sensitivity and specificity of FIRM performance. As such, we have not only demonstrated the extraordinary value of the cancer-miRNA network in cancer research; but also the power of FIRM to construct from easily generated gene expression data similar miRNA regulatory networks for any disease.
We contemplate integrating inference of miRNA regulation into the clustering procedure. This will act as a constraint for accurate discovery of genes co-regulated by the same miRNA. The cMonkey biclustering algorithm already incorporates de novo discovery of transcription factor binding sites within gene promoters to limit the space of gene-gene associations to accurately discover sets of genes that are regulated by the same transcription factor (Reiss et al. 2006). The incorporation of constraints based on mechanisms of miRNA regulation will greatly improve the ability of cMonkey to model eukaryotic transcriptional regulatory networks. We contemplate that the ability of cMonkey to discover conditional coregulation of genes increases the sensitivity of FIRM and also provides the context (disease type, stage of progression, etc.) for regulatory influence of a miRNA.
Availability of miRvestigator, FIRM and Cancer-miRNA Regulatory Network
MiRvestigator was developed as an open source project using the Python programming language and is available both as a web service (http://mirvestigator.systemsbiology.net) and as source code (http://github.com/cplaisier/miRvestigator) (Plaisier et al. 2011). The FIRM and cancer-miRNA regulatory network are freely available at http://cmrn.systemsbiology.net
To facilitate reader access and usability we have developed and hosted a freely available website (http://cmrn.systemsbiology.net) containing: 1) all data contained within the cancer-miRNA regulatory network, 2) including the compendium of 50 experimentally defined miRNA target gene sets, and 3) the FIRM framework to infer miRNA regulatory networks from gene coexpression information. Our hope is that this will provide cancer researchers with a usable interface to explore the cancer-miRNA regulatory network, computational biologists with a valuable resource to compare methods of inferring miRNA mediated regulation, and researchers with the tools to infer miRNA regulatory networks for their disease of interest.
While the present invention has been described in terms of various embodiments and examples, it is understood that variations and improvements will occur to those skilled in the art. Therefore, only such limitations as appear in the claims should be placed on the invention.
All documents referred to in this application, including priority documents, are hereby incorporated by reference in their entirety with particular attention to the content for which they are referred.
Alexa A, Rahnenfiihrer J, and Lengauer T. 2006. Improved scoring of functional groups from gene expression data by decorrelating GO gaph structure. Bioinformatics 22: 1600-1607.
Baek D, Villén J, Shin C, Camargo F D, Gygi S P, and Bartel D P. 2008. The impact of microRNAs on protein output. Nature 455: 64-71.
Barabasi, and Albert. 1999. Emergence of scaling in random networks. Science 286: 509-512.
Bartel D P. 2009. MicroRNAs: target recognition and regulatory functions. Cell 136: 215-233.
Beer D G, Kardia SLR, Huang C-C, Giordano T J, Levin A M, Misek D E, Lin L, Chen G, Gharib T G, Thomas D G, et al. 2002. Gene-expression profiles predict survival of patients with lung adenocarcinoma. Nat. Med. 8: 816-824.
Betel D, Koppal A, Agius P, Sander C, and Leslie C. 2010. Comprehensive modeling of microRNA targets predicts functional non-conserved and non-canonical sites. Genome Biol. 11: R90.
Betel D, Wilson Manda, Gabow A, Marks D S, and Sander C. 2008. The microlMA.org resource: targets and expression. Nucleic Acids Res. 36: D149-153.
Bhattacharjee A, Richards WG, Staunton J, Li C, Monti S, Vasa P, Ladd C, Beheshti J, Bueno R, Gillette M, et al. 2001. Classification of human lung carcinomas by mRNA expressionprofiling reveals distinct adenocarcinoma subclasses. Proc. Natl. Acad. Sci. U.S.A. 98: 13790-13795.
Boll K, Reiche K, Kasack K, Morbt N, Kretzschmar A K, Tomm J M, Verhaegh G, Schalken J, von Bergen M, Horn F, et al. 2012. MiR-130a, miR-203 and miR-205 jointly repress key oncogenic pathways and are downregulated in prostate carcinoma. Oncogene. http://www.ncbi.nlm.nih.gov/pubmed/22391564 (Accessed Apr. 12, 2012).
Brennecke J, Stark A, Russell R B, and Cohen S M. 2005. Principles of microRNA-target recognition. PLoS Biol. 3: e85.
Brueckner B, Stresemann C, Kuner R, Mund C, Musch T, Meister M, Silltmann H, and Lyko F. 2007. The human let-7a-3 locus contains an epigenetically regulated microRNA gene with oncogenic function. Cancer Res. 67: 1419-1423.
Ceppi M, Pereira P M, Dunand-Sauthier I, Bums E, Reith W, Santos M A, and Pierre P. 2009. MicroRNA-155 modulates the interleukin-1 signaling pathway in activated human monocyte-derived dendritic cells. Proc. Natl. Acad Sci. U.S.A. 106: 2735-2740.
Chang T-C, Wentzel E A, Kent O A, Ramachandran K, Mullendore M, Lee K H, Feldmann G, Yamakuchi M, Ferlito M, Lowenstein C J, et al. 2007. Transactivation of miR-34a by p53 broadly influences gene expression and promotes apoptosis. Mol. Cell 26: 745-752.
Chung C H, Parker J S, Karaca G, Wu Junyuan, Funkhouser W K, Moore D, Butterfoss D, Xiang D, Zanation A, Yin X, et al. 2004. Molecular classification of head and neck squamous cell carcinomas using patterns of gene expression. Cancer Cell 5: 489-500.
Cushing L, Kuang P P, Qian J, Shao F, Wu Junjie, Little F, Thannickal V J, Cardoso W V, and Lu J. 2011. miR-29 is a major regulator of genes associated with pulmonary fibrosis. Am. J. Respir. Cell Mol. Biol. 45: 287-294.
Dalmay T, and Edwards D R. 2006. MicroRNAs and the hallmarks of cancer. Oncogene 25: 6170-6175.
Fan D, Bitterman P B, and Larsson 0.2009. Regulatory element identification in subsets of transcripts: comparison and integyation of current computational methods. RNA 15: 1469-1482.
Fasanaro P, Greco S, Lorenzi M, Pescatori M, Brioschi M, Kulshreshtha R, Banfi C, Stubbs A, Cahn George A, Ivan M, et al. 2009. An integyated approach for experimental target identification of hypoxia-induced miR-210. J. Biol. Chem. 284: 35134-35143.
Frankel L B, Christoffersen N R, Jacobsen A, Lindow M, Krogh A, and Lund All. 2008. Programmed cell death 4 (PDCD4) is an important functional target of the microRNA miR-21 in breast cancer cells. J. Biol. Chem. 283: 1026-1033.
Friedman R C, Farh K K-H, Burge C B, and Bartel D P. 2009. Most mammalian mRNAs are conserved targets of microRNAs. Genome Res. 19: 92-105.
Fröhlich 11, Speer N, Poustka A, and Beissbarth T. 2007. GOSim—an R-package for computation of information theoretic GO similarities between terms and gene products. BMC Bioinformatics 8: 166.
Garofalo M, and Croce C M. 2011. microRNAs: Master regulators as potential therapeutics in cancer. Annu. Rev. Pharmacol. Toxicol. 51: 25-43.
Georges S A, Biery M C, Kim S-Y, Schelter J M, Guo J, Chang A N, Jackson A L, Carleton M O, Linsley P S, Cleary M A, et al. 2008. Coordinated regulation of cell cycle transcripts by p53-Inducible microRNAs, miR-192 and miR-215. Cancer Res. 68: 10105-10112.
Goodarzi II, Elemento 0, and Tavazoie S. 2009. Revealing global regulatory perturbations across human cancers. Mol. Cell 36: 900-911.
Grimson A, Farh K K-H, Johnston W K, Garrett-Engele P, Lim L P, and Bartel D P. 2007. MicroRNA targeting specificity in mammals: determinants beyond seed pairing. Mol. Cell 27: 91-105.
Hanahan D, and Weinberg R A. 2000. The hallmarks of cancer. Cell 100: 57-70.
Hausser J, Berninger P, Rodak C, Jantscher Y, Wirth S, and Zavolan M. 2009. MirZ: an integrated microRNA expression atlas and target prediction resource. Nucleic Acids Res. 37: W266-272.
He L, He X, Lim LP, de Stanchina E, Xuan Z, Liang Y, Xue W, Zender L, Magnus J, Ridzon D, et al. 2007. A microRNA component of the p53 tumour suppressor network. Nature 447: 1130-1134.
Hendrickson D G, Hogan D J, Herschlag D, Ferrell J E, and Brown P O. 2008. Systematic identification of mRNAs recruited to argonaute 2 by specific microRNAs and corresponding changes in transcript abundance. PLoS ONE 3: e2126.
Jiang Q, Wang Y, Hao Y, Juan L, Teng M, Zhang X, Li M, Wang G, and Liu Y. 2009. miR2Disease: a manually curated database for microRNA deregulation in human disease. Nucleic Acids Res. 37: D98-104.
Johnson C D, Esquela-Kerscher A, Stefani G, Byrom M, Kelnar K, Ovcharenko D, Wilson Mike, Wang Xiaowei, Shelton J, Shingara J, et al. 2007. The let-7 microRNA represses cell proliferation pathways in human cells. Cancer Res. 67: 7713-7722.
Karginov F V, Conaco C, Xuan Z, Schmidt B H, Parker J S, Mandel G, and Hannon G J. 2007. A biochemical approach to identifying microRNA targets. Proc. Natl. Acad. Sci. U.S.A. 104: 19291-19296.
Kertesz M, lovino N, Unnerstall U, Gaul U, and Segal E. 2007. The role of site accessibility in microRNA target recognition. Nat. Genet. 39: 1278-1284.
Kozomara A, and Griffiths-Jones S. 2011. miRBase: integrating microRNA annotation and deep-sequencing data. Nucleic Acids Res. 39: D152-157.
Lal A, Thomas M P, Altschuler G, Navarro F, O′Day E, Li X L, Concepcion C, Han Y-C, Thiery J, Rajani D K, et al. 2011. Capture of microRNA-bound mRNAs identifies the tumor suppressor miR-34a as a regulator of gyowth factor signaling. PLoS Genet. 7: el 002363.
Landi M T, Zhao Y, Rotunno M, Koshiol J, Liu H, Bergen A W, Rubagotti M, Goldstein A M, Linnoila I, Marincola F M, et al. 2010. MicroRNA expression differentiates histology and predicts survival of lung cancer. Clin. Cancer Res. 16: 430-441.
Lim L P, Lau N C, Garrett-Engele P, Grimson A, Schelter J M, Castle J, Bartel D P, Linsley P S, and Johnson J M. 2005. Microarray analysis shows that some microRNAs downregulate large numbers of target mRNAs. Nature 433: 769-773.
Linhart C, Halperin Y, and Shamir R. 2008. Transcription factor and microRNA motif discovery: the Amadeus platform and a compendium of metazoan target sets. Genome Res. 18: 1180-1189.
Linsley P S, Schelter J, Burchard J, Kibukawa M, Martin M M, Bartz S R, Johnson J M, Cummins J M, Raymond C K, Dai H, et al. 2007. Transcripts targeted by the microRNA-16 family cooperatively regulate cell cycle progression. Mol. Cell. Biol. 27: 2240-2252.
Liu T, Papagiannakopoulos T, Puskar K, Qi S, Santiago F, Clay W, Lao K, Lee Y, Nelson S F, Komblum H I, et al. 2007. Detection of a microRNA signal in an in vivo expression set of mRNAs. PLoS ONE 2: e804.
Malzkorn B, Wolter M, Liesenberg F, Grzendowski M, Stiihler K, Meyer H E, and Reifenberger G. 2010. Identification and functional characterization of microRNAs involved in the malignant progression of gliomas. Brain Pathol. 20: 539-550.
Muniyappa M K, Dowling P, Henry M, Meleady P, Doolan P, Gammell P, Clynes M, and Barron N. 2009. MiRNA-29a regulates the expression of numerous proteins and reduces the invasiveness and proliferation of human carcinoma cell lines. Eur. J. Cancer 45: 3104-3118.
Nana-Sinkam S P, and Croce C M. 2011. MicroRNAs as therapeutic targets in cancer. Transl Res 157: 216-225.
Ozen M, Creighton C J, Ozdemir M, and Ittmann M. 2008. Widespread deregulation of microRNA expression in human prostate cancer. Oncogene 27: 1788-1793.
Pavesi G, Mereghetti P, Zambelli F, Stefani M, Mauri G, and Pesole G. 2006. MoD Tools: regulatory motif discovery in nucleotide sequences from co-regulated or homologous genes. Nucleic Acids Res. 34: W566-570.
Plaisier C L, Bare J C, and Baliga N S. 2011. miRvestigator: web application to identify miRNAs responsible for co-regulated gene expression patterns discovered through transcriptome profiling. Nucleic Acids Res. 39: W125-131.
Reiss D J, Baliga N S, and Bonneau R. 2006. Integrated biclustering of heterogeneous genomewide datasets for the inference of global regulatory networks. BMC Bioinformatics 7: 280.
Ritchie W, Rajasekhar M, Flamant S, and Rasko J E J. 2009. Conserved expression patterns predict microRNA targets. PLoS Comput. Biol. 5: e1000513.
Robin X, Turck N, Hainard A, Tiberti N, Lisacek F, Sanchez J-C, and Muller M. 2011. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics 12: 77.
Rothschild S I, Tschan M P, Federzoni E A, Jaggi R, Fey M F, Gugger M, and Gautschi 0. 2012. MicroRNA-29b is involved in the Src-ID1 signaling pathway and is dysregulated in human lung adenocarcinoma. Oncogene. http://www.ncbi.nlm.nih.gov/pubmed/22249264 (Accessed Apr. 12, 2012).
Ruan K, Fang X, and Ouyang G. 2009. MicroRNAs: novel regulators in the hallmarks of human cancer. Cancer Lett. 285: 116-126.
Selbach M, Schwanhausser B, Thierfelder N, Fang Z, Khanin R, and Rajewsky N. 2008. Widespread changes in protein synthesis induced by microRNAs. Nature 455: 58-63.
Sengupta S, den Boon J A, Chen I-H, Newton M A, Stanhope S A, Cheng Y-J, Chen C-J, Hildesheim A, Sugden B, and Ahlquist P. 2008. MicroRNA 29c is down-regulated innasopharyngeal carcinomas, up-regulating mRNAs encoding extracellular matrix proteins. Proc. Natl. Acad. Sci. U.S.A. 105: 5874-5878.
Sethupathy P, Megraw M, and Hatzigeorgiou AG. 2006. A guide through present computational approaches for the identification of mammalian microRNA targets. Nat. Methods 3: 881-886.
Sing T, Sander 0, Beerenwinkel N, and Lengauer T. 2005. ROCR: visualizing classifier performance in R. Bioinformatics 21: 3940-3941.
Tan L P, Seinen E, Duns G, de Jong D, Sibon O C M, Poppema S, Kroesen B-J, Kok K, and van den Berg A. 2009. A high throughput experimental approach to identify miRNA targets in human cells. Nucleic Acids Res. 37: el 37.
Tsai W-C, Hsu PW-C, Lai T-C, Chau G-Y, Lin C-W, Chen C-M, Lin C-D, Liao Y-L, Wang J-L, Chau Y-P, et al. 2009. MicroRNA-122, a tumor suppressor microRNA that regulates intrahepatic metastasis of hepatocellular carcinoma. Hepatology 49: 1571-1582.
Vaira V, Faversani A, Dohi T, Montorsi M, Augello C, Gatti S, Coggi G, Alfieri D C, and Bosari S. 2011 miR-296 regulation of a cell polarity-cell plasticity module controls tumor progression. Oncogene. http://www.ncbi.nlm.nih.gov/pubmed/21613016 (Accessed Oct. 8, 2011).
Valastyan S, Reinhardt F, Benaich N, Calogrias D, Szasz A M, Wang Z C, Brock J E, Richardson A L, and Weinberg Robert A. 2009. A pleiotropically acting microRNA, miR-31, inhibits breast cancer metastasis. Cell 137: 1032-1046.
Wang L, Oberg A L, Asmann Y W, Sicotte H, McDonnell S K, Riska S M, Liu W, Steer C J, Subramanian S, Cunningham J M, et al. 2009. Genome-wide transcriptional profiling reveals microRNA-correlated genes and biological processes in human lymphoblastoid cell lines. PLoS ONE 4: e5878.
Wang W-X, Wilfred B R, Hu Y, Stromberg A J, and Nelson P T. 2010. Anti-Argonaute RIP-Chip shows that miRNA transfections alter global patterns of mRNA recruitment to microribonucleoprotein complexes. RNA 16: 394-404.
Weber F, Teresi R E, Broelsch C E, Frilling A, and Eng C. 2006. A limited set of human MicroRNA is deregulated in follicular thyroid carcinoma. J. Clin. Endocrinol. Metab. 91: 3584-3591.
Welsh J B, Zarrinkar P P, Sapinoso L M, Kern S G, Behling C A, Monk B J, Lockhart D J, Burger R A, and Hampton G M. 2001. Analysis of gene expression profiles in normal and neoplastic ovarian tissue samples identifies candidate molecular markers of epithelial ovarian cancer. Proc. Natl. Acad. Sci. U.S.A. 98: 1176-1181.
Yanaihara N, Caplen N, Bowman E, Seike M, Kumamoto K, Yi M, Stephens R M, Okamoto A, Yokota J, Tanaka T, et al. 2006. Unique microRNA molecular profiles in lung cancer diagnosis and prognosis. Cancer Cell 9: 189-198.
Zen K, and Zhang C-Y. 2010. Circulating MicroRNAs: a novel class of biomarkers to diagnose and monitor human cancers. Med Res Rev http://www.ncbi.nlm.nih.gov/pubmed/21064190 (Accessed Oct. 8, 2011).
Baek D, Villén J, Shin C, Camargo F D, Gygi S P, and Bartel D P. 2008. The impact of microRNAs on protein output. Nature 455: 64-71.
Fan D, Bitterman P B, and Larsson O. 2009. Regulatory element identification in subsets of transcripts: comparison and integration of current computational methods. RNA 15: 1469-1482.
Guo H, Ingolia N T, Weissman J S, and Bartel D P. 2010. Mammalian microRNAs predominantly act to decrease target mRNA levels. Nature 466: 835-840.
Hendrickson D G, Hogan D J, McCullough H L, Myers J W, Herschlag D, Ferrell J E, and Brown P O. 2009. Concordant regulation of translation and mRNA abundance for hundreds of targets of a human microRNA. PLoS Biol. 7: e1000238.
Kertesz M, lovino N, Unnerstall U, Gaul U, and Segal E. 2007. The role of site accessibility in microRNA target recognition. Nat. Genet. 39: 1278-1284.
Linhart C, Halperin Y, and Shamir R. 2008. Transcription factor and microRNA motif discovery: the Amadeus platform and a compendium of metazoan target sets. Genome Res. 18: 1180-1189.
Pavesi G, Mereghetti P, Zambelli F, Stefani M, Mauri G, and Pesole G. 2006. MoD Tools: regulatory motif discovery in nucleotide sequences from co-regulated or homologous genes. Nucleic Acids Res. 34: W566-570.
Selbach M, Schwanhäusser B, Thierfelder N, Fang Z, Khanin R, and Rajewsky N. 2008. Widespread changes in protein synthesis induced by microRNAs. Nature 455: 58-63.
This invention was made with U.S. Government support under NIH (P50GM076547 and 1R01GM077398-01A2), DoE (DE-FG02-04ER64685)and NSF (DBI-0640950). The U.S. Government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US14/44385 | 6/26/2014 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
61840255 | Jun 2013 | US | |
61888346 | Oct 2013 | US |