The invention relates to assessing the character and level of RNA molecules in human tissues and bodily fluids especially plasma. In particular, it relates to the nature and level of a multitude of both endogenous and exogenous RNA in these samples, including determining microbiome composition and function for a test subject.
Many biological insights have emerged from the analysis of DNA and RNA sequences. Important discoveries, such as various pathology-causing variants in the human genome and the history of human migration, were made possible by the availability of sequencing technology. Normal human physiology is the result of a well-orchestrated balance between genetic (intrinsic) and environmental (extrinsic) factors, and the availability of the complete human genome sequence facilitates the study of complex human-environmental interactions. Recently this has included the human-microbiome interaction, especially the gut microbiome. These microbes interact intimately with gut epithelium and the alteration in the spectrum of the gut microbiome has been linked to various physiopathological conditions, such as diarrhea, diabetes, obesity, inflammatory pathologies and cancer as well as to the general state of health.
The recent development of highly parallelized next generation (NextGen) sequencing technologies has further advanced the use of sequencing as a tool for studying complex biological systems by genome sequencing and transcriptome analysis. One advantage of using a sequence-based approach for transcriptome analysis is the ability to identify novel transcripts, such as alternative usage of exons or polyadenylation sites of known transcripts. The recent explosion of information on microRNA (miRNA) and other noncoding RNAs (ncRNAs) is the result in part of applying these new technologies. To date more than 1000 different human miRNA species have been identified (see miRBase, see the website for mirbase.org). Recently, a significant number of these RNA molecules have been observed in the extracellular environment and have been implicated as important mediators in cell-cell communication.
The present invention relates to the application of RNA identification techniques such as parallel rapid sequencing and microarray mass spectrometry techniques to identify and quantify the RNA molecules circulating in blood, residing in tissues, or present in other bodily fluids. It has been found that not all of the circulating RNA molecules are endogenous to human or other animal subjects, and many are characteristic of exogenous substances or organisms, such as bacteria, archaea, fungi, or substances that have been consumed such as food or infectious organisms. These exogenous RNAs have also been observed in tissues. A variety of applications is disclosed as part of the invention.
Thus, in one aspect, the invention is directed to a method to assess the physiological state of a test subject which method comprises obtaining a test spectrum of the identity and level of RNA molecules present in a sample of a tissue or biological fluid from said test subject; and comparing said spectrum with a control spectrum comparably obtained from one or more normal, control subjects; whereby a significant difference between the test spectrum from that of said control spectrum indicates a physiological condition in said test subject that is other than normal.
In another aspect, the invention is directed to a method to determine microbiome composition and function of a test subject, which method comprises obtaining a test spectrum of the identity and level of RNA molecules present in a sample of a tissue or biological fluid from said test subject; and associating the identity and/or level of RNA molecules in said spectrum with individual microorganisms; whereby the microbiome of said subject is determined.
In still another aspect the invention is directed to a method to assess the effect of a treatment or protocol that has been administered to a test subject, which method comprises obtaining a test spectrum of the identity and level of RNA molecules present in a sample of a tissue or biological fluid from said test subject; and comparing said spectrum with a control spectrum comparably obtained from one or more subjects that have not been administered said treatment or protocol or from said subject prior to administration of said treatment or protocol; whereby a significant difference between the test spectrum from said control spectrum indicates the effect of said treatment or protocol on said test subject.
In still another aspect the invention is directed to a method to determine whether a test subject has been subjected to a treatment or protocol or is afflicted with a disease or condition, which method comprises obtaining a test spectrum of the identity and level of RNA molecules present in a sample of a tissue or biological fluid from said test subject; and comparing said spectrum with a control spectrum comparably obtained from one or more control subjects that have been administered said treatment or protocol or are known to be afflicted with said disease or condition; whereby a significant similarity between the test spectrum with that of said control spectrum indicates the subject has been administered said treatment or protocol or is afflicted with said disease or condition.
In still another aspect the invention is directed to a method to determine whether a subject has ingested one or more substances, which method comprises obtaining a test spectrum of the identity and level of RNA molecules present in a sample of a tissue or biological fluid from said subject; and comparing said test spectrum with a control spectrum comparably obtained from one or more subjects that have ingested said one or more substances, whereby a significant similarity between the test spectrum with that of said control spectrum indicates the subject has ingested said one or more substance.
In still another aspect the invention is directed to a method to determine whether a subject has ingested one or more substances which method comprises obtaining a test spectrum of the identity and level of RNA molecules present in a sample of a tissue or biological fluid from said subject; and associating the identity and/or level of RNA molecules in said spectrum with said one or more substances; whereby assessing the presence and/or level of one or more RNA molecules as characteristic of said one or more substances determines whether said ingestion has occurred. This general principle can be expanded to correlate dietary patterns with patterns found in the microbiome. Thus, combinatorial techniques can be used to correlate differences in dietary patterns with regard to single types of nutrients or multiplicities of types of nutrients with changes in the microbiome. This may guide practitioners in prescribing appropriate dietary changes for subjects.
In still another aspect the invention is directed to a method to identify a biological pathway that is affected in a subject afflicted with an abnormal condition, which method comprises identifying at least one RNA molecule in the RNA spectrum of a sample of tissue or biological fluid of said subject, the presence or level of which is different in from that in a control spectrum comparably obtained from control subjects; testing the effect of said RNA molecule on the transcriptome of cells of the same species as the test subject; identifying at least one element of said transcriptome that is affected; and associating said element with a biological pathway.
In addition to the methods of the invention, the information useful in conducting the methods can be tabulated and stored on computer-readable media. Thus, the invention further includes a database contained on a computer readable medium which comprises a record of the identity and levels of RNA contained in an RNA spectrum associated with at least one of: 1) tissue or biological fluid of normal subjects; 2) tissue or biological fluid of subjects affected by known conditions; 3) tissue or biological fluid of subjects or administered known treatments; 4) tissue or biological fluid of subjects known to have ingested specified substances.
The methods of the invention may be performed on human subjects or on any vertebrate subject, including laboratory animals as well as livestock, companion animals, horses, and the like.
The present invention takes advantage of the availability of RNA identification techniques such as high throughput parallel sequencing techniques, such as the commercially available NextGen techniques as well as microarray/mass spectrometry techniques to explore the implications of the spectrum of RNA molecules found in bodily fluids and tissues. Although the examples herein focus on plasma, RNA profiles may also be obtained from other biological fluids such as saliva, semen, lymph, urine and in tissues themselves either as secretions or extracts. Depending on interest, the subjects may be laboratory models such as rabbits, mice, rats, guinea pigs, etc., or other animals such as livestock, birds, fish, as well as animals in general such as companion animals, racehorses and marsupials. A number of applications of such spectra are part of the present invention.
By “RNA spectrum” of a biological fluid or tissue we mean the identity and quantity or concentration of a multiplicity of RNA sequences or molecules present in the tissue or biological fluid. As shown in the examples below, tissues or fluids may contain not only RNA representing the transcriptome and miRNAs, but may also contain exogenous sequences characteristic of microorganisms, i.e., the microbiome represented in the fluid or tissue by its specific RNA spectral signature. Other exogenous RNAs may result from ingested materials such as plant materials or animals ingested as food as well as microbial contaminants of these ingested materials or other substances. Thus, the information obtained by determining the RNA spectrum may have forensic value to determine whether ingestion of materials having informative RNA patterns has occurred. Typically, the RNA sequences or molecules are 10-40 nucleotides in length, or may be 15-35 nucleotides in length or may be 20-25 nucleotides in length. All integer values between the designated ranges are included—thus, sequences or molecules of 10-35 nucleotides in length also include those 14-30 nucleotides in length, or 16-29 nucleotides in length, etc.
The identification of these RNA molecules or sequences is performed by matching these to publicly available or other databases that contain sequence information regarding the microRNA (miRNA), genetic sequences, or transcriptomes of the organism from which the tissue of biological fluid used to sample is derived and matching the RNA sequences or molecules in the spectrum to those in the database. The matching can be conducted using a number of strategies, for example, allowing no mismatches, or one mismatch or two mismatches to account for allelic variations, etc. Typically, microRNA sequences or molecules in the RNA spectrum are not permitted any mismatches because of the similarity of miRNA's, but RNA sequences or molecules that otherwise match the transcriptome or the genomic sequences of the organism may be allowed greater flexibility. This permits identification of molecules or sequences in the spectrum that cannot be matched endogenously to be more efficiently compared to other databases that represent the genomes, transcriptomes, or microRNA of microorganisms or substances such as food substances that might be present in a microbiome or other exogenous sequences in the organism tested.
The number of RNA molecules composing a determined RNA spectrum is arbitrary, but typically the spectrum will comprise more than one such RNA molecule. However, determination of the nature and quantity even of a single RNA is informative under some circumstances—e.g., an RNA specifically characteristic of anthrax would demonstrate ingestion of this microorganism. Typically, however, a multiplicity of RNA molecules is identified and optionally quantitated to obtain a specific “RNA spectrum” of a fluid or tissue derived from a subject. Thus, the number of RNA molecules to be characterized and optionally quantitated may be as few as two or as many as several hundred. All integer numbers between 2 and 100 are also included as if specifically set forth herein. Thus, the spectrum may contain, for example, 3, 5, 20, 50 or 100 such molecules; again, it is to be emphasized that any and all specific integers between these boundaries are to be considered specifically set forth herein.
The “microbiome” of a sample of tissue or fluid is an RNA spectrum that represents RNA associated with microorganisms and viruses. Microorganisms include fungi, bacteria, archaea and protozoa, and any single-celled or non-cellular microbe.
The sample size for determination may be quite small and is arbitrary and suited to the specific method for determination of the spectrum.
Many of the applications of the invention involve comparisons between test and control spectra. These spectra are “significantly similar” if statistical tests indicate that they vary overall by <10%, preferably <5% and preferably <1%. Conversely, they are “substantially different” if they differ by at least 1% overall, preferably 5% overall and more preferably 10% or more overall. In many cases, it is not necessary to apply statistics; a graphic display of a manageable number of RNA molecules in each spectrum may be sufficient for simple observation to determine whether the spectra are similar or different. Many algorithms are also available to determine statistical similarities and differences and any such algorithms may be applied to make this determination.
As noted above, the substances that may contribute to the RNA spectrum are ingested substances, and “ingestion” includes not only oral uptake, but any means of providing the substance to the subject, including injection, transmucosal delivery, transdermal delivery, and any mechanism that succeeds in providing the substance to the subject. Thus, the substance may be supplied, for example, to a tumor by direct administration to the tumor such as by injection, and may be provided in a multiplicity of forms. The examples below illustrate the effect of oral ingestion of foodstuffs, but the presence of insect RNA in plasma indicates that inhalation may also be a route of administration effective in delivering exogenous RNA. Any material capable of generating, or having associated with it, RNA is included within a “substance” to be ingested. “Substance” is not limited to single molecules but includes mixtures, composites, organisms, materials in general, including those containing contaminants.
By associating the identities of RNA molecules in the spectrum with their sources, is meant that by virtue of the nature of the sequence of the RNA, it can be determined to have originated in a particular source. Thus, if the RNA is characteristic of a particular substance or organism or microbe, its presence and/or quantity is informative as to the exposure of the subject to the substance or organism. Some, indeed many, RNA molecules are not uniquely characteristic of a particular source exogenous to the subject, but the level present in the fluid or tissue may indicate that the RNA present endogenously has been supplemented. Further, the substance itself may not contain or generate RNA but may stimulate alterations in the patterns of RNA of the subject. Thus, toxins, pharmaceuticals, and other inorganic or organic small molecules or non-living molecules in general by virtue of their perturbation of the metabolism and physiology of the subject will alter the RNA spectrum. This expands the applications for forensic purposes. For example, detection of a pattern characteristic of arsenic poisoning or ricin poisoning will indicate that such poisoning has or has not occurred.
The inventors have also found that the nature of the RNA spectrum is useful to determine metabolic and other physiological pathways that are associated with particular diseases or conditions. Thus, the nexus between the impact of particular RNA molecules on known pathways can be determined by measuring the effects of such RNAs on cells of the same species as the subject. For example, if the subject shows elevated levels of an RNA in plasma that is associated with enhancing a pathway associated with oncogenesis, the presence and amount of this RNA in the spectrum may indicate the relevance of this pathway to tumor progression, thus providing a target for treatment.
In still another embodiment, the invention takes advantage of the discovery by applicants that RNA molecules are protected in plasma and the circulatory system in general by association with protein and/or lipid complexes. By disrupting these complexes, such as treatment with proteases and/or lipases, the RNA can be freed to be used more conveniently for diagnostic purposes or as a target for therapeutics if desired. Thus, for example, if a particular miRNA is believed to cause deleterious effects, exposure of that RNA for activity by, for example, RNAse may precede the treatment with the liberating enzymes. Similarly, the activity of a desirable RNA may be enhanced by liberating it from its protective shields.
The following examples are provided to illustrate but not to limit the invention.
In these examples, plasma or other fluid was analyzed for the various RNA molecules present. Their levels or concentrations in the fluid were also determined.
When plasma was used as a test substrate from human subjects, samples were obtained from Proteogenex (Culver City, Calif.). All samples were collected from donors with proper approvals from institutional review boards. The plasma was prepared from EDTA blood by centrifugation at 1000×g for 15 minutes to separate the plasma and blood cells.
For plasma samples generally, or for cases wherein the sample was a finger-prick of whole blood, total RNA was extracted from 100 μl of the sample using the miRNeasy® kit (Qiagen, Valencia, Calif.). The quality and quantity of the RNA were evaluated with Agilent 2100 Bioanalyzer (Santa Clara, Calif.) and NanoDrop 1000 spectrophotometer (Thermo Scientific, Wilmington, Del.). Generally, we obtained about 100 ng of RNA per ml of sample. As a control we also obtained total RNA from Ambion (Life Technologies, Carlsbad, Calif.).
Libraries of RNA to be sequenced were prepared with small RNA sample preparation kits from Illumina (Illumina, San Diego, Calif.). The 3′ and 5′ adapters, and the reverse transcription primer were diluted in nuclease-free water to the concentration specified by Illumina. RNA isolated from 200 μl of plasma was concentrated and mixed with the diluted 3′ adapter in a final volume of 6 μl of nuclease free water. To eliminate secondary structures, the tube was incubated at 70° C. for 2 minutes, then immediately cooled on ice. The ligation reaction was set by adding 1 μl of 10×T4 RNL2 reaction buffer, 0.8 μl of 100 mM MgCl2, 1.5 μl of T4 RNA ligase 2, and 0.5 μl of RNaseOUTT™ RNase inhibitor (Life Technologies, Carlsbad, Calif.) and then incubated at 22° C. for 1 hour. After ligating the 3′ adapter, 1 μl of the 5′ adapter, 1 μl of 10 mM ATP, and 1 μl of T4 RNA ligase were added, then incubated at 20° C. for 4 hours.
For cDNA synthesis, 4 μl of RNA ligated with both 5′ and 3′ adapters was mixed with 1 μl of diluted reverse transcription primer and incubated at 70° C. for 2 minutes, then cooled on ice. Two μl of 5× first-strand synthesis buffer, 0.5 μl of 12.5 mM dNTP mix, 1 μl of 100 mM DTT, and 0.5 μl of RNaseOUTT™ were added to the annealed primer-template mixture. The sample was then heated at 48° C. for 3 minutes. One μl of SuperScript II Reverse Transcriptase was added to the sample and incubated at 44° C. for 1 hour. The first-strand cDNA was then amplified with GX1 and GX2 primers using a condition as following: 98° C. for 30 seconds, followed by 20 cycles of 10 seconds at 98° C., 30 seconds at 60° C., 15 seconds at 72° C., holding for 10 minutes at 72° C., then holding at 4° C.
Since the amount of RNA in the sample is low, we did not use the small RNA-enriched fraction for sequencing library preparation; rather we selected and purified through 6% Novex® TBE PAGE gel (Life Technologies, Carlsbad, Calif.) a larger library insert size, covering 20 to 100 nucleotides in length. We thus expected to get lower percentage of sequence reads for miRNA, but would gain the ability to see the general spectrum of RNA in samples including other ncRNAs including bacterial small RNAs (50-500 nt) and degraded messenger RNAs (mRNA).
The quality and quantity of the library was assessed by using the Agilent 2100 Bioanalyzer with the DNA 1000 chip. The prepared library was then run on Illumina Genome Analyzer IIx at the genomic facility at the Institute for Systems Biology.
Based on results from 9 individual human subjects, over 20 million sequence reads per sample were obtained with 35 cycle runs on Illumina Genome Analyzer Ia. After trimming the adapter sequences, removing low quality sequences, adapter only sequences, and sequences containing only polyA, we generally had 2 to 4 million “processed” reads with an average length of 23 nucleotides. These data are shown in Table 1.
(a) Based on Mayo Scoring System for Assessment of Ulcerative Colitis Activity
As noted, the total number of reads is greatly diminished by processing as described above which eliminates artifacts due to polyA, adapters, etc.
A NextGen sequence read simulator, ART, available at bioinformatics.joyhz.com/ART/, was used to generate artificial transcriptome data from human, mouse, bovine and yeast. Transcript sequences from ENSEMBL and miRNA sequences from miRBase were combined and used as reference sequences. Illumina read error profile was selected as the program to generate artificial reads with either 23 or 35 nucleotides in length, from the reference sequences. With a 2 mismatch allowance, over 98% of the sequences from our simulated dataset can be mapped to the corresponding transcriptome (Table 3). This provided some assurance that our protocol can map most (˜98%) of the NextGen sequencing data under 2 mismatch allowance.
The nature of the RNA could thus be ascertained. The results for the 9 subjects shown in Table 1 are shown in Table 2 and the results for other species as well as human are shown in Table 3.
a Due to high sequence similarity for various miRNA species, we did not allow any sequence mismatch in miRNA alignment.
b Numbers in parentheses represents number of samples in each group.
The processed sequences were first screened against endogenous (human) sequence databases including known human miRNA, human transcripts, followed by human genomic sequence. To get complementary and efficient mapping results, the alignment tool BLAST was used to search miRNA, and Bowtie was used to search other large databases. For the endogenous sequence mapping, except miRNA, we applied three different levels of error tolerance: 0 mismatch (termed Strategy 0), 1 mismatch (termed Strategy 1) and 2 mismatch (termed Strategy 2). The remaining unmapped sequences were then compared to sequences from the known human microbiome, miRNA sequences from other species, and the non-redundant nucleic acid sequence collection from NCBI. Due to the high sequence similarity for miRNA, we did not allow any sequence mismatch for either endogenous and exogenous miRNA mappings. We also did not allow any sequence mismatches for exogenous sequence mapping. Species classification was based on NCBI Taxonomy database at ncbi nlm nih.gov/taxonomy.
As shown in Table 2 for the 9 human subjects, a large portion of the RNA could not be matched to the database although this percentage diminished as less rigorous requirements for matching were employed as in strategy allowing for two mismatches.
On first examination, we noticed that less than 1.5% of the processed reads actually mapped to human miRNAs. About 11% of the remaining reads mapped to human transcripts and human genome sequence when no sequence mismatch was allowed (Table 2). With a higher tolerance of sequence mismatches, the fraction of reads that can be mapped rose to about 42% to known human transcripts and 15% to other human genomic sequences (under two mismatch allowance). However, this still leaves over 40% of the processed reads with an unknown origin.
In order to identify the origin of those unmapped sequences in our sequencing results and to ensure that there was no error introduced in preparing the sequencing library that could account for the unknowns, we conducted a systematic search against various sequence databases. We used a “map and remove” approach to analyze the sequence as shown in
a Numbers in parentheses represent number of samples in each group.
b To increase the sequence mapping accuracy, we did not allow any sequence mismatch except in the endogenous sequence search step.
To eliminate the possibility of bacteria and fungi contamination during plasma preparation and handling, we generated sequencing libraries from other types of samples including human tissue (commercially obtained normal lung RNA), bovine milk (commercial whole milk), and mouse plasma (C57BL/6J), and proceeded through the same analysis scheme. Sequences from bacteria, fungi and other species can also be seen in these samples (
a Numbers in parentheses represent number of samples in each group.
b To increase the sequence mapping accuracy, we did not allow any sequence mismatch except in the endogenous sequence search step.
The overall percentages of exogenous sequences for mouse plasma were lower compared to human plasma samples. The human lung tissue had a very small fraction: less than 1% under strategies 1 and 2, of the processed sequences were from exogenous sources. The commercially obtained milk contains a significant fraction of sequences mapped to bacteria.
To ensure that the exogenous sequences we observed were not derived from any contaminated instruments or reagents, we analyzed two public domain NextGen sequencing data sets: SRR332232, serum small RNA sequencing results from a normal Chinese individual, and SRR014350, yeast transcriptome data from a yeast culture. The yeast culture should not have any exogenous sequences since it was grown in a sterile, defined culture media. The yeast dataset yielded less than 0.15% of the reads mapped to sequences other than yeast (
a Numbers in parentheses are the access numbers.
b To increase the sequence mapping accuracy, we did not allow any sequence mismatch except in the endogenous sequence search step.
As noted in Example 2, allowing 2 mismatches identifies 98% of the endogenous sequences in humans. The exogenous sequence mapping results from Strategy 2 (2 mismatches allowed for endogenous sequence mapping steps and no mismatch allowed in exogenous sequence mapping) was used for further analysis.
We observed reads from human plasma covering all major bacteria phyla and two archaeal phyla [Euryarchaeota (include methanogens typically found in intestines) and Crenarchaeota] as shown in
As shown in Table 7, significant difference was observed in the sequence distribution patterns among plasma samples from normals and patients with either colorectal cancer or ulcerative colitis. Firmicutes, typically on of the two most abundant bacteria phyla in the human gut microbiome, is the 3rd most abundant sequence population in plasma.
A significant number of the reads mapped to bacteria are from various ribosomal RNAs and tRNAs. High sequence similarity of these sequences among different microbial species can easily lead to misassignment of sequence reads. Thus, to increase the reliability of mapping results, we removed reads that mapped to bacterial rRNAs and tRNAs and reanalyzed. Removing rRNA and tRNA sequences affected our ability to detect species from Chloroflexi, Deferribacteres, Fibrobacteres and some other phyla (
The bacterium that accounts for the highest number of reads is an uncultured bacterium. This is followed by Pseudomonas fluorescens, an important beneficial bacterium in agricultural settings (
Fungi represent the largest source of exogenous RNA, about 14% of the processed reads under the Strategy 2 in our human plasma samples as shown in Example 2 (Table 4). Like bacteria, the species mapped covered all major phyla in fungi and Ascomycota is the most abundant phylum in either with or without rRNA and tRNA reads (
Metarhizium anisopliae, a common fungus in soil had the most mapped reads and Thielavia terrestris, a thermophilic fungus became the species with the most abundant reads after removing tRNA and rRNA sequences (
We recently developed a qPCR based protocol to measure the level of RNA molecules directly from small amount of plasma without RNA isolation (Wang, et al., in preparation). Using this approach we were able to detect both Pseudomonas putida (bacterium) 16S RNA, Ceratocystiopsis minuta (fungus) 18S RNA along with the human 28S rRNA from freshly obtained plasma from finger-prick blood samples (
We also compared the data in
After removing the reads that mapped to rRNAs and tRNAs to increase the accuracy of mapping results, we found reads that mapped to food items. We did not analyze sequences mapped to metazoan species since the risk of coincidental sequence match caused by sequencing error is much higher between human and some metazoan samples. The most abundant food item identified from our plasma samples then is corn (Zea mays) followed by rice (Oryza sativa Japonica Group) (
Our sequencing results also revealed the presence of exogenous miRNAs from other species. Due to the extreme sequence similarity of miRNA sequences among some species, it is often difficult to determine the exact origin of those exogenous miRNAs. Some of the highly abundant exogenous miRNA species detected in our plasma samples are listed in Table 8.
Except for miR-168a from the common cereal grains such as corn or rice, the rest of the exogenous miRNAs were from various common household insects, including the housefly, mosquito and bees. There is a high variation in the number of reads among individual donors for those insect miRNAs.
An acetaminophen overdose mouse model for drug-induced liver injury was employed to determine the effect of liver injury on the RNA spectrum. Several hundred transcripts in plasma were affected including those representing transcripts that are highly concentrated in liver such as albumin, apolipoprotein A2 (apoA2) and transferrin. All of these were significantly increased as compared to untreated controls. The level of albumin spiked after 3 hours and decreased over a 24 hour period, as did that of apoA2. The transferrin levels were increased to a lesser extent but held reasonably steady over a 24-hour period (
In addition, we used a gene enrichment analysis from the Database for Annotation, Visualization and Integrated Discovery (DAVID) found on the web at david.abcc.ncifcrf.gov/home/jsp. The enrichment of organ-specific transcripts in blood as well as liver is shown in Table 9.
The numbers in the table are p values that represent the likelihood of the tissue origin of the RNA sequences observed in plasma, and are smaller the greater the likelihood this is the case. Thus, in the case of liver, the certainty that the increase in transcripts from liver was most certain at 3 hours and less so at 24 hours. As shown in Table 9, some transcripts derived from liver increased significantly in plasma post-acetaminophen administration which suggests RNA released from hepatocyte due to acetaminophen induced liver injury. Histopathology examination of the liver tissues indicates typical zoon 3 hepatocyte death induced by acetaminophen overdose. The other major organ listed in Table 9 is kidney. Histopathological examination on the kidney tissues also indicates renal tubular injury induced by acetaminophen overdose. These findings provide the evidences of using the spectrum of endogenous RNA to predict pathology occurred in specific tissues.
The effect of a particular condition, asymptomatic sarcoidosis, a systemic inflammatory disease with granulomas in multiple tissues also provided a pattern of transcripts detectable in plasma characteristic of various organs as shown in Table 10.
To explore the stability of exogenous RNA in plasma, we treated the plasma with RNase A from Fermentas™ (Thermo Scientific™, Wilmington, Del.) at a concentration at 1 μg/ml, DNase I (Fermentas™, Thermo Scientific™, Wilmington, Del.) at a concentration of 1 unit/ml, protease K (Fermentas™, Thermo Scientific™, Wilmington, Del.) at a concentration of 0.05 mg/ml, 0.1% Triton™ X 100, or protease K for 20 minutes followed by additional RNase A at 1 μg/ml after heat inactivation of protease K at 70° C. for 10 minutes prior for RNA isolation.
Like endogenous miRNA (miR-16), the levels of specific exogenous miRNA (miR-263a-5p) and RNA (16S rRNA from Pseudomonas putida) were reduced significantly after Triton™ X-100, protease, RNase, and protease followed by RNase treatments (
It has been demonstrated that certain cells can take up miRNA contained in lipid vesicles, resulting in a changed gene expression profile. We transfected several synthetic, double-stranded RNA molecules selected from observed exogenous miRNA sequences and some highly abundant exogenous sequences (bacterial rRNAs) that have potential to form pre-miRNA-like secondary structures (
The mouse dicer deficient (DCR −/−) fibroblast cell line was generated from a conditional cre and floxed Dicer allele transgenic mouse available from Jax (located on the web at jaxmice.jax.org/strain/006001.html) kindly provided by Dr. Jacques Peschon. Part of the RNase III domain encoded in the exon 23 of dicer gene was deleted following cre excision. DCR −/− cells were maintained in Dulbecco's modified Eagle's medium with high glucose. The media was supplemented with 10% FBS, 1% non-essential amino acid, 1% GlutaMAX™. The cells were routinely incubated at 37° C. in a humidified atmosphere with 5% CO2.
Lipofectamine™ RNAiMAX was purchased from Invitrogen (Life Technologies, Carlsbad, Calif.). Custom designed exogenous RNA used in transfection was obtained from Ambion (Life Technologies, Carlsbad, Calif.). DCR −/− cells were seeded at a density of 1×105 cells in 6-well tissue culture plates 24 h prior to transfect with 10 nM of synthetic RNAs. Cells exposed to transfection reagents only were used as control. After 24 hours in the transfection media, the cells were harvested for RNA isolation and the transfection efficiency was validated with qPCR.
Effects of exogenous RNAs on transcriptome were assessed by using the Agilent mouse 4×44K microarray (Agilent, Santa Clara, Calif.). Total RNAs were isolated with an miRNeasy® column (Qiagen, Valencia, Calif.), and both Cy3 and Cy5-labeled cRNA samples were prepared with two color labeling kit (Agilent Technologies, Santa Clara, Calif.) and then hybridized at 65° C. for 17 h. Signal intensity was calculated from digitized images captured by a scanner from Agilent (Santa Clara, Calif.), and data analysis was performed by using GeneSpring GX software (Agilent Technologies, Santa Clara, Calif.).
The expression profiles of a number of genes in the cells were affected by some of the exogenous RNA sequences. We verified the changes in levels of some of these affected genes' mRNA by QPCR (
Pseudomonas
Rhodococcus
Rhodococcus
Rhodococcus
Two of the insect miRNAs, miR-263a-5p and bantam, did not produce any significant effects on the cellular transcriptome, which shows that the process of transfection itself was not the cause of the observed gene expression changes. Thus, RNA sequences in plasma have biological effects on human cells.
This application claims priority from U.S. Ser. No. 61/658,876 filed 12 Jun. 2012. The content of this document is incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
61658876 | Jun 2012 | US |