ncRNA AND USES THEREOF

Information

  • Patent Application
  • 20230016456
  • Publication Number
    20230016456
  • Date Filed
    July 18, 2022
    2 years ago
  • Date Published
    January 19, 2023
    2 years ago
Abstract
The present invention relates to compositions and methods for cancer diagnosis, research and therapy, including but not limited to, cancer markers. In particular, the present invention relates to ncRNAs as diagnostic markers and clinical targets for prostate, lung, breast and pancreatic cancer.
Description
SEQUENCE LISTING

The text of the computer readable sequence listing filed herewith, titled “UM-31566-305 SQL”, created Jul. 18, 2022, having a file size of 38,384 bytes, is hereby incorporated by reference in its entirety.


FIELD OF THE INVENTION

The present invention relates to compositions and methods for cancer diagnosis, research and therapy, including but not limited to, cancer markers. In particular, the present invention relates to ncRNAs as diagnostic markers and clinical targets for prostate, lung, breast and pancreatic cancer.


BACKGROUND OF THE INVENTION

A central aim in cancer research is to identify altered genes that are causally implicated in oncogenesis. Several types of somatic mutations have been identified including base substitutions, insertions, deletions, translocations, and chromosomal gains and losses, all of which result in altered activity of an oncogene or tumor suppressor gene. First hypothesized in the early 1900's, there is now compelling evidence for a causal role for chromosomal rearrangements in cancer (Rowley, Nat Rev Cancer 1: 245 (2001)). Recurrent chromosomal aberrations were thought to be primarily characteristic of leukemias, lymphomas, and sarcomas. Epithelial tumors (carcinomas), which are much more common and contribute to a relatively large fraction of the morbidity and mortality associated with human cancer, comprise less than 1% of the known, disease-specific chromosomal rearrangements (Mitelman, Mutat Res 462: 247 (2000)). While hematological malignancies are often characterized by balanced, disease-specific chromosomal rearrangements, most solid tumors have a plethora of non-specific chromosomal aberrations. It is thought that the karyotypic complexity of solid tumors is due to secondary alterations acquired through cancer evolution or progression.


Two primary mechanisms of chromosomal rearrangements have been described. In one mechanism, promoter/enhancer elements of one gene are rearranged adjacent to a proto-oncogene, thus causing altered expression of an oncogenic protein. This type of translocation is exemplified by the apposition of immunoglobulin (IG) and T-cell receptor (TCR) genes to MYC leading to activation of this oncogene in B- and T-cell malignancies, respectively (Rabbitts, Nature 372: 143 (1994)). In the second mechanism, rearrangement results in the fusion of two genes, which produces a fusion protein that may have a new function or altered activity. The prototypic example of this translocation is the BCR-ABL gene fusion in chronic myelogenous leukemia (CML) (Rowley, Nature 243: 290 (1973); de Klein et al., Nature 300: 765 (1982)). Importantly, this finding led to the rational development of imatinib mesylate (Gleevec), which successfully targets the BCR-ABL kinase (Deininger et al., Blood 105: 2640 (2005)). Thus, diagnostic methods that specifically identify epithelial tumors are needed.


SUMMARY OF THE INVENTION

The present invention relates to compositions and methods for cancer diagnosis, research and therapy, including but not limited to, cancer markers. In particular, the present invention relates to ncRNAs as diagnostic markers and clinical targets for prostate, lung, breast and pancreatic cancer.


Embodiments of the present invention provide compositions, kits, and methods useful in the detection and screening of prostate cancer. Experiments conducted during the course of development of embodiments of the present invention identified upregulation of non-coding RNAs in prostate cancer. Some embodiments of the present invention provide compostions and methods for detecting expression levels of such ncRNAs. Identification of ncRNAs finds use in screening, diagnostic and research uses.


For example, in some embodiments, the present invention provides a method of screening for the presence of prostate cancer in a subject, comprising contacting a biological sample from a subject with a reagent for detecting the level of expression of one or more non-coding RNAs (ncRNA) (e.g., PCAT1, PCAT14, PCAT43 and PCAT 109); and detecting the level of expression of the ncRNA in the sample, for example, using an in vitro assay, wherein an increased level of expression of the ncRNA in the sample (e.g., relative to the level in normal prostate cells, increase in level relative to a prior time point, increase relative to a pre-established threshold level, etc.) is indicative of prostate cancer in the subject. In some embodiments, the ncRNAs are described by SEQ ID NOs: 1-9. In some embodiments, the sample is tissue, blood, plasma, serum, urine, urine supernatant, urine cell pellet, semen, prostatic secretions or prostate cells. In some embodiments, the detection is carried out utilizing a sequencing technique, a nucleic acid hybridization technique, a nucleic acid amplification technique, or an immunoassay. However, the invention is not limited to the technique employed. In some embodiments, the nucleic acid amplification technique is polymerase chain reaction, reverse transcription polymerase chain reaction, transcription-mediated amplification, ligase chain reaction, strand displacement amplification or nucleic acid sequence based amplification. In some embodiments, the prostate cancer is localized prostate cancer or metastatic prostate cancer. In some embodiments, the reagent is a pair of amplification oligonucleotides or an oligonucleotide probe.


Additional embodiments provide a method of screening for the presence of prostate cancer in a subject, comprising contacting a biological sample from a subject with a reagent for detecting the level of expression of two or more (e.g., 10 or more, 25 or more, 50 or more, 100 or more or all 121) non-coding RNAs (ncRNA) selected from, for example, PCAT1, PCAT2, PCAT3, PCAT4, PCAT5, PCAT6, PCAT7, PCAT8, PCAT9, PCAT10, PCAT11, PCAT12, PCAT13, PCAT14, PCAT15, PCAT16, PCAT17, PCAT18, PCAT19, PCAT20, PCAT21, PCAT22, PCAT23, PCAT24, PCAT25, PCAT26, PCAT27, PCAT28, PCAT29, PCAT30, PCAT31, PCAT32, PCAT33, PCAT34, PCAT35, PCAT36, PCAT37, PCAT38, PCAT39, PCAT40, PCAT41, PCAT42, PCAT43, PCAT44, PCAT45, PCAT46, PCAT47, PCAT48, PCAT49, PCAT50, PCAT51, PCAT52, PCAT53, PCAT54, PCAT55, PCAT56, PCAT57, PCAT58, PCAT59, PCAT60, PCAT61, PCAT62, PCAT63, PCAT64, PCAT65, PCAT66, PCAT67, PCAT68, PCAT69, PCAT70, PCAT71, PCAT72, PCAT73, PCAT74, PCAT75, PCAT76, PCAT77, PCAT78, PCAT79, PCAT80, PCAT81, PCAT82, PCAT83, PCAT84, PCAT85, PCAT86, PCAT87, PCAT88, PCAT89, PCAT90, PCAT91, PCAT92, PCAT93, PCAT94, PCAT95, PCAT96, PCAT97, PCAT98, PCAT99, PCAT100, PCAT101, PCAT102, PCAT103, PCAT104, PCAT105, PCAT106, PCAT107, PCAT108, PCAT109, PCAT110, PCAT111, PCAT112, PCAT113, PCAT114, PCAT115, PCAT116, PCAT117, PCAT118, PCAT119, PCAT120, or PCAT121; and detecting the level of expression of the ncRNA in the sample using an in vitro assay, wherein an increased level of expression of the ncRNA in the sample relative to the level in normal prostate cells in indicative of prostate cancer in the subject.


Further embodiments of the present invention provide an array, comprising reagents for detecting the level of expression of two or more (e.g., 10 or more, 25 or more, 50 or more, 100 or more or all 121) non-coding RNAs (ncRNA) selected from, for example, PCAT1, PCAT2, PCAT3, PCAT4, PCAT5, PCAT6, PCAT7, PCAT8, PCAT9, PCAT10, PCAT11, PCAT12, PCAT13, PCAT14, PCAT15, PCAT16, PCAT17, PCAT18, PCAT19, PCAT20, PCAT21, PCAT22, PCAT23, PCAT24, PCAT25, PCAT26, PCAT27, PCAT28, PCAT29, PCAT30, PCAT31, PCAT32, PCAT33, PCAT34, PCAT35, PCAT36, PCAT37, PCAT38, PCAT39, PCAT40, PCAT41, PCAT42, PCAT43, PCAT44, PCAT45, PCAT46, PCAT47, PCAT48, PCAT49, PCAT50, PCAT51, PCAT52, PCAT53, PCAT54, PCAT55, PCAT56, PCAT57, PCAT58, PCAT59, PCAT60, PCAT61, PCAT62, PCAT63, PCAT64, PCAT65, PCAT66, PCAT67, PCAT68, PCAT69, PCAT70, PCAT71, PCAT72, PCAT73, PCAT74, PCAT75, PCAT76, PCAT77, PCAT78, PCAT79, PCAT80, PCAT81, PCAT82, PCAT83, PCAT84, PCAT85, PCAT86, PCAT87, PCAT88, PCAT89, PCAT90, PCAT91, PCAT92, PCAT93, PCAT94, PCAT95, PCAT96, PCAT97, PCAT98, PCAT99, PCAT100, PCAT101, PCAT102, PCAT103, PCAT104, PCAT105, PCAT106, PCAT107, PCAT108, PCAT109, PCAT110, PCAT111, PCAT112, PCAT113, PCAT114, PCAT115, PCAT116, PCAT117, PCAT118, PCAT119, PCAT120, or PCAT121. In some embodiments, the reagent is a pair of amplification oligonucleotides or an oligonucleotide probe.


In some embodiments, the present invention provides a method for screening for the presence of lung cancer in a subject, comprising contacting a biological sample from a subject with a reagent for detecting the level of expression of one or more non-coding RNAs (e.g., M41 or ENST-75); and detecting the level of expression of the ncRNA in the sample, for example, using an in vitro assay, wherein an increased level of expression of the ncRNA in the sample (e.g., relative to the level in normal lung cells, increase in level relative to a prior time point, increase relative to a pre-established threshold level, etc.) is indicative of lung cancer in the subject.


In some embodiments, the present invention provides a method for screening for the presence of breast cancer in a subject, comprising contacting a biological sample from a subject with a reagent for detecting the level of expression of one or more non-coding RNAs (e.g., TU0011194, TU0019356, or TU0024146); and detecting the level of expression of the ncRNA in the sample, for example, using an in vitro assay, wherein an increased level of expression of the ncRNA in the sample (e.g., relative to the level in normal breast cells, increase in level relative to a prior time point, increase relative to a pre-established threshold level, etc.) is indicative of breast cancer in the subject.


In some embodiments, the present invention provides a method for screening for the presence of pancreatic cancer in a subject, comprising contacting a biological sample from a subject with a reagent for detecting the level of expression of one or more non-coding RNAs (e.g., TU0009141, TU0062051, or TU0021861); and detecting the level of expression of the ncRNA in the sample, for example, using an in vitro assay, wherein an increased level of expression of the ncRNA in the sample (e.g., relative to the level in normal pancreatic cells, increase in level relative to a prior time point, increase relative to a pre-established threshold level, etc.) is indicative of pancreatic cancer in the subject.


Additional embodiments are described herein.





DESCRIPTION OF THE FIGURES


FIG. 1 shows that prostate cancer transcriptome sequencing reveals dysregulation of exemplary transcripts identified herein. a. A global overview of transcription in prostate cancer. b. A line graph showing the cumulative fraction of genes that are expressed at a given RPKM level. c. Conservation analysis comparing unannotated transcripts to known genes and intronic controls shows a low but detectable degree of purifying selection among intergenic and intronic unannotated transcripts. d-g. Intersection plots displaying the fraction of unannotated transcripts enriched for H3K4me2 (d), H3K4me3 (e), Acetyl-H3 (f) or RNA polymerase II (g) at their transcriptional start site (TSS) using ChIP-Seq and RNA-Seq data for the VCaP prostate cancer cancer cell line. h. A heatmap representing differentially expressed transcripts, including novel unannotated transcripts, in prostate cancer.



FIG. 2 shows that unannotated intergenic transcripts differentiate prostate cancer and benign prostate samples. a. A histogram plotting the genomic distance between an unannotated ncRNA and the nearest protein-coding gene. b. A Circos plot displaying the location of annotated transcripts and unannotated transcripts on Chr15q. c. A heatmap of differentially expressed or outlier unannotated intergenic transcripts clusters benign samples, localized tumors, and metastatic cancers by unsupervised clustering analyses. d. Cancer outlier profile analysis (COPA) outlier analysis for the prostate cancer transcriptome reveals known outliers (SPINK1, ERG, and ETV1), as well as numerous unannotated transcripts.



FIG. 3 shows validation of tissue-specific prostate cancer-associated non-coding RNAs. a-c. Quantitative real-time PCR was performed on a panel of prostate and non-prostate samples to measure expression levels of three nominated non-coding RNAs (ncRNAs), PCAT-43, PCAT-109, and PCAT-14, upregulated in prostate cancer compared to normal prostate tissues. a. PCAT-43 is a 20 kb ncRNA located 40 kb upstream of PMEPA1 on chr20q13.31. b. PCAT-109, located in a large, 0.5 Mb gene desert region on chr2q31.3 displays widespread transcription in prostate tissues, particularly metastases. c. PCAT-14, a genomic region on chr22q11.23 encompassing a human endogenous retrovirus exhibits marked upregulation in prostate tumors but not metastases.



FIG. 4 shows that prostate cancer ncRNAs populate the Chr8q24 gene desert. a. A schematic of the chr8q24 region. b. Comprehensive analysis of the chr8q24 region by RNA-Seq and ChIP-Seq reveals numerous transcripts supported by histone modifications, such as Acetyl-H3 and H3K4me3, demarcating active chromatin. c. RT-PCR and Sanger sequencing validation of the PCAT-1 exon-exon junction. d. The genomic location of PCAT-1 determined by 5′ and 3′ RACE. Sequence analysis of PCAT-1 shows that it is a viral long terminal repeat (LTR) promoter splicing to a marniner family transposase that has been bisected by an Alu repeat. e. qPCR on a panel of prostate and non-prostate samples shows prostate-specific expression and upregulation in prostate cancers and metastases compared to benign prostate samples. f Four matched tumor/normal pairs included in the analysis in e. demonstrate somatic upregulation of PCAT-1 in matched cancer samples.



FIG. 5 shows that ncRNAs serve as urine biomarkers for prostate cancer. a-c. Three ncRNAs displaying biomarker status in prostate cancer tissues were evaluated on a cohort of urine samples from 77 patients with prostate cancer and 31 controls with negative prostate biopsy results and absence of the TMPRSS2-ERG fusion transcript. PCA3 (a); PCAT-1 (b); and PCAT-14 (c).


d. Scatter plots demonstrating distinct patient subsets scoring positively for PCA3, PCAT-1, or PCAT-14 expression. e. A heatmap displaying patients positive and negative for several different prostate cancer biomarkers in urine sediment samples. f A table displaying the statistical significance of the ncRNA signature. g. A model for non-coding RNA (ncRNA) activation in prostate cancer.



FIG. 6 shows Ab initio assembly of the prostate cancer transcriptome. (a) Reads were mapped with TopHat and assembled into library-specific transcriptomes by Cufflinks. (b) Transcripts corresponding to processed pseudogenes were isolated, and the remaining transcripts were categorized based on overlap with an aggregated set of known gene annotations.



FIG. 7 shows classification tree results for Chromosome 1. The recursive regression and partitioning trees (rpart) machine learning algorithm was used to predict expressed transcripts versus background signal.



FIG. 8 shows transcript assembly of known genes. ab initio transcript assembly on prostate transcriptome sequencing data was used to reconstruct the known prostate transcriptome. a. SPINK1, a biomarker for prostate cancer. b. PRUNE2 with the PCA3 non-coding RNA within its intronic regions. c. NFKB1. d. COL9A2.



FIG. 9 shows analysis of EST support for exemplary transcripts. ESTs from the UCSC database table “Human ESTs” were used to evaluate the amount of overlap between ESTs and novel transcripts. a. A line graph showing the fraction of genes whose transcripts are supported by a particular fraction of ESTs. b. A table displaying the number of ESTs supporting each class of transcripts



FIG. 10 shows analysis of coding potential of unannotated transcripts. DNA sequences for each transcript were extracted and searched for open reading frames (ORFs) using the txCdsPredict program from the UCSC source tool set.



FIG. 11 shows repetitive content of novel transcripts. The percentage of repetitive sequences was assessed in all transcripts by calculating the percentage of repeatmasked nucleotides in each sequence.



FIG. 12 shows distinct ChIP-Seq signatures for repeat-associated and nonrepeat novel ncRNAs. Unannotated transcripts were divided into two groups, repeat-associated and non-repeat, and intersected with ChIP-Seq data for Acetyl-H3 and H3K4me3, two histone modifications strongly associated with transcriptional start sites (TSS), in two prostate cancer cell lines. a. Acetyl-H3 in LNCaP cells. b. H3K4me3 in LNCaP cells. c. Acetyl-H3 in VCaP cells. d. H3K4me3 in VCaP cells.



FIG. 13 shows overlap of unannotated transcripts with ChIP-Seq data in VCaP cells. Previously published ChIP-Seq data for VCaP prostate cancer cells were intersected with unannotated prostate cancer transcripts and annotated control genes. a. H3K4me1 b. H3K36me3.



FIG. 14 shows overlap of unannotated transcripts with ChIP-Seq data in LNCaP cells. ChIP-Seq data for LNCaP prostate cancer cells were intersected with unannotated transcripts and annotated control genes. ncRNAs were divided into intergenic and intronic. a. H3K4me1 b. H3K4me2 c. H3K4me3 d. Acetyl-H3 e. H3K36me3 f. RNA polymerase II.



FIG. 15 shows validation of a novel transcript on chromosome 15. a. Coverage maps showing the average expression levels (RPKM) across the benign, localized tumor, and metastatic samples shows upregulation of a novel transcript downstream of TLE3. b. Several predicted isoforms of this transcript were nominated which retained common exons 1 and 2. c. The exon-exon boundary between exons 1 and 2, as well as an internal portion of exon 3, was validated by RT-PCR in prostate cell line models. d. Sanger sequencing of the RT-PCR product confirmed the junction of exon 1 and exon 2.



FIG. 16 shows clustering of prostate cancer with outliers. Transcripts with outlier profile scores in the top 10% were clustered using hierarchical trees.



FIG. 17 shows validation of novel transcripts in prostate cell lines. 11/14 unannotated transcripts selected for validation by RT-PCR and qPCR were confirmed in cell line models. a. RT-PCR gels showing expected bands for the 11 transcripts that validated. b. Representative qPCR results using primers selected from a. The primers used in b are indicated by a red asterisk in a.



FIG. 18 shows that PCAT-14 is upregulated by androgen signaling. VCaP and LNCaP cells were treated 5 nM R1881 or vehicle (ethanol) control.



FIG. 19 shows that PCAT-14 is upregulated in matched tumor tissues. Four matched tumor-normal patient tissue samples were assayed for PCAT-14 expression by qPCR.



FIG. 20 shows analysis of PCAT-14 transcript structure. a. Representative 5′RACE results using a 3′ primer confirms the presence of the sense transcript PCAT-14. Predicted novel transcripts are displayed above the RACE results. b. DNA sequence analysis of PCAT-14 indicates expected splice donor sites, splice acceptor sites, and a polyadenylation site.



FIG. 21 shows analysis of PCAT-1 transcript structure. 5′ and 3′ RACE experiments showed a ncRNA transcript containing two exons.



FIG. 22 shows that knockdown of PCAT-1 does not affect invasion or proliferation of VCaP cells. VCaP cells were transfected with custom-made siRNAs targeting PCAT-1 or non-targeting controls. a. Knockdown efficiency for four siRNA oligos individually and pooled. b.-d. siRNAs 2-4 were tested for functional effect due to their higher efficiency of knockdown. b. A cell proliferation assay performed with a Coulter counter shows no significant difference in cell proliferation following knockdown of PCAT-1. c. A WST-1 assay indicates no change in VCaP cell viability following PCAT-1 knockdown. d. A transmembrane invasion assay shows no change in VCaP cell invasiveness following PCAT-1 knockdown.



FIG. 23 shows transcription of two Alu elements in a CACNA1D intron. a. Coverage maps representing average expression in RPKM in benign samples, localized tumors, and prostate metastases. b. RPKM expression values for the CACNA1D Alu transcript across the prostate transcriptome sequencing cohort. c. RT-PCR validation of the Alu transcript in cell line models. d. Sanger sequencing confirmation of RT-PCR fragments verifies the presence of AluSp transcript sequence. e. Raw sequencing data of a portion of the AluSp sequence.



FIG. 24 shows transcription of numerous repeat elements at the SChLAP1 locus. a. Coverage maps representing repeat elements transcribed at the chr2q31.3 locus. b. RPKM expression values for the LINE-1 repeat region on chr2q31.3 across the prostate transcriptome sequencing cohort. c. RTPCR validation of the LINE-1 repetitive element in cell line models. A 402 bp fragment was amplified. d. Sanger sequencing of the PCR fragment confirms identity of the LINE-1 amplicon.



FIG. 25 shows a heatmap of repeats clusters prostate cancer samples. Unannotated transcripts that contained repeat elements were used to cluster prostate cancer samples in an unsupervised manner.



FIG. 26 shows that the SChLAP1 locus spans >500 kb. Visualization of transcriptome sequencing data in the UCSC genome browser indicates that a large, almost 1 Mb section of chromosome 2 is highly activated in cancer, contributing to many individual transcripts regulated in a coordinated fashion.



FIG. 27 shows that the SChLAP1 locus is associated with ETS positive tumors. a. Expression of the SChLAP1 locus was assayed by qPCR as display in FIG. 3b on a cohort of 14 benign prostate tissues, 47 localized prostate tumors and 10 metastatic prostate cancers. b. Quantification of the SChLAP1 association with ETS status using the threshold indicated by the blue dotted line in a.



FIG. 28 shows the sequence of PCAT-1 and PCAT-14.



FIG. 29 shows that PCAT-1 expression sensitizes prostate cancer cells to treatment with PARP-1 inhibitors. (a-d) treatment with the PARP1 inhibitor olaparib, (e-h) treatment with the PARP1 inhibitor ABT-888. Stable PCAT-1 knockdown in LNCAP prostate cells reduces sensitivity to olaparib (a) and ABT-888 (e). Stable overexpression in Du145 prostate cancer and RWPE benign prostate cells increases sensitivity to olaparib (b,c) and ABT-888 (f,g). Overexpression of PCAT-1 in MCF7 breast cancer cells does not recapitulate this effect (d,h).



FIG. 30 shows that PCAT-1 expression sensitizes prostate cancer cells to radiation treatment. (a) Stable PCAT-1 knockdown in LNCAP prostate cells reduces sensitivity to radiation. (b,c) Stable overexpression in Du145 prostate cancer and RWPE benign prostate cells increases sensitivity to radiation. (d). Overexpression of PCAT-1 in MCF7 breast cancer cells does not recapitulate this effect.



FIG. 31 shows that unannotated intergenic transcripts differentiate prostate cancer and benign samples. (a) The genomic location and exon structure of SChLAP-1. SChLAP-1 is located on chromosome 2 in a previously unannotated region. (b) The isoform structure of SChLAP-1. (c) Cell fractionation into nuclear and cytoplasmic fractions demonstrates that SChLAP-1 is predominantly nuclear in its localization. (d) Expression of SChLAP-1 in a cohort of prostate cancer and benign tissues indicates that SChLAP-1 is a prostate cancer outlier associated with cancers.



FIG. 32 shows that SChLAP-1 is required for prostate cancer cell invasion and proliferation. (a) Prostate and non-prostate cancer cell lines were treated with SChLAP-1 siRNAs.


(b and c) As in (a), prostate and non-prostate cell lines were assayed for cell proliferation following SChLAP-1 knockdown. (d) The three most abundant isoforms of SChLAP-1 were cloned and overexpressed in RWPE benign immortalized prostate cells at levels similar to LNCaP cancer cells. (e) RWPE cells overexpressing SChLAP-1 isoforms show an increased ability to invade through Matrigel in Boyden chamber assays.



FIG. 33 shows that deletion analysis of SChLAP-1 identifies a region essential for its function. (a) RWPE cells overexpressing SChLAP-1 deletion constructs or full-length isoform #1 were generated as shown in the schematic of the constructs. (b) RWPE cells overexpressing SChLAP-1 deletion construct5 demonstrated an impaired ability to invade through Matrigel, while the other deletion constructs showed no reduction in their ability to induce RWPE cell invasion compared to the wild type SChLAP-1.



FIG. 34 shows detection of prostate cancer RNAs in patient urine samples. (a-e). (a) PCA3 (b) PCAT-14 (c) PCAT-1 (d) SChLAP-1 (e) PDLIM5



FIG. 35 shows multiplexing urine SChLAP-1 measurements with serum PSA improves prostate cancer risk stratification.



FIG. 36 shows analysis of the lung cancer transcriptome. (a) 38 lung cell lines were analyzed by RNA-Seq and then lncRNA transcripts were reconstructed. (b) Expression levels of transcripts observed in lung cell lines. (c) An outlier analyses of 13 unannotated transcripts shows the presence of novel lncRNAs in subtypes of lung cancer cell lines.



FIG. 37 shows discovery of M41 and ENST-75 in lung cancer. (a) The genomic location of M41, which resides in an intron of DSCAM. M41 is poorly conserved across species. (b) qPCR of M41 demonstrates outlier expression in 15-20% of lung adenocarcinomas as well as high expression in breast cells. (c) The genomic location of ENST-75, which demonstrates high conservation across species. (d) qPCR of ENST-75 shows up-regulation in lung cancer but not breast or prostate cancers. High expression is observed in normal testis.



FIG. 38 shows lncRNAs are drivers and biomarkers in lung cancer. (a) Knockdown of ENST-75 in H1299 cells with independent siRNAs achieving >70% knockdown. (b) Knockdown of ENST-75 in H1299 cells impairs cell proliferation. Error bars represent s.e.m. (c) ENST-75 expression in lung adenocarcinomas stratifies patient overall survival. (d) Serum detection levels of ENST-75 in normal and lung cancer patients. (e) Average ENST-75 expression in lung cancer patient sera compared to normal patient sera. Error bars represent s.e.m.



FIG. 39 shows nomination of cancer-associated lncRNAs in breast and pancreatic cancer. (a-c) (a) TU0011194 (b) TU0019356 (c) TU0024146 (d-f) Three novel pancreatic cancer lncRNAs nominated from RNA-Seq data. All show outlier expression patterns in pancreatic cancer samples but not benign samples. (d) TU0009141 (e) TU0062051 (f) TU0021861





DEFINITIONS

To facilitate an understanding of the present invention, a number of terms and phrases are defined below:


As used herein, the terms “detect”, “detecting” or “detection” may describe either the general act of discovering or discerning or the specific observation of a detectably labeled composition.


As used herein, the term “subject” refers to any organisms that are screened using the diagnostic methods described herein. Such organisms preferably include, but are not limited to, mammals (e.g., murines, simians, equines, bovines, porcines, canines, felines, and the like), and most preferably includes humans.


The term “diagnosed,” as used herein, refers to the recognition of a disease by its signs and symptoms, or genetic analysis, pathological analysis, histological analysis, and the like.


A “subject suspected of having cancer” encompasses an individual who has received an initial diagnosis (e.g., a CT scan showing a mass or increased PSA level) but for whom the stage of cancer or presence or absence of ncRNAs indicative of cancer is not known. The term further includes people who once had cancer (e.g., an individual in remission). In some embodiments, “subjects” are control subjects that are suspected of having cancer or diagnosed with cancer.


As used herein, the term “characterizing cancer in a subject” refers to the identification of one or more properties of a cancer sample in a subject, including but not limited to, the presence of benign, pre-cancerous or cancerous tissue, the stage of the cancer, and the subject's prognosis. Cancers may be characterized by the identification of the expression of one or more cancer marker genes, including but not limited to, the ncRNAs disclosed herein.


As used herein, the term “characterizing prostate tissue in a subject” refers to the identification of one or more properties of a prostate tissue sample (e.g., including but not limited to, the presence of cancerous tissue, the presence or absence of ncRNAs, the presence of pre-cancerous tissue that is likely to become cancerous, and the presence of cancerous tissue that is likely to metastasize). In some embodiments, tissues are characterized by the identification of the expression of one or more cancer marker genes, including but not limited to, the cancer markers disclosed herein.


As used herein, the term “stage of cancer” refers to a qualitative or quantitative assessment of the level of advancement of a cancer. Criteria used to determine the stage of a cancer include, but are not limited to, the size of the tumor and the extent of metastases (e.g., localized or distant).


As used herein, the term “nucleic acid molecule” refers to any nucleic acid containing molecule, including but not limited to, DNA or RNA. The term encompasses sequences that include any of the known base analogs of DNA and RNA including, but not limited to, 4-acetylcytosine, 8-hydroxy-N6-methyladenosine, aziridinylcytosine, pseudoisocytosine, 5-(carboxyhydroxylmethyl) uracil, 5-fluorouracil, 5-bromouracil, 5-carboxymethylaminomethyl-2-thiouracil, 5-carboxymethylaminomethyluracil, dihydrouracil, inosine, N6-isopentenyladenine, 1-methyladenine, 1-methylpseudouracil, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-methyladenine, 7-methylguanine, 5-methylaminomethyluracil, 5-methoxy-aminomethyl-2-thiouracil, beta-D-mannosylqueosine, 5′-methoxycarbonylmethyluracil, 5-methoxyuracil, 2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid, oxybutoxosine, pseudouracil, queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, N-uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid, pseudouracil, queosine, 2-thiocytosine, and 2,6-diaminopurine.


The term “gene” refers to a nucleic acid (e.g., DNA) sequence that comprises coding sequences necessary for the production of a polypeptide, precursor, or RNA (e.g., rRNA, tRNA). The polypeptide can be encoded by a full length coding sequence or by any portion of the coding sequence so long as the desired activity or functional properties (e.g., enzymatic activity, ligand binding, signal transduction, immunogenicity, etc.) of the full-length or fragments are retained. The term also encompasses the coding region of a structural gene and the sequences located adjacent to the coding region on both the 5′ and 3′ ends for a distance of about 1 kb or more on either end such that the gene corresponds to the length of the full-length mRNA. Sequences located 5′ of the coding region and present on the mRNA are referred to as 5′ non-translated sequences. Sequences located 3′ or downstream of the coding region and present on the mRNA are referred to as 3′ non-translated sequences. The term “gene” encompasses both cDNA and genomic forms of a gene. A genomic form or clone of a gene contains the coding region interrupted with non-coding sequences termed “introns” or “intervening regions” or “intervening sequences.” Introns are segments of a gene that are transcribed into nuclear RNA (hnRNA); introns may contain regulatory elements such as enhancers. Introns are removed or “spliced out” from the nuclear or primary transcript; introns therefore are absent in the messenger RNA (mRNA) transcript. The mRNA functions during translation to specify the sequence or order of amino acids in a nascent polypeptide.


As used herein, the term “oligonucleotide,” refers to a short length of single-stranded polynucleotide chain. Oligonucleotides are typically less than 200 residues long (e.g., between 15 and 100), however, as used herein, the term is also intended to encompass longer polynucleotide chains. Oligonucleotides are often referred to by their length. For example a 24 residue oligonucleotide is referred to as a “24-mer”. Oligonucleotides can form secondary and tertiary structures by self-hybridizing or by hybridizing to other polynucleotides. Such structures can include, but are not limited to, duplexes, hairpins, cruciforms, bends, and triplexes.


As used herein, the terms “complementary” or “complementarity” are used in reference to polynucleotides (i.e., a sequence of nucleotides) related by the base-pairing rules. For example, the sequence “5′-A-G-T-3′,” is complementary to the sequence “3′-T-C-A-5′.” Complementarity may be “partial,” in which only some of the nucleic acids' bases are matched according to the base pairing rules. Or, there may be “complete” or “total” complementarity between the nucleic acids. The degree of complementarity between nucleic acid strands has significant effects on the efficiency and strength of hybridization between nucleic acid strands. This is of particular importance in amplification reactions, as well as detection methods that depend upon binding between nucleic acids.


The term “homology” refers to a degree of complementarity. There may be partial homology or complete homology (i.e., identity). A partially complementary sequence is a nucleic acid molecule that at least partially inhibits a completely complementary nucleic acid molecule from hybridizing to a target nucleic acid is “substantially homologous.” The inhibition of hybridization of the completely complementary sequence to the target sequence may be examined using a hybridization assay (Southern or Northern blot, solution hybridization and the like) under conditions of low stringency. A substantially homologous sequence or probe will compete for and inhibit the binding (i.e., the hybridization) of a completely homologous nucleic acid molecule to a target under conditions of low stringency. This is not to say that conditions of low stringency are such that non-specific binding is permitted; low stringency conditions require that the binding of two sequences to one another be a specific (i.e., selective) interaction. The absence of non-specific binding may be tested by the use of a second target that is substantially non-complementary (e.g., less than about 30% identity); in the absence of non-specific binding the probe will not hybridize to the second non-complementary target.


As used herein, the term “hybridization” is used in reference to the pairing of complementary nucleic acids. Hybridization and the strength of hybridization (i.e., the strength of the association between the nucleic acids) is impacted by such factors as the degree of complementary between the nucleic acids, stringency of the conditions involved, the Tm of the formed hybrid, and the G:C ratio within the nucleic acids. A single molecule that contains pairing of complementary nucleic acids within its structure is said to be “self-hybridized.”


As used herein the term “stringency” is used in reference to the conditions of temperature, ionic strength, and the presence of other compounds such as organic solvents, under which nucleic acid hybridizations are conducted. Under “low stringency conditions” a nucleic acid sequence of interest will hybridize to its exact complement, sequences with single base mismatches, closely related sequences (e.g., sequences with 90% or greater homology), and sequences having only partial homology (e.g., sequences with 50-90% homology). Under ‘medium stringency conditions,” a nucleic acid sequence of interest will hybridize only to its exact complement, sequences with single base mismatches, and closely relation sequences (e.g., 90% or greater homology). Under “high stringency conditions,” a nucleic acid sequence of interest will hybridize only to its exact complement, and (depending on conditions such a temperature) sequences with single base mismatches. In other words, under conditions of high stringency the temperature can be raised so as to exclude hybridization to sequences with single base mismatches.


The term “isolated” when used in relation to a nucleic acid, as in “an isolated oligonucleotide” or “isolated polynucleotide” refers to a nucleic acid sequence that is identified and separated from at least one component or contaminant with which it is ordinarily associated in its natural source. Isolated nucleic acid is such present in a form or setting that is different from that in which it is found in nature. In contrast, non-isolated nucleic acids as nucleic acids such as DNA and RNA found in the state they exist in nature. For example, a given DNA sequence (e.g., a gene) is found on the host cell chromosome in proximity to neighboring genes; RNA sequences, such as a specific mRNA sequence encoding a specific protein, are found in the cell as a mixture with numerous other mRNAs that encode a multitude of proteins. However, isolated nucleic acid encoding a given protein includes, by way of example, such nucleic acid in cells ordinarily expressing the given protein where the nucleic acid is in a chromosomal location different from that of natural cells, or is otherwise flanked by a different nucleic acid sequence than that found in nature. The isolated nucleic acid, oligonucleotide, or polynucleotide may be present in single-stranded or double-stranded form. When an isolated nucleic acid, oligonucleotide or polynucleotide is to be utilized to express a protein, the oligonucleotide or polynucleotide will contain at a minimum the sense or coding strand (i.e., the oligonucleotide or polynucleotide may be single-stranded), but may contain both the sense and anti-sense strands (i.e., the oligonucleotide or polynucleotide may be double-stranded).


As used herein, the term “purified” or “to purify” refers to the removal of components (e.g., contaminants) from a sample. For example, antibodies are purified by removal of contaminating non-immunoglobulin proteins; they are also purified by the removal of immunoglobulin that does not bind to the target molecule. The removal of non-immunoglobulin proteins and/or the removal of immunoglobulins that do not bind to the target molecule results in an increase in the percent of target-reactive immunoglobulins in the sample. In another example, recombinant polypeptides are expressed in bacterial host cells and the polypeptides are purified by the removal of host cell proteins; the percent of recombinant polypeptides is thereby increased in the sample.


As used herein, the term “sample” is used in its broadest sense. In one sense, it is meant to include a specimen or culture obtained from any source, as well as biological and environmental samples. Biological samples may be obtained from animals (including humans) and encompass fluids, solids, tissues, and gases. Biological samples include blood products, such as plasma, serum and the like. Such examples are not however to be construed as limiting the sample types applicable to the present invention.


DETAILED DESCRIPTION OF THE INVENTION

The present invention relates to compositions and methods for cancer diagnosis, research and therapy, including but not limited to, cancer markers. In particular, the present invention relates to ncRNAs as diagnostic markers and clinical targets for prostate, lung, breast and pancreatic cancer.


Experiments conducted during the development of embodiments of the present invention utilized RNA-Seq analyses of tissue samples and ab initio transcriptome assembly to predict the complete polyA+ transcriptome of prostate cancer. 6,144 novel ncRNAs found in prostate cancer were identified, including 121 ncRNAs that associated with disease progression (FIGS. 1, 2, 16 and 25). These data demonstrate the global utility of RNA-Seq in defining functionally-important elements of the genome.


The present invention is not limited to a particular mechanism. Indeed, an understanding of the mechanism is not necessary to practice the present invention. Nonetheless, although the biological role of these RNAs, especially the differentially-expressed ones, is not yet known, these results indicate a model in which specific intergenic loci are activated in prostate cancer, enabling the transcription of numerous disease-specific and tissue-specific ncRNAs (FIG. 5g). Clinically, these ncRNA signatures are suitable for urine-based assays to detect and diagnose prostate cancer in a non-invasive manner (See e.g., Example 1). It is further contemplated that specific ncRNA signatures occur universally in all disease states and applying these methodologies to other diseases reveals clinically important biomarkers, particularly for diseases that currently lack good protein biomarkers.


While traditional approaches have focused on the annotated reference genome, data generated during the course of development of embodiments of the present invention implicate large swaths of unannotated genomic loci in prostate cancer progression and prostate-specific expression. One example of this is the SChLAP1 locus, which represents a >500 kb stretch of coordinately regulated expression, and the chr8q24 locus, which contains a prostate specific region with the prostate cancer biomarker PCAT-1. The fact that the SChLAP1 locus is almost exclusively expressed in prostate cancers harboring an ETS gene fusion further confirms the capacity of ncRNAs to identify patient disease subtypes. In addition, these analyses reveal novel cancer-specific drivers of tumorigenesis. For example, the long ncRNA HOTAIR is known to direct cancer-promoting roles for EZH2 in breast cancer (Gupta et al., Nature 464 (7291), 1071 (2010)), while in the PC3 prostate cancer cell line a similar role has been proposed for the ANRIL ncRNA (Yap et al., Mol Cell 38 (5), 662 (2010)).


I. Diagnostic and Screening Methods

As described above, embodiments of the present invention provide diagnostic and screening methods that utilize the detection of ncRNAs (e.g., PCAT-1, PCAT-14, PCAT-43 and PCAT-109; SEQ ID NOs: 1-9). Exemplary, non-limiting methods are described below.


Any patient sample suspected of containing the ncRNAs may be tested according to methods of embodiments of the present invention. By way of non-limiting examples, the sample may be tissue (e.g., a prostate biopsy sample or a tissue sample obtained by prostatectomy), blood, urine, semen, prostatic secretions or a fraction thereof (e.g., plasma, serum, urine supernatant, urine cell pellet or prostate cells). A urine sample is preferably collected immediately following an attentive digital rectal examination (DRE), which causes prostate cells from the prostate gland to shed into the urinary tract.


In some embodiments, the patient sample is subjected to preliminary processing designed to isolate or enrich the sample for the ncRNAs or cells that contain the ncRNAs. A variety of techniques known to those of ordinary skill in the art may be used for this purpose, including but not limited to: centrifugation; immunocapture; cell lysis; and, nucleic acid target capture (See, e.g., EP Pat. No. 1 409 727, herein incorporated by reference in its entirety).


The ncRNAs may be detected along with other markers in a multiplex or panel format. Markers are selected for their predictive value alone or in combination with the gene fusions. Exemplary prostate cancer markers include, but are not limited to: AMACR/P504S (U.S. Pat. No. 6,262,245); PCA3 (U.S. Pat. No. 7,008,765); PCGEM1 (U.S. Pat. No. 6,828,429); prostein/P501S, P503S, P504S, P509S, P510S, prostase/P703P, P710P (U.S. Publication No. 20030185830); RAS/KRAS (Bos, Cancer Res. 49:4682-89 (1989); Kranenburg, Biochimica et Biophysica Acta 1756:81-82 (2005)); and, those disclosed in U.S. Pat. Nos. 5,854,206 and 6,034,218, 7,229,774, each of which is herein incorporated by reference in its entirety. Markers for other cancers, diseases, infections, and metabolic conditions are also contemplated for inclusion in a multiplex or panel format.


In some embodiments, multiplex or array formats are utilized to detected multiple markers in combination. For example, in some embodiments, the level of expression of two or more (e.g., 10 or more, 25 or more, 50 or more, 100 or more or all 121) non-coding RNAs (ncRNA) selected from, for example, PCAT1, PCAT2, PCAT3, PCAT4, PCAT5, PCAT6, PCAT7, PCAT8, PCAT9, PCAT10, PCAT11, PCAT12, PCAT13, PCAT14, PCAT15, PCAT16, PCAT17, PCAT18, PCAT19, PCAT20, PCAT21, PCAT22, PCAT23, PCAT24, PCAT25, PCAT26, PCAT27, PCAT28, PCAT29, PCAT30, PCAT31, PCAT32, PCAT33, PCAT34, PCAT35, PCAT36, PCAT37, PCAT38, PCAT39, PCAT40, PCAT41, PCAT42, PCAT43, PCAT44, PCAT45, PCAT46, PCAT47, PCAT48, PCAT49, PCAT50, PCAT51, PCAT52, PCAT53, PCAT54, PCAT55, PCAT56, PCAT57, PCAT58, PCAT59, PCAT60, PCAT61, PCAT62, PCAT63, PCAT64, PCAT65, PCAT66, PCAT67, PCAT68, PCAT69, PCAT70, PCAT71, PCAT72, PCAT73, PCAT74, PCAT75, PCAT76, PCAT77, PCAT78, PCAT79, PCAT80, PCAT81, PCAT82, PCAT83, PCAT84, PCAT85, PCAT86, PCAT87, PCAT88, PCAT89, PCAT90, PCAT91, PCAT92, PCAT93, PCAT94, PCAT95, PCAT96, PCAT97, PCAT98, PCAT99, PCAT100, PCAT101, PCAT102, PCAT103, PCAT104, PCAT105, PCAT106, PCAT107, PCAT108, PCAT109, PCAT110, PCAT111, PCAT112, PCAT113, PCAT114, PCAT115, PCAT116, PCAT117, PCAT118, PCAT119, PCAT120, or PCAT121 is utilized in the research, screening, diagnostic and prognostic compositions and methods described herein.


i. DNA and RNA Detection


The ncRNAs of the present invention are detected using a variety of nucleic acid techniques known to those of ordinary skill in the art, including but not limited to: nucleic acid sequencing; nucleic acid hybridization; and, nucleic acid amplification.


1. Sequencing


Illustrative non-limiting examples of nucleic acid sequencing techniques include, but are not limited to, chain terminator (Sanger) sequencing and dye terminator sequencing. Those of ordinary skill in the art will recognize that because RNA is less stable in the cell and more prone to nuclease attack experimentally RNA is usually reverse transcribed to DNA before sequencing.


Chain terminator sequencing uses sequence-specific termination of a DNA synthesis reaction using modified nucleotide substrates. Extension is initiated at a specific site on the template DNA by using a short radioactive, or other labeled, oligonucleotide primer complementary to the template at that region. The oligonucleotide primer is extended using a DNA polymerase, standard four deoxynucleotide bases, and a low concentration of one chain terminating nucleotide, most commonly a di-deoxynucleotide. This reaction is repeated in four separate tubes with each of the bases taking turns as the di-deoxynucleotide. Limited incorporation of the chain terminating nucleotide by the DNA polymerase results in a series of related DNA fragments that are terminated only at positions where that particular di-deoxynucleotide is used. For each reaction tube, the fragments are size-separated by electrophoresis in a slab polyacrylamide gel or a capillary tube filled with a viscous polymer. The sequence is determined by reading which lane produces a visualized mark from the labeled primer as you scan from the top of the gel to the bottom.


Dye terminator sequencing alternatively labels the terminators. Complete sequencing can be performed in a single reaction by labeling each of the di-deoxynucleotide chain-terminators with a separate fluorescent dye, which fluoresces at a different wavelength.


A variety of nucleic acid sequencing methods are contemplated for use in the methods of the present disclosure including, for example, chain terminator (Sanger) sequencing, dye terminator sequencing, and high-throughput sequencing methods. Many of these sequencing methods are well known in the art. See, e.g., Sanger et al., Proc. Natl. Acad. Sci. USA 74:5463-5467 (1997); Maxam et al., Proc. Natl. Acad. Sci. USA 74:560-564 (1977); Drmanac, et al., Nat. Biotechnol. 16:54-58 (1998); Kato, Int. J. Clin. Exp. Med. 2:193-202 (2009); Ronaghi et al., Anal. Biochem. 242:84-89 (1996); Margulies et al., Nature 437:376-380 (2005); Ruparel et al., Proc. Natl. Acad. Sci. USA 102:5932-5937 (2005), and Harris et al., Science 320:106-109 (2008); Levene et al., Science 299:682-686 (2003); Korlach et al., Proc. Natl. Acad. Sci. USA 105:1176-1181 (2008); Branton et al., Nat. Biotechnol. 26(10):1146-53 (2008); Eid et al., Science 323:133-138 (2009); each of which is herein incorporated by reference in its entirety.


2. Hybridization


Illustrative non-limiting examples of nucleic acid hybridization techniques include, but are not limited to, in situ hybridization (ISH), microarray, and Southern or Northern blot. In situ hybridization (ISH) is a type of hybridization that uses a labeled complementary DNA or RNA strand as a probe to localize a specific DNA or RNA sequence in a portion or section of tissue (in situ), or, if the tissue is small enough, the entire tissue (whole mount ISH). DNA ISH can be used to determine the structure of chromosomes. RNA ISH is used to measure and localize mRNAs and other transcripts (e.g., ncRNAs) within tissue sections or whole mounts. Sample cells and tissues are usually treated to fix the target transcripts in place and to increase access of the probe. The probe hybridizes to the target sequence at elevated temperature, and then the excess probe is washed away. The probe that was labeled with either radio-, fluorescent- or antigen-labeled bases is localized and quantitated in the tissue using either autoradiography, fluorescence microscopy or immunohistochemistry, respectively. ISH can also use two or more probes, labeled with radioactivity or the other non-radioactive labels, to simultaneously detect two or more transcripts.


In some embodiments, ncRNAs are detected using fluorescence in situ hybridization (FISH). In some embodiments, FISH assays utilize bacterial artificial chromosomes (BACs). These have been used extensively in the human genome sequencing project (see Nature 409: 953-958 (2001)) and clones containing specific BACs are available through distributors that can be located through many sources, e.g., NCBI. Each BAC clone from the human genome has been given a reference name that unambiguously identifies it. These names can be used to find a corresponding GenBank sequence and to order copies of the clone from a distributor.


The present invention further provides a method of performing a FISH assay on human prostate cells, human prostate tissue or on the fluid surrounding said human prostate cells or human prostate tissue. Specific protocols are well known in the art and can be readily adapted for the present invention. Guidance regarding methodology may be obtained from many references including: In situ Hybridization: Medical Applications (eds. G. R. Coulton and J. de Belleroche), Kluwer Academic Publishers, Boston (1992); In situ Hybridization: In Neurobiology; Advances in Methodology (eds. J. H. Eberwine, K. L. Valentino, and J. D. Barchas), Oxford University Press Inc., England (1994); In situ Hybridization: A Practical Approach (ed. D. G. Wilkinson), Oxford University Press Inc., England (1992)); Kuo, et al., Am. J. Hum. Genet. 49:112-119 (1991); Klinger, et al., Am. J. Hum. Genet. 51:55-65 (1992); and Ward, et al., Am. J. Hum. Genet. 52:854-865 (1993)). There are also kits that are commercially available and that provide protocols for performing FISH assays (available from e.g., Oncor, Inc., Gaithersburg, Md.). Patents providing guidance on methodology include U.S. Pat. Nos. 5,225,326; 5,545,524; 6,121,489 and 6,573,043. All of these references are hereby incorporated by reference in their entirety and may be used along with similar references in the art and with the information provided in the Examples section herein to establish procedural steps convenient for a particular laboratory.


3. Microarrays


Different kinds of biological assays are called microarrays including, but not limited to: DNA microarrays (e.g., cDNA microarrays and oligonucleotide microarrays); protein microarrays; tissue microarrays; transfection or cell microarrays; chemical compound microarrays; and, antibody microarrays. A DNA microarray, commonly known as gene chip, DNA chip, or biochip, is a collection of microscopic DNA spots attached to a solid surface (e.g., glass, plastic or silicon chip) forming an array for the purpose of expression profiling or monitoring expression levels for thousands of genes simultaneously. The affixed DNA segments are known as probes, thousands of which can be used in a single DNA microarray. Microarrays can be used to identify disease genes or transcripts (e.g., ncRNAs) by comparing gene expression in disease and normal cells. Microarrays can be fabricated using a variety of technologies, including but not limiting: printing with fine-pointed pins onto glass slides; photolithography using pre-made masks; photolithography using dynamic micromirror devices; ink-jet printing; or, electrochemistry on microelectrode arrays.


Southern and Northern blotting is used to detect specific DNA or RNA sequences, respectively. DNA or RNA extracted from a sample is fragmented, electrophoretically separated on a matrix gel, and transferred to a membrane filter. The filter bound DNA or RNA is subject to hybridization with a labeled probe complementary to the sequence of interest. Hybridized probe bound to the filter is detected. A variant of the procedure is the reverse Northern blot, in which the substrate nucleic acid that is affixed to the membrane is a collection of isolated DNA fragments and the probe is RNA extracted from a tissue and labeled.


3. Amplification


Nucleic acids (e.g., ncRNAs) may be amplified prior to or simultaneous with detection. Illustrative non-limiting examples of nucleic acid amplification techniques include, but are not limited to, polymerase chain reaction (PCR), reverse transcription polymerase chain reaction (RT-PCR), transcription-mediated amplification (TMA), ligase chain reaction (LCR), strand displacement amplification (SDA), and nucleic acid sequence based amplification (NASBA). Those of ordinary skill in the art will recognize that certain amplification techniques (e.g., PCR) require that RNA be reversed transcribed to DNA prior to amplification (e.g., RT-PCR), whereas other amplification techniques directly amplify RNA (e.g., TMA and NASBA).


The polymerase chain reaction (U.S. Pat. Nos. 4,683,195, 4,683,202, 4,800,159 and 4,965,188, each of which is herein incorporated by reference in its entirety), commonly referred to as PCR, uses multiple cycles of denaturation, annealing of primer pairs to opposite strands, and primer extension to exponentially increase copy numbers of a target nucleic acid sequence. In a variation called RT-PCR, reverse transcriptase (RT) is used to make a complementary DNA (cDNA) from mRNA, and the cDNA is then amplified by PCR to produce multiple copies of DNA. For other various permutations of PCR see, e.g., U.S. Pat. Nos. 4,683,195, 4,683,202 and 4,800,159; Mullis et al., Meth. Enzymol. 155: 335 (1987); and, Murakawa et al., DNA 7: 287 (1988), each of which is herein incorporated by reference in its entirety.


Transcription mediated amplification (U.S. Pat. Nos. 5,480,784 and 5,399,491, each of which is herein incorporated by reference in its entirety), commonly referred to as TMA, synthesizes multiple copies of a target nucleic acid sequence autocatalytically under conditions of substantially constant temperature, ionic strength, and pH in which multiple RNA copies of the target sequence autocatalytically generate additional copies. See, e.g., U.S. Pat. Nos. 5,399,491 and 5,824,518, each of which is herein incorporated by reference in its entirety. In a variation described in U.S. Publ. No. 20060046265 (herein incorporated by reference in its entirety), TMA optionally incorporates the use of blocking moieties, terminating moieties, and other modifying moieties to improve TMA process sensitivity and accuracy.


The ligase chain reaction (Weiss, R., Science 254: 1292 (1991), herein incorporated by reference in its entirety), commonly referred to as LCR, uses two sets of complementary DNA oligonucleotides that hybridize to adjacent regions of the target nucleic acid. The DNA oligonucleotides are covalently linked by a DNA ligase in repeated cycles of thermal denaturation, hybridization and ligation to produce a detectable double-stranded ligated oligonucleotide product.


Strand displacement amplification (Walker, G. et al., Proc. Natl. Acad. Sci. USA 89: 392-396 (1992); U.S. Pat. Nos. 5,270,184 and 5,455,166, each of which is herein incorporated by reference in its entirety), commonly referred to as SDA, uses cycles of annealing pairs of primer sequences to opposite strands of a target sequence, primer extension in the presence of a dNTPaS to produce a duplex hemiphosphorothioated primer extension product, endonuclease-mediated nicking of a hemimodified restriction endonuclease recognition site, and polymerase-mediated primer extension from the 3′ end of the nick to displace an existing strand and produce a strand for the next round of primer annealing, nicking and strand displacement, resulting in geometric amplification of product. Thermophilic SDA (tSDA) uses thermophilic endonucleases and polymerases at higher temperatures in essentially the same method (EP Pat. No. 0 684 315).


Other amplification methods include, for example: nucleic acid sequence based amplification (U.S. Pat. No. 5,130,238, herein incorporated by reference in its entirety), commonly referred to as NASBA; one that uses an RNA replicase to amplify the probe molecule itself (Lizardi et al., BioTechnol. 6: 1197 (1988), herein incorporated by reference in its entirety), commonly referred to as Qβ replicase; a transcription based amplification method (Kwoh et al., Proc. Natl. Acad. Sci. USA 86:1173 (1989)); and, self-sustained sequence replication (Guatelli et al., Proc. Natl. Acad. Sci. USA 87: 1874 (1990), each of which is herein incorporated by reference in its entirety). For further discussion of known amplification methods see Persing, David H., “In Vitro Nucleic Acid Amplification Techniques” in Diagnostic Medical Microbiology: Principles and Applications (Persing et al., Eds.), pp. 51-87 (American Society for Microbiology, Washington, D.C. (1993)).


4. Detection Methods


Non-amplified or amplified nucleic acids can be detected by any conventional means. For example, the ncRNAs can be detected by hybridization with a detectably labeled probe and measurement of the resulting hybrids. Illustrative non-limiting examples of detection methods are described below.


One illustrative detection method, the Hybridization Protection Assay (HPA) involves hybridizing a chemiluminescent oligonucleotide probe (e.g., an acridinium ester-labeled (AE) probe) to the target sequence, selectively hydrolyzing the chemiluminescent label present on unhybridized probe, and measuring the chemiluminescence produced from the remaining probe in a luminometer. See, e.g., U.S. Pat. No. 5,283,174 and Norman C. Nelson et al., Nonisotopic Probing, Blotting, and Sequencing, ch. 17 (Larry J. Kricka ed., 2d ed. 1995, each of which is herein incorporated by reference in its entirety).


Another illustrative detection method provides for quantitative evaluation of the amplification process in real-time. Evaluation of an amplification process in “real-time” involves determining the amount of amplicon in the reaction mixture either continuously or periodically during the amplification reaction, and using the determined values to calculate the amount of target sequence initially present in the sample. A variety of methods for determining the amount of initial target sequence present in a sample based on real-time amplification are well known in the art. These include methods disclosed in U.S. Pat. Nos. 6,303,305 and 6,541,205, each of which is herein incorporated by reference in its entirety. Another method for determining the quantity of target sequence initially present in a sample, but which is not based on a real-time amplification, is disclosed in U.S. Pat. No. 5,710,029, herein incorporated by reference in its entirety.


Amplification products may be detected in real-time through the use of various self-hybridizing probes, most of which have a stem-loop structure. Such self-hybridizing probes are labeled so that they emit differently detectable signals, depending on whether the probes are in a self-hybridized state or an altered state through hybridization to a target sequence. By way of non-limiting example, “molecular torches” are a type of self-hybridizing probe that includes distinct regions of self-complementarity (referred to as “the target binding domain” and “the target closing domain”) which are connected by a joining region (e.g., non-nucleotide linker) and which hybridize to each other under predetermined hybridization assay conditions. In a preferred embodiment, molecular torches contain single-stranded base regions in the target binding domain that are from 1 to about 20 bases in length and are accessible for hybridization to a target sequence present in an amplification reaction under strand displacement conditions. Under strand displacement conditions, hybridization of the two complementary regions, which may be fully or partially complementary, of the molecular torch is favored, except in the presence of the target sequence, which will bind to the single-stranded region present in the target binding domain and displace all or a portion of the target closing domain. The target binding domain and the target closing domain of a molecular torch include a detectable label or a pair of interacting labels (e.g., luminescent/quencher) positioned so that a different signal is produced when the molecular torch is self-hybridized than when the molecular torch is hybridized to the target sequence, thereby permitting detection of probe:target duplexes in a test sample in the presence of unhybridized molecular torches. Molecular torches and a variety of types of interacting label pairs are disclosed in U.S. Pat. No. 6,534,274, herein incorporated by reference in its entirety.


Another example of a detection probe having self-complementarity is a “molecular beacon.” Molecular beacons include nucleic acid molecules having a target complementary sequence, an affinity pair (or nucleic acid arms) holding the probe in a closed conformation in the absence of a target sequence present in an amplification reaction, and a label pair that interacts when the probe is in a closed conformation. Hybridization of the target sequence and the target complementary sequence separates the members of the affinity pair, thereby shifting the probe to an open conformation. The shift to the open conformation is detectable due to reduced interaction of the label pair, which may be, for example, a fluorophore and a quencher (e.g., DABCYL and EDANS). Molecular beacons are disclosed in U.S. Pat. Nos. 5,925,517 and 6,150,097, herein incorporated by reference in its entirety.


Other self-hybridizing probes are well known to those of ordinary skill in the art. By way of non-limiting example, probe binding pairs having interacting labels, such as those disclosed in U.S. Pat. No. 5,928,862 (herein incorporated by reference in its entirety) might be adapted for use in the present invention. Probe systems used to detect single nucleotide polymorphisms (SNPs) might also be utilized in the present invention. Additional detection systems include “molecular switches,” as disclosed in U.S. Publ. No. 20050042638, herein incorporated by reference in its entirety. Other probes, such as those comprising intercalating dyes and/or fluorochromes, are also useful for detection of amplification products in the present invention. See, e.g., U.S. Pat. No. 5,814,447 (herein incorporated by reference in its entirety).


ii. Data Analysis


In some embodiments, a computer-based analysis program is used to translate the raw data generated by the detection assay (e.g., the presence, absence, or amount of a given marker or markers) into data of predictive value for a clinician. The clinician can access the predictive data using any suitable means. Thus, in some preferred embodiments, the present invention provides the further benefit that the clinician, who is not likely to be trained in genetics or molecular biology, need not understand the raw data. The data is presented directly to the clinician in its most useful form. The clinician is then able to immediately utilize the information in order to optimize the care of the subject.


The present invention contemplates any method capable of receiving, processing, and transmitting the information to and from laboratories conducting the assays, information provides, medical personal, and subjects. For example, in some embodiments of the present invention, a sample (e.g., a biopsy or a serum or urine sample) is obtained from a subject and submitted to a profiling service (e.g., clinical lab at a medical facility, genomic profiling business, etc.), located in any part of the world (e.g., in a country different than the country where the subject resides or where the information is ultimately used) to generate raw data. Where the sample comprises a tissue or other biological sample, the subject may visit a medical center to have the sample obtained and sent to the profiling center, or subjects may collect the sample themselves (e.g., a urine sample) and directly send it to a profiling center. Where the sample comprises previously determined biological information, the information may be directly sent to the profiling service by the subject (e.g., an information card containing the information may be scanned by a computer and the data transmitted to a computer of the profiling center using an electronic communication systems). Once received by the profiling service, the sample is processed and a profile is produced (i.e., expression data), specific for the diagnostic or prognostic information desired for the subject.


The profile data is then prepared in a format suitable for interpretation by a treating clinician. For example, rather than providing raw expression data, the prepared format may represent a diagnosis or risk assessment (e.g., presence or absence of a ncRNA) for the subject, along with recommendations for particular treatment options. The data may be displayed to the clinician by any suitable method. For example, in some embodiments, the profiling service generates a report that can be printed for the clinician (e.g., at the point of care) or displayed to the clinician on a computer monitor.


In some embodiments, the information is first analyzed at the point of care or at a regional facility. The raw data is then sent to a central processing facility for further analysis and/or to convert the raw data to information useful for a clinician or patient. The central processing facility provides the advantage of privacy (all data is stored in a central facility with uniform security protocols), speed, and uniformity of data analysis. The central processing facility can then control the fate of the data following treatment of the subject. For example, using an electronic communication system, the central facility can provide data to the clinician, the subject, or researchers.


In some embodiments, the subject is able to directly access the data using the electronic communication system. The subject may chose further intervention or counseling based on the results. In some embodiments, the data is used for research use. For example, the data may be used to further optimize the inclusion or elimination of markers as useful indicators of a particular condition or stage of disease or as a companion diagnostic to determine a treatment course of action.


iiii. In Vivo Imaging


ncRNAs may also be detected using in vivo imaging techniques, including but not limited to: radionuclide imaging; positron emission tomography (PET); computerized axial tomography, X-ray or magnetic resonance imaging method, fluorescence detection, and chemiluminescent detection. In some embodiments, in vivo imaging techniques are used to visualize the presence of or expression of cancer markers in an animal (e.g., a human or non-human mammal). For example, in some embodiments, cancer marker mRNA or protein is labeled using a labeled antibody specific for the cancer marker. A specifically bound and labeled antibody can be detected in an individual using an in vivo imaging method, including, but not limited to, radionuclide imaging, positron emission tomography, computerized axial tomography, X-ray or magnetic resonance imaging method, fluorescence detection, and chemiluminescent detection. Methods for generating antibodies to the cancer markers of the present invention are described below.


The in vivo imaging methods of embodiments of the present invention are useful in the identification of cancers that express ncRNAs (e.g., prostate cancer). In vivo imaging is used to visualize the presence or level of expression of a ncRNA. Such techniques allow for diagnosis without the use of an unpleasant biopsy. The in vivo imaging methods of embodiments of the present invention can further be used to detect metastatic cancers in other parts of the body.


In some embodiments, reagents (e.g., antibodies) specific for the cancer markers of the present invention are fluorescently labeled. The labeled antibodies are introduced into a subject (e.g., orally or parenterally). Fluorescently labeled antibodies are detected using any suitable method (e.g., using the apparatus described in U.S. Pat. No. 6,198,107, herein incorporated by reference).


In other embodiments, antibodies are radioactively labeled. The use of antibodies for in vivo diagnosis is well known in the art. Sumerdon et al., (Nucl. Med. Biol 17:247-254 [1990] have described an optimized antibody-chelator for the radioimmunoscintographic imaging of tumors using Indium-111 as the label. Griffin et al., (J Clin Onc 9:631-640 [1991]) have described the use of this agent in detecting tumors in patients suspected of having recurrent colorectal cancer. The use of similar agents with paramagnetic ions as labels for magnetic resonance imaging is known in the art (Lauffer, Magnetic Resonance in Medicine 22:339-342) [1991]). The label used will depend on the imaging modality chosen. Radioactive labels such as Indium-111, Technetium-99m, or Iodine-131 can be used for planar scans or single photon emission computed tomography (SPECT). Positron emitting labels such as Fluorine-19 can also be used for positron emission tomography (PET). For MRI, paramagnetic ions such as Gadolinium (III) or Manganese (II) can be used.


Radioactive metals with half-lives ranging from 1 hour to 3.5 days are available for conjugation to antibodies, such as scandium-47 (3.5 days) gallium-67 (2.8 days), gallium-68 (68 minutes), technetium-99m (6 hours), and indium-111 (3.2 days), of which gallium-67, technetium-99m, and indium-111 are preferable for gamma camera imaging, gallium-68 is preferable for positron emission tomography.


A useful method of labeling antibodies with such radiometals is by means of a bifunctional chelating agent, such as diethylenetriaminepentaacetic acid (DTPA), as described, for example, by Khaw et al. (Science 209:295 [1980]) for In-111 and Tc-99m, and by Scheinberg et al. (Science 215:1511 [1982]). Other chelating agents may also be used, but the 1-(p-carboxymethoxybenzyl)EDTA and the carboxycarbonic anhydride of DTPA are advantageous because their use permits conjugation without affecting the antibody's immunoreactivity substantially.


Another method for coupling DPTA to proteins is by use of the cyclic anhydride of DTPA, as described by Hnatowich et al. (Int. J. Appl. Radiat. Isot. 33:327 [1982]) for labeling of albumin with In-111, but which can be adapted for labeling of antibodies. A suitable method of labeling antibodies with Tc-99m which does not use chelation with DPTA is the pretinning method of Crockford et al., (U.S. Pat. No. 4,323,546, herein incorporated by reference).


A method of labeling immunoglobulins with Tc-99m is that described by Wong et al. (Int. J. Appl. Radiat. Isot., 29:251 [1978]) for plasma protein, and recently applied successfully by Wong et al. (J. Nucl. Med., 23:229 [1981]) for labeling antibodies.


In the case of the radiometals conjugated to the specific antibody, it is likewise desirable to introduce as high a proportion of the radiolabel as possible into the antibody molecule without destroying its immunospecificity. A further improvement may be achieved by effecting radiolabeling in the presence of the ncRNA, to insure that the antigen binding site on the antibody will be protected. The antigen is separated after labeling.


In still further embodiments, in vivo biophotonic imaging (Xenogen, Almeda, Calif.) is utilized for in vivo imaging. This real-time in vivo imaging utilizes luciferase. The luciferase gene is incorporated into cells, microorganisms, and animals (e.g., as a fusion protein with a cancer marker of the present invention). When active, it leads to a reaction that emits light. A CCD camera and software is used to capture the image and analyze it.


iv. Compositions & Kits


Compositions for use in the diagnostic methods described herein include, but are not limited to, probes, amplification oligonucleotides, and the like.


The probe and antibody compositions of the present invention may also be provided in the form of an array.


II. Drug Screening Applications

In some embodiments, the present invention provides drug screening assays (e.g., to screen for anticancer drugs). The screening methods of the present invention utilize ncRNAs. For example, in some embodiments, the present invention provides methods of screening for compounds that alter (e.g., decrease) the expression or activity of ncRNAs. The compounds or agents may interfere with transcription, by interacting, for example, with the promoter region. The compounds or agents may interfere with mRNA (e.g., by RNA interference, antisense technologies, etc.). The compounds or agents may interfere with pathways that are upstream or downstream of the biological activity of ncRNAs. In some embodiments, candidate compounds are antisense or interfering RNA agents (e.g., oligonucleotides) directed against ncRNAs. In other embodiments, candidate compounds are antibodies or small molecules that specifically bind to a ncRNAs regulator or expression products inhibit its biological function.


In one screening method, candidate compounds are evaluated for their ability to alter ncRNAs expression by contacting a compound with a cell expressing a ncRNA and then assaying for the effect of the candidate compounds on expression. In some embodiments, the effect of candidate compounds on expression of ncRNAs is assayed for by detecting the level ncRNA expressed by the cell. mRNA expression can be detected by any suitable method.


EXPERIMENTAL

The following examples are provided in order to demonstrate and further illustrate certain preferred embodiments and aspects of the present invention and are not to be construed as limiting the scope thereof.


Example 1
A. Methods

Methods Summary


All prostate tissue samples were obtained from the University of Michigan Specialized Program Of Research Excellence (S.P.O.R.E.) using an IRB-approved informed consent protocol. Next generation sequencing and library preparation was performed as previously described (Maher et al., Proc Natl Acad Sci USA 106 (30), 12353 (2009)). Uniquely mapping sequencing reads were aligned with TopHat and sequencing data for all samples was merged. Ab initio transcriptome assembly was performed by aligning sequences with TopHat and using uniquely mapped read positions to build transcripts with Cufflinks. Informatics approaches were used to refine the assembly and predict expressed transcriptional units. Unannotated transcripts were nominated based upon their absence in the UCSC, RefSeq, ENSEMBL, ENCODE, and Vega databases. Differential expression was determined using the Significance Analysis of Microarrays (SAM) algorithm (Tusher et al., Proc Natl Acad Sci USA 98 (9), 5116 (2001)) on log 2 mean expression in benign, cancer, and metastatic samples. Cancer outlier profile analysis (COPA) was performed as previously described (Tomlins et al., Science 310 (5748), 644 (2005)) with slight modifications. PCR experiments were performed according to standard protocols, and RACE was performed with the GeneRacer Kit (Invitrogen) according to manufacturer's instructions. ChIP-seq data was obtained from previously published data (Yu et al., Cancer Cell 17 (5), 443). siRNA knockdown was performed with custom siRNA oligos (Dharmacon) with Oligofectamine (Invitrogen). Transmembrane invasion assays were performed with Matrigel (BD Biosciences) and cell proliferation assays were performed by cell count with a Coulter counter. Urine analyses were performed as previously described (Laxman et al., Cancer Res 68 (3), 645 (2008)) with minor modifications.


Cell Lines and Tissues


The benign immortalized prostate cell line RWPE as well as PC3, Du145, LNCaP, VCaP, 22Rv1, CWR22, C4-2B, NCI-660, MDA PCa 2b, WPMY-1, and LAPC-4 prostate cell lines were obtained from the American Type Culture Collection (Manassas, Va.). Benign non-immortalized prostate epithelial cells (PrEC) and prostate smooth muscle cells (PrSMC) were obtained from Lonza (Basel, Switzerland). Cell lines were maintained using standard media and conditions. For androgen treatment experiments, LNCaP and VCaP cells were grown in androgen depleted media lacking phenol red and supplemented with 10% charcoal-stripped serum and 1% penicillin-streptomycin. After 48 hours, cells were treated with 5 nM methyltrienolone (R1881, NEN Life Science Products) or an equivalent volume of ethanol. Cells were harvested for RNA at 6, 24, and 48 hours post-treatment. Prostate tissues were obtained from the radical prostatectomy series and Rapid Autopsy Program at the University of Michigan tissue core. These programs are part of the University of Michigan Prostate Cancer Specialized Program Of Research Excellence (S.P.O.R.E.). All tissue samples were collected with informed consent under an Institutional Review Board (IRB) approved protocol at the University of Michigan.


PC3, Du145, LNCaP, 22Rv1, and CRW22 cells were grown in RPMI 1640 (Invitrogen) and supplemented with 10% fetal bovine serum (FBS) and 1% penicillin-streptomycin. LNCaP CDS parent cells were grown in RPMI 1640 lacking phenol red (Invitrogen) supplemented with 10% charcoal-dextran stripped FBS (Invitrogen) and 1% penicillin-streptomycin. LNCaP CDS 1, 2, and 3 are androgen-independent subclones derived from extended cell culture in androgendepleted media. VCaP and WPMY-1 cells were grown in DMEM (Invitrogen) and supplemented with 10% fetal bovine serum (FBS) with 1% penicillin-streptomycin. NCI-H660 cells were grown in RPMI 1640 supplemented with 0.005 mg/ml insulin, 0.01 mg/ml transferring, 30 nM sodium selenite, 10 nM hydrocortisone, 10 nM beta-estradiol, 5% FBS and an extra 2 mM of L-glutamine (for a final concentration of 4 mM). MDA PCa 2b cells were grown in F-12K medium (Invitrogen) supplemented with 20% FBS, 25 ng/ml cholera toxin, 10 ng/ml EGF, 0.005 mM phosphoethanolamine, 100 pg/ml hydrocortisone, 45 nM selenious acid, and 0.005 mg/ml insulin. LAPC-4 cells were grown in Iscove's media (Invitrogen) supplemented with 10% FBS and 1 nM R1881. C4-2B cells were grown in 80% DMEM supplemented with 20% F12, 5% FBS, 3 g/L NaCo3, 5 μg/ml insulin, 13.6 pg/ml triiodothyronine, 5 μg/ml transferrin, 0.25 μg/ml biotin, and 25 μg/ml adenine. PrEC cells were grown in PrEGM supplemented with 2 ml BPE, 0.5 ml hydrocortisone, 0.5 ml EGF, 0.5 ml epinephrine, 0.5 ml transferring, 0.5 ml insulin, 0.5 ml retinoic acid, and 0.5 ml triiodothyronine, as part of the PrEGM BulletKit (Lonza). PrSMC cells were grown in SmGM-2 media supplemented with 2 ml BPE, 0.5 ml hydrocortisone, 0.5 ml EGF, 0.5 ml epinephrine, 0.5 ml transferring, 0.5 ml insulin, 0.5 ml retinoic acid, and 0.5 ml triiodothyronine, as part of the SmGM-2 BulletKit (Lonza).


RNA-Seq Library Preparation.


Next generation sequencing of RNA was performed on 21 prostate cell lines, 20 benign adjacent prostates, 47 localized tumors, and 14 metastatic tumors according to Illumina's protocol using 2 μg of RNA. RNA integrity was measured using an Agilent 2100 Bioanalyzer, and only samples with a RIN score >7.0 were advanced for library generation. RNA was poly-A+ selected using the OligodT beads provided by Ilumina and fragmented with the Ambion Fragmentation Reagents kit (Ambion, Austin, Tex.). cDNA synthesis, end-repair, A-base addition, and ligation of the Illumina PCR adaptors (single read or paired-end where appropriate) were performed according to Illumina's protocol. Libraries were then size-selected for 250-300 bp cDNA fragments on a 3.5% agarose gel and PCR-amplified using Phusion DNA polymerase (Finnzymes) for 15-18 PCR cycles. PCR products were then purified on a 2% agarose gel and gel-extracted. Library quality was credentialed by assaying each library on an Agilent 2100 Bioanalyzer of product size and concentration. Libraries were sequenced as 36-45mers on an Illumina Genome Analyzer I or Genome Analyzer II flowcell according to Illumina's protocol. All single read samples were sequenced on a Genome Analyzer I, and all paired-end samples were sequenced on a Genome Analyzer II.


RNA Isolation and cDNA Synthesis


Total RNA was isolated using Trizol and an RNeasy Kit (Invitrogen) with DNase I digestion according to the manufacturer's instructions. RNA integrity was verified on an Agilent Bioanalyzer 2100 (Agilent Technologies, Palo Alto, Calif.). cDNA was synthesized from total RNA using Superscript III (Invitrogen) and random primers (Invitrogen).


Quantitative Real-Time PCR


Quantitative Real-time PCR (qPCR) was performed using Power SYBR Green Mastermix (Applied Biosystems, Foster City, Calif.) on an Applied Biosystems 7900HT Real-Time PCR System. All oligonucleotide primers were obtained from Integrated DNA Technologies (Coralville, Iowa) and are listed in Table 13. The housekeeping gene, GAPDH, was used as a loading control. Fold changes were calculated relative to GAPDH and normalized to the median value of the benign samples.


Reverse-Transcription PCR


Reverse-transcription PCR (RT-PCR) was performed for primer pairs using Platinum Taq


High Fidelity polymerase (Invitrogen). PCR products were resolved on a 2% agarose gel. PCR products were either sequenced directly (if only a single product was observed) or appropriate gel products were extracted using a Gel Extraction kit (Qiagen) and cloned into per4-TOPO vectors (Invitrogen). PCR products were bidirectionally sequenced at the University of Michigan Sequencing Core using either gene-specific primers or M13 forward and reverse primers for cloned PCR products. All oligonucleotide primers were obtained from Integrated DNA Technologies (Coralville, Iowa) and are listed in Table 13.


RNA-Ligase-Mediated Rapid Amplification of cDNA Ends (RACE)


5′ and 3′ RACE was performed using the GeneRacer RLM-RACE kit (Invitrogen) according to the manufacturer's instructions. RACE PCR products were obtained using Platinum Taq High Fidelity polymerase (Invitrogen), the supplied GeneRacer primers, and appropriate gene-specific primers indicated in Table 13. RACE PCR products were separated on a 2% agarose gels. Gel products were extracted with a Gel Extraction kit (Qiagen), cloned into per4-TOPO vectors (Invitrogen), and sequenced bidirectionally using M13 forward and reverse primers at the University of Michigan Sequencing Core. At least three colonies were sequenced for every gel product that was purified.


Paired-End Next-Generation Sequencing of RNA


2 μg total RNA was selected for polyA+ RNA using Sera-Mag oligo(dT) beads (Thermo Scientific), and paired-end next-generation sequencing libraries were prepared as previously described (Maher et al., supra) using Illumina-supplied universal adaptor oligos and PCR primers (Illumina). Samples were sequenced in a single lane on an Illumina Genome Analyzer II flowcell using previously described protocols (Maher et al., supra). 36-45 mer paired-end reads were according to the protocol provided by Illumina.


siRNA Knockdown Studies


Cells were plated in 100 mM plates at a desired concentration and transfected with 20 μM experimental siRNA oligos or non-targeting controls twice, at 12 hours and 36 hours post-plating. Knockdowns were performed with Oligofectamine and Optimem. Knockdown efficiency was determined by qPCR. 72 hours post-transfection, cells were trypsinized, counted with a Coulter counter, and diluted to 1 million cells/mL. For proliferation assays, 200,000 cells were plated in 24-well plates and grown in regular media. 48 and 96 hours post-plating, cells were harvested and counted using a Coulter counter. For invasion assays, Matrigel was diluted 1:4 in serum-free media and 100 μL of the diluted Matrigel was applied to a Boyden chamber transmembrane insert and allowed to settle overnight at 37° C. 200,000 cells suspended in serum-free media were applied per insert and 500 μL of serum-containing media was placed in the bottom of the Boyden (fetal bovine serum functioning as a chemoattractant). Cells were allowed to invade for 48 hours, at which time inserts were removed and noninvading cells and Matrigel were gently removed with a cotton swab. Invading cells were stained with crystal violet for 15 minutes and air-dried. For colorimetric assays, the inserts were treated with 200 μl of 10% acetic acid and the absorbance at 560 nm was measured using a spectrophotometer. For WST-1 assays, 20,000 cells were plated into 96-well plates and grown in 100 μL of serum-containing media. 48 and 96 hours post-plating, cells were measured for viability by adding 10 μL of WST-1 reagent to the cell media, incubating for 2 hours at 37° C. and measuring the absorbance at 450 nM using a spectrophotomer.


Urine qPCR


Urine samples were collected from 120 patients with informed consent following a digital rectal exam before either needle biopsy or radical prostatectomy at the University of Michigan with Institutional Review Board approval as described previously (Laxman et al., Cancer Res 68 (3), 645 (2008)). Isolation of RNA from urine and TransPlex whole transcriptome amplification were performed as described previously (Laxman et al., Neoplasia 8 (10), 885 (2006)). qPCR on urine samples was performed for KLK3 (PSA), TMPRSS2-ERG, GAPDH, PCA3, PCAT-1 and PCAT-14 using Power SYBR Mastermix (Applied Biosystems) as described above. Raw Ct values were extracted and normalized in the following manner. First, samples with GAPDH Ct values >25 or KLK3 Ct values >30 were removed from analysis to ensure sufficient prostate cell collection, leaving 108 samples for analysis. The GAPDH and KLK3 raw Ct values were average for each sample. ΔCt analysis was performed by measuring each value against the average of CtGAPDH and CtKLK3, and ΔCt values were normalized to the median ΔCt of the benign samples. Fold change was then calculated at 2- ΔCt. Samples were considered to be prostate cancer if histopathological analysis observed cancer or if the TMPRSS2-ERG transcript achieved a Ct value <37. Benign samples were defined as samples with normal histology and TMPRSS2-ERG transcript Ct values >37.


Statistical Analyses for Experimental Studies


All data are presented as means±s.e.m. All experimental assays were performed in duplicate or triplicate.


Bioinformatics Analyses


To achieve an ab initio prediction of the prostate cancer transcriptome existing publicly tools for mapping, assembly, and quantification of transcripts were supplemented with additional informatics filtering steps to enrich the results for the most robust transcript predictions (FIG. 6a). Transcripts were then identified and classified by comparing them against gene annotation databases (FIG. 6b). Details of the bioinformatics analyses are provided below.


Mapping Reads with TopHat


Reads were aligned using TopHat v1.0.13 (Feb. 5, 2010) (Trapnell et al., Bioinformatics 25, 1105-11 (2009)), a gapped aligner capable of discovering splice junctions ab initio. Briefly, TopHat aligns reads to the human genome using Bowtie (Langmead et al., Genome Biol 10, R25 (2009)) to determine a set of “coverage islands” that may represent putative exons. TopHat uses these exons as well as the presence of GT-AG genomic splicing motifs to build a second set of reference sequences spanning exon-exon junctions. The unmapped reads from the initial genome alignment step are then remapped against this splice junction reference to discover all the junction-spanning reads in the sample. TopHat outputs the reads that successfully map to either the genome or the splice junction reference in SAM format for further analysis. For this study a maximum intron size of 500 kb, corresponding to over 99.98% of RefSeq (Wheeler et al. Nucleic Acids Res 28, 10-4 (2000)) introns was used. For sequencing libraries the insert size was determined using an Agilent 2100 Bioanalyzer prior to data analysis, and it was found that this insert size agreed closely with software predictions. An insert size standard deviation of 20 bases was chosen in order to match the most common band size cut from gels during library preparation. In total, 1.723 billion fragments were generated from 201 lanes of sequencing on the Illumina Genome Analyzer and Illumina Genome Analyzer II. Reads were mapped to the human genome (hg18) downloaded from the UCSC genome browser website (Karolchik et al., Nucleic Acids Res 31, 51-4 (2003); Kent et al., Genome Res 12, 996-1006 (2002)). 1.418 billion unique alignments were obtained, including 114.4 million splice junctions for use in transcriptome assembly. Reads with multiple alignments with less than two mismatches were discarded.


Ab Initio Assembly and Quantification with Cufflinks


Aligned reads from TopHat were assembled into sample-specific transcriptomes with Cufflinks version 0.8.2 (Mar. 26, 2010) (Trapnell et al., Nat Biotechnol 28, 511-5). Cufflinks assembles exonic and splice-junction reads into transcripts using their alignment coordinates. To limit false positive assemblies a maximum intronic length of 300 kb, corresponding to the 99.93% percentile of known introns was used. After assembling transcripts, Cufflinks computes isoform-level abundances by finding a parsimonious allocation of reads to the transcripts within a locus. Transcripts with abundance less than 15% of the major transcript in the locus, and minor isoforms with abundance less than 5% of the major isoform were filtered. Default settings were used for the remaining parameters.


The Cufflinks assembly stage yielded a set of transcript annotations for each of the sequenced libraries. The transcripts were partitioned by chromosome and the Cuffcompare utility provided by Cufflinks was used to merge the transcripts into a combined set of annotations. The Cuffcompare program performs a union of all transcripts by merging transcripts that share all introns and exons. The 5′ and 3′ exons of transcripts were allowed to vary by up to 100 nt during the comparison process.


Distinguishing Transcripts from Background Signal


Cuffcompare reported a total of 8.25 million distinct transcripts. Manual inspection of these transcripts in known protein coding gene regions indicated that most of the transcripts were likely to be poor quality reconstructions of overlapping larger transcripts. Also, many of the transcripts were unspliced and had a total length smaller than the size selected fragment length of approximately ˜250 nt. Furthermore, many of these transcripts were only present in a single sample. A statistical classifier to predict transcripts over background signal was designed to identify highly recurrent transcripts that may be altered in prostate cancer. AceView (Thierry-Mieg et al. Genome Biol 7 Suppl 1, S12 1-14 (2006)) were used. For each transcript predicted by Cufflinks the following statistics were collected: length (bp), number of exons, recurrence (number of samples in which the transcript was predicted), 95th percentile of abundance (measured in Fragments per Kilobase per Million reads (FPKM)) across all samples, and uniqueness of genomic DNA harboring the transcript (measured using the Rosetta uniqueness track from UCSC (Rhead et al. 2010. Nucleic Acids Res 38, D613-9). Using this information, recursive partitioning and regression trees in R (package rpart) were used to predict, for each transcript, whether its expression patterns and structural properties resembled those of annotated genes. Classification was performed independently for each chromosome in order to incorporate the effect of gene density variability on expression thresholds. Transcripts that were not classified as annotated genes were discarded, and the remainder were subjected to additional analysis and filtering steps. By examining the decision tree results it was observed that the 95th percentile of expression across all samples as well as the recurrence of each transcript were most frequently the best predictors of expressed versus background transcripts (FIG. 7).


Refinement of Transcript Fragments


The statistical classifier predicted a total 2.88 million (34.9%) transcript fragments as “expressed” transcripts. A program was developed to extend and merge intron-redundant transcripts to produce a minimum set of transcripts that describes the assemblies produced by Cufflinks. The merging step produced a total of 123,554 independent transcripts. Transcript abundance levels were re-computed for these revised transcripts in Reads per Kilobase per Million (RPKM) units. These expression levels were used for the remainder of the study. Several additional filtering steps were used to isolate the most robust transcripts. First, transcripts with a total length less than 200 nt were discarded. Single exon transcripts with greater than 75% overlap to another longer transcript were also discarded. Transcripts that lacked a completely unambiguous genomic DNA stretch of at least 40 nt were also removed. Genomic uniqueness was measured using the Rosetta uniqueness track downloaded from the UCSC genome browser website. Transcripts that were not present in at least 5% of the cohort (>5 samples) at more than 5.0 RPKM were retained.


In certain instances transcripts were observed that were interrupted by poorly mappable genomic regions. Additionally, for low abundance genes fragmentation due to the lack of splice junction or paired-end read evidence needed to connect nearby fragments were observed. The difference in the Pearson correlation between expression of randomly chosen exons on the same transcript versus expression of spatially proximal exons on different transcripts was measured and it was found that in the cohort, a Pearson correlation >0.8 had a positive predictive value (PPV) of >95% for distinct exons to be part of the same transcript. Using this criteria, hierarchical agglomerative clustering to extend transcript fragments into larger transcriptional units was performed. Pairs of transcripts further than 100 kb apart, transcripts on opposite strands, and overlapping transcripts were not considered for clustering. Groups of correlated transcripts were merged, and introns <40 nt in length were removed.


Comparison with Gene Annotation Databases


The 44,534 transcripts produced by the bioinformatics pipeline were classified by comparison with a comprehensive list of “annotated” transcripts from UCSC, RefSeq, ENCODE, Vega, and Ensembl. First, transcripts corresponding to processed pseudogenes were separated. This was done to circumvent a known source of bias in the TopHat read aligner. TopHat maps reads to genomic DNA in its first step, predisposing exon-exon junction reads to align to their spliced retroposed pseudogene homologues. Next, transcripts with >1 bp of overlap with at least one annotated gene on the correct strand were designated “annotated”, and the remainder were deemed “unannotated”. Transcripts with no overlap with protein coding genes were subdivided into intronic, intergenic, or partially intronic antisense categories based on their relative genomic locations.


Informatics Filtering of Unspliced Pre-mRNA Isoforms


An increase in the percentage of intronic transcripts in the assembly relative to known intronic ncRNAs was observed. This led to the observation that in many cases unspliced pre mRNAs appear at sufficient levels to escape the filtering steps employed by Cufflinks during the assembly stage. Intronic and antisense transcripts that were correlated (Pearson correlation >0.5) to their overlapping protein coding genes were removed. This effectively removed transcripts within genes such as PCA3 and HPN that were obvious premRNA artifacts, while leaving truly novel intronic transcripts—such as those within FBXL7 and CDH13—intact. These steps produced a consensus set of 35,415 transcripts supporting long polyadenylated RNA molecules in human prostate tissues and cell lines. Per chromosome transcript counts closely mirrored known transcript databases (Table 2), indicating that the informatics procedures employed compensate well for gene density variability across chromosomes. Overall a similar number of transcripts as present in the either the RefSeq or UCSC databases (Wheeler et al. Nucleic Acids Res 28, 10-4 (2000)) were detected.


Coding Potential Analysis


To analyze coding potential, DNA sequences for each transcript were extracted and searched for open reading frames (ORFs) using the txCdsPredict program from the UCSC source tool set (Kent et al. Genome Res 12, 996-1006 (2002)). This program produces a score corresponding to the protein coding capacity of a given sequence, and scores >800 are ˜90% predictive of protein coding genes. This threshold was used to count transcripts with coding potential, and found only 5 of 6,641 unannotated genes with scores >800, compared with 1,669 of 25,414 protein coding transcripts. Additionally, it was observed that protein coding genes possess consistently longer ORFs than either unannotated or annotated ncRNA transcripts, indicating that the vast majority of the unannotated transcripts represent ncRNAs (FIG. 10).


Separation of Transcripts into Repetitive and Non-Repetitive Categories


To separate transcripts into “repeat” and “non-repeat” transcripts, the genomic DNA corresponding to the transcript exons was extracted and the fraction of repeat-masked nucleotides in each sequence were calculated. For the designation of repeat classes, RepMask 3.2.7 UCSC Genome Browser track (Kent, supra) was used. It was observed that transcripts enriched with repetitive DNA tended to be poorly conserved and lacked ChIP-seq marks of active chromatin (FIG. 12). Transcripts containing >25% repetitive DNA (FIG. 11) were separated for the purposes of the ChIP-seq and conservation analyses discussed below.


Conservation Analysis


The SiPhy package (Garber et al. Bioinformatics 25, i54-62 (2009)) was used to estimate the locate rate of variation (ω) of all non-repetitive transcript exons across 29 placental mammals. The program was run as described on the SiPhy website.


ChIP-Seq Datasets


Published ChIP-Seq datasets for H3K4me1, H3K4me2, H3K4me3, Acetylated H3, Pan-H3, and H3K36me3 were used (Yu et al. Cancer Cell 17, 443-54). These data are publically available through the NCBI Geo Omnibus (GEO GSM353632). The raw ChIP-Seq data was analyzed using MACS34 (H3K4me1, H3K4me2, H3K4me3, Acetylated H3, and Pan-H3) or SICER35 (H3K36me3) peak finder programs using default settings. These peak finders were used based upon their preferential suitability to detect different types of histone modifications (Pepke et al., Nat Methods 6, S22-32 (2009)). The H3K4me3-H3K36me3 chromatin signature used to identify lincRNAs was determined from the peak coordinates by associating each H3K4me3 peak with the closest H3K36me3-enriched region up to a maximum of 10 kb away. The enhancer signature (H3K4me1 but not H3K4me3) was determined by subtracting the set of overlapping H3K4me3 peaks from the entire set of H3K4me1 peaks. These analyses were performed with the bx-python libraries distributed as part of the Galaxy bioinformatics infrastructure.


Differential Expression Analysis


To predict differentially expressed transcripts a matrix of log-transformed, normalized RPKM expression values was prepared by using the base 2 logarithm after adding 0.1 to all RPKM values. The data were first centered by subtracting the median expression of the benign samples for each transcript. The Significance Analysis of Microarrays (SAM) method (Tusher et al., Proc Natl Acad Sci USA 98, 5116-21 (2001)) with 250 permutations of the Tusher et al. S0 selection method was used to predict differentially expressed genes. A delta value corresponding to the 90th percentile FDR desired for individual analyses was used. The MultiExperiment Viewer application (Chu et al., Genome Biol 9, R118 (2008)) was used to run SAM and generate heatmaps. It was confirmed that the results matched expected results through comparison with microarrays and known prostate cancer biomarkers.


Outlier Analysis


A modified COPA analysis was performed on the 81 tissue samples in the cohort. RPKM expression values were used and shifted by 1.0 in order to avoid division by zero. The COPA analysis had the following steps (MacDonald & Ghosh, Bioinformatics 22, 2950-1 (2006); Tomlins et al. Science 310, 644-8 (2005)): 1) gene expression values were median centered, using the median expression value for the gene across the all samples in the cohort. This sets the gene's median to zero. 2) The median absolute deviation (MAD) was calculated for each gene, and then each gene expression value was scaled by its MAD. 3) The 80, 85, 90, 98 percentiles of the transformed expression values were calculated for each gene and the average of those four values was taken. Then, genes were rank ordered according to this “average percentile”, which generated a list of outliers genes arranged by importance. 4) Finally, genes showing an outlier profile in the benign samples were discarded. Six novel transcripts ranked as both outliers and differentially-expressed genes in the analyses. These six were manually classified either as differentially-expressed or outlier status based on what each individual's distribution across samples indicated.


Repeat Enrichment Analysis


To assess the enrichment of repetitive elements in the assembly, 100 random permutations of the transcript positions on the same chromosome and strand were generated. To mirror the original constraints used to nominate transcripts it was ensured that permuted transcript positions contained a uniquely mappable stretch of genomic DNA at least 50 nt long. To account for the effects of mappability difficulties, each exon was padded by ±0 bp, 50 bp, 100 bp, or 500 bp of additional genomic sequence before intersecting the exons with repeat elements in the RepeatMasker 3.2.7 database. It was observed that padding by more than 50 bp did not improve enrichment results and padded exons by ±50 bp in subsequent analyses and tests (Table 9). Finally, the Shapiro-Wilk test for normality was performed and it was verified that the number of matches to highly abundant repetitive element types was approximately normally distributed.


B. Results

Prostate Cancer Transcriptome Sequencing


Transcriptome sequencing (RNA-Seq) was performed on 21 prostate cell lines, 20 benign adjacent prostates (benign), 47 localized tumors (PCA), and 14 metastatic tumors (MET). A total of 201 RNA-Seq libraries from this cohort were sequenced yielding a total of 1.41 billion mapped reads, with a median 4.70 million mapped reads per sample (Table 1 for sample information).


To analyze these data a method for ab initio transcriptome assembly to reconstruct transcripts and transcript abundance levels was used (FIG. 6 and Table 2) (Trapnell et al., NatBiotechnol 28 (5), 511; Trapnell et al., Bioinformatics 25 (9), 1105 (2009)). Sample-specific transcriptomes were predicted and individual predication were merged into a consensus transcriptome and the most robust transcripts were retained (FIG. 7). The ab initio transcriptome assembly and subsequent refinement steps yielded 35,415 distinct transcriptional loci (see FIG. 8 for examples).


The assembled transcriptome was compared to the UCSC, Ensembl, Refseq, Vega, and ENCODE gene databases to identify and categorize transcripts. While the majority of the transcripts (77.3%) corresponded to annotated protein coding genes (72.1%) and noncoding RNAs (5.2%), a significant percentage (19.8%) lacked any overlap and were designated “unannotated” (FIG. 1a). These included partially intronic antisense (2.44%), totally intronic (12.1%), and intergenic transcripts (5.25%). These results agree with previous data indicating that large fractions of the transcriptome represent unannotated transcription (Birney et al., Nature 447 (7146), 799 (2007); Carninci et al., Science 309 (5740), 1559 (2005) and that significant percentages of genes may harbor related antisense transcripts (He et al., Science 322 (5909), 1855 (2008); Yelin et al., Nat Biotechnol 21 (4), 379 (2003)). Due to the added complexity of characterizing antisense or partially intronic transcripts without strand-specific RNA-Seq libraries, studies focused on totally intronic and intergenic transcripts.


Characterization of Novel Transcripts


Global characterization of novel transcripts corroborated previous reports that they are relatively poorly conserved and more lowly expressed than protein coding genes (Guttman et al., Nat Biotechnol 28 (5), 503; Guttman et al., Nature 458 (7235), 223 (2009)). Expression levels of unannotated prostate cancer transcripts were consistently higher than randomly permuted controls, but lower than annotated ncRNAs or protein coding genes (FIG. 1b). Unannotated transcripts also showed less overlap with known expressed sequence tags (ESTs) than protein-coding genes but more than randomly permuted controls (FIG. 5). Unannotated transcripts showed a clear but subtle increase in conservation over control genomic intervals (novel intergenic transcripts p=2.7×10-4±0.0002 for 0.4<ω<0.8; novel intronic transcripts p=2.6×10-5±0.0017 for 0<ω<0.4, FIG. 1c). Only a small subset of novel intronic transcripts showed increased conservation (FIG. 1c insert), but this conservation was quite profound. By contrast, a larger number of novel intergenic transcripts showed more mild increases in conservation. Finally, analysis of coding potential revealed that only 5 of 6,144 transcripts harbored a high quality open reading frame (ORF), indicating that the overwhelming majority of these transcripts represent ncRNAs (FIG. 10).


Next, published prostate cancer ChIP-Seq data for two prostate cell lines (Yu et al., Cancer Cell 17 (5), 443; VCaP and LNCaP was used in order to interrogate the overlap of unannotated transcripts with histone modifications supporting active transcription (H3K4me1, H3K4me2, H3K4me3, H3K36me3, Acetyl-H3 and RNA polymerase II, see Table 3). Because unannotated ncRNAs showed two clear subtypes, repeat-associated and non-repeats (FIG. 11 and discussed below), it was contemplated that these two subtypes may display distinct histone modifications as noted in previous research (Day et al., Genome Biol 11 (6), R69). Whereas non-repeat transcripts showed strong enrichment for histone marks of active transcription at their putative transcriptional start sites (TSSs), repeat-associated transcripts showed virtually no enrichment (FIG. 12), and for the remaining ChIP-Seq analyses non-repeat transcripts only were considered. In this set of unannotated transcripts, strong enrichment for histone modifications characterizing TSSs and active transcription, including H3K4me2, H3K4me3, Acetyl-H3 and RNA Polymerase II (FIG. 1d-g) but not H3K4me1 was observed, which characterizes enhancer regions (FIGS. 13 and 14). Intergenic ncRNAs performed much better in these analyses than intronic ncRNAs (FIG. 1d-g). To elucidate global changes in transcript abundance between prostate cancer and benign tissues, differential expression was performed analysis for all transcripts. 836 genes differentially-expressed between benign and PCA samples (FDR<0.01) were found, with protein-coding genes constituting 82.8% of all differentially-expressed genes (FIG. 1h and Table 4). This category contained the most significant transcripts, including numerous known prostate cancer genes such as AMACR32 and Hepsin (Dhanasekaran et al., Nature 412 (6849), 822 (2001)). Annotated ncRNAs represented 7.4% of differentially-expressed genes, including the ncRNA PCA334, which resides within an intron of the PRUNE2 gene and ranked #4 overall (12.2 fold change; adj. p<2×10-4, Wilcoxon rank sum test, Benjamini-Hochberg correction) (FIG. 8). Finally, 9.8% of differentially-expressed genes corresponded to unannotated ncRNAs, including 3.2% within gene introns and 6.6% in intergenic regions, indicating that these species contribute significantly to the complexity of the prostate cancer transcriptome.


Dysregulation of Unannotated Non-Coding RNAs


Recent reports of functional long intervening non-coding RNAs (Dhanasekaran et al., Nature 412 (6849), 822 (2001); Gupta et al., Nature 464 (7291), 1071; Rinn et al., Cell 129 (7), 1311 (2007); Guttman et al., Nature 458 (7235), 223 (2009)) (lincRNAs) in intergenic regions led to an exploration of intergenic ncRNAs further. A total of 1859 unannotated intergenic RNAs were found throughout the human genome. The present invention is not limited to a particular mechanism. Indeed, an understanding of the mechanism is not necessary to practice the present invention. Nonetheless it is contemplated that this is an underestimate due to the inability to detect small RNAs eliminated by the ˜250 bp size selection performed during RNA-Seq library generation (Methods). Overall, novel intergenic RNAs resided closer to protein-coding genes than protein-coding genes do to each other (the median distance to the nearest protein-coding gene is 4292 kb for novel genes and 8559 kb for protein-coding genes, FIG. 2a). For instance, if two protein-coding genes, Gene A and Gene B, are separated by the distance AB, then the furthest an unannotated ncRNA can be from both of them is 0.5*AB, which is exactly what was observed (4292/8559=0.501). Supporting this observation, 34.1% of unannotated transcripts are located ≥10 kb from the nearest protein-coding gene. As an example, the Chr15q arm was visualized using the Circos program. Eighty-nine novel intergenic transcripts were nominated across this chromosomal region, including several differentially-expressed loci centromeric to TLE3 (FIG. 2b) which were validated by PCR in prostate cancer cell lines (FIG. 15). A focused analysis of the 1859 novel intergenic RNAs yielded 106 that were differentially expressed in localized tumors (FDR<0.05; FIG. 2c). These Prostate Cancer Associated Transcripts (PCATs) were ranked according to their fold change in localized tumor versus benign tissue (Tables 5 and 6).


Similarly, performing a modified cancer outlier profile analysis (COPA) on the RNA-Seq dataset re-discovered numerous known prostate cancer outliers, such as ERG7, ETV17, SPINK135, and CRISP336,37, and nominated numerous unannotated ncRNAs as outliers (FIG. 2d and Tables 6 and 7). Merging the results from the differential expression and COPA analyses resulted in a set of 121 unannotated transcripts that accurately discriminated benign, localized tumor, and metastatic prostate samples by unsupervised clustering (FIG. 2c). These data provide evidence that PCATs serve as biomarkers for prostate cancer and novel prostate cancer subtypes. Clustering analyses using novel ncRNA outliers also provide disease subtypes (FIG. 16).


Confirmation and Tissue-Specificity of ncRNAs


Validation studies were performed on 14 unannotated expressed regions, including ones both included and not present in the list of differentially expressed transcripts. Reverse transcription PCR (RT-PCR) and quantitative real-time PCR (qPCR) experiments demonstrated a ˜78% (11/14) validation rate in predicted cell line models for both transcript identity and expression level (FIG. 17). Next, three transcripts (PCAT-109, PCAT-14, and PCAT-43) selectively upregulated in prostate cancer compared to normal prostate were examined. From the sequencing data, each genomic loci shows significantly increased expression in prostate cancer and metastases, except for PCAT-14, which appears absent in metastases (FIG. 3a-c). PCAT-109 also ranks as the #5 best outlier in prostate cancer, just ahead of ERG (FIG. 2d and Table 6). qPCR on a cohort of 14 benign prostates, 47 tumors, and 10 metastases confirmed expression of these transcripts (FIG. 3a-c). All three appear to be prostate-specific, with no expression seen in breast or lung cancer cell lines or in 19 normal tissue types (Table 8). This tissue specificity was not necessarily due to regulation by androgen signaling, as only PCAT-14 expression was induced by treatment of androgen responsive VCaP and LNCaP cells with the synthetic androgen R1881, consistent with previous data from this genomic locus (FIG. 18) (Tomlins et al., Nature 448 (7153), 595 (2007); Stavenhagen et al., Cell 55 (2), 247 (1988)). PCAT-14, but not PCAT-109 or PCAT-43, also showed differential expression when tested on a panel of matched tumor-normal samples, indicating that this transcript, which is comprised of an endogenous retrovirus in the HERV-K family (Bannert and Kurth, Proc Natl Acad Sci USA 101 Suppl 2, 14572 (2004)), can be used as a somatic marker for prostate cancer (FIG. 19). 5′ and 3′ rapid amplification of cDNA ends (RACE) at this locus revealed the presence of individual viral protein open reading frames (ORFs) and a transcript splicing together individual ORF 5′ untranslated region (UTR) sequences (FIG. 20). It was observed that the top-ranked intergenic ncRNA resided in the chromosome 8q24 gene desert nearby to the c-Myc oncogene. This ncRNA, termed PCAT-1, is located on the edge of the prostate cancer susceptibility region 240-43 (FIG. 4a) and is about 0.5 Mb away from c-Myc. This transcript is supported by clear peaks in H3K4me3, Acetyl-H3, and RNA polymerase II ChIP-Seq data (FIG. 4b). The exon-exon junction in cell lines was validated by RT-PCR and Sanger sequencing of the junction (FIG. 4c), and 5′ and 3′ RACE was performed to elucidate transcript structure (FIG. 4d). By this analysis, PCAT-1 is a mariner family transposase (Oosumi et al., Nature 378 (6558), 672 (1995); Robertson et al., Nat Genet 12 (4), 360 (1996)) interrupted by an Alu retrotransposon and regulated by a viral long terminal repeat (LTR) promoter region (FIG. 4d and FIG. 21). By qPCR, PCAT-1 expression is specific to prostate tissue, with striking upregulation in prostate cancers and metastases compared to benign prostate tissue (FIG. 4e). PCAT-1 ranks as the second best overall prostate cancer biomarker, just behind AMACR (Table 3), indicating that this transcript is a powerful discriminator of this disease. Matched tumor normal pairs similarly showed marked upregulation in the matched tumor samples (FIG. 4f). RNA interference (RNAi) was performed in VCaP cells using custom siRNAs targeting PCAT-1 sequences and no change in the cell proliferation or invasion upon PCAT-1 knockdown was observed (FIG. 22)


Selective Re-Expression of Repetitive Elements in Cancer


The presence of repetitive elements in PCAT-1 led to an exploration of repetitive elements. Repetitive elements, such as Alu and LINE-1 retrotransposons, are broadly known to be degenerate in humans (Oosumi et al, supra; Robertson et al., supra; Cordaux et al., Nat Rev Genet 10 (10), 691 (2009), with only ˜100 LINE-1 elements (out of 12 500,000) showing possible retrotransposon activity (Brouha et al., Proc Natl Acad Sci USA 100 (9), 5280 (2003)). While transcription of these elements is frequently repressed through DNA methylation and repressive chromatin modifications (Slotkin and Martienssen, Nat Rev Genet 8 (4), 272 (2007)), in cancer widespread hypomethylation has been reported (Cho et al., J Pathol 211 (3), 269 (2007); Chalitchagorn et al., Oncogene 23 (54), 8841 (2004); Yegnasubramanian et al., Cancer Res 68 (21), 8954 (2008)). Moreover, recent evidence indicates that these elements have functional roles in both normal biology (Kunarso et al., Nat Genet.) and cancer (Lin et al., Cell 139 (6), 1069 (2009)), even if their sequences have mutated away from their evolutionary ancestral sequence (Chow et al., Cell 141 (6), 956). To date, only RNA-Seq platforms enable discovery and quantification of specific transposable elements expressed in cancer. As described above, it was observed that >50% of unannotated exons in the assembly overlap with at least one repetitive element (FIG. 11). Since these elements pose mappability challenges when performing transcriptome assembly with unique reads, these loci typically appear as “mountain ranges” of expression, with uniquely mappable regions forming peaks of expression separated by unmappable “ravines” (FIGS. 23 and 24). PCR and Sanger sequencing experiments were performed to confirm that these transposable elements of low mappability are expressed as part of these loci (FIGS. 23 and 24). To probe this observation further, the exons from unannotated transcripts in the assembly, with the addition of the flanking 50, 100, or 500 bp of additional genomic sequence to the 5′ and 3′ end of the exons were generated, the overlap of these intervals with repetitive elements to randomly permuted genomic intervals of similar sizes was performed. A highly significant enrichment for repetitive elements in the dataset was observed (OR 2.82 (95% CI 2.68-2.97), p<10-100, Table 9). Examination of the individual repetitive element classes revealed a specific enrichment for SINE elements, particularly Alus (p≤2×10-16, Tables 10 and 11). A subset of LINE-1 and Alu transposable elements demonstrate marked differential expression in a subset of prostate cancer tumors (FIG. 25). One locus on chromosome 2 (also highlighted in FIG. 3b) is a 500+ kb region with numerous expressed transposable elements (FIG. 26). This locus, termed Second Chromosome Locus Associated with Prostate-1 (SChLAP1), harbors transcripts that perform extremely well in outlier analyses for prostate cancer (Tables 6 and 7). PCAT-109, discussed above, is one outlier transcript in this region. Moreover, the SChLAP1 locus is highly associated with patients positive for ETS gene fusions (p<0.0001, Fisher's exact test, FIG. 27), whereas this association was not observed with other expressed repeats. A direct regulatory role for ERG on this region was not identified using siRNA-mediated knockdown of ERG in the VCaP cell line. These data indicate that the dysregulation of repeats in cancer is highly specific, and that this phenomenon associates with only a subset of tumors and metastases. Thus, the broad hypomethylation of repeat elements observed in cancer (Cho et al., J Pathol 211 (3), 269 (2007); Chalitchagorn et al., Oncogene 23 (54), 8841 (2004); Yegnasubramanian et al., Cancer Res 68 (21), 8954 (2008)) does not account for the high specificity of repeat expression.


Non-Invasive Detection of ncRNAs in Urine


Taken together, these data show an abundance of novel ncRNA biomarkers for prostate cancer, many of which appear to have tissue specificity. 77 urine sediments obtained from patients with prostate cancer and 31 control patients without known disease (Table 12 for sample details) were analyzed (Laxman et al., Cancer Res 68 (3), 645 (2008)). The control patients are defined as those lacking cancer histology upon prostate biopsy and lacking the TMPRSS2-ERG fusion transcript in urine sediment RNA (Laxman et al., supra). PCAT-1 and PCAT-14, as well as the known ncRNA biomarker PCA3, were selected for evaluation on this urine panel due to their biomarker status in patient tissue samples. qPCR analysis led to an observation of specificity in their ability to detect prostate cancer patients and not patients with normal prostates (FIG. 5a-c). In several cases, patients with ETS-negative prostate cancer that were misclassified as “benign” are clearly evident (FIGS. 5a and 5c). Moreover, PCAT-14 appears to perform almost as well as PCA3 as a urine biomarker, nearly achieving statistical significance (p=0.055, Fisher's exact test) despite the small number of patients used for this panel. It was next evaluated whether these unannotated ncRNAs identified a redundant set of patients that would also be identified by other urine tests, such as PCA3 or TMPRSS2-ERG transcripts. Comparing PCAT-1 and PCAT-14 expression in urine samples to PCA3 or to each other revealed that these ncRNAs identified distinct patient sets, indicating that a patient's urine typically harbors PCAT-1 or PCAT-14 transcripts but not both (FIG. 5d). Using the cut-offs displayed in FIG. 5a-c, a binary heatmap comparing these three ncRNAs with patients' TMPRSS2-ERG status was generated (FIG. 5e). The ncRNAs were able to detect additional ETS-negative patients with prostate cancer through this urine test, indicating that they have clinical utility as highly specific markers for prostate cancer using a multiplexed urine test. Combining PCAT-1, PCAT-14 and PCA3 into a single “non-coding RNA signature” generated a highly specific urine signature (p=0.0062, Fisher's exact test, FIG. 5f) that identifies a number of prostate cancer patients that is broadly comparable to the TMPRSS2-ERG fusion (33% vs. 45%).



FIG. 34 shows detection of prostate cancer RNAs in patient urine samples using qPCR. All RNA species were detectable in urine. FIG. 35 shows that multiplexing urine SChLAP-1 measurements with serum PSA improves prostate cancer risk stratification. Individually, SChLAP-1 is a predictor for prostate cancers with intermediate or high clinical risk of aggressiveness. Multiplexing this measurement with serum PSA improves upon serum PSA's ability to predict for more aggressive disease.


Additional Characterization

Additional experiments were conducted related to PCAT-1 and SChLAP-1 region in prostate cancer. FIG. 29 demonstrates that PCAT-1 expression sensitizes prostate cancer cells to treatment with PARP-1 inhibitors. FIG. 30 demonstrates that PCAT-1 expression sensitizes prostate cells to radiation treatment.



FIG. 31 demonstrates that unannotated intergenic transcripts in SChLAP-1 differentiate prostate cancer and benign samples. FIG. 32 demonstrates that SChLAP-1 is required for prostate cancer cell invasion and proliferation. Prostate cell lines, but not non-prostate cells, showed a reduction in invasion by Boyden chamber assays. EZH2 and non-targeting siRNAs served as positive and negative controls, respectively. Deletion analysis of SChLAP-1 was performed. FIG. 33 shows that a region essential for its function was identified.


ncRNAs in Lung, Breast, and Pancreatic Cancers


Analysis of the lung cancer transcriptome (FIG. 36) was performed. 38 lung cell lines were analyzed by RNA-Seq and then lncRNA transcripts were reconstructed. Unannotated transcripts accounted for 27% of all transcripts. Novel transcripts well more highly expressed than annotated ncRNAs but not protein-coding transcripts. An outlier analyses of 13 unannotated transcripts shows novel lncRNAs in subtypes of lung cancer cell lines. FIG. 37 shows discovery of M41 and ENST-75 ncRNAs in lung cancer. FIG. 38 shows that lncRNAs are drivers and biomarkers in lung cancer. FIG. 39 shows identification of cancer-associated lncRNAs in breast and pancreatic cancer. Three novel breast cancer lncRNAs were nominated from RNA-Seq data (TU0011194, TU0019356, and TU0024146. All show outlier expression patterns in breast cancer samples but not benign samples. Three novel pancreatic cancer lncRNAs were nominated from RNA-Seq data (TU0009141, TU0062051, and TU0021861). All show outlier expression patterns in pancreatic cancer samples but not benign samples.




















TABLE 1













TopHat











Total
TopHat
Splice






Sample

Sample
Read
Read
Reads
Mapped
Junction
%

ETS


Library ID
Name
Type
Type
Type
Length
(text missing or illegible when filed  for PE)
Reads
Reads
Splice
Diagnosis
status


























ctp_
PWR-1E
RNA-
Cell
paired_
40
7353042
5357325
2091179
13.04%
Benign
Negative


42823AAXX_3

Seq
Line
end









mctp_
PrEC
RNA-
Cell
single_
40

text missing or illegible when filed

955130
107511
11.24%
Benign
Negative



text missing or illegible when filed AAXX_5


Seq
Line
read









mctp_
PrEC
RNA-
Cell
single_
30
3319065
971560
53520
 7.75%
Benign
Negative



text missing or illegible when filed AAXX_0


Seq
Line
read









mctp_
PrEc
RNA-
Cell
paired_
40
7748627
7443379
747751
10.05%
Benign
Negative


31462AAXX_1

Seq
Line
end









mctp_
PrEC
RNA-
Cell
paired_
40

text missing or illegible when filed

9562343

text missing or illegible when filed

 9.33%
Benign
Negative


30351AAXX_7

Seq
Line
end









mctp_

text missing or illegible when filed

RNA-
Cell
paired_
40
9464529
0625131
335563

text missing or illegible when filed

Benign
Negative


314 text missing or illegible when filed AAXX_2

Seq
Line
end









mctp_
RWPE
RNA-
Cell
single_
35

text missing or illegible when filed


text missing or illegible when filed

149383
 5.92%
Benign
Negative



text missing or illegible when filed AAXX_5


Seq
Line
read









mctp_
RWPE
RNA-
Cell
single_
35
5347784
1710762
150130

text missing or illegible when filed

Benign
Negative



text missing or illegible when filed AAXX_7


Seq
Line
read









mctp_
RWPE
RNA-
Cell
single_

text missing or illegible when filed

4778245
1539225

text missing or illegible when filed

 8.84%
Benign
Negative



text missing or illegible when filed AAXX_0


Seq
Line
read









mctp_
RWPE
RNA-
Cell
single_
35
4933610
1565250
137416
 8.73%
Benign
Negative



text missing or illegible when filed AAXX_5


Seq
Line
read









mctp_
RWPE
RNA-
Cell
single_
35
5005497
1622035
143185
 8.84%
Benign
Negative



text missing or illegible when filed AAXX_7


Seq
Line
read









mctp_
RWPE
RNA-
Cell
single_
35
4855663
1507124

text missing or illegible when filed

 8.73%
Benign
Negative


20FCBAAXX_0

Seq
Line
read









mctp_
RWPE
RNA-
Cell
single_
35
4966436
1569635
138224
 8.83%
Benign
Negative


30FOGAAXX_7

Seq
Line
read









mctp_
RWPE
RNA-
Cell
single_
35
4909235
1550957
133025

text missing or illegible when filed

Benign
Negative


20FOGAAXX_8

Seq
Line
read









mctp_
RWPE
RNA-
Cell
single_
35
4304467

text missing or illegible when filed

138224
 6.91%
Benign
Negative


30FOGAAXX_6

Seq
Line
read









mctp_

text missing or illegible when filed

RNA-
Cell
paired_

text missing or illegible when filed

7595911
5153303
135045
 6.77%
Benign
Negative



text missing or illegible when filed AAXX_3


Seq
Line
end









mctp_

text missing or illegible when filed

RNA-
Cell
single_
35
5301735
2345205
138674
 5.77%
Localized
Negative


20F text missing or illegible when filed AAXX_1

Seq
Line
read









mctp_

text missing or illegible when filed

RNA-
Cell
paired_
40
9254420

text missing or illegible when filed

3011035
12.48%
Localized
Negative


324 text missing or illegible when filed AAXX_5

Seq
Line
end









mctp_
CR-HPV 30
RNA-
Cell
paired_
40
13654861
14731630
169257
 7.22%
Localized
Negative


42854AAXX_3

Seq
Line
end









mctp_
CWR22
RNA-
Cell
paired_
45

text missing or illegible when filed

14791235

text missing or illegible when filed

21.07%
Localized
Negative


42854AAX_7

Seq
Line
end









mctp_
VCaP
RNA-
Cell
single_
35

text missing or illegible when filed

1400655

text missing or illegible when filed

11.58%
Metastatic
ERG+



text missing or illegible when filed AAXX_2


Seq
Line
read









mctp_
VCaP
RNA-
Cell
single_
35

text missing or illegible when filed

981204

text missing or illegible when filed

10.35%
Metastatic
ERG+


20 text missing or illegible when filed AAXX_7

Seq
Line
read









mctp_
VCaP
RNA-
Cell
single_
35

text missing or illegible when filed

957548
267745
11.98%
Metastatic
ERG+



text missing or illegible when filed


Seq
Line
read









mctp_
VCaP
RNA-
Cell
single_
35

text missing or illegible when filed

958522
89663
 9.14%
Metastatic
ERG+



text missing or illegible when filed


Seq
Line
read









mctp_
VCaP
RNA-
Cell
single_
35
5405236

text missing or illegible when filed

36193
 9.00%
Metastatic
ERG+


20 text missing or illegible when filed AAXX_3

Seq
Line
read









mctp_
VCaP
RNA-
Cell
single_
35
5093526
935272
25342
 9.23%
Metastatic
ERG+



text missing or illegible when filed AAXX_2


Seq
Line
read









mctp_
VCaP
RNA-
Cell
single_
35
4275325
004030

text missing or illegible when filed

 9.11%
Metastatic
ERG+



text missing or illegible when filed AAXX_1


Seq
Line
read









mctp_
VCaP
RNA-
Cell
single_
35
4717534

text missing or illegible when filed

25147
 9.07%
Metastatic
ERG+



text missing or illegible when filed AAXX_1


Seq
Line
read









mctp_
VCaP
RNA-
Cell
single_
35
5034204
926214

text missing or illegible when filed

 9.05%
Metastatic
ERG+


AAXX_text missing or illegible when filed

Seq
Line
read









mctp_
VCaP
RNA-
Cell
single_
40
4492727
007997

text missing or illegible when filed

 9.07%
Metastatic
ERG+


AAXX_2

Seq
Line
read









mctp_

text missing or illegible when filed

RNA-
Cell
paired_
35
12322606
15194197
35535
 9.23%
Metastatic
ERG+


329F4AAXX_4

Seq
Line
end









mctp_
LNCaP
RNA-
Cell
single_
35
5109483
1430543
73610
 9.11%
Metastatic
ETV+


20FOGAAXX_4

Seq
Line
read









mctp_
LNCaP
RNA-
Cell
single_
35
5018345
1402514
1377759
 9.12%
Metastatic
ETV+


20FOGAAXX_1

Seq
Line
read









mctp_
LNCaP
RNA-
Cell
single_
35
5206724
1425034
129478
 6.35%
Metastatic
ETV+


20FOGAAXX_3

Seq
Line
read









mctp_
LNCaP
RNA-
Cell
single_
35
4930256
1399261
117293

text missing or illegible when filed

Metastatic
ETV+


20FCGAAXX_2

Seq
Line
read









mctp_
LNCaP
RNA-
Cell
single_
35
4593725
1370920
119462
 8.37%
Metastatic
ETV+


2056CAAXX_2

Seq
Line
read









mctp_
LNCaP
RNA-
Cell
single_
35
5402665
1510040
126377
 8.43%
Metastatic
ETV+



text missing or illegible when filed CAAXX_3


Seq
Line
read









mctp_
LNCaP
RNA-
Cell
single_
35
4933947
1304247

text missing or illegible when filed


text missing or illegible when filed

Metastatic
ETV+


2 text missing or illegible when filed E5CAAXX_4

Seq
Line
read









mctp_
LNCaP
RNA-
Cell
paired_
35
20714359
20272530
1057574
10.30%
Metastatic
Negative


42PMUAAXX_6
CD52
Seq
Line
end










text missing or illegible when filed _

LNCaP
RNA-
Cell
paired_
35
9545473

text missing or illegible when filed

973617

text missing or illegible when filed

Metastatic
Negative


42PMUAAXX_7
CD53
Seq
Line
end









mctp_
DU-145
RNA-
Cell
paired_
35
12804352
13651384
1378507
10.04%
Metastatic
Negative


22TASAAXX_7

Seq
Line
end









mctp_
DU-145
RNA-
Cell
paired_
35
25755349

text missing or illegible when filed

1572336
 9.85%
Metastatic
Negative


42TASAAXX_text missing or illegible when filed

Seq
Line
end









mctp_
DU-145
RNA-
Cell
paired_
35
14127745

text missing or illegible when filed

1425534
 9.93%
Metastatic
Negative


42TASAAXX_5

Seq
Line
end









mctp_
DU-145
RNA-
Cell
paired_
35

text missing or illegible when filed

13047940
1328224
10.32%
Metastatic
Negative


42TASAAXX_3

Seq
Line
end









mctp_
DU-145
RNA-
Cell
paired_
35

text missing or illegible when filed

23718573

text missing or illegible when filed

10.09%
Metastatic
Negative


42TASAAXX_2

Seq
Line
end









mctp_
DU-145
RNA-
Cell
paired_
35
10524533
5437257
553952
10.56%
Metastatic
Negative


42TASAAXX_text missing or illegible when filed

Seq
Line
end









mctp_
DU-145
RNA-
Cell
paired_
35
9239144
10026773
1013732
10.11%
Metastatic
Negative


42T text missing or illegible when filed AAXX_8

Seq
Line
end









mctp_
LNCaP CD5
RNA-
Cell
paired_
38
12368674
9515829
1431356
10.35%
Metastatic
Negative


42PFAAAXX_5
parent
Seq
Line
end









mctp_
LNCap
RNA-
Cell
paired_
35
14459553
13998752
235383
10.08%
Metastatic
Negative


42PFAAAXX_5
CD52
Seq
Line
end









mctp_
DU-145
RNA-
Cell
single_
35

text missing or illegible when filed

3558542
225574
 9.18%
Metastatic
Negative


20BC5AAXX_5

Seq
Line
read









mctp_
DU-145
RNA-
Cell
single_
35
5059249
2437193
493465
 9.26%
Metastatic
Negative


25FERAAXX_2

Seq
Line
read









mctp_
DU-145
RNA-
Cell
single_
45
0596512
4162530

text missing or illegible when filed

11.97%
Metastatic
Negative



text missing or illegible when filed AAXX_3


Seq
Line
read









mctp_

text missing or illegible when filed

RNA-
Cell
paired_
40
14785826
16711055
1155473
10.75%
Metastatic
Negative


429F4AAXX_3

Seq
Line
end









mctp_
PC3
RNA-
Cell
paired_
40
10257396
10251560
237597
11.53%
Metastatic
Negative


3054YAAXX_1

Seq
Line
end









mctp_
PC3
RNA-
Cell
single_
35

text missing or illegible when filed

2547308
1581197
 9.33%
Metastatic
Negative


25F69AAXX_3

Seq
Line
read









mctp_
C4-25
RNA-
Cell
paired_
40
12759809
11823209
1534544
12.43%
Metastatic
Negative


429F4AAXX_1

Seq
Line
end









mctp_
MDAtext missing or illegible when filed
RNA-
Cell
paired_
40
13341323
14905946

text missing or illegible when filed

10.96%
Metastatic
Negative


42354AAXX_5

Seq
Line
end









mctp_
WPE1- text missing or illegible when filed
RNA-
Cell
paired_
40
10553920
9530521
1435670
12.49%
Metastatic
Negative


42B0AAXX_4

Seq
Line
end









mctp_
PrBe10013
RNA-
Tissue
paired_
40
15313395

text missing or illegible when filed

927598
 7.95%
Benign
Negative


42B text missing or illegible when filed AAXX_4

Seq

end









mctp_
PrBe10013
RNA-
Tissue
paired_
33
3922744
12263152
715431
 7.56%
Benign
Negative



text missing or illegible when filed


Seq

end









mctp_
PrBe10014
RNA-
Tissue
paired_
40
11242242
9015876
471853
 7.92%
Benign
Negative


42B45AAXX_text missing or illegible when filed

Seq

end









mctp_
PrBe10014
RNA-
Tissue
paired_

text missing or illegible when filed

5546531
5358075
321691
 7.38%
Benign
Negative


42 text missing or illegible when filed FAAXX_2

Seq

end









mctp_
PrBe10014
RNA-
Tissue
paired_
33
3977108
4253690
532270
 7.56%
Benign
Negative



text missing or illegible when filed


Seq

end









mctp_
PrBe10015
RNA-
Tissue
paired_
40
7584420
7527754
836452
 7.98%
Benign
Negative


42 text missing or illegible when filed JAAXX_7

Seq

end









mctp_
PrBe10015
RNA-
Tissue
paired_

text missing or illegible when filed

14331227
12877854
936352
 7.27%
Benign
Negative



text missing or illegible when filed


Seq

end









mctp_
PrBe10016
RNA-
Tissue
paired_
40
12122294
11750835
320710

text missing or illegible when filed

Benign
Negative


42540AAXX_1

Seq

end









mctp_
PrBe10016
RNA-
Tissue
paired_
35

text missing or illegible when filed

11367252
741959
 6.53%
Benign
Negative


42NY4AAXX_5

Seq

end









mctp_
PrBe10017
RNA-
Tissue
paired_
35
1259390
2156567
152020
 8.05%
Benign
Negative



text missing or illegible when filed _7


Seq

end









mctp_
PrBe10017
RNA-
Tissue
paired_
40
14245233
14383797
1025161
 7.85%
Benign
Negative


42CJFAAXX_5

Seq

end









mctp_
PrBe10018
RNA-
Tissue
paired_
30
26615395
17002419
1465145
 7.85%
Benign
Negative


42520AAXX_5

Seq

end









mctp_
PrBe10018
RNA-
Tissue
paired_
30
25877894
15409081
1418434
 7.54%
Benign
Negative


42NY4AAXX_text missing or illegible when filed

Seq

end









mctp_
aN10_6
RNA-
Tissue
paired_
40
10100950
13949254

text missing or illegible when filed

 7.88%
Benign
Negative


4203AAXX_5

Seq

end









mctp_
aN11_1
RNA-
Tissue
paired_
40
9792955

text missing or illegible when filed


text missing or illegible when filed

 7.87%
Benign
Negative



text missing or illegible when filed


Seq

end









mctp_
aN11_1
RNA-
Tissue
paired_
40
14655835
10917491

text missing or illegible when filed

 7.75%
Benign
Negative


42P6 text missing or illegible when filed AAXX_1

Seq

end









mctp_
aN13_2
RNA-
Tissue
paired_
40
14755537
15347535
1174593
 7.47%
Benign
Negative



text missing or illegible when filed AAXX_1


Seq

end









mctp_
aN13_2
RNA-
Tissue
paired_
40

text missing or illegible when filed

15070565
1231804
 5.47%
Benign
Negative


43P text missing or illegible when filed AAXX_4

Seq

end









mctp_
aN14_4
RNA-
Tissue
paired_
40

text missing or illegible when filed

3528550
733492
 4.45%
Benign
Negative


3054YAAXX_3

Seq

end









mctp_
aN14_4
RNA-
Tissue
paired_
40
12517092

text missing or illegible when filed

394315

text missing or illegible when filed

Benign
Negative


42P text missing or illegible when filed AAXX_2

Seq

end









mctp_
PrBe10002
RNA-
Tissue
paired_
40
10252325

text missing or illegible when filed

190504
 3.53%
Benign
Negative


3C553AAXX_5

Seq

end









mctp_
PrBe10002
RNA-
Tissue
single_
40
4309340
577148
39125
 6.92%
Benign
Negative



text missing or illegible when filed


Seq

read









mctp_
PrBe10003
RNA-
Tissue
single_
40
4734295
302030
17102
 5.72%
Benign
Negative



text missing or illegible when filed


Seq

read









mctp_
aN15_3
RNA-
Tissue
paired_
40
14095939
10090895
923550
 4.53%
Benign
Negative


42 text missing or illegible when filed

Seq

end









mctp_
aN15_3
RNA-
Tissue
paired_
40
2772663
3101379
714439
 4.70%
Benign
Negative


3054YAAXX_7

Seq

end









mctp_
aN23
RNA-
Tissue
single_
35
6359059

text missing or illegible when filed

171398
 5.02%
Benign
Negative



text missing or illegible when filed _6


Seq

read









mctp_
aN25
RNA-
Tissue
single_
35
5162304
2101754
100935
 4.93%
Benign
Negative


300M2AAXX_4

Seq

read









mctp_
aN25
RNA-
Tissue
single_
35
5687402
2632652
125775
 5.93%
Benign
Negative


300M2AAXX_3

Seq

read









mctp_
aN27
RNA-
Tissue
single_
35
4771581
1054625
93256
 5.67%
Benign
Negative


300M2AAXX_1

Seq

read









mctp_
aN27
RNA-
Tissue
single_
35

text missing or illegible when filed

1095978
105344
 7.44%
Benign
Negative


300M2AAXX_2

Seq

read









mctp_
aN25
RNA-
Tissue
single_
35
5661652

text missing or illegible when filed

57547
 7.10%
Benign
Negative


300M2AAXX_7

Seq

read









mctp_
aN29
RNA-
Tissue
single_
35
5201944
5472975
53453
 6.33%
Benign
Negative


300M2AAXX_8

Seq

read









mctp_
aN31
RNA-
Tissue
single_
36
4206556
1642631
122140
 5.31%
Benign
Negative


2QFCKAAXX_1

Seq

read









mctp_
aN31
RNA-
Tissue
single_
35
3624043
1504320
107996
 6.05%
Benign
Negative


2QFCKAAXX_2

Seq

read









mctp_
aN32
RNA-
Tissue
single_
36
4445595

text missing or illegible when filed

118140
 6.09%
Benign
Negative


2QFCKAAXX_4

Seq

read









mctp_
aN32
RNA-
Tissue
single_
35
4352455
1355243
115676
 3.34%
Benign
Negative


2QFCKAAXX_text missing or illegible when filed

Seq

read









mctp_
aN33
RNA-
Tissue
single_
35
5375947
2024752
122564
 3.16%
Benign
Negative


2QFCKAAXX_7

Seq

read









mctp_
aN33
RNA-
Tissue
single_
33

text missing or illegible when filed


text missing or illegible when filed


text missing or illegible when filed


text missing or illegible when filed

Benign
Negative


2QFCKAAXX_8

Seq

read









mctp_
aT12_4
RNA-
Tissue
paired_
40
10323732
10700318
391873
 3.34%
Localized
ERG+


4203NAAXX_5

Seq

end









mctp_
aT12_4
RNA-
Tissue
paired_
40
12591552
12687329
1035842
 3.16%
Localized
ERG+



text missing or illegible when filed AAXX_5


Seq

end









mctp_
aT54
RNA-
Tissue
single_
35

text missing or illegible when filed

2395362
155160
 6.35%
Localized
ERG+



text missing or illegible when filed AAXX_7


Seq

read









mctp_
at5_5
RNA-
Tissue
paired_
40
14298078
15157510
1231913

text missing or illegible when filed

Localized
ERG+



text missing or illegible when filed WAAXX_3


Seq

end









mctp_
aT52
RNA-
Tissue
single_
35
5144018
2594536
146353
 5.56%
Localized
ERG+


20A0MAAXX_8

Seq

read









mctp_
aT76
RNA-
Tissue
single_
30
4482645
2095380
77035
 3.58%
Localized
ERG+



text missing or illegible when filed AAXX_3


Seq

read









mctp_

text missing or illegible when filed

RNA-
Tissue
paired_
40

text missing or illegible when filed

10269470
745408
 7.25%
Localized
ERG+


4203NAAXX_7

Seq

end









mctp_

text missing or illegible when filed

RNA-
Tissue
paired_
40
13155443
12759015
925564
 7.25%
Localized
ERG+


42P text missing or illegible when filed AAXX_7

Seq

end









mctp_
aT20
RNA-
Tissue
single_
35
4505834
2328259

text missing or illegible when filed

 7.05%
Localized
ETV1+



text missing or illegible when filed MAAXX_6


Seq

read









mctp_
aT52
RNA-
Tissue
paired_
34

text missing or illegible when filed

11236237
579321
 5.15%
Localized
ETV1+


30Y5NAAXX_5

Seq

end









mctp_
PrCa10001
RNA-
Tissue
single_
30
5073375
1003723
01777
 4.04%
Localized
Negative



text missing or illegible when filed AAXX_4


Seq

read









mctp_
PrCa10002
RNA-
Tissue
single_
40
3979845

text missing or illegible when filed

122303

text missing or illegible when filed

Localized
Negative



text missing or illegible when filed AAXX_3


Seq

read









mctp_
PrCa10002
RNA-
Tissue
single_
30
5337734
2135509
134750
 5.17%
Localized
Negative



text missing or illegible when filed AAXX_7


Seq

read









mctp_
PrCa10003
RNA-
Tissue
single_
40

text missing or illegible when filed

5325480
200975
 5.04%
Localized
Negative



text missing or illegible when filed W7AAXX_4


Seq

read









mctp_
PrCa10003
RNA-
Tissue
single_
40
2232675
956717
47049
 4.72%
Localized
Negative



text missing or illegible when filed


Seq

read









mctp_
PrCa10003
RNA-
Tissue
single_
30
4309584

text missing or illegible when filed

50319
 4.29%
Localized
Negative


30093AAXX_5

Seq

read









mctp_
PrCa10004
RNA-
Tissue
single_
30
4877518

text missing or illegible when filed

101279
 4.17%
Localized
Negative


20093AAXX_2

Seq

read









mctp_
PrCa10004
RNA-
Tissue
single_
40
8502651
43379331
251531
 6.03%
Localized
Negative


300W7AAXX_3

Seq

read









mctp_
PrCa10005
RNA-
Tissue
single_
30
4597349
2219400
56343
 3.95%
Localized
Negative



text missing or illegible when filed


Seq

read









mctp_
PrCa10005
RNA-
Tissue
single_
40
3750454

text missing or illegible when filed

211005
 5.52%
Localized
Negative



text missing or illegible when filed AAXX_5


Seq

read









mctp_
PrCa10013
RNA-
Tissue
paired_
38
7094073

text missing or illegible when filed

695536
 8.25%
Localized
Negative



text missing or illegible when filed


Seq

end









mctp_
PrCa10013
RNA-
Tissue
paired_
38
15129950
14550397
1105327
 3.12%
Localized
Negative


42 text missing or illegible when filed FAAAXX_5

Seq

end









mctp_
PrCa10013
RNA-
Tissue
paired_
40
21555634
13593357
1143752
 3.78%
Localized
Negative



text missing or illegible when filed FAAXX_4


Seq

end









mctp_
PrCa10014
RNA-
Tissue
paired_
40
12555996
11279993
923435
 3.19%
Localized
Negative


42308AAXX_5

Seq

end









mctp_
PrCa10014
RNA-
Tissue
paired_
40
9529325
7576252
705179

text missing or illegible when filed

Localized
Negative



text missing or illegible when filed


Seq

end









mctp_
PrCa10014
RNA-
Tissue
paired_
40
13185434
17250396
1316951
 7.71%
Localized
Negative



text missing or illegible when filed


Seq

end









mctp_
PrCa10015
RNA-
Tissue
paired_
33
13853345
15792364
1122174
 7.11%
Localized
Negative



text missing or illegible when filed


Seq

end









mctp_
PrCa10015
RNA-
Tissue
paired_
40
14322439
14744516

text missing or illegible when filed

 7.08%
Localized
Negative



text missing or illegible when filed _3


Seq

end









mctp_
PrCa10015
RNA-
Tissue
paired_
30

text missing or illegible when filed

10010115
675850
 5.75%
Localized
Negative


3055U2AAXX_4

Seq

end









mctp_
PrCa10015
RNA-
Tissue
paired_
33
11875130
13526717
954576

text missing or illegible when filed

Localized
Negative


42RY4AAXX_4

Seq

end









mctp_
PrCa10015
RNA-
Tissue
paired_
40
11853538
13459171
1027558
 7.53%
Localized
Negative


42643AAXX_6

Seq

end









mctp_
PrCa10017
RNA-
Tissue
paired_
40
7585235
7555652
622237
 5.24%
Localized
Negative


42643AAXX_3

Seq

end









mctp_
PrCa10017
RNA-
Tissue
paired_
35
13554764
11318051
352274
 7.53%
Localized
Negative


42PFAAAXX_1

Seq

end









mctp_
PrCa10018
RNA-
Tissue
paired_
33

text missing or illegible when filed

18635010
1472858
 7.93%
Localized
Negative



text missing or illegible when filed AAXX_5


Seq

end









mctp_
PrCa10018
RNA-
Tissue
paired_
42
22506693
14935573
1301243
 5.72%
Localized
Negative


420JFAAXX_2

Seq

end









mctp_
PrCa10018
RNA-
Tissue
paired_
34
8565225
10571523
549435
 5.17%
Localized
Negative


30Y5NAAXX_4

Seq

end









mctp_
PrCa10019
RNA-
Tissue
paired_
40

text missing or illegible when filed

12335106
804253
 7.23%
Localized
Negative



text missing or illegible when filed


Seq

end









mctp_
PrCa10021
RNA-
Tissue
paired_
40

text missing or illegible when filed

15470222
1147555
 7.42%
Localized
Negative



text missing or illegible when filed AAXX_5


Seq

end









mctp_
PrCa10013
RNA-
Tissue
paired_
40
9473417
11040935
935157
 8.51%
Localized
Negative



text missing or illegible when filed AAXX_2


Seq

end









mctp_
PrCa10024
RNA-
Tissue
paired_
40
5249645
5541745
432504
 7.81%
Localized
Negative


42C5JAAXX_6

Seq

end









mctp_
PrCa10024
RNA-
Tissue
paired_
38
2109134

text missing or illegible when filed

541558
 7.21%
Localized
Negative


42PF0AAXX_3

Seq

end









mctp_
PrCa10028
RNA-
Tissue
paired_
40
5344595
6256991
515414
 8.25%
Localized
Negative


42CJFAAXX_3

Seq

end









mctp_
PrCa10030
RNA-
Tissue
paired_
38
37239720
12312019
1255021
 5.95%
Localized
Negative


42TB9AAXX_3

Seq

end









mctp_
PrCa10031
RNA-
Tissue
paired_
35
37885940
19792732
1356072

text missing or illegible when filed

Localized
Negative


42TB9AAXX_1

Seq

end









mctp_
PrCa10032
RNA-
Tissue
paired_
38
26093884
18353947
1420206
 7.75%
Localized
Negative


42TB9AAXX_6

Seq

end









mctp_
PrCa10033
RNA-
Tissue
paired_
30
30735020
7145280
450739
 5.45%
Localized
Negative


42TB9AAXX_2

Seq

end









mctp_
PrCa10034
RNA-
Tissue
paired_
35
35494766
15616451
1416932
 7.81%
Localized
Negative


42TB9AAXX_7

Seq

end









mctp_
aT1_3
RNA-
Tissue
paired_
40
14031095
15120363
1839923

text missing or illegible when filed

Localized
Negative


42P6MAAXX_5

Seq

end









mctp_
aT1_3
RNA-
Tissue
paired_
40
34017923
15424771

text missing or illegible when filed


text missing or illegible when filed

Localized
Negative



text missing or illegible when filed WAAXX_2


Seq

end









mctp_
aT38
RNA-
Tissue
paired_
40
14028075
14206815
1075647
 7.57%
Localized
Negative


42848AAXX_7

Seq

end









mctp_
aT38
RNA-
Tissue
paired_
34
9548041
10557079
634116
 5.84%
Localized
Negative


30Y5MAAXX_3

Seq

end









mctp_
aT42
RNA-
Tissue
paired_
30
35907739
17536905
1251425
 5.41%
Localized
Negative


4252TAAXX_2

Seq

end









mctp_
aT42
RNA-
Tissue
single_
45
9446722
4597937
345381
 7.52%
Localized
Negative



text missing or illegible when filed DAAXX_5


Seq

read









mctp_
aT45
RNA-
Tissue
paired_
38
16395435
32740230
314457
 5.42%
Localized
Negative


42927AAXX_3

Seq

end









mctp_
aT45
RNA-
Tissue
single_
45
9154922
3915984
273181
 5.97%
Localized
Negative



text missing or illegible when filed DAAXX_6


Seq

read









mctp_
aT53
RNA-
Tissue
paired_
40
13164542
13049052
1855172

text missing or illegible when filed

Localized
Negative


42603AAXX_7

Seq

end









mctp_
aT55
RNA-
Tissue
single_
36

text missing or illegible when filed


text missing or illegible when filed

105234
 5.44%
Localized
Negative


20F55AAXX_5

Seq

read









mctp_
aT55
RNA-
Tissue
single_
40
7556627
3045289
195579
 5.22%
Localized
Negative


36CW7AAXX_2

Seq

read









mctp_
aT56
RNA-
Tissue
single_
36
4294127

text missing or illegible when filed

102305
 5.56%
Localized
Negative


72F65AAXX_1

Seq

read









mctp_
aT57
RNA-
Tissue
paired_
40
9490697
9403761
628415
 7.32%
Localized
Negative



text missing or illegible when filed AAXX_4


Seq

end









mctp_
aT58
RNA-
Tissue
paired_
40
4160283
4703591
355740
 5.22%
Localized
Negative


42CJFAAXX_8

Seq

end









mctp_
aT51
RNA-
Tissue
paired_
40
10252280
10445125
718210
 5.88%
Localized
Negative


42802AAXX_3

Seq

end









mctp_
aT text missing or illegible when filed
RNA-
Tissue
single_
36
3036317
2455353
153907

text missing or illegible when filed

Localized
Negative


20F56AAXX_7

Seq

read









mctp_
aT text missing or illegible when filed
RNA-
Tissue
single_
40
5055524
3791023
258911
 7.09%
Localized
Negative


30CW7AAXX_8

Seq

read









mctp_
aT56
RNA-
Tissue
single_
36
5184373

text missing or illegible when filed

149553

text missing or illegible when filed

Localized
Negative


42CJFAAXX_2

Seq

read









mctp_
aT6_1
RNA-
Tissue
paired_
40
935249
999294
78652
 7.89%
Localized
Negative


42P6VAAXX_2

Seq

end









mctp_
aT6_1
RNA-
Tissue
paired_
40
5438987
7353528
534419
 7.13%
Localized
Negative


428JFAAXX_7

Seq

end









mctp_
aT6_1
RNA-
Tissue
paired_
38
13242920

text missing or illegible when filed

610289

text missing or illegible when filed

Localized
Negative


429FAAAXX_4

Seq

end









mctp_
PrCa1007
RNA-
Tissue
single_
42
7909935
3245254
303738
 9.34%
Localized
Negative



text missing or illegible when filed _7


Seq

read









mctp_
PrCa1025
RNA-
Tissue
paired_
40

text missing or illegible when filed

9085984
903090
 9.94%
Localized
Negative


4203NAAXX_3

Seq

end









mctp_
PrCa1026
RNA-
Tissue
paired_
40
7781205
0535677
801237
 9.38%
Localized
Negative


4203NAAXX_3

Seq

end









mctp_
PrCa1027
RNA-
Tissue
paired_
40
80305352
11427244
1130543
 3.72%
Localized
Negative


4203NAAXX_5

Seq

end









mctp_
PrCa10029
RNA-
Tissue
paired_
38
8574521
9910831
734269
 7.41%
Localized
Negative


42TB9AAXX_4

Seq

end









mctp_

text missing or illegible when filed 29

RNA-
Tissue
paired_
38
13229393
14060533
186050
 7.54%
Localized
Negative



text missing or illegible when filed _5


Seq

end









mctp_

text missing or illegible when filed

RNA-
Tissue
paired_
40
9542504
3523157
638903
 7.41%
Localized
Negative


3054YAAXX_4

Seq

end









mctp_
aT47
RNA-
Tissue
paired_
40
7209523
7010780
554581
 5.05%
Localized
Negative


42B3YAAXX_4

Seq

end









mctp_
aM23
RNA-
Tissue
single_
36
4580303
2048558
115179
 5.67%
Metastasic
ERG+


20F56AAXX_3

Seq

read









mctp_
aM23
RNA-
Tissue
single_
36
4915455
2137036
127972
 5.85%
Metastasic
ERG+


20F56AAXX_4

Seq

read









mctp_
aM25
RNA-
Tissue
single_
36
5374558
1203543

text missing or illegible when filed

 4.51%
Metastasic
ERG+


20F59AAXX_4

Seq

read









mctp_
aM28
RNA-
Tissue
single_
30
5537535
2234539
79073
 3.54%
Metastasic
ERG+


20LV5AAXX_6

Seq

read









mctp_
aM2 text missing or illegible when filed
RNA-
Tissue
single_
30
5548789
2250521

text missing or illegible when filed

 3.55%
Metastasic
ERG+


20LV5AAXX_7

Seq

read









mctp_
aM28
RNA-
Tissue
single_
35
4905432
1339767
73792
 4.01%
Metastasic
ERG+



text missing or illegible when filed AAXX_4


Seq

read









mctp_
aM29
RNA-
Tissue
single_
36
3092573
1777721
75454
 4.13%
Metastasic
ERG+


205ETAAXX_5

Seq

read









mctp_
aM30
RNA-
Tissue
single_
36
5126432
2559849
150930
 5.99%
Metastasic
ERG+


2074YAAXX_1

Seq

read









mctp_
aM33
RNA-
Tissue
single_
40
4759734

text missing or illegible when filed

139352
 5.11%
Metastasic
ERG+



text missing or illegible when filed AAXX_4


Seq

read









mctp_
aM38
RNA-
Tissue
paired_
40
6778934

text missing or illegible when filed


text missing or illegible when filed

 7.62%
Metastasic
ERG+



text missing or illegible when filed _3


Seq

end









mctp_
aM15
RNA-
Tissue
paired_
35
83825385
11854423
950874
 8.14%
Metastasic
ERG+


42820AAXX_5

Seq

end









mctp_
aM15
RNA-
Tissue
single_
36

text missing or illegible when filed

2087670
95103
 4.58%
Metastasic
ERG+


2074VAAXX_3

Seq

read









mctp_
aM37
RNA-
Tissue
single_
36
4509553
1941952
98651
 4.32%
Metastasic
ETV1+


2074VAAXX_5

Seq

read









mctp_
aM41
RNA-
Tissue
single_
36
4450735
1702039
74579
 4.30%
Metastasic
ETV1+



text missing or illegible when filed AAXX_2


Seq

read









mctp_
aM41
RNA-
Tissue
single_
36
5532905
1051694

text missing or illegible when filed

 4.03%
Metastasic
ETV1+


20FETAAZZ_text missing or illegible when filed

Seq

read









mctp_
aM41
RNA-
Tissue
single_
36
5222744
5134030
38780
 4.0%
Metastasic
ETV1+


2074NAAXX_2

Seq

read









mctp_

text missing or illegible when filed -57

RNA-
Tissue
paired_
40
9563724
10247077
1004315
 3.89%
Metastasic
Negative


3054YAAXX_6

Seq

end









mctp_

text missing or illegible when filed -93

RNA-
Tissue
paired_
40

text missing or illegible when filed

10358661
951083
 3.19%
Metastasic
Negative


3054YAAXX_5

Seq

end









mctp_
aM text missing or illegible when filed
RNA-
Tissue
single_
36
5201508
1335757
100573
 4.55%
Metastasic
Negative


20EXPAAXX_7

Seq

read









mctp_
aM29
RNA-
Tissue
paired_
40
9038499
3821509
572135
 5.49%
Metastasic
Negative


4241FAAXX_5

Seq

end









mctp_
aM35
RNA-
Tissue
single_
36
5587358
2277795
104747

text missing or illegible when filed

Metastasic
Negative


30EYPAAXX_5

Seq

read









mctp_
aM35
RNA-
Tissue
single_
40
9598611
3833469
193878
 5.05%
Metastasic
Negative



text missing or illegible when filed W7AAXX_6


Seq

read









mctp_
aM3 text missing or illegible when filed
RNA-
Tissue
paired_
40
7749510
2450300
141723
 5.83%
Metastasic
Negative


30 text missing or illegible when filed GAAXX_1

Seq

end









mctp_
aM3 text missing or illegible when filed
RNA-
Tissue
single_
36
5097473
2217807

text missing or illegible when filed


text missing or illegible when filed

Metastasic
Negative


205NAAAXX_text missing or illegible when filed

Seq

read









mctp_
aM39
RNA-
Tissue
single_
36
5516548
2539222
113774

text missing or illegible when filed

Metastasic
Negative


20EYPAAXX_2

Seq

read









mctp_
aM39
RNA-
Tissue
paired_
40
5279570
3585922
238234
 5.82%
Metastasic
Negative


30 text missing or illegible when filed GAAXX_5

Seq

end









mctp_
aM39
RNA-
Tissue
single_
36
5354844
1217551
102001

text missing or illegible when filed

Metastasic
Negative


20FETAAXX_7

Seq

read









mctp_
aM43
RNA-
Tissue
single_
36
5497705

text missing or illegible when filed


text missing or illegible when filed

 4.33%
Metastasic
Negative


20EXPAAXX_5

Seq

read









mctp_
aM43
RNA-
Tissue
single_
40
2456529
3952621

text missing or illegible when filed

 5.07%
Metastasic
Negative


30CW7AAXX_7

Seq

read













TOTAL

1723713421
1417527939
114448745
 8.97%








text missing or illegible when filed indicates data missing or illegible when filed






















TABLE 2








Merge










intron-

Join
Filter






Classification
redundant
Informatic
transcript
intronic
UCSC



Chromosome
Cuffcompare
tree filter
transcripts
filters
fragments
pre-mRNA
Canonical
Refseq























chr1 
759121
272072
12701
5030
4489
3652
2499
3334


chr2 
581574
206281
9353
3224
2856
2361
1579
2023


chr3 
518621
167071
5706
2917
2560
2053
1312
1816


chr4 
329950
103113
5160
2019
1731
1444
977
1238


chr5 
380613
126139
5833
2365
2067
1694
1104
1465


chr6 
396848
145607
7580
2590
2309
1874
1370
1667


chr7 
432152
134051
6432
2355
2132
1703
1325
1583


chr8 
308935
97724
4226
1729
1529
1243
848
1210


chr9 
359300
122626
4069
1937
1767
1402
1114
1272


chr10
354625
103512
3509
1672
1508
1226
998
1382


chr11
424606
165211
6909
2922
2640
2102
1566
2023


chr12
425280
138650
6872
2653
2373
1858
1233
1668


chr13
159649
68284
3616
1118
908
751
425
549


chr14
261497
123741
4842
1806
1619
1308
885
1102


chr15
291241
108058
5816
1884
1626
1321
1362
1127


chr16
364747
124182
3968
2002
1835
1386
1093
1311


chr17
473261
168469
5581
2780
2582
1950
1480
1907


chr18
144300
49112
2504
785
682
539
377
459


chr19
494738
189411
7209
3543
3239
2269
1668
2314


chr20
217223
70308
3059
1243
1158
907
659
926


chr21
113368
29728
939
495
436
354
306
427


chr22
223385
73509
2401
1156
1068
798
633
771


chrX 
222743
94591
4997
1516
1349
1161
959
1841


chrY 
15190
4039
272
81
71
59
148
254


Total
8253710
2885489
123554
49822
44534
35415
35921
33669























TABLE 3











# Uniquely








Peak
mapped






Antibody
Antibody
Finder
reads (in
# Peaks


GEO ID
File name
Pubmed ID
used
vendor
Used
millions)
Called






















GSM353631
VCaP_regular_medium_H3K4me1
20478527
ab8855
Abcam
MACS
6.96
23126


GSM353632
VCap_regular_medium_H3K4me2
20478527
ab7756
Abcam
MACS
5.97
74153


GSM353620
VCaP_regular_medium_H3K4me3
20478527
ab8580
Abcam
MACS
10.95
30043


GSM353624
VCaP_regular_medium_H3K35me3
20478527
ab9050
Abcam
SICER
9.91
29860


GSM353629
VCaP_regular_medium_Ace_H3
20478527
06-593
Millipore
MACS
4.76
41971


GSM353622
VCap_regular_medium_Pan_H3
20478527
ab1791
Abcam
MACS
5.91
control


GSM353623
VCaP_regular_medium_text missing or illegible when filed
20478527
ab817
Abcam
MACS
6.88
16041


GSM353634
LNCaP_regular_medium_H3K4me1
20478527
ab8895
Abcam
MACS
6.19
31109


GSM353635
LNCaP_regular_medium_H3K4me2
20478527
ab7766
Abcam
MACS
6.14
62061


GSM353626
LNCaP_regular_medium_H3K4m83
20478527
ab8580
Abcam
MACS
10.22
19838


GSM353627
LNCaP_regular_medium_H3K36me3
20478527
ab9050
Abcam
SICER
9.15
24332


GSM353628
LNCaP_regular_medium_Ace_H3
20478527
06-599
Millipore
MACS
4.76
33211


GSM353617
LDCaP_Ethl_text missing or illegible when filed
20478527
ab817
Abcam
MACS
1.36
8232


GSM353653
tissue_H3X4me3
20478527
ab8580
Abcam
MACS
11.85
23750






text missing or illegible when filed indicates data missing or illegible when filed



















TABLE 4









Fold







change
SAM score


Category
Type
Name
Interval
(Unlogged)
((r)/(s + s0))




















PROTEIN
UPREG.
TU_0084471_0
chr5: 33980375-34087770
12.75
7.71


NOVEL
UPREG.
TU_0099865_0
chr8: 128087842-128095202
7.07
7.41


PROTEIN
UPREG.
TU_0123088_0
chr2: 238147710-238169707
3.01
7.01


ncRNA
UPREG.
TU_0102832_0
chr9: 78569118-78593537
12.23
6.93


PROTEIN
UPREG.
TU_0078322_0
chr12: 32260254-32260805
4.52
6.82


ncRNA
UPREG.
TU_0101270_0
chr21: 41853044-41875166
9.82
6.79


PROTEIN
UPREG.
TU_0027326_0
chrX: 16874726-17077384
3.31
6.79


PROTEIN
UPREG.
TU_0092114_0
chr11: 60223535-60239968
7.40
6.65


PROTEIN
UPREG.
TU_0044448_0
chr13: 51509122-51537693
4.77
6.59


PROTEIN
UPREG.
TU_0023159_0
chr19: 40224450-40249318
3.69
6.56


PROTEIN
UPREG.
TU_0092116_0
chr11: 60238519-60239968
7.50
6.44


PROTEIN
UPREG.
TU_0123090_0
chr2: 238164428-238165452
3.57
6.24


ncRNA
UPREG.
TU_0046239_0
chr4: 1185645-1201937
5.19
6.22


PROTEIN
UPREG.
TU_0122750_0
chr2: 231610299-231625861
4.56
6.14


PROTEIN
UPREG.
TU_0082723_0
chr12: 120142512-120219979
3.26
6.13


PROTEIN
UPREG.
TU_0123089_0
chr2: 238164428-238165452
4.22
6.12


PROTEIN
UPREG.
TU_0101111_0
chr21: 36989329-37045253
4.04
6.04


PROTEIN
UPREG.
TU_0090152_0
chr11: 4965638-4969515
6.38
5.99


PROTEIN
UPREG.
TU_0101113_0
chr21: 36994126-37045253
3.76
5.98


PROTEIN
UPREG.
TU_0045026_0
chr13: 94660907-94668260
3.68
5.97


ncRNA
UPREG.
TU_0101274_0
chr21: 41869930-41870631
8.95
5.88


PROTEIN
UPREG.
TU_0046235_0
chr4: 1181913-1189142
4.28
5.87


NOVEL
UPREG.
TU_0054603_0
chr16: 82380933-82394836
7.25
5.84


PROTEIN
UPREG.
TU_0101308_0
chr21: 42605257-42608791
4.97
5.83


PROTEIN
UPREG.
TU_0084137_0
chr5: 13981150-13997615
3.91
5.80


PROTEIN
UPREG.
TU_0084127_0
chr5: 13882635-13892514
4.95
5.79


PROTEIN
UPREG.
TU_0101119_0
chr21: 37034016-37045253
3.56
5.78


PROTEIN
UPREG.
TU_0054919_0
chr16: 88188842-88191143
3.46
5.75


PROTEIN
UPREG.
TU_0120963_0
chr2: 172658361-172662549
27.56
5.66


PROTEIN
UPREG.
TU_0044977_0
chr13: 94524392-94621526
3.64
5.64


PROTEIN
UPREG.
TU_0052614_0
chr16: 20542057-20616514
6.65
5.63


NOVEL
UPREG.
TU_0084303_0
chr5: 15899476-15955226
7.46
5.61


PROTEIN
UPREG.
TU_0060406_0
chr1: 28134091-28158290
3.03
5.61


PROTEIN
UPREG.
TU_0060407_0
chr1: 28155047-28170460
2.41
5.60


ncRNA
UPREG.
TU_0103252_0
chr9: 96357168-96369978
5.00
5.58


PROTEIN
UPREG.
TU_0034719_0
chr14: 73490756-73555773
2.51
5.57


PROTEIN
UPREG.
TU_0070457_0
chr20: 2258975-2269890
6.49
5.56


NOVEL
UPREG.
TU_0114240_0
chr2: 1534883-1538193
5.25
5.54


PROTEIN
UPREG.
TU_0087676_0
chr5: 138643394-138648458
2.75
5.50


PROTEIN
UPREG.
TU_0084138_0
chr5: 13976388-13981285
4.09
5.48


ncRNA
UPREG.
TU_0046237_0
chr4: 1162036-1195088
4.29
5.47


ncRNA
UPREG.
TU_0060421_0
chr1: 28157480-28158290
3.12
5.44


PROTEIN
UPREG.
TU_0061436_0
chr1: 37954250-37957136
2.66
5.41


PROTEIN
UPREG.
TU_0044894_0
chr13: 94470096-94752898
2.85
5.38


PROTEIN
UPREG.
TU_0034720_0
chr14: 73486609-73503474
2.20
5.38


PROTEIN
UPREG.
TU_0090153_0
chr11: 4969009-4970186
7.37
5.34


PROTEIN
UPREG.
TU_0061432_0
chr1: 37954250-37958679
2.65
5.31


PROTEIN
UPREG.
TU_0090268_0
chr11: 6659768-6661138
1.76
5.30


PROTEIN
UPREG.
TU_0084120_0
chr5: 13743434-13864864
3.59
5.29


PROTEIN
UPREG.
TU_0045059_0
chr13: 94638351-94639152
2.93
5.28


ncRNA
UPREG.
TU_0075807_0
chr10: 101676895-101680049
2.61
5.27


PROTEIN
UPREG.
TU_0078285_0
chr12: 32150992-32421799
3.02
5.26


PROTEIN
UPREG.
TU_0103019_0
chr9: 87826642-87905011
2.77
5.22


PROTEIN
UPREG.
TU_0046244_0
chr4: 1185645-1216291
3.51
5.21


PROTEIN
UPREG.
TU_0075664_0
chr10: 98752046-98935267
4.15
5.20


PROTEIN
UPREG.
TU_0090949_0
chr11: 24475021-25059245
3.50
5.19


NOVEL
UPREG.
TU_0099864_0
chr8: 128094589-128103681
3.56
5.17


PROTEIN
UPREG.
TU_0030273_0
chrX: 106690714-106735138
3.52
5.15


PROTEIN
UPREG.
TU_0090128_0
chr11: 4656012-4675667
5.26
5.15


PROTEIN
UPREG.
TU_0017700_0
chr17: 51183394-51209728
2.05
5.13


ncRNA
UPREG.
TU_0018760_0
chr17: 71645643-71652049
6.41
5.08


PROTEIN
UPREG.
TU_0018765_0
chr17: 71652262-71747927
5.18
5.06


ncRNA
UPREG.
TU_0114235_0
chr2: 1521347-1608386
4.22
5.04


PROTEIN
UPREG.
TU_0084132_0
chr5: 13964466-13969509
4.30
5.03


NOVEL
UPREG.
TU_0049368_0
chr4: 106772318-106772770
3.40
5.03


PROTEIN
UPREG.
TU_0115204_0
chr2: 27175274-27195587
2.37
4.99


PROTEIN
UPREG.
TU_0115205_0
chr2: 27163593-27178264
2.49
4.98


PROTEIN
UPREG.
TU_0062449_0
chr1: 46418568-46424753
1.95
4.96


PROTEIN
UPREG.
TU_0072027_0
chr20: 35964872-36007156
3.91
4.95


ncRNA
UPREG.
TU_0086706_0
chr5: 116818427-116835522
2.91
4.92


PROTEIN
UPREG.
TU_0084136_0
chr5: 13972327-13976416
3.37
4.91


PROTEIN
UPREG.
TU_0042761_0
chr13: 23200813-23363662
3.54
4.90


PROTEIN
UPREG.
TU_0114168_0
chr15: 99658271-99847175
2.25
4.89


ncRNA
UPREG.
TU_0018764_0
chr17: 71650143-71652049
6.28
4.86


PROTEIN
UPREG.
TU_0085832_0
chr5: 76150810-76167055
3.84
4.86


NOVEL
UPREG.
TU_0090142_0
chr11: 4748677-4760303
12.08
4.86


PROTEIN
UPREG.
TU_0103018_0
chr9: 87745936-87851451
2.41
4.83


NOVEL
UPREG.
TU_0096472_0
chr11: 133844590-133862924
6.85
4.82


PROTEIN
UPREG.
TU_0029229_0
chrX: 70349443-70377690
2.34
4.81


NOVEL
UPREG.
TU_0084306_0
chr5: 15896315-15947088
5.37
4.78


PROTEIN
UPREG.
TU_0024934_0
chr19: 54352845-54407356
1.88
4.77


NOVEL
UPREG.
TU_0096473_0
chr11: 133844590-133862995
6.96
4.76


ncRNA
UPREG.
TU_0101131_0
chr21: 36994126-37041774
3.57
4.74


PROTEIN
UPREG.
TU_0008239_0
chr7: 7362390-7537552
3.00
4.73


PROTEIN
UPREG.
TU_0000022_0
chr6: 1567640-2190842
2.14
4.72


PROTEIN
UPREG.
TU_0065193_0
chr1: 145122471-145183544
2.72
4.72


PROTEIN
UPREG.
TU_0061439_0
chr1: 37954250-37971671
2.46
4.71


ncRNA
UPREG.
TU_0096470_0
chr11: 133841573-133850753
6.44
4.70


PROTEIN
UPREG.
TU_0046219_0
chr4: 993725-995193
3.90
4.69


NOVEL
UPREG.
TU_0078288_0
chr12: 32393283-32405731
2.47
4.67


PROTEIN
UPREG.
TU_0101115_0
chr21: 37000839-37005920
3.31
4.67


NOVEL
UPREG.
TU_0099884_0
chr8: 128301493-128307576
2.65
4.66


PROTEIN
UPREG.
TU_0008489_0
chr7: 23685881-23708938
1.70
4.64


PROTEIN
UPREG.
TU_0042767_0
chr13: 23186666-23204319
4.82
4.64


PROTEIN
UPREG.
TU_0061430_0
chr1: 37930752-37957012
2.30
4.64


PROTEIN
UPREG.
TU_0079451_0
chr12: 52696814-52736068
3.77
4.64


PROTEIN
UPREG.
TU_0069545_0
chr1: 226711356-226712534
2.36
4.63


PROTEIN
UPREG.
TU_0045837_0
chr13: 113151239-113151444
3.73
4.61


PROTEIN
UPREG.
TU_0101138_0
chr21: 36994126-37004010
3.54
4.61


PROTEIN
UPREG.
TU_0049362_0
chr4: 106693102-106771686
3.06
4.58


PROTEIN
UPREG.
TU_0055044_0
chr16: 88589437-88613428
2.23
4.55


PROTEIN
UPREG.
TU_0038605_0
chr3: 52689830-52704651
1.54
4.55


ncRNA
UPREG.
TU_0062653_0
chr1: 51756544-51799759
2.52
4.54


PROTEIN
UPREG.
TU_0080359_0
chr12: 63512292-63558861
1.87
4.53


PROTEIN
UPREG.
TU_0012481_0
chr7: 111155336-111217889
2.04
4.52


PROTEIN
UPREG.
TU_0076355_0
chr10: 115970327-115995953
10.34
4.52


PROTEIN
UPREG.
TU_0099892_0
chr8: 128817416-128822629
2.33
4.52


ncRNA
UPREG.
TU_0050484_0
chr1: 28706931-28707187
2.53
4.51


PROTEIN
UPREG.
TU_0046232_0
chr4: 1147069-1175181
2.75
4.50


PROTEIN
UPREG.
TU_0107858_0
chr22: 40664589-40673116
2.27
4.50


PROTEIN
UPREG.
TU_0042794_0
chr13: 23228589-23228839
3.47
4.49


PROTEIN
UPREG.
TU_0057850_0
chr1: 1523259-1525373
2.80
4.48


PROTEIN
UPREG.
TU_0023156_0
chr19: 40109515-40127909
2.56
4.48


PROTEIN
UPREG.
TU_0102821_0
chr9: 78263916-78312152
2.98
4.48


PROTEIN
UPREG.
TU_0081659_0
chr12: 108636297-108700791
2.90
4.47


PROTEIN
UPREG.
TU_0049370_0
chr4: 106776991-106847697
2.15
4.47


PROTEIN
UPREG.
TU_0047672_0
chr4: 41807710-41840313
2.51
4.47


PROTEIN
UPREG.
TU_0114959_0
chr2: 24865860-24869912
1.68
4.46


PROTEIN
UPREG.
TU_0037043_0
chr3: 13332730-13436812
1.77
4.46


PROTEIN
UPREG.
TU_0087443_0
chr5: 135237637-135247034
4.09
4.46


PROTEIN
UPREG.
TU_0086635_0
chr5: 114489075-114543909
2.02
4.43


PROTEIN
UPREG.
TU_0107859_0
chr22: 40664589-40665721
2.38
4.42


NOVEL
UPREG.
TU_0106548_0
chr22: 22209111-22212055
6.49
4.42


PROTEIN
UPREG.
TU_0067165_0
chr1: 160797907-160845907
1.81
4.40


PROTEIN
UPREG.
TU_0020146_0
chr19: 3728970-3737293
2.53
4.39


PROTEIN
UPREG.
TU_0107642_0
chr22: 39046992-39047479
1.69
4.38


PROTEIN
UPREG.
TU_0016185_0
chr17: 31415814-31422953
3.63
4.38


NOVEL
UPREG.
TU_0104717_0
chr9: 130697833-130698832
2.79
4.36


PROTEIN
UPREG.
TU_0052105_0
chr16: 4785874-4786488
2.99
4.36


PROTEIN
UPREG.
TU_0059663_0
chr1: 21795295-21850886
1.99
4.35


PROTEIN
UPREG.
TU_0108030_0
chr22: 43527117-43638770
1.74
4.34


PROTEIN
UPREG.
TU_0093781_0
chr11: 67151991-67154057
2.48
4.33


PROTEIN
UPREG.
TU_0086924_0
chr5: 126233852-126241807
2.89
4.32


PROTEIN
UPREG.
TU_0048191_0
chr4: 72423780-72424347
2.93
4.32


PROTEIN
UPREG.
TU_0034727_0
chr14: 73508223-73508442
2.29
4.32


PROTEIN
UPREG.
TU_0096297_0
chr11: 128342286-128353900
1.84
4.31


PROTEIN
UPREG.
TU_0007829_0
chr7: 3625233-4275129
4.39
4.30


PROTEIN
UPREG.
TU_0116252_0
chr2: 47449810-47467636
1.93
4.30


PROTEIN
UPREG.
TU_0115216_0
chr2: 27175274-27177799
2.02
4.27


PROTEIN
UPREG.
TU_0018409_0
chr17: 65013419-65049811
2.02
4.26


PROTEIN
UPREG.
TU_0099847_0
chr8: 126511614-126519830
2.75
4.25


PROTEIN
UPREG.
TU_0035152_0
chr14: 81062791-81063412
2.22
4.25


PROTEIN
UPREG.
TU_0040936_0
chr3: 155391785-155458293
2.10
4.25


PROTEIN
UPREG.
TU_0027558_0
chrX: 23595491-23614436
1.66
4.25


PROTEIN
UPREG.
TU_0076460_0
chr10: 121248954-121292235
1.66
4.24


PROTEIN
UPREG.
TU_0067170_0
chr1: 160826739-160826994
2.10
4.23


PROTEIN
UPREG.
TU_0103050_0
chr9: 89409681-89512477
2.30
4.23


PROTEIN
UPREG.
TU_0112868_0
chr15: 77390455-77402242
1.55
4.23


PROTEIN
UPREG.
TU_0090960_0
chr11: 25059388-25060757
3.35
4.23


PROTEIN
UPREG.
TU_0072165_0
chr20: 40142077-40204030
4.69
4.22


PROTEIN
UPREG.
TU_0044687_0
chr13: 74756644-74954891
2.04
4.21


ncRNA
UPREG.
TU_0096477_0
chr11: 133879414-133850753
4.43
4.21


PROTEIN
UPREG.
TU_0093947_0
chr11: 68208575-68215238
1.41
4.20


PROTEIN
UPREG.
TU_0103253_0
chr9: 96405246-96442373
1.69
4.20


PROTEIN
UPREG.
TU_0091863_0
chr11: 57008498-57039966
2.69
4.20


PROTEIN
UPREG.
TU_0106199_0
chr22: 18308042-18314411
3.94
4.20


NOVEL
UPREG.
TU_0090140_0
chr11: 4748163-4759145
6.33
4.20


PROTEIN
UPREG.
TU_0103051_0
chr9: 89302442-89409890
2.37
4.19


NOVEL
UPREG.
TU_0078290_0
chr12: 32394534-32410898
3.20
4.19


PROTEIN
UPREG.
TU_0029336_0
chrX: 70669659-70712461
1.70
4.18


PROTEIN
UPREG.
TU_0092155_0
chr11: 60871597-60886554
1.80
4.18


PROTEIN
UPREG.
TU_0095597_0
chr11: 114549577-114880335
1.75
4.18


PROTEIN
UPREG.
TU_0082724_0
chr12: 120230545-120274615
1.42
4.17


PROTEIN
UPREG.
TU_0079770_0
chr12: 55040666-55042824
4.25
4.16


PROTEIN
UPREG.
TU_0000263_0
chr6: 4060925-4080831
1.55
4.16


NOVEL
UPREG.
TU_0040394_0
chr3: 133418632-133441282
3.46
4.16


PROTEIN
UPREG.
TU_0066594_0
chr1: 154245443-154257363
1.40
4.15


PROTEIN
UPREG.
TU_0099852_0
chr8: 126515081-126519830
2.81
4.15


PROTEIN
UPREG.
TU_0100363_0
chr8: 144891741-144899598
2.24
4.14


PROTEIN
UPREG.
TU_0096461_0
chr11: 133751095-133757235
2.10
4.13


ncRNA
UPREG.
TU_0044488_0
chr13: 51641093-51641310
2.76
4.13


PROTEIN
UPREG.
TU_0048990_0
chr4: 95592056-95804933
2.30
4.13


NOVEL
UPREG.
TU_0078293_0
chr12: 32396393-32414822
2.90
4.13


PROTEIN
UPREG.
TU_0046201_0
chr4: 991841-1010686
2.57
4.12


PROTEIN
UPREG.
TU_0091866_0
chr11: 57008498-57010253
2.54
4.12


PROTEIN
UPREG.
TU_0011133_0
chr7: 94378726-94759741
1.77
4.12


PROTEIN
UPREG.
TU_0122941_0
chr2: 234410713-234427931
3.28
4.12


PROTEIN
UPREG.
TU_0084131_0
chr5: 13929889-13953380
2.62
4.12


NOVEL
UPREG.
TU_0084142_0
chr5: 14017046-14021379
3.59
4.11


PROTEIN
UPREG.
TU_0087955_0
chr5: 140931645-140931865
2.00
4.10


PROTEIN
UPREG.
TU_0085953_0
chr5: 79410392-79410908
3.35
4.10


PROTEIN
UPREG.
TU_0022288_0
chr19: 18357973-18360121
2.75
4.09


PROTEIN
UPREG.
TU_0085951_0
chr5: 79366959-79414885
3.01
4.09


PROTEIN
UPREG.
TU_0060849_0
chr1: 32572021-32574435
1.81
4.09


PROTEIN
UPREG.
TU_0087441_0
chr5: 134934290-134942617
2.74
4.09


PROTEIN
UPREG.
TU_0042725_0
chr13: 23148223-23200531
4.96
4.09


PROTEIN
UPREG.
TU_0039018_0
chr3: 66510805-66634168
1.69
4.08


PROTEIN
UPREG.
TU_0096299_0
chr11: 128340164-128347506
1.70
4.07


PROTEIN
UPREG.
TU_0022290_0
chr19: 18357973-18359195
2.64
4.07


PROTEIN
UPREG.
TU_0100684_0
chr8: 146190487-146191030
1.89
4.06


PROTEIN
UPREG.
TU_0042974_0
chr13: 26148671-26148967
2.81
4.06


NOVEL
UPREG.
TU_0084308_0
chr5: 15938753-15949124
4.09
4.06


NOVEL
UPREG.
TU_0082746_0
chr12: 120197102-120197416
4.97
4.06


PROTEIN
UPREG.
TU_0014355_0
chr17: 2650561-2887730
1.92
4.05


PROTEIN
UPREG.
TU_0114110_0
chr15: 99250537-99274351
2.01
4.05


PROTEIN
UPREG.
TU_0096341_0
chr11: 129534843-129585464
1.64
4.04


PROTEIN
UPREG.
TU_0052083_0
chr16: 4784094-4805339
2.71
4.04


NOVEL
UPREG.
TU_0078296_0
chr12: 32394534-32405549
2.92
4.04


PROTEIN
UPREG.
TU_0084126_0
chr5: 13892443-13903812
3.64
4.03


NOVEL
UPREG.
TU_0047312_0
chr4: 39217669-39222163
3.83
4.02


PROTEIN
UPREG.
TU_0008287_0
chr7: 8119340-8268973
1.65
4.02


PROTEIN
UPREG.
TU_0018937_0
chr17: 73714011-73714967
1.61
4.01


PROTEIN
UPREG.
TU_0048995_0
chr4: 95805027-95808417
2.47
4.00


PROTEIN
UPREG.
TU_0038694_0
chr3: 53810226-53855769
2.03
3.99


ncRNA
UPREG.
TU_0046233_0
chr4: 1202157-1232168
2.45
3.99


PROTEIN
UPREG.
TU_0019018_0
chr17: 75372094-75381243
2.25
3.98


PROTEIN
UPREG.
TU_0042326_0
chr3: 199123974-199125319
1.77
3.98


PROTEIN
UPREG.
TU_0099893_0
chr8: 128817416-128819105
2.23
3.98


PROTEIN
UPREG.
TU_0012491_0
chr7: 111304238-111362856
1.91
3.98


PROTEIN
UPREG.
TU_0112335_0
chr15: 70816880-70864494
1.71
3.97


PROTEIN
UPREG.
TU_0047964_0
chr4: 57020861-57038533
1.74
3.97


PROTEIN
UPREG.
TU_0052565_0
chr16: 19362784-19409995
1.98
3.96


NOVEL
UPREG.
TU_0042717_0
chr13: 23149908-23200198
4.95
3.96


PROTEIN
UPREG.
TU_0017374_0
chr17: 43380086-43404182
1.53
3.96


PROTEIN
UPREG.
TU_0071058_0
chr20: 20318209-20549154
2.02
3.96


PROTEIN
UPREG.
TU_0105741_0
chrY: 6971017-6998339
2.20
3.95


PROTEIN
UPREG.
TU_0018995_0
chr17: 74491566-74517485
1.64
3.94


PROTEIN
UPREG.
TU_0103055_0
chr9: 89512509-8913285
1.92
3.93


PROTEIN
UPREG.
TU_0041139_0
chr3: 171237964-171285906
1.91
3.93


PROTEIN
UPREG.
TU_0042325_0
chr3: 199124975-199143480
1.74
3.93


PROTEIN
UPREG.
TU_0020688_0
chr19: 8180084-8237335
1.60
3.93


PROTEIN
UPREG.
TU_0118314_0
chr2: 99086923-99100654
1.78
3.92


PROTEIN
UPREG.
TU_0017875_0
chr17: 54652767-54706896
2.33
3.92


PROTEIN
UPREG.
TU_0037277_0
chr3: 24134438-24511318
1.75
3.92


PROTEIN
UPREG.
TU_0047593_0
chr4: 40446539-40457235
1.90
3.91


PROTEIN
UPREG.
TU_0114108_0
chr15: 99235494-99274389
2.00
3.91


ncRNA
UPREG.
TU_0024530_0
chr19: 50889160-50909766
1.72
3.91


PROTEIN
UPREG.
TU_0008957_0
chr7: 38308886-38325338
2.62
3.91


PROTEIN
UPREG.
TU_0043122_0
chr13: 28981555-28989371
1.73
3.90


PROTEIN
UPREG.
TU_0076644_0
chr10: 127398227-127398596
2.06
3.90


PROTEIN
UPREG.
TU_0045423_0
chr13: 100053877-100125079
2.02
3.89


PROTEIN
UPREG.
TU_0045495_0
chr13: 107720446-107737194
2.06
3.88


PROTEIN
UPREG.
TU_0076648_0
chr10: 127412714-127442685
1.64
3.88


NOVEL
UPREG.
TU_0088857_0
chr5: 172259171-172275517
1.69
3.87


NOVEL
UPREG.
TU_0044453_0
chr13: 51505777-51524522
2.96
3.86


NOVEL
UPREG.
TU_0047330_0
chr4: 39217641-39222163
3.43
3.86


PROTEIN
UPREG.
TU_0100838_0
chr21: 30508275-30510244
2.43
3.86


NOVEL
UPREG.
TU_0106544_0
chr22: 22210421-22220506
4.27
3.85


ncRNA
UPREG.
TU_0100275_0
chr8: 144520506-144537551
2.11
3.85


PROTEIN
UPREG.
TU_0057466_0
chr18: 72853744-72866791
1.58
3.84


PROTEIN
UPREG.
TU_0040010_0
chr3: 126311839-126412928
2.16
3.84


PROTEIN
UPREG.
TU_0042800_0
chr13: 23360816-23370548
2.73
3.84


PROTEIN
UPREG.
TU_0117501_0
chr2: 74065748-74174193
1.71
3.83


PROTEIN
UPREG.
TU_0053389_0
chr16: 45673980-45701001
2.66
3.83


PROTEIN
UPREG.
TU_0087944_0
chr5: 140874777-140978925
1.47
3.83


PROTEIN
UPREG.
TU_0017393_0
chr17: 43389397-43390300
1.90
3.82


PROTEIN
UPREG.
TU_0008919_0
chr7: 38257158-38271020
1.93
3.82


PROTEIN
UPREG.
TU_0033383_0
chr14: 50259793-50367616
1.51
3.82


PROTEIN
UPREG.
TU_0049911_0
chr4: 139304784-139382952
2.48
3.82


PROTEIN
UPREG.
TU_0024366_0
chr19: 50100808-50104487
1.86
3.82


PROTEIN
UPREG.
TU_0070109_0
chr1: 243979271-244159914
1.56
3.81


PROTEIN
UPREG.
TU_0120975_0
chr2: 182104631-182107832
1.86
3.80


NOVEL
UPREG.
TU_0044933_0
chr13: 94755992-94760688
2.52
3.80


PROTEIN
UPREG.
TU_0103689_0
chr9: 111019219-111122750
1.75
3.80


PROTEIN
UPREG.
TU_0096460_0
chr11: 133734857-133786962
2.09
3.79


PROTEIN
UPREG.
TU_0071115_0
chr20: 24934888-24986948
1.48
3.79


PROTEIN
UPREG.
TU_0093783_0
chr11: 67153661-67153870
2.48
3.79


PROTEIN
UPREG.
TU_0047591_0
chr4: 40457999-40506655
1.79
3.79


PROTEIN
UPREG.
TU_0112336_0
chr15: 70830765-70838346
1.63
3.78


PROTEIN
UPREG.
TU_0066664_0
chr1: 154481433-154485049
2.29
3.78


PROTEIN
UPREG.
TU_0018812_0
chr17: 72119376-72151549
3.38
3.78


PROTEIN
UPREG.
TU_0110225_0
chr15: 43310091-43512722
3.60
3.78


ncRNA
UPREG.
TU_0054545_0
chr16: 79431010-79431852
10.26
3.78


PROTEIN
UPREG.
TU_0107643_0
chr22: 39072466-39093168
1.36
3.78


PROTEIN
UPREG.
TU_0025230_0
chr19: 55992773-56000199
1.86
3.78


PROTEIN
UPREG.
TU_0012480_0
chr7: 111153704-111155311
1.81
3.77


PROTEIN
UPREG.
TU_0070821_0
chr20: 8997167-9409281
1.64
3.77


PROTEIN
UPREG.
TU_0103873_0
chr9: 115151636-115178163
1.52
3.77


PROTEIN
UPREG.
TU_0018813_0
chr17: 72128611-72133119
3.56
3.76


NOVEL
UPREG.
TU_0112004_0
chr15: 67644390-67650387
3.56
3.76


PROTEIN
UPREG.
TU_0043118_0
chr13: 28981555-29067829
1.76
3.76


NOVEL
UPREG.
TU_0112003_0
chr15: 67645590-67775246
3.12
3.76


NOVEL
UPREG.
TU_0060446_0
chr1: 28438629-28450156
2.23
3.75


PROTEIN
UPREG.
TU_0122972_0
chr2: 236068012-236482693
1.69
3.75


NOVEL
UPREG.
TU_0106545_0
chr22: 22218478-22219162
3.99
3.74


PROTEIN
UPREG.
TU_0087283_0
chr5: 133753241-133766074
1.85
3.74


ncRNA
UPREG.
TU_0025312_0
chr19: 57059515-57145170
1.89
3.74


PROTEIN
UPREG.
TU_0079679_0
chr12: 54760142-54783545
1.58
3.73


PROTEIN
UPREG.
TU_0074564_0
chr10: 64241765-64246112
2.62
3.73


PROTEIN
UPREG.
TU_0106189_0
chr22: 18235213-18328816
1.82
3.73


PROTEIN
UPREG.
TU_0078994_0
chr12: 49412412-49428706
1.41
3.72


ncRNA
UPREG.
TU_0003229_0
chr6: 41598975-41621874
2.05
3.72


PROTEIN
UPREG.
TU_0040937_0
chr3: 155439710-155458293
1.96
3.72


PROTEIN
UPREG.
TU_0040093_0
chr3: 128830731-128874336
1.87
3.72


NOVEL
UPREG.
TU_0106542_0
chr22: 22211315-22220506
3.77
3.71


PROTEIN
UPREG.
TU_0019375_0
chr17: 77608812-77616980
1.63
3.71


PROTEIN
UPREG.
TU_0042563_0
chr13: 20264762-20334966
1.85
3.71


PROTEIN
UPREG.
TU_0103386_0
chr9: 9905734-99110148
1.89
3.71


PROTEIN
UPREG.
TU_0030004_0
chrX: 100534013-100534540
1.84
3.71


NOVEL
UPREG.
TU_0089906_0
chr11: 1042845-1045705
2.94
3.71


NOVEL
UPREG.
TU_0089014_0
chr5: 176014905-176015351
2.01
3.71


ncRNA
UPREG.
TU_0056173_0
chr18: 22523074-22537627
3.31
3.70


PROTEIN
UPREG.
TU_0052880_0
chr16: 28393117-28411069
1.48
3.70


PROTEIN
UPREG.
TU_0100355_0
chr8: 144884230-144910177
2.00
3.69


PROTEIN
UPREG.
TU_0096216_0
chr11: 125271293-125271517
2.08
3.69


PROTEIN
UPREG.
TU_0092161_0
chr11: 60884289-60892364
1.99
3.68


PROTEIN
UPREG.
TU_0086926_0
chr5: 126241953-126394149
2.27
3.68


NOVEL
UPREG.
TU_0088230_0
chr5: 148864170-148864752
1.94
3.68


ncRNA
UPREG.
TU_0099940_0
chr8: 129065546-129182684
1.61
3.68


PROTEIN
UPREG.
TU_0089017_0
chr5: 176222085-176240501
10.21
3.67


PROTEIN
UPREG.
TU_0078586_0
chr12: 46643629-46648944
1.47
3.67


PROTEIN
UPREG.
TU_0053467_0
chr16: 51028455-51138080
2.19
3.67


PROTEIN
UPREG.
TU_0089452_0
chr5: 179258704-179258997
1.62
3.67


PROTEIN
UPREG.
TU_0076329_0
chr10: 115501382-115531028
2.60
3.67


PROTEIN
UPREG.
TU_0047688_0
chr4: 42105164-42354144
1.68
3.67


PROTEIN
UPREG.
TU_0059142_0
chr1: 16203274-16206548
12.41
3.67


PROTEIN
UPREG.
TU_0116906_0
chr2: 63135968-63138462
2.81
3.66


PROTEIN
UPREG.
TU_0000154_0
chr6: 3063923-3099152
1.53
3.66


PROTEIN
UPREG.
TU_0088782_0
chr5: 170625426-170659593
1.78
3.66


NOVEL
UPREG.
TU_0089905_0
chr11: 1042845-1045705
2.77
3.66


PROTEIN
UPREG.
TU_0101704_0
chr9: 3265495-3516005
2.33
3.64


ncRNA
UPREG.
TU_0044897_0
chr13: 94746488-94760688
2.17
3.64


PROTEIN
UPREG.
TU_0071059_0
chr20: 20549245-20641260
2.39
3.64


ncRNA
UPREG.
TU_0046268_0
chr4: 1199698-1211108
1.93
3.63


PROTEIN
UPREG.
TU_0071601_0
chr20: 32827590-32828002
1.75
3.62


PROTEIN
UPREG.
TU_0100712_0
chr21: 15258179-15359100
2.14
3.62


PROTEIN
UPREG.
TU_0092156_0
chr11: 60885030-60893249
1.45
3.62


PROTEIN
UPREG.
TU_0091402_0
chr11: 46255779-46299542
1.71
3.62


PROTEIN
UPREG.
TU_0039018_0
chr3: 66376322-66514060
1.50
3.62


PROTEIN
UPREG.
TU_0100378_0
chr8: 144899799-144900640
2.00
3.62


NOVEL
UPREG.
TU_0112025_0
chr15: 67780574-67782345
3.42
3.62


PROTEIN
UPREG.
TU_0106031_0
chr22: 16336630-16412806
2.01
3.62


PROTEIN
UPREG.
TU_0050785_0
chr4: 174395360-174453821
2.36
3.61


PROTEIN
UPREG.
TU_0058834_0
chr1: 11768665-11783670
1.50
3.61


PROTEIN
UPREG.
TU_0039496_0
chr3: 106753939-106754201
1.99
3.61


ncRNA
UPREG.
TU_0098397_0
chr8: 69379259-69406175
2.73
3.61


PROTEIN
UPREG.
TU_0017847_0
chr17: 54188675-54413808
2.82
3.61


PROTEIN
UPREG.
TU_0108299_0
chr22: 49267227-49270226
2.03
3.60


PROTEIN
UPREG.
TU_0076846_0
chr10: 135042714-135056670
2.27
3.59


PROTEIN
UPREG.
TU_0096351_0
chr11: 129611827-129689996
1.61
3.59


PROTEIN
UPREG.
TU_0019298_0
chr17: 77242472-77300154
1.51
3.59


PROTEIN
UPREG.
TU_0057465_0
chr18: 72830973-7297379
1.56
3.59


PROTEIN
UPREG.
TU_0013475_0
chr7: 148137800-148212367
1.74
3.59


PROTEIN
UPREG.
TU_0001426_0
chr6: 28655044-28662198
2.56
3.59


NOVEL
UPREG.
TU_0106541_0
chr22: 22209111-22219162
4.02
3.58


PROTEIN
UPREG.
TU_0073803_0
chr10: 19005554-19007053
1.84
3.58


PROTEIN
UPREG.
TU_0040100_0
chr3: 129253916-129289610
1.39
3.58


PROTEIN
UPREG.
TU_0001431_0
chr6: 28978594-28999755
1.33
3.58


PROTEIN
UPREG.
TU_0076643_0
chr10: 127398227-127407663
1.73
3.57


PROTEIN
UPREG.
TU_0089137_0
chr5: 176814485-176815986
1.93
3.57


PROTEIN
UPREG.
TU_0098700_0
chr8: 82806988-82833618
1.76
3.57


PROTEIN
UPREG.
TU_0093785_0
chr11: 67186209-67198838
3.74
3.57


NOVEL
UPREG.
TU_0056168_0
chr18: 22477042-22477886
3.05
3.57


PROTEIN
UPREG.
TU_0067222_0
chr1: 164063363-164147501
1.63
3.57


PROTEIN
UPREG.
TU_0052172_0
chr16: 8799176-8799379
1.61
3.57


PROTEIN
UPREG.
TU_0008360_0
chr7: 16652301-16712672
1.46
3.57


PROTEIN
UPREG.
TU_0035610_0
chr14: 93580687-93582188
2.08
3.56


PROTEIN
UPREG.
TU_0000168_0
chr6: 3100128-3102765
2.10
3.56


PROTEIN
UPREG.
TU_0039649_0
chr3: 115160992-115164502
1.72
3.56


PROTEIN
UPREG.
TU_0052843_0
chr16: 27143818-27187607
1.42
3.56


NOVEL
UPREG.
TU_0024950_0
chr19: 54450100-54452968
2.11
3.55


PROTEIN
UPREG.
TU_0008504_0
chr7: 24656812-24693891
1.99
3.55


PROTEIN
UPREG.
TU_0061102_0
chr1: 35671678-35795597
1.44
3.55


PROTEIN
UPREG.
TU_0032890_0
chr14: 36736878-36788106
2.36
3.55


ncRNA
UPREG.
TU_0046241_0
chr4: 1158292-1167160
2.53
3.55


NOVEL
UPREG.
TU_0008499_0
chr7: 24236191-24236455
5.44
3.54


PROTEIN
UPREG.
TU_0100172_0
chr8: 142471307-142511866
1.78
3.54


NOVEL
UPREG.
TU_0086543_0
chr5: 110311813-110312092
1.53
3.53


PROTEIN
UPREG.
TU_0072450_0
chr20: 44619899-44747359
1.83
3.53


NOVEL
UPREG.
TU_0044931_0
chr13: 94755980-94759335
2.15
3.53


PROTEIN
UPREG.
TU_0093950_0
chr11: 68214746-68215218
1.49
3.53


PROTEIN
UPREG.
TU_0006239_0
chr6: 138649313-138671427
2.22
3.53


PROTEIN
UPREG.
TU_0065894_0
chr1: 150044684-150070988
1.54
3.52


PROTEIN
UPREG.
TU_0078675_0
chr12: 47602047-47602939
1.58
3.52


PROTEIN
UPREG.
TU_0052150_0
chr16: 8799176-8864674
1.42
3.52


NOVEL
UPREG.
TU_0112021_0
chr15: 67762926-67783593
2.66
3.52


PROTEIN
UPREG.
TU_0041581_0
chr3: 185450132-185459240
1.77
3.52


PROTEIN
UPREG.
TU_0017269_0
chr17: 42127174-42189979
1.59
3.52


PROTEIN
UPREG.
TU_0103138_0
chr9: 94055563-94056563
1.61
3.52


PROTEIN
UPREG.
TU_0078683_0
chr12: 47603989-47604485
1.69
3.52


PROTEIN
UPREG.
TU_0099209_0
chr11: 6453771-6453210
1.44
3.51


ncRNA
UPREG.
TU_0045193_0
chr13: 97851959-97852689
1.98
3.51


PROTEIN
UPREG.
TU_0050499_0
chr4: 156862572-156862939
1.82
3.51


PROTEIN
UPREG.
TU_0088025_0
chr5: 142130134-142254088
1.89
3.51


PROTEIN
UPREG.
TU_0052554_0
chr16: 19329285-19424714
1.78
3.51


PROTEIN
UPREG.
TU_0085653_0
chr5: 70918890-70990273
2.39
3.51


PROTEIN
UPREG.
TU_0101238_0
chr21: 41610494-41651888
1.89
3.50


PROTEIN
UPREG.
TU_0098689_0
chr8: 82355436-82355977
4.15
3.49


PROTEIN
UPREG.
TU_0100271_0
chr8: 144522379-144537551
1.93
3.49


PROTEIN
UPREG.
TU_0013258_0
chr7: 139750340-139773086
1.85
3.49


PROTEIN
UPREG.
TU_0122559_0
chr2: 224338108-224338327
2.32
3.49


PROTEIN
UPREG.
TU_0068947_0
chr1: 212567070-212567723
1.74
3.48


PROTEIN
UPREG.
TU_0101300_0
chr21: 42512421-42593934
1.60
3.48


PROTEIN
UPREG.
TU_0105268_0
chr9: 138238011-138277254
1.49
3.47


PROTEIN
UPREG.
TU_0080269_0
chr12: 62524730-62664317
2.05
3.47


PROTEIN
UPREG.
TU_0001992_0
chr6: 31939105-31955076
1.56
3.47


PROTEIN
UPREG.
TU_0018485_0
chr17: 70458432-70480451
1.58
3.47


ncRNA
UPREG.
TU_0050493_0
chr1: 28705947-28706605
1.60
2.46


PROTEIN
UPREG.
TU_0085975_0
chr5: 79478814-79495113
1.91
3.46


PROTEIN
UPREG.
TU_0018919_0
chr17: 73678343-73714970
1.48
3.46


ncRNA
UPREG.
TU_0054534_0
chr16: 79404014-79431652
9.85
3.46


PROTEIN
UPREG.
TU_0076107_0
chr10: 104454315-104488075
1.67
3.45


ncRNA
UPREG.
TU_0069658_0
chr1: 229724782-229731269
1.75
3.45


NOVEL
UPREG.
TU_0120387_0
chr2: 170267824-170281386
2.10
3.45


PROTEIN
UPREG.
TU_0015665_0
chr17: 24073407-24077926
1.52
3.45


ncRNA
UPREG.
TU_0070414_0
chr20: 1254059-1303172
1.68
3.45


NOVEL
UPREG.
TU_0072624_0
chr20: 47335522-47338977
1.65
3.45


PROTEIN
UPREG.
TU_0012495_0
chr7: 111373031-111411626
2.29
3.45


PROTEIN
UPREG.
TU_0076659_0
chr10: 127514501-127526128
1.31
3.45


PROTEIN
UPREG.
TU_0088525_0
chr5: 156625701-156755178
1.53
3.45


PROTEIN
UPREG.
TU_0046096_0
chr4: 759449-809939
2.01
3.44


ncRNA
UPREG.
TU_0074332_0
chr10: 43420869-43421283
1.52
3.44


PROTEIN
UPREG.
TU_0082983_0
chr12: 121778239-121779189
2.65
3.44


PROTEIN
UPREG.
TU_0008361_0
chr7: 16759923-16790805
1.58
3.44


PROTEIN
UPREG.
TU_0061443_0
chr1: 38032067-38039550
1.67
3.44


PROTEIN
UPREG.
TU_0042715_0
chr13: 23148223-23204319
3.68
3.43


ncRNA
UPREG.
TU_0119128_0
chr2: 118310197-118313068
1.62
3.43


PROTEIN
UPREG.
TU_0112349_0
chr15: 70834440-70835126
1.67
3.43


PROTEIN
UPREG.
TU_0027543_0
chrX: 21921233-21922374
2.48
3.43


PROTEIN
UPREG.
TU_0062582_0
chr1: 47489058-47552320
1.83
3.43


ncRNA
UPREG.
TU_0050791_0
chr4: 174322695-174323924
2.13
3.41


PROTEIN
UPREG.
TU_0048346_0
chr4: 77175264-77176185
2.48
3.41


NOVEL
UPREG.
TU_0093068_0
chr11: 64956616-64961189
2.13
3.41


PROTEIN
UPREG.
TU_0033869_0
chr14: 60248258-60260801
1.21
3.41


PROTEIN
UPREG.
TU_0000031_0
chr6: 2190031-2190908
2.44
3.41


PROTEIN
UPREG.
TU_0082131_0
chr12: 111151572-111152227
1.88
3.40


PROTEIN
UPREG.
TU_0038169_0
chr3: 49035494-49041923
1.35
3.40


NOVEL
UPREG.
TU_0044898_0
chr13: 94753009-94760688
2.11
3.40


PROTEIN
UPREG.
TU_0089144_0
chr5: 176814489-176815986
1.86
3.40


PROTEIN
UPREG.
TU_0094504_0
chr11: 74812477-74817273
2.40
3.40


PROTEIN
UPREG.
TU_0035633_0
chr14: 94304291-94305127
2.17
3.40


PROTEIN
UPREG.
TU_0085819_0
chr5: 75734806-76039614
1.64
3.40


PROTEIN
UPREG.
TU_0061431_0
chr1: 37961347-37973585
2.62
3.40


NOVEL
UPREG.
TU_0078299_0
chr12: 32290896-32292169
3.67
3.39


PROTEIN
UPREG.
TU_0004059_0
chr6: 52976378-53034598
1.65
3.39


PROTEIN
UPREG.
TU_0098927_0
chr8: 95722432-95788870
1.48
3.39


ncRNA
UPREG.
TU_0013886_0
chr7: 155957953-156090820
2.50
3.39


PROTEIN
UPREG.
TU_0068377_0
chr1: 201452418-201458956
1.84
3.39


NOVEL
UPREG.
TU_0101035_0
chr21: 35419563-36421930
1.84
3.39


PROTEIN
UPREG.
TU_0062957_0
chr1: 54089897-54128073
1.43
3.39


PROTEIN
UPREG.
TU_0099854_0
chr8: 127633901-127639897
1.65
3.38


PROTEIN
UPREG.
TU_0048743_0
chr4: 87924751-87955166
1.47
3.38


PROTEIN
UPREG.
TU_0086478_0
chr5: 102510255-102521832
1.95
3.38


PROTEIN
UPREG.
TU_0120565_0
chr2: 172672776-172675279
4.31
3.38


PROTEIN
UPREG.
TU_0122360_0
chr2: 219554051-219557439
2.92
3.38


PROTEIN
UPREG.
TU_0092154_0
chr11: 60857271-60874474
1.44
3.37


PROTEIN
UPREG.
TU_0015718_0
chr17: 24095069-24100305
1.64
3.37


PROTEIN
UPREG.
TU_0039284_0
chr3: 95208586-95249573
2.23
3.37


PROTEIN
UPREG.
TU_0082089_0
chr12: 111082307-111187476
1.44
3.37


PROTEIN
UPREG.
TU_0035148_0
chr14: 81009021-81069951
1.64
3.37


PROTEIN
UPREG.
TU_0054849_0
chr16: 87403253-87406669
1.47
3.37


PROTEIN
UPREG.
TU_0113376_0
chr15: 87432680-87545107
2.13
3.36


PROTEIN
UPREG.
TU_0019481_0
chr17: 77998514-77999441
1.55
3.36


PROTEIN
UPREG.
TU_0007004_0
chr6: 158396021-158440190
1.47
3.36


PROTEIN
UPREG.
TU_0092190_0
chr11: 60876795-60877493
1.85
3.36


ncRNA
UPREG.
TU_0001996_0
chr6: 31941546-31959679
1.43
3.36


NOVEL
UPREG.
TU_0066689_0
chr1: 154509233-154510967
1.61
3.36


PROTEIN
UPREG.
TU_0035151_0
chr14: 81015445-81021875
2.00
3.35


PROTEIN
UPREG.
TU_0092866_0
chr11: 63975211-63975675
3.20
3.35


PROTEIN
UPREG.
TU_0050482_0
chr4: 156807332-156877628
1.69
3.35


PROTEIN
UPREG.
TU_0022391_0
chr19: 19076718-19094443
1.60
3.35


PROTEIN
UPREG.
TU_0048729_0
chr4: 87734463-87924734
1.74
3.35


PROTEIN
UPREG.
TU_0103472_0
chr9: 100534124-100570357
1.61
3.35


PROTEIN
UPREG.
TU_0087465_0
chr5: 136431191-136431490
2.47
3.35


PROTEIN
UPREG.
TU_0058833_0
chr1: 11768665-11788581
1.45
3.34


PROTEIN
DOWNREG.
TU_0009047_0
chr7: 41967123-41970103
0.65
−3.35


PROTEIN
DOWNREG.
TU_0020039_0
chr19: 2948637-2980244
0.65
−3.36


PROTEIN
DOWNREG.
TU_0024046_0
chr19: 47194316-47201741
0.53
−3.36


PROTEIN
DOWNREG.
TU_0120035_0
chr2: 154042114-154043553
0.49
−3.36


PROTEIN
DOWNREG.
TU_0014542_0
chr17: 4790024-4790984
0.77
−3.36


PROTEIN
DOWNREG.
TU_0058703_0
chr1: 10457547-10613394
0.66
−3.37


NOVEL
DOWNREG.
TU_0084922_0
chr5: 44337219-44338127
0.51
−3.37


PROTEIN
DOWNREG.
TU_0067333_0
chr1: 167362572-167539064
0.68
−3.37


PROTEIN
DOWNREG.
TU_0030086_0
chrX: 101794939-101798995
0.64
−3.37


PROTEIN
DOWNREG.
TU_0031101_0
chrX: 134247418-134254372
0.69
−3.37


PROTEIN
DOWNREG.
TU_0063762_0
chr1: 87566944-87583813
0.66
−3.38


PROTEIN
DOWNREG.
TU_0107584_0
chr22: 38075931-38123808
0.66
−3.38


PROTEIN
DOWNREG.
TU_0102296_0
chr9: 34979701-34988409
0.57
−3.38


PROTEIN
DOWNREG.
TU_0038455_0
chr3: 51951847-51958668
0.65
−3.38


PROTEIN
DOWNREG.
TU_0062948_0
chr1: 53744574-53746867
0.46
−3.38


PROTEIN
DOWNREG.
TU_0092655_0
chr11: 63282470-63288729
0.73
−3.38


PROTEIN
DOWNREG.
TU_0035606_0
chr14: 93470258-93500717
0.58
−3.38


PROTEIN
DOWNREG.
TU_0055588_0
chr18: 10470831-10478699
0.58
−3.38


PROTEIN
DOWNREG.
TU_0056462_0
chr18: 41558112-41584622
0.49
−3.39


PROTEIN
DOWNREG.
TU_0002739_0
chr6: 35321958-35328561
0.55
−3.39


PROTEIN
DOWNREG.
TU_0030147_0
chrX: 102727067-102729284
0.65
−3.39


NOVEL
DOWNREG.
TU_0030209_0
chrX: 103250901-103253228
0.66
−3.39


ncRNA
DOWNREG.
TU_0068206_0
chr1: 200132176-200134973
0.60
−3.39


PROTEIN
DOWNREG.
TU_0081627_0
chr12: 108186419-108190411
0.63
−3.40


PROTEIN
DOWNREG.
TU_0068194_0
chr1: 200132176-200182322
0.59
−3.40


PROTEIN
DOWNREG.
TU_0049308_0
chr4: 104220026-104220361
0.46
−3.40


NOVEL
DOWNREG.
TU_0068431_0
chr1: 202350966-202363482
0.62
−3.40


PROTEIN
DOWNREG.
TU_0073506_0
chr10: 7630096-7723984
0.60
−3.40


PROTEIN
DOWNREG.
TU_0054695_0
chr16: 83411105-83499914
0.62
−3.40


PROTEIN
DOWNREG.
TU_0012556_0
chr7: 115934290-115935899
0.50
−3.41


PROTEIN
DOWNREG.
TU_0018647_0
chr17: 71259157-71294839
0.74
−3.41


NOVEL
DOWNREG.
TU_0030577_0
chrX: 118036531-118036860
0.43
−3.41


PROTEIN
DOWNREG.
TU_0089961_0
chr11: 2248339-2247566
0.52
−3.41


PROTEIN
DOWNREG.
TU_0000888_0
chr6: 19947236-19950403
0.56
−3.41


PROTEIN
DOWNREG.
TU_0002212_0
chr6: 32224073-32226328
0.56
−3.41


PROTEIN
DOWNREG.
TU_0024749_0
chr19: 52937559-52939100
0.58
−3.41


PROTEIN
DOWNREG.
TU_0101225_0
chr21: 40161189-40161418
0.52
−3.41


ncRNA
DOWNREG.
TU_0100030_0
chr8: 134653589-134655310
0.41
−3.41


PROTEIN
DOWNREG.
TU_0102256_0
chr9: 34356684-34366854
0.56
−3.41


PROTEIN
DOWNREG.
TU_0039040_0
chr3: 69107066-69108860
0.62
−3.42


ncRNA
DOWNREG.
TU_0115808_0
chr2: 37722515-37725828
0.61
−3.42


PROTEIN
DOWNREG.
TU_0115807_0
chr2: 37722515-37725828
0.61
−3.42


NOVEL
DOWNREG.
TU_0038811_0
chr3: 57890130-57890834
0.43
−3.43


PROTEIN
DOWNREG.
TU_0107000_0
chr22: 29790122-29830660
0.60
−3.43


PROTEIN
DOWNREG.
TU_0065126_0
chr1: 144274405-144279906
0.53
−3.43


PROTEIN
DOWNREG.
TU_0065093_0
chr1: 144167535-144181746
0.72
−3.43


PROTEIN
DOWNREG.
TU_0066887_0
chr1: 158352167-158379985
0.56
−3.44


PROTEIN
DOWNREG.
TU_0034681_0
chr14: 73248261-73250867
0.61
−3.44


PROTEIN
DOWNREG.
TU_0064872_0
chr1: 115373945-115394701
0.60
−3.44


PROTEIN
DOWNREG.
TU_0115146_0
chr2: 26806070-26809827
0.49
−3.44


PROTEIN
DOWNREG.
TU_0023552_0
chr19: 43433715-43439100
0.52
−3.44


PROTEIN
DOWNREG.
TU_0013056_0
chr2: 134269121-134269574
0.41
−3.44


PROTEIN
DOWNREG.
TU_0078015_0
chr12: 21809160-21817495
0.61
−3.45


PROTEIN
DOWNREG.
TU_0010849_0
chr7: 84462824-84464278
0.41
−3.45


PROTEIN
DOWNREG.
TU_0018278_0
chr17: 62235564-62237319
0.62
−3.45


PROTEIN
DOWNREG.
TU_0106896_0
chr22: 28206216-28217370
0.46
−3.46


PROTEIN
DOWNREG.
TU_0086308_0
chr5: 95158335-95154222
0.54
−3.46


PROTEIN
DOWNREG.
TU_0059500_0
chr1: 19842799-19857540
0.66
−3.46


PROTEIN
DOWNREG.
TU_0030156_0
chrX: 102749504-102752161
0.61
−3.46


PROTEIN
DOWNREG.
TU_0053209_0
chr16: 30815439-30839057
0.45
−3.46


PROTEIN
DOWNREG.
TU_0102372_0
chr9: 35672000-35681106
0.58
−3.46


PROTEIN
DOWNREG.
TU_0040491_0
chr3: 134947802-134980329
0.35
−3.46


PROTEIN
DOWNREG.
TU_0063025_0
chr1: 54832256-54849445
0.56
−3.46


PROTEIN
DOWNREG.
TU_0016741_0
chr17: 37808007-37818100
0.61
−3.47


PROTEIN
DOWNREG.
TU_0079872_0
chr12: 53272841-55276238
0.70
−3.47


NOVEL
DOWNREG.
TU_0072214_0
chr20: 42166331-42172501
0.45
−3.47


PROTEIN
DOWNREG.
TU_0069254_0
chr1: 223745864-223750945
0.54
−3.48


PROTEIN
DOWNREG.
TU_0014474_0
chr17: 4410320-4410614
0.34
−3.48


PROTEIN
DOWNREG.
TU_0002034_0
chr6: 31975375-31977685
0.61
−3.48


ncRNA
DOWNREG.
TU_0115805_0
chr2: 37722515-37727509
0.64
−3.48


PROTEIN
DOWNREG.
TU_0106487_0
chr22: 21742726-21797216
0.56
−3.48


PROTEIN
DOWNREG.
TU_0100880_0
chr21: 32808766-32809639
0.62
−3.48


PROTEIN
DOWNREG.
TU_0028960_0
chrX: 64873768-64873981
0.59
−3.48


PROTEIN
DOWNREG.
TU_0103717_0
chr9: 112675334-112676369
0.59
−3.48


PROTEIN
DOWNREG.
TU_0016732_0
chr17: 37807991-37828819
0.65
−3.48


PROTEIN
DOWNREG.
TU_0075573_0
chr10: 96987317-97040810
0.65
−3.48


PROTEIN
DOWNREG.
TU_0108979_0
chr15: 34659121-34889737
0.68
−3.48


PROTEIN
DOWNREG.
TU_0039868_0
chr3: 123526763-123543198
0.51
−3.48


PROTEIN
DOWNREG.
TU_0032236_0
chr14: 22885061-22893832
0.61
−3.48


PROTEIN
DOWNREG.
TU_0103902_0
chr9: 115957988-116128421
0.59
−3.49


PROTEIN
DOWNREG.
TU_0004251_0
chr6: 71069214-71069482
0.36
−3.49


PROTEIN
DOWNREG.
TU_0115344_0
chr2: 27568254-27571592
0.64
−3.49


NOVEL
DOWNREG.
TU_0094307_0
chr11: 7977293-7979927
0.69
−3.49


NOVEL
DOWNREG.
TU_0020914_0
chr19: 9718612-9721799
0.47
−3.49


PROTEIN
DOWNREG.
TU_0014009_0
chr7: 158513133-158630217
0.48
−3.50


PROTEIN
DOWNREG.
TU_0111467_0
chr15: 62817064-62854842
0.58
−3.50


NOVEL
DOWNREG.
TU_0088552_0
chr5: 157103352-157120455
0.64
−3.50


PROTEIN
DOWNREG.
TU_0016616_0
chr17: 36992038-37034423
0.44
−3.50


PROTEIN
DOWNREG.
TU_0109820_0
chr15: 41600571-41611159
0.56
−3.51


PROTEIN
DOWNREG.
TU_0083744_0
chr5: 236838-237985
0.50
−3.51


PROTEIN
DOWNREG.
TU_0038899_0
chr3: 58465926-58495812
0.58
−3.51


PROTEIN
DOWNREG.
TU_0018817_0
chr17: 72183287-72184800
0.61
−3.51


PROTEIN
DOWNREG.
TU_0096362_0
chr11: 129779777-129794214
0.56
−3.51


ncRNA
DOWNREG.
TU_0104765_0
chr9: 131134480-131144297
0.53
−3.51


PROTEIN
DOWNREG.
TU_0047809_0
chr4: 52581019-52582331
0.62
−3.52


PROTEIN
DOWNREG.
TU_0114638_0
chr2: 11804193-11884972
0.68
−3.52


PROTEIN
DOWNREG.
TU_0110215_0
chr15: 43246574-43254766
0.63
−3.52


PROTEIN
DOWNREG.
TU_0117024_0
chr2: 66515747-66653430
0.61
−3.52


PROTEIN
DOWNREG.
TU_0109004_0
chr15: 35178588-35180010
0.39
−3.53


PROTEIN
DOWNREG.
TU_0114005_0
chr15: 97462760-97493368
0.56
−3.53


PROTEIN
DOWNREG.
TU_0079534_0
chr12: 53260191-53268540
0.41
−3.53


PROTEIN
DOWNREG.
TU_0058435_0
chr1: 202366748-202385528
0.62
−3.53


PROTEIN
DOWNREG.
TU_0014730_0
chr17: 7034460-7061662
0.61
−3.53


PROTEIN
DOWNREG.
TU_0111099_0
chr15: 57738640-57756015
0.70
−3.54


PROTEIN
DOWNREG.
TU_0079355_0
chr12: 51906937-51912605
0.54
−3.54


PROTEIN
DOWNREG.
TU_0107389_0
chr22: 36670710-36671784
0.59
−3.54


PROTEIN
DOWNREG.
TU_0105434_0
chr9: 138991774-138996018
0.54
−3.54


ncRNA
DOWNREG.
TU_0122441_0
chr2: 220000172-220002664
0.38
−3.54


PROTEIN
DOWNREG.
TU_0074041_0
chr10: 29785041-30065975
0.64
−3.55


PROTEIN
DOWNREG.
TU_0114819_0
chr2: 23779564-23785016
0.65
−3.55


PROTEIN
DOWNREG.
TU_0013666_0
chr7: 150180552-150189309
0.34
−3.55


PROTEIN
DOWNREG.
TU_0036844_0
chr3: 9930678-9933062
0.54
−3.56


PROTEIN
DOWNREG.
TU_0014467_0
chr17: 4407802-4410614
0.49
−3.56


NOVEL
DOWNREG.
TU_0036397_0
chr14: 104617328-104624500
0.45
−3.56


PROTEIN
DOWNREG.
TU_0014721_0
chr17: 6882853-6884238
0.60
−3.57


PROTEIN
DOWNREG.
TU_0061867_0
chr1: 41618433-41621890
0.61
−3.57


PROTEIN
DOWNREG.
TU_0090901_0
chr11: 20061238-20099725
0.60
−3.57


PROTEIN
DOWNREG.
TU_0089503_0
chr5: 179949721-179951068
0.47
−3.57


NOVEL
DOWNREG.
TU_0112056_0
chr15: 69658838-69678469
0.46
−3.57


NOVEL
DOWNREG.
TU_0052454_0
chr16: 15702084-15702374
0.40
−3.57


PROTEIN
DOWNREG.
TU_0004248_0
chr6: 70983350-71069482
0.52
−3.57


PROTEIN
DOWNREG.
TU_0111118_0
chr15: 58426685-58428608
0.59
−3.58


PROTEIN
DOWNREG.
TU_0047256_0
chr4: 38781223-38804739
0.63
−3.58


PROTEIN
DOWNREG.
TU_0092308_0
chr11: 61395022-61326508
0.62
−3.58


PROTEIN
DOWNREG.
TU_0037381_0
chr3: 33159367-33165995
0.70
−3.59


PROTEIN
DOWNREG.
TU_0088765_0
chr5: 169737435-169749043
0.53
−3.60


PROTEIN
DOWNREG.
TU_0039072_0
chr3: 70098064-70100160
0.63
−3.60


NOVEL
DOWNREG.
TU_0112059_0
chr15: 69667695-69691724
0.41
−3.60


PROTEIN
DOWNREG.
TU_0030975_0
chrX: 130235170-130235814
0.49
−3.60


PROTEIN
DOWNREG.
TU_0038532_0
chr3: 52258212-52287726
0.77
−3.60


PROTEIN
DOWNREG.
TU_0014418_0
chr17: 3748115-3749717
0.39
−3.60


PROTEIN
DOWNREG.
TU_0001986_0
chr6: 31791087-31793378
0.48
−3.61


PROTEIN
DOWNREG.
TU_0111109_0
chr15: 58426685-58477514
0.66
−3.61


PROTEIN
DOWNREG.
TU_0064151_0
chr1: 98933515-98937074
0.46
−3.61


PROTEIN
DOWNREG.
TU_0111253_0
chr15: 61121812-61151157
0.63
−3.61


PROTEIN
DOWNREG.
TU_0058947_0
chr1: 13782811-13817026
0.61
−3.62


PROTEIN
DOWNREG.
TU_0031484_0
chrX: 151890690-151892673
0.59
−3.62


PROTEIN
DOWNREG.
TU_0076212_0
chr10: 105781059-105835687
0.47
−3.62


PROTEIN
DOWNREG.
TU_0062567_0
chr1: 47050692-47056967
0.47
−3.62


NOVEL
DOWNREG.
TU_0020667_0
chr19: 7888598-7889980
0.41
−3.62


PROTEIN
DOWNREG.
TU_0029358_0
chrX: 71263703-71268507
0.66
−3.63


PROTEIN
DOWNREG.
TU_0065339_0
chr1: 148457403-148475104
0.56
−3.63


PROTEIN
DOWNREG.
TU_0063765_0
chr1: 87583567-87587269
0.58
−3.63


NOVEL
DOWNREG.
TU_0036395_0
chr14: 104617328-104623671
0.53
−3.63


PROTEIN
DOWNREG.
TU_0103872_0
chr9: 115178483-115203441
0.59
−3.63


PROTEIN
DOWNREG.
TU_0050244_0
chr4: 148665059-148685558
0.63
−3.63


PROTEIN
DOWNREG.
TU_0031913_0
chr14: 20554755-20563715
0.64
−3.63


PROTEIN
DOWNREG.
TU_0065343_0
chr1: 148501147-148501585
0.37
−3.63


PROTEIN
DOWNREG.
TU_0084946_0
chr5: 50715235-50726033
0.60
−3.64


PROTEIN
DOWNREG.
TU_0090342_0
chr11: 8671475-8849482
0.64
−3.64


PROTEIN
DOWNREG.
TU_0120044_0
chr2: 155422693-155423038
0.26
−3.64


PROTEIN
DOWNREG.
TU_0023267_0
chr19: 40937280-40940189
0.52
−3.64


PROTEIN
DOWNREG.
TU_0023553_0
chr19: 43433715-43434071
0.51
−3.65


PROTEIN
DOWNREG.
TU_0115806_0
chr2: 37722515-37725663
0.60
−3.65


PROTEIN
DOWNREG.
TU_0085256_0
chr5: 59099679-59100724
0.53
−3.65


PROTEIN
DOWNREG.
TU_0038056_0
chr3: 48563574-48623119
0.68
−3.65


PROTEIN
DOWNREG.
TU_0022088_0
chr19: 16864768-16929718
0.55
−3.65


ncRNA
DOWNREG.
TU_0083408_0
chr12: 129197899-129212499
0.58
−3.65


PROTEIN
DOWNREG.
TU_0059155_0
chr1: 16397144-16405288
0.61
−3.65


PROTEIN
DOWNREG.
TU_0046595_0
chr4: 3264594-3411502
0.68
−3.65


PROTEIN
DOWNREG.
TU_0099476_0
chr8: 108331106-108578694
0.58
−3.66


PROTEIN
DOWNREG.
TU_0091498_0
chr11: 46834081-46849744
0.65
−3.66


PROTEIN
DOWNREG.
TU_0098389_0
chr8: 68586418-68699042
0.45
−3.66


PROTEIN
DOWNREG.
TU_0046627_0
chr4: 3735533-3740037
0.45
−3.67


NOVEL
DOWNREG.
TU_0103946_0
chr9: 116821701-116822181
0.48
−3.67


PROTEIN
DOWNREG.
TU_0008057_0
chr7: 5519816-5536775
0.62
−3.67


PROTEIN
DOWNREG.
TU_0100219_0
chr8: 143849604-143856276
0.59
−3.67


PROTEIN
DOWNREG.
TU_0087532_0
chr5: 137802544-137810548
0.53
−3.68


PROTEIN
DOWNREG.
TU_0066743_0
chr1: 154859563-154862200
0.43
−3.68


PROTEIN
DOWNREG.
TU_0052586_0
chr16: 19637116-19779369
0.64
−3.68


PROTEIN
DOWNREG.
TU_0075808_0
chr10: 88708340-88712998
0.51
−3.68


PROTEIN
DOWNREG.
TU_0032240_0
chr14: 22894093-22905632
0.57
−3.68


PROTEIN
DOWNREG.
TU_0046399_0
chr4: 2031053-2040569
0.44
−3.70


PROTEIN
DOWNREG.
TU_0081487_0
chr12: 104248577-104289423
0.56
−3.70


PROTEIN
DOWNREG.
TU_0096978_0
chr8: 22133174-22140355
0.47
−3.70


PROTEIN
DOWNREG.
TU_0054692_0
chr16: 83411105-83500616
0.62
−3.70


PROTEIN
DOWNREG.
TU_0067818_0
chr1: 180809414-180811333
0.72
−3.71


PROTEIN
DOWNREG.
TU_0098841_0
chr8: 92038228-92039575
0.39
−3.71


PROTEIN
DOWNREG.
TU_0121595_0
chr2: 202193170-202196672
0.62
−3.71


PROTEIN
DOWNREG.
TU_0023218_0
chr19: 40679964-40694184
0.55
−3.71


PROTEIN
DOWNREG.
TU_0112386_0
chr15: 71818130-71820041
0.55
−3.71


PROTEIN
DOWNREG.
TU_0024601_0
chr19: 51605296-51609005
0.56
−3.71


PROTEIN
DOWNREG.
TU_0055238_0
chr18: 2561572-2606627
0.59
−3.71


PROTEIN
DOWNREG.
TU_0085908_0
chr5: 78401241-78420780
0.52
−3.72


ncRNA
DOWNREG.
TU_0111315_0
chr15: 61676589-61681634
0.55
−3.72


PROTEIN
DOWNREG.
TU_0111311_0
chr15: 61676589-61681634
0.55
−3.72


PROTEIN
DOWNREG.
TU_0023241_0
chr19: 40856254-40861198
0.41
−3.72


PROTEIN
DOWNREG.
TU_0068139_0
chr1: 199127296-199147465
0.42
−3.72


ncRNA
DOWNREG.
TU_0102684_0
chr9: 70336502-70344481
0.56
−3.73


PROTEIN
DOWNREG.
TU_0068764_0
chr1: 207854842-207892483
0.49
−3.73


PROTEIN
DOWNREG.
TU_0053636_0
chr16: 55846971-55853340
0.58
−3.74


PROTEIN
DOWNREG.
TU_0084025_0
chr5: 6501949-6545706
0.54
−3.74


NOVEL
DOWNREG.
TU_0032151_0
chr14: 22508055-22508830
0.58
−3.74


PROTEIN
DOWNREG.
TU_0014680_0
chr17: 6295379-6305574
0.62
−3.74


PROTEIN
DOWNREG.
TU_0076124_0
chr10: 104619299-104651033
0.60
−3.75


PROTEIN
DOWNREG.
TU_0085198_0
chr5: 58300638-58305429
0.60
−3.75


PROTEIN
DOWNREG.
TU_0102686_0
chr9: 70337677-70344573
0.55
−3.76


PROTEIN
DOWNREG.
TU_0112385_0
chr15: 71818130-71831566
0.54
−3.76


PROTEIN
DOWNREG.
TU_0100875_0
chr21: 32705500-32809639
0.61
−3.78


PROTEIN
DOWNREG.
TU_0065928_0
chr1: 151800274-151855449
0.49
−3.78


PROTEIN
DOWNREG.
TU_0063298_0
chr1: 62474433-62474872
0.36
−3.78


PROTEIN
DOWNREG.
TU_0100851_0
chr21: 32604246-32608457
0.62
−3.79


PROTEIN
DOWNREG.
TU_0101015_0
chr21: 35010830-35012376
0.55
−3.79


ncRNA
DOWNREG.
TU_0031086_0
chrX: 133993992-133995935
0.73
−3.79


PROTEIN
DOWNREG.
TU_0068759_0
chr1: 207669209-207672813
0.45
−3.79


NOVEL
DOWNREG.
TU_0069253_0
chr1: 223741202-223745600
0.62
−3.79


PROTEIN
DOWNREG.
TU_0020150_0
chr19: 3877291-3879097
0.52
−3.79


ncRNA
DOWNREG.
TU_0084069_0
chr5: 9599340-9603383
0.50
−3.80


PROTEIN
DOWNREG.
TU_0016922_0
chr17: 38430856-38435173
0.51
−3.80


PROTEIN
DOWNREG.
TU_0013053_0
chr7: 134114695-134305949
0.56
−3.81


PROTEIN
DOWNREG.
TU_0017406_0
chr17: 43458534-43470076
0.58
−3.81


PROTEIN
DOWNREG.
TU_0014681_0
chr17: 6295379-6305877
0.50
−3.81


PROTEIN
DOWNREG.
TU_0058447_0
chr1: 9040090-9052233
0.36
−3.81


PROTEIN
DOWNREG.
TU_0055624_0
chr18: 11872611-11875972
0.64
−3.82


PROTEIN
DOWNREG.
TU_0003717_0
chr6: 43381215-43381963
0.49
−3.82


NOVEL
DOWNREG.
TU_0016578_0
chr17: 35881203-35884855
0.52
−3.82


PROTEIN
DOWNREG.
TU_0101224_0
chr21: 40161189-40223184
0.50
−3.82


PROTEIN
DOWNREG.
TU_0064871_0
chr1: 115391459-115433611
0.59
−3.83


PROTEIN
DOWNREG.
TU_0097462_0
chr8: 37773618-37822041
0.55
−3.83


PROTEIN
DOWNREG.
TU_0066742_0
chr1: 154860755-154862200
0.42
−3.83


PROTEIN
DOWNREG.
TU_0090638_0
chr11: 14242208-14246823
0.55
−3.83


PROTEIN
DOWNREG.
TU_0046626_0
chr4: 3735533-3740037
0.46
−3.83


PROTEIN
DOWNREG.
TU_0024608_0
chr19: 51842682-51856041
0.53
−3.83


PROTEIN
DOWNREG.
TU_0071146_0
chr20: 25381375-25432639
0.58
−3.84


PROTEIN
DOWNREG.
TU_0080097_0
chr12: 56301840-56307003
0.56
−3.85


PROTEIN
DOWNREG.
TU_0062615_0
chr1: 48974664-48997227
0.51
−3.85


PROTEIN
DOWNREG.
TU_0013669_0
chr7: 150272983-150305963
0.52
−3.86


PROTEIN
DOWNREG.
TU_0102682_0
chr9: 70197177-70337519
0.56
−3.86


PROTEIN
DOWNREG.
TU_0104855_0
chr9: 131689287-131691419
0.64
−3.86


PROTEIN
DOWNREG.
TU_0116336_0
chr2: 48677181-48685259
0.65
−3.86


PROTEIN
DOWNREG.
TU_0116619_0
chr2: 60532630-60533546
0.47
−3.87


PROTEIN
DOWNREG.
TU_0034452_0
chr14: 69415893-69568826
0.48
−3.87


PROTEIN
DOWNREG.
TU_0067213_0
chr1: 163086189-163087684
0.59
−3.87


PROTEIN
DOWNREG.
TU_0065337_0
chr1: 148457403-148475119
0.56
−3.87


NOVEL
DOWNREG.
TU_0062461_0
chr1: 46461750-46463004
0.51
−3.88


PROTEIN
DOWNREG.
TU_0080098_0
chr12: 56302807-56307707
0.56
−3.88


PROTEIN
DOWNREG.
TU_0034421_0
chr14: 68410559-68412495
0.62
−3.88


PROTEIN
DOWNREG.
TU_0016601_0
chr17: 36911114-36928728
0.39
−3.88


PROTEIN
DOWNREG.
TU_0079221_0
chr12: 51194638-51200498
0.43
−3.89


PROTEIN
DOWNREG.
TU_0112752_0
chr15: 76184009-76210733
0.55
−3.90


PROTEIN
DOWNREG.
TU_0028410_0
chrX: 48910899-48929704
0.68
−3.91


PROTEIN
DOWNREG.
TU_0076498_0
chr10: 123227854-123347940
0.55
−3.92


NOVEL
DOWNREG.
TU_0093208_0
chr11: 65396931-65397655
0.45
−3.92


PROTEIN
DOWNREG.
TU_0078129_0
chr12: 27016771-27017190
0.47
−3.92


PROTEIN
DOWNREG.
TU_0064620_0
chr1: 111962071-112059304
0.61
−3.92


PROTEIN
DOWNREG.
TU_0005224_0
chr6: 107917248-108088034
0.60
−3.93


PROTEIN
DOWNREG.
TU_0023668_0
chr19: 44114820-44158190
0.56
−3.93


PROTEIN
DOWNREG.
TU_0041856_0
chr3: 190990156-191097717
0.44
−3.93


PROTEIN
DOWNREG.
TU_0107364_0
chr22: 36658502-36671784
0.62
−3.93


PROTEIN
DOWNREG.
TU_0079224_0
chr12: 51194638-51199100
0.43
−3.94


PROTEIN
DOWNREG.
TU_0027357_0
chrX: 17728093-17737982
0.57
−3.94


PROTEIN
DOWNREG.
TU_0071013_0
chr20: 19141491-19652034
0.55
−3.95


PROTEIN
DOWNREG.
TU_0060281_0
chr1: 27204050-27211524
0.48
−3.95


PROTEIN
DOWNREG.
TU_0096007_0
chr11: 119487208-119514087
0.45
−3.95


PROTEIN
DOWNREG.
TU_0058810_0
chr1: 11631005-11637486
0.50
−3.95


ncRNA
DOWNREG.
TU_0102668_0
chr9: 67902293-67904671
0.52
−3.96


PROTEIN
DOWNREG.
TU_0103126_0
chr9: 93524079-93559558
0.55
−3.96


PROTEIN
DOWNREG.
TU_0098384_0
chr8: 68508843-68581618
0.43
−3.96


NOVEL
DOWNREG.
TU_0084058_0
chr5: 9602147-9603383
0.49
−3.96


ncRNA
DOWNREG.
TU_0018887_0
chr17: 73068191-73068659
0.29
−3.97


PROTEIN
DOWNREG.
TU_0020916_0
chr19: 9720305-9727203
0.55
−3.97


PROTEIN
DOWNREG.
TU_0018819_0
chr17: 72184340-72195820
0.59
−3.97


NOVEL
DOWNREG.
TU_0042081_0
chr3: 197374550-197376798
0.46
−3.97


PROTEIN
DOWNREG.
TU_0065864_0
chr1: 149850009-149852238
0.46
−3.98


PROTEIN
DOWNREG.
TU_0111301_0
chr15: 61676589-51684028
0.54
−3.98


PROTEIN
DOWNREG.
TU_0073443_0
chr10: 5556713-3558609
0.43
−3.99


PROTEIN
DOWNREG.
TU_0030581_0
chrX: 118096546-118104692
0.38
−3.99


PROTEIN
DOWNREG.
TU_0039780_0
chr3: 120843508-120866813
0.55
−4.00


PROTEIN
DOWNREG.
TU_0081660_0
chr12: 108705678-108718771
0.50
−4.00


PROTEIN
DOWNREG.
TU_0046397_0
chr4: 2032569-2050090
0.46
−4.00


PROTEIN
DOWNREG.
TU_0122440_0
chr2: 219991398-219999705
0.53
−4.01


PROTEIN
DOWNREG.
TU_0011534_0
chr7: 99083477-99096154
0.36
−4.01


PROTEIN
DOWNREG.
TU_0047206_0
chr4: 37815997-37817190
0.59
−4.02


PROTEIN
DOWNREG.
TU_0017005_0
chr17: 39308253-39337366
0.52
−4.02


PROTEIN
DOWNREG.
TU_0052436_0
chr16: 15704489-15858435
0.54
−4.03


PROTEIN
DOWNREG.
TU_0014761_0
chr17: 7128572-7131411
0.46
−4.03


PROTEIN
DOWNREG.
TU_0080075_0
chr12: 56290183-56301803
0.53
−4.03


PROTEIN
DOWNREG.
TU_0089295_0
chr5: 177597111-177621358
0.48
−4.03


PROTEIN
DOWNREG.
TU_0062594_0
chr16: 19775320-19780719
0.60
−4.03


PROTEIN
DOWNREG.
TU_0068168_0
chr1: 199700556-199742901
0.61
−4.04


ncRNA
DOWNREG.
TU_0102657_0
chr9: 67902293-67908869
0.54
−4.04


PROTEIN
DOWNREG.
TU_0003729_0
chr6: 43525496-43528789
0.55
−4.04


PROTEIN
DOWNREG.
TU_0071246_0
chr20: 29913077-29921837
0.42
−4.05


NOVEL
DOWNREG.
TU_0050224_0
chr4: 147115887-147190781
0.25
−4.06


PROTEIN
DOWNREG.
TU_0110166_0
chr15: 43172154-43198892
0.49
−4.07


PROTEIN
DOWNREG.
TU_0030085_0
chrX: 101782933-101800062
0.56
−4.07


PROTEIN
DOWNREG.
TU_0021042_0
chr19: 10435466-10441506
0.61
−4.08


PROTEIN
DOWNREG.
TU_0097463_0
chr8: 37812227-37826549
0.58
−4.08


PROTEIN
DOWNREG.
TU_0101681_0
chr9: 734412-736069
0.67
−4.08


PROTEIN
DOWNREG.
TU_0030157_0
chrX: 102750729-102751737
0.44
−4.09


NOVEL
DOWNREG.
TU_0098190_0
chr8: 61704765-61708199
0.40
−4.09


PROTEIN
DOWNREG.
TU_0062947_0
chr1: 53744955-53838542
0.42
−4.09


PROTEIN
DOWNREG.
TU_0078008_0
chr12: 21679541-21702042
0.57
−4.09


PROTEIN
DOWNREG.
TU_0017582_0
chr17: 45858594-45907395
0.54
−4.09


PROTEIN
DOWNREG.
TU_0000021_0
chr6: 1555144-1559122
0.53
−4.09


PROTEIN
DOWNREG.
TU_0031424_0
chrX: 149432223-149433104
0.47
−4.10


PROTEIN
DOWNREG.
TU_0065603_0
chr1: 149275738-149286201
0.42
−4.10


PROTEIN
DOWNREG.
TU_0037859_0
chr3: 45240966-45242817
0.49
−4.11


PROTEIN
DOWNREG.
TU_0102271_0
chr9: 34511045-34512853
0.50
−4.11


PROTEIN
DOWNREG.
TU_0035605_0
chr14: 93254401-93273368
0.49
−4.11


PROTEIN
DOWNREG.
TU_0064621_0
chr1: 112047963-112062396
0.54
−4.11


ncRNA
DOWNREG.
TU_0031098_0
chrX: 134057388-134058604
0.47
−4.11


PROTEIN
DOWNREG.
TU_0018799_0
chr17: 72061371-72080938
0.61
−4.11


PROTEIN
DOWNREG.
TU_0011129_0
chr7: 94135058-94136943
0.41
−4.11


NOVEL
DOWNREG.
TU_0036396_0
chr14: 104617328-104619095
0.41
−4.12


PROTEIN
DOWNREG.
TU_0086255_0
chr5: 92944260-92956054
0.57
−4.12


ncRNA
DOWNREG.
TU_0074501_0
chr10: 60429298-60431091
0.42
−4.12


PROTEIN
DOWNREG.
TU_0073757_0
chr10: 17672547-17699461
0.56
−4.13


PROTEIN
DOWNREG.
TU_0015457_0
chr17: 19581898-19587356
0.45
−4.13


PROTEIN
DOWNREG.
TU_0122402_0
chr2: 219821926-219824741
0.61
−4.13


PROTEIN
DOWNREG.
TU_0116618_0
chr2: 60532830-60633902
0.49
−4.13


PROTEIN
DOWNREG.
TU_0029963_0
chrX: 100220537-100238005
0.51
−4.15


PROTEIN
DOWNREG.
TU_0028949_0
chrX: 64804077-64878518
0.61
−4.15


PROTEIN
DOWNREG.
TU_0088443_0
chr5: 154178336-154210363
0.57
−4.16


PROTEIN
DOWNREG.
TU_0107371_0
chr22: 36668731-36671784
0.56
−4.17


PROTEIN
DOWNREG.
TU_0016830_0
chr17: 38070906-38071660
0.57
−4.17


PROTEIN
DOWNREG.
TU_0016596_0
chr17: 36923524-36946925
0.50
−4.17


PROTEIN
DOWNREG.
TU_0014764_0
chr17: 7131441-7134452
0.45
−4.18


PROTEIN
DOWNREG.
TU_0070473_0
chr20: 2621571-2702522
0.60
−4.18


PROTEIN
DOWNREG.
TU_0065602_0
chr1: 149282206-149286718
0.40
−4.19


PROTEIN
DOWNREG.
TU_0105435_0
chr9: 138997874-138999099
0.37
−4.19


PROTEIN
DOWNREG.
TU_0015445_0
chr17: 19415396-19422913
0.46
−4.20


PROTEIN
DOWNREG.
TU_0019012_0
chr17: 74597027-74990278
0.42
−4.21


PROTEIN
DOWNREG.
TU_0048538_0
chr4: 81336928-81344460
0.41
−4.22


PROTEIN
DOWNREG.
TU_0098385_0
chr8: 68508843-68509111
0.41
−4.22


PROTEIN
DOWNREG.
TU_0076499_0
chr10: 123227854-123248042
0.53
−4.23


PROTEIN
DOWNREG.
TU_0117482_0
chr2: 73973507-74000287
0.56
−4.23


PROTEIN
DOWNREG.
TU_0114778_0
chr2: 20264034-20288661
0.45
−4.24


PROTEIN
DOWNREG.
TU_0018316_0
chr17: 33917848-33935788
0.53
−4.25


PROTEIN
DOWNREG.
TU_0071893_0
chr20: 34603301-34611746
0.59
−4.25


PROTEIN
DOWNREG.
TU_0073523_0
chr10: 8136827-8157157
0.44
−4.26


PROTEIN
DOWNREG.
TU_0064500_0
chr1: 110061334-110079791
0.42
−4.27


PROTEIN
DOWNREG.
TU_0065862_0
chr1: 149850009-149852444
0.41
−4.27


PROTEIN
DOWNREG.
TU_0030064_0
chrX: 101268429-101269091
0.44
−4.28


PROTEIN
DOWNREG.
TU_0060278_0
chr1: 27192773-27200190
0.51
−4.28


PROTEIN
DOWNREG.
TU_0000013_0
chr6: 1257191-1259972
0.36
−4.29


PROTEIN
DOWNREG.
TU_0120707_0
chr2: 176665581-176669190
0.46
−4.31


PROTEIN
DOWNREG.
TU_0016744_0
chr17: 37790368-37809206
0.54
−4.31


PROTEIN
DOWNREG.
TU_0016827_0
chr17: 38065830-38071660
0.63
−4.31


PROTEIN
DOWNREG.
TU_0056190_0
chr18: 26824024-26842486
0.43
−4.33


PROTEIN
DOWNREG.
TU_0096964_0
chr8: 22027917-22043914
0.47
−4.35


PROTEIN
DOWNREG.
TU_0030062_0
chrX: 101267701-101269091
0.41
−4.36


ncRNA
DOWNREG.
TU_0120711_0
chr2: 176690351-176696560
0.49
−4.36


PROTEIN
DOWNREG.
TU_0011537_0
chr7: 99085728-99111736
0.39
−4.39


PROTEIN
DOWNREG.
TU_0107366_0
chr22: 36668731-36673469
0.54
−4.39


PROTEIN
DOWNREG.
TU_0065341_0
chr1: 148496551-148500610
0.35
−4.39


PROTEIN
DOWNREG.
TU_0015076_0
chr17: 12510065-12612990
0.50
−4.40


PROTEIN
DOWNREG.
TU_0087752_0
chr5: 139206352-139211418
0.44
−4.40


PROTEIN
DOWNREG.
TU_0108990_0
chr15: 34970176-35180015
0.51
−4.41


PROTEIN
DOWNREG.
TU_0062566_0
chr1: 47037330-47057598
0.43
−4.42


PROTEIN
DOWNREG.
TU_0018825_0
chr17: 72192513-72192794
0.47
−4.43


PROTEIN
DOWNREG.
TU_0002566_0
chr6: 33797424-33798978
0.37
−4.44


PROTEIN
DOWNREG.
TU_0074074_0
chr10: 29814868-29815135
0.26
−4.44


PROTEIN
DOWNREG.
TU_0110179_0
chr15: 43196205-43235205
0.43
−4.46


PROTEIN
DOWNREG.
TU_0082372_0
chr12: 116130336-116130610
0.41
−4.47


ncRNA
DOWNREG.
TU_0102658_0
chr9: 67902293-67908683
0.46
−4.48


PROTEIN
DOWNREG.
TU_0024160_0
chr19: 48777171-48778386
0.51
−4.49


PROTEIN
DOWNREG.
TU_0031081_0
chrX: 133993992-134013925
0.64
−4.49


PROTEIN
DOWNREG.
TU_0015447_0
chr17: 19421649-19423000
0.46
−4.50


PROTEIN
DOWNREG.
TU_0016834_0
chr17: 38072130-38072515
0.54
−4.50


PROTEIN
DOWNREG.
TU_0120709_0
chr2: 176677352-176697902
0.49
−4.50


PROTEIN
DOWNREG.
TU_0041205_0
chr3: 171619688-171634575
0.48
−4.53


PROTEIN
DOWNREG.
TU_0110178_0
chr15: 43196270-43241274
0.43
−4.54


PROTEIN
DOWNREG.
TU_0064473_0
chr1: 110000292-110079791
0.51
−4.58


ncRNA
DOWNREG.
TU_0120715_0
chr2: 176692475-176697902
0.50
−4.58


PROTEIN
DOWNREG.
TU_0110180_0
chr15: 43196205-43243358
0.43
−4.63


PROTEIN
DOWNREG.
TU_0024922_0
chr19: 54253368-54259943
0.42
−4.64


ncRNA
DOWNREG.
TU_0115816_0
chr2: 38109039-38116939
0.32
−4.64


ncRNA
DOWNREG.
TU_0067289_0
chr1: 166307141-166318970
0.48
−4.69


NOVEL
DOWNREG.
TU_0095765_0
chr11: 117640504-117642734
0.36
−4.69


PROTEIN
DOWNREG.
TU_0058445_0
chr1: 9017797-9040122
0.33
−4.70


PROTEIN
DOWNREG.
TU_0047068_0
chr4: 23402764-23403824
0.41
−4.72


PROTEIN
DOWNREG.
TU_0016882_0
chr17: 38260060-38263683
0.51
−4.82


NOVEL
DOWNREG.
TU_0098382_0
chr8: 68494189-68495887
0.29
−4.83


PROTEIN
DOWNREG.
TU_0110177_0
chr15: 43196768-43245735
0.47
−4.86


PROTEIN
DOWNREG.
TU_0089598_0
chr11: 303980-310982
0.35
−4.87


PROTEIN
DOWNREG.
TU_0107527_0
chr22: 37740155-37746215
0.44
−4.88


PROTEIN
DOWNREG.
TU_0107528_0
chr22: 37741248-37746215
0.43
−4.90


PROTEIN
DOWNREG.
TU_0032311_0
chr14: 23612588-23617134
0.32
−5.04






















TABLE 5








Expected

Fold change





Chromosomal
score
Observed
(PCA vs
q-value


PCAT ID
Gene
Location
(dExp)
score(d)
Benign
(%)





















PCAT-1
TU_0099865_0
chr8:128087842-128006202
−2.2654014
5.444088
5.9071784
0


PCAT-2
TU_0090142_0
chr11:4745877-4760303
−2.4400573
4.6781354
11.39658
0


PCAT-3
TU_0054603_0
chr16:82380933-82394836
−2.1786723
4.4612455
5.8916535
0


PCAT-4
TU_0090140_0
chr11:4748163-4759145
−2.1153426
4.4345
7.1999164
0


PCAT-5
TU_0078288_0
chr12:52392383-32405733
−1.9164219
4.312803
3.5655262
0


PCAT-6
TU_0099864_0
chr8:128094589-128103681
−1.7214081
4.265535
5.8997242
0


PCAT-7
TU_0084308_0
chr text missing or illegible when filed :15938755-15949124
−1.9636475
4.124071
4.747801
0


PCAT-8
TU_0084303_0
chr text missing or illegible when filed :15899476-15955226
−2.0245786
4.0520086
7.1035967
0


PCAT-9
TU_0082746_0
chr12:120197102-
−1.861408
3.7551165
5.1431665
0




120197416






PCAT-10
TU_0078296_0
chr12:32394534-32405549
−1.5944241
3.6902914
3.084959
0


PCAT-11
TU_0078280_0
chr12:32394534-32410888
−1.5337954
3.675318
3.1572607
0


PCAT-12
TU_0002597_0
chr6:34335202-34338521
−1.6253148
3.6489774
3.352418
0


PCAT-13
TU_0049368_0
chr4:105772318-106772770
−1.6894134
3.6079373
2.8299345
0


PCAT-14
TU_0106548_0
chr22:22209111-22212055
−1.930075
3.591358
5.962547
0


PCAT-15
TU_0078293_0
chr12:32395393-32414822
−1.5212961
3.5705945
2.9219174
0


PCAT-16
TU_0099884_0
chr8:128301495-128307578
−1.4445064
3.5658843
2.516981
0


PCAT-17
TU_0112014_0
chr15:67755165-67739990
−1.6325295
3.562463
3.6594224
0


PCAT-18
TU_0084306_0
chr5:15896315-15947085
−1.845
3.5603588
5.746707
0


PCAT-19
TU_0114240_0
chr2:1534883-1538193
−1.6970209
3.5233572
4.339947
0


PCAT-20
TU_0008499_0
chr7:24236191-24236455
−1.8302055
3.5071697
6.6821446
0


PCAT-21
TU_0078599_0
chr12:32290896-322921 text missing or illegible when filed
−1.7297353
3.508232
3.2923654
0


PCAT-22
TU_0000033_0
chr6:1619605-1568581
−1.7680657
3.494188
2.2470818
0


PCAT-23
TU_0096472_0
chr11:133844590-
−1.8782617
3.410355
5.9854193
0




133862924






PCAT-24
TU_0114250_0
chr2:1606782-1607314
−1.6662377
3.3910659
5.060926
0


PCAT-25
TU_0096473_0
chr11:133844590-
−1.8963361
3.3859823
6.1071715
0




133862995






PCAT-26
TU_0100361_0
chr8:44914456-144930753
−1.6521469
3.3805158
3.8420231
0


PCAT-27
TU_0040394_0
ch3:133418632-133441282
−1.5208395
3.3201025
2.9724674
0


PCAT-28
TU_0045432_0
chr13:34032994-34050503
−1.6738471
3.2037551
3.2093527
0


PCAT-29
TU_0112020_0
chr15:67764259-57801825
−1.5803315
3.1957351
3.593551
0


PCAT-30
TU_0042717_0
chr13:23149908-23200198
−2.0654948
3.1685438
4.9699407
0


PCAT-31
TU_0078292_0
chr12:32290485-32406307
−1.4503003
3.151379
2.8911364
0


PCAT-32
TU_0084146_0
chr5:14025126-14062770
−1.6452767
3.1257985
2.6190455
0


PCAT-33
TU_0056158_0
chr18:22477042-224776 text missing or illegible when filed
−1.5381516
3.0557241
3.1951044
0


PCAT-34
TU_0040383_0
chr3:1333605431-133429262
−1.5558791
3.0416508
3.747 text missing or illegible when filed 42
0


PCAT-35
TU_0112025_0
chr15:67780574-67782345
−1.6815377
3.0412362
3.433415
0


PCAT-36
TU_0041688_0
chr3:186741299-186741933
−1.4745297
3.0062308
2.543468
0


PCAT-37
TU_01 text missing or illegible when filed 42_0
chr9:109187089-109157455
−1.7387192
2.998355
8.6124363
0


PCAT-38
TU_0040375_0
chr3:133280634-133394609
−1.5469993
2.9753562
 3. text missing or illegible when filed 68055
0


PCAT-39
TU_0047312_0
chr4:39217660-39222163
−1.6388935
2.9124916
3.6121209
0


PCAT-40
TU_0106545_0
chr22:22215478-22219102
−1.7586497
2.88 text missing or illegible when filed 56
5.7357745
0


PCAT-41
TU_0054541_0
chr16:7940 text missing or illegible when filed 000-79435056
−1.74853934
2.8839164
5.847557
0


PCAT-42
TU_0060446_0
chr1:28438529-28450156
−1.4880521
2.857332
1.9824111
0


PCAT-43
TU_0072907_0
c20:55759486-55771583
−1.5254781
2.7566201
2.512179
0


PCAT-44
TU_00 text missing or illegible when filed 403_0
chr13:33844637-338457921
−1.5293877
2.7919009
3.6403422
0


PCAT-45
TU_0038678_0
chr3:3415954-53517078
−1.7047809
2.7858517
3.6008987
0


PCAT-46
TU_0101706_0
chr9:3408690-3415374
−1.4780945
2.7822090
3.3066912
0


PCAT-47
TU_0101709_0
chr9:3411967-3415374
−1.4652373
2.7822206
5.1886175
0


PCAT-48
TU_0106544_0
chr22:22210421-22220595
−1.6153599
2.7578135
5.7418716
0


PCAT-49
TU_0046121_0
chr4:756363-766599
−1.5697786
2.7573307
1.435532
0


PCAT-50
TU_0106542_0
chr22:22211315-22220506
−1.6098742
2.755721
3.3781004
0


PCAT-51
TU_0106541_0
chr22:22209111-22219162
−1.8595723
2.7341027
3.654145
0


PCAT-52
TU_0044453_0
chr13:51505777-51524522
−1.3416
2.732019
2.536953
0


PCAT-53
TU_0104717_0
chr9:130 text missing or illegible when filed 67833-130698832
−1.2938
2.7219732
2.3344588
0


PCAT-54
TU_0089014_0
chr5:176014905-176015351
−1.3967873
2.7047238
1.7803582
0


PCAT-55
TU_0108452_0
chr15:19344745-19352915
−1.5839852
2.6759455
1.8484153
0


PCAT-56
TU_0112003_0
chr15:67545590-67775246
−1.4385703
2.668052
3.045022
0


PCAT-57
TU_0078286_0
chr12:32395585-32405731
−1.3580805
2.6550874
2.6121044
0


PCAT-58
TU_0078303_0
chr12:32274210-32274530
−1.5020599
2.85856
3.3306372
0


PCAT-59
TU_0112004_0
chr15:67844390-67650387
−1.5175762
2.6509888
2.9933635
0


PCAT-60
TU_0071057_0
chr20:21428679-21429454
−1.4915688
2.649109
4.5481714
0


PCAT-61
TU_text missing or illegible when filed 2906_0
chr20:55759768-55770657
−1.5059631
2.645009
2.95756
0


PCAT-62
TU_0054240_0
chr text missing or illegible when filed :70155175-70173873
−1.4715649
2.6437716
3.5509577
0


PCAT-63
TU_0047330_0
chr4:39217641-39222163
−1.5139307
2.6277235
3.0695639
0


PCAT-64
TU_0055435_0
ch18:6715938-6719172
−1.6048826
2.6173768
2.9221427
0


PCAT-65
TU_0079791_0
chr12:54971053-54971481
−1.4415568
2.6910823
2.0141602
0


PCAT-66
TU_0043411_0
chr13:33918267-35916789
−1.495064
2.5 text missing or illegible when filed 1523
5.386 text missing or illegible when filed 2
0


PCAT-67
TU_0056121_0
chr18:20196762-20197522
−1.2526748
2.5938754
1.7191441
0


PCAT-68
TU_0043412_0
chr13:33918267-33935946
−1.5891836
2.590195
4.2804046
0


PCAT-69
TU_0065837_0
chr1:143791525-149795934
−1.3852053
2.5 text missing or illegible when filed 2297
2.9543975
0


PCAT-70
TU_0043401_0
chr13:33825711-33845275
−1.5994886
2.5853658
4.3461533
0


PCAT-71
TU_0006453_0
chr6:144659810-144660143
−1.4985942
2.5744107
2.2007995
0


PCAT-72
TU_0048556_0
chr4:50329012-80348259
−1.5744382
2.5690413
2.8022916
0


PCAT-73
TU_0084140_0
chr5:14003 text missing or illegible when filed -14054874
−1.4040573
2.5472755
2.5979335
0


PCAT-74
TU_0082983_0
chr12:1212776584-
−1.5293782
2.5458217
2.6197503
0




121777370






PCAT-75
TU_0013212_0
chr7:138960883-139001515
−1.2296493
2.544434
3.8879753
0


PCAT-76
TU_0032912_0
chr20:55779532-55780517
−1.4502364
2.5406737
5.8653345
0


PCAT-77
TU_0112281_0
chr15:70586704-70590792
−1.4590155
2.5375097
2.4288568
0


PCAT-78
TU_0048767_0
chr4:88120065-88124880
−1.3735119
2.5323946
2.233308
0


PCAT-79
TU_0408455_0
chr15:19358326-19365341
−1.5651321
2.5281353
1.9462657
0


PCAT-80
TU_0091997_0
chr11:58560356-58573012
−1.3149303
2.5185204
2.1175686
0


PCAT-81
TU_0121658_0
chr2:262985284-202998634
−1.4014161
2.476237
2.2194188
0.859614


PCAT-82
TU_text missing or illegible when filed 71798_0
chr20:533775260-35778511
−1.3358665
2.4845917
1.8566333
0.850371


PCAT-83
TU_0049200_0
chr4:192460973-102476 text missing or illegible when filed 7
−1.3222212
2.456723
1.9456172
0.841324


PCAT-84
TU_0121714_0
chr2:203295212-203314868
−1.3457565
2.4496563
1.7624274
0.832468


PCAT-85
TU_0098937_0
chr8:95748751-95751321
−1.4532137
2.42248
2.2526834
0.823797


PCAT-86
TU_0108453_0
chrl5:19358396-19354013
−1.803 text missing or illegible when filed 4539
2.4094539
3.539975
0.767811


PCAT-87
TU_0114170_0
chr15:98659312-99689199
−1.4358851
2.4052114
2.1252658
0.767811


PCAT-88
TU_0089956_0
chr11:1042845-1045705
−1.3899238
2.401665
2.6390955
0.767811


PCAT-89
TU_0001559_0
chr6:30283700-30286011
−1.3517065
2.3987799
1.5110788
0.767811


PCAT-90
TU_0050557_0
chr4:159976338-160016453
−1.17525
2.398508
2.0524442
0.767811


PCAT-91
TU_0078294_0
chr12:32395632-32413064
−1.4560982
2.3960867
2.1863208
0.767811


PCAT-92
TU_0044933_0
chr13:94755992-94760688
−1.2905197
2.3965187
2.189938
0.767811


PCAT-93
TU_0017730_0
chr17: 52348635-52345880
−1.4169512
2.3874657
1.4708191
0.760428


PCAT-94
TU_0039020_0
chr3:66578320-65507777
−1.2662895
2.3720088
1.7112709
0.712473


PCAT-95
TU_0049213_0
chr4:162461960-102476087
−1.2725139
2.3671806
1.8876821
0.712473


PCAT-96
TU_0093070_0
chr11:64945809-64961189
−1.2954472
2.3645105
1.9128969
0.712473


PCAT-97
TU_0051053_0
chr4:187244297-187244767
1.8922831
−2.8485844
0.50983155
0.732264


PCAT-98
TU_0098190_0
chr text missing or illegible when filed 1704765-61705193
1. text missing or illegible when filed 25526 
−2.8612507
0.4027831
0.732264


PCAT-99
TU_0038811_0
chr3:57 text missing or illegible when filed 0136-57890834
1.9620296
−2.8837516
0.44431657
0.732264


PCAT-100
TU_0020914_0
chr19:9715612-9721799
1.8433232
−2.9243097
0.50623006
0.732264


PCAT-101
TU_0112056_0
chr15:69655838-69672469
1.837821
−3.0355222
0.46161976
0


PCAT-102
TU_0036396_0
chr14:104617328-
1.849786
−3.1192882
0.45514825
0




104619095






PCAT-103
TU_0095765_0
chr11:117640504-
2.1902219
−3.2632742
0.38160657
0




117642734






PCAT-104
TU_0050224_0
chr4:147115887-147190781
2.1981242
−3.2575357
0.28569755
0


PCAT-105
TU_0112059_0
chr15:59667895-65891724
1.814 text missing or illegible when filed 81
−3.3526626
0.4356746 text missing or illegible when filed
0


PCAT-106
TU_0095382_0
chr8: text missing or illegible when filed 494189-68495887
2.5413978
−4.0586042
0.30793378
0






text missing or illegible when filed indicates data missing or illegible when filed



















TABLE 6









Median
Maximum






Expression
Expression


PCAT ID
Gene
Chromosomal Location
Outlier Score
(RPKM)
(RPKM)




















PCAT-107
TU_0029004_0
chrX: 66691350-66692032
130.7349145
1
90.921


PCAT-108
TU_0054542_0
chr16: 79420131-79423590
127.0430957
5.60998
135.85


PCAT-109
TU_0120899_0
chr2: 180689090-180696402
123.5416436
1.0525222
94.6932


PCAT-110
TU_0054540_0
chr16: 79419351-79423673
119.090847
4.161985
94.4461


PCAT-111
TU_0120918_0
chr2: 181297540-181400892
112.710111
1.4533705
92.1795


PCAT-112
TU_0054538_0
chr16: 79408946-79450819
93.01851659
1.830343
93.1207


PCAT-113
TU_0059541_0
chr1: 20685471-20686432
68.3572507
1.783109
1375.15


PCAT-114
TU_0120924_0
chr2: 181331111-181427485
63.95455962
1.3891845
365.202


PCAT-115
TU_0074308_0
chr10: 42652247-42653596
60.91841567
1.393607
65.7712


PCAT-116
TU_0049192_0
chr4: 102257900-102306678
59.24997694
1.3854525
69.2423


PCAT-117
TU_0054537_0
chr16: 79406933-79430041
53.04481977
1.8534395
42.751


PCAT-118
TU_0120900_0
chr2: 80926864-130985967
55.8438747
1
67.6582


PCAT-119
TU_0114527_0
chr2: 0858318-10858530
54.76455104
1.2969775
35.0059


PCAT-120
TU_0120923_0
chr2: 81328093-181419225
52.9793227
1.2821
232.556


PCAT-121
TU_0049231_0
chr4: 02257900-102259695
52.77001947
2.34042
67.6276





















TABLE 7









Median
Maximum





Outlier
Expression
Expression


Rank
Gene
Chromosomal location
Score
(RPKM)
(RPKM)




















1
CRISP3
chr6: 49803053-49813070
294.56446
1.5414775
478.812


2
SPINK1
chr5: 147184335-147191453
177.19518
2.484455
624.733


3
TU_0029004_0
chrX: 66691350-66692032
130.73491
1
90.921


4
TU_0054542_0
chr16: 79420131-79423590
127.0431
5.60998
135.85


5
TU_0120899_0
chr2: 180689090-180696402
123.54164
1.0525222
94.6932


6
ERG
chr21: 38673821-38792298
119.446
3.421615
178.826


7
TU_0054540_0
chr16: 79419351-79423673
119.09085
4.161985
94.4461


8
ERG
chr21: 38673821-38792298
117.60294
3.470755
176.186


9
ERG
chr21: 38673821-38955574
117.26408
3.385695
170.663


10
ERG
chr21: 38673821-38955574
116.33448
3.40077
170.443


11
TU_0120918_0
chr2: 181297540-181400892
112.71011
1.4533705
92.1795


12
C7orf68
chr7: 127883119-127885708
105.18504
6.835525
336.148


13
CSRP3
chr11: 19160153-19180106
101.12947
1
148.45


14
C7orf68
chr7: 127883119-127885708
100.63202
7.08303
337.76


15
COL2A1
chr12: 46653014-46684552
99.166329
1.2285615
96.0977


16
C1orf64
chr1: 16203317-16205771
98.085922
3.62012
252.013


17
TU_0054538_0
chr16: 79408946-79450819
98.018517
1.830343
93.1207


18
COL2A1
chr12: 46653014-46684552
97.347905
1.2416035
94.6672


19
CSRP3
chr11: 19160153-19180165
96.730187
1
141.963


20
COL9A2
chr1: 40538749-40555526
74.408443
19.24815
570.961


21
PLA2G7
chr6: 46780012-46811389
69.521175
10.83567
97.8331


22
AGT
chr1: 228904891-228916959
69.319886
4.797365
189.281


23
TU_0059541_0
chr1: 20685471-20686432
68.357251
1.783109
1375.15


24
ETV1
chr7: 13897382-13992664
68.218569
1.932797
138.519


25
ETV1
chr7: 13897382-13992664
67.723331
1.9899945
142.406


26
ETV1
chr7: 13897382-13992664
67.680571
1.9915925
143.632


27
PLA2G7
chr6: 46780011-46811110
67.089039
10.62
95.3551


28
ETV1
chr7: 13897382-13997390
66.381191
2.697225
143.975


29
ETV1
chr7: 13897382-13997575
65.563724
2.074935
141.069


30
MUC6
chr11: 1002823-1026706
64.7328
1.466194
351.862


31
TU_0120924_0
chr2: 181331111-181427485
63.95456
1.3891845
365.202


32
ETV1
chr7: 13897382-13996167
63.929225
2.05648
135.131


33
ETV1
chr7: 13897382-13996167
62.424072
2.03086
131.644


34
TU_0074308_0
chr10: 42652247-42653596
60.918416
1.393607
65.7712


35
TU_0049192_0
chr4: 102257900-102306678
59.249977
1.3854525
69.2423


36
TU_0054537_0
chr16: 79406933-79430041
58.04482
1.8534395
42.751


37
RGL3
chr19: 11365731-11391018
57.528689
7.660035
91.2238


38
RGL3
chr19: 11365731-11391018
57.393056
7.6327
90.6937


39
TMEM458
chr11: 129190950-129235108
55.887845
4.87695
60.0414


40
TU_0120900_0
chr2: 180926864-180985967
55.843875
1
67.6582


41
PTK6
chr20: 61630219-61639151
55.101291
3.420545
114.116


42
TU_0114527_0
chr2: 10858318-10858530
54.764551
1.2969775
35.0059


43
TU_0112020_0
chr15: 67764259-67801825
53.882769
2.0281615
88.99


44
TU_0120923_0
chr2: 181328093-181419226
52.979323
1.2821
232.556


45
TU_0043231_0
chr4: 102257900-102259695
52.770019
1.34042
67.6276


46
MON1B
chr16: 75782336-75791044
51.717027
26.00355
187.807


47
TU_0054541_0
chr16: 79408800-79435066
50.445248
1.7164375
32.5832


48
TU_0087466_0
chr5: 136779809-136798173
50.285169
1.2738505
42.0309


49
DLX1
chr2: 172658453-172662647
50.048039
2.088625
43.0035


50
TU_0108209_0
chr22: 46493579-46531245
47.753833
1.0491419
25.6643


51
DLX1
chr2: 172658453-172662647
47.159314
1.9682735
38.4705


52
SMC4
chr3: 161600123-161635435
47.127047
4.581655
63.2353


53
SMC4
chr3: 161601040-161635435
46.967013
4.442065
61.2756


54
TU_0102399_0
chr9: 35759438-35761676
46.664973
6.44675
179.711


55
TU_0029005_0
chrX: 66690414-66704178
46.155567
1.0870047
38.3022


56
C15orf48
chr15: 43510054-43512939
45.732195
19.02125
223.42


57
C15orf48
chr15: 43510054-43512939
45.549287
21.28355
248.097


58
EFNA3
chr1: 153317971-153325638
44.993943
3.68358
70.5016


59
TU_0043412_0
chr13: 33918267-33935946
44.506741
1.311142
15.1968


60
TU_0069093_0
chr1: 220878648-220886461
42.645673
1.443496
160.898


61
UGT1A6
chr2: 234265059-234346684
42.500058
1.937622
45.753


62
TU_0057051_0
chr18: 54524352-54598419
42.108622
2.418785
56.0712


63
AMH
chr19: 2200112-2203072
41.744334
2.16026
91.244


64
TU_0120908_0
chr2: 181147971-181168431
41.650097
1.0750564
48.7957


65
TU_0099873_0
chr8: 128138926-128140075
41.420293
1.51101
38.7353


66
HN1
chr17: 70642938-70662369
40.495209
16.35625
110.208


67
TU_0022570_0
chr19: 20341299-20343938
39.984803
2.912835
98.5739


68
TU_0098937_0
chr8: 95748751-95751321
39.740546
1.4422495
51.5935


69
TU_0040375_0
chr3: 133280694-133394609
39.664781
2.149005
50.9787


70
HN1
chr17: 70642938-70662370
39.655603
16.34725
109.587


71
TU_0120929_0
chr2: 181328093-181423017
39.419483
1.2116475
189.765


72
TU_0112004_0
chr15: 67644390-67650387
39.300923
6.10665
76.723


73
TU_0108439_0
chr15: 19293567-19296333
39.131646
1
27.7534


74
HN1
chr17: 70642938-70662369
39.00893
15.53595
103.782


75
SULT1C2
chr2: 108271526-108292803
39.007062
1.2259165
91.5617


76
STX19
chr3: 95215904-95230144
38.954223
4.521255
46.0375


77
TU_0030420_0
chrX: 112642982-112685485
38.715477
1.0890785
62.9419


78
TU_0099875_0
chr8: 128138047-128140075
38.489447
1.393413
35.8984


79
UBE2T
chr1: 200567408-200577717
38.387515
3.070345
85.9738


80
SULT1C2
chr2: 108271526-108292803
37.817555
1.215033
88.0858


81
TU_0049429_0
chr4: 109263508-109272353
37.794245
1.09915225
29.1838


82
STMN1
chr1: 26099193-26105955
37.319869
14.3784
187.062


83
UGT1A1
chr2: 234333657-234346684
37.267194
1.660554
35.9476


84
LRRN1
chr3: 3816120-3864387
37.229013
3.8912
137.117


85
TU_0086631_0
chr5: 113806149-113806936
36.896806
1.0501165
29.6561


86
ORM2
chr9: 116131889-116135357
36.878688
3.614505
120.139


87
TU_0084060_0
chr5: 7932238-7932523
36.807599
1
23.1979


88
TU_0098644_0
chr8: 81204784-81207034
36.779294
1.6013735
64.9663


89
ACSM1
chr16: 20542059-20610079
36.280896
13.3707
317.077


90
STMN1
chr1: 26099193-26105231
35.882914
12.73275
164.721


91
STMN1
chr1: 26099193-26105580
35.823453
14.31935
185.329


92
TU_0120914_0
chr2: 181265370-181266053
35.551458
1.053468
30.7074


93
UGT1A7
chr2: 234255322-234346684
35.073998
1.667349
33.4378


94
TU_0087462_0
chr5: 136386339-136403134
34.992335
1.4450115
27.1703


95
UGT1A3
chr2: 234302511-234346684
34.952247
1.6889365
33.4202


96
UGT1A5
chr2: 234286376-234346684
34.950003
1.6639345
33.2713


97
FOXD1
chr5: 72777840-72780108
34.875512
1.2373575
10.80944


98
ADM
chr11: 10283217-10285499
34.855767
11.83635
276.194


99
PPFIA4
chr1: 201286933-201314487
34.769924
1.566044
43.9812


100
UGT1A10
chr2: 234209861-234346690
34.738527
1.652799
32.7318


101
UGT1A4
chr2: 234292176-234346684
34.663597
1.655824
32.9264


102
UGT1A9
chr2: 234245282-234346690
34.643086
1.655272
32.852


103
TU_0090142_0
chr11: 4748677-4760303
34.517072
1.5226305
51.3411


104
TU_0082746_0
ch12: 120197102-120197416
34.499713
2.531095
59.9025


105
UGT1A8
chr2: 234191029-234346684
34.433379
1.6498025
32.5849


106
TU_0112207_0
chr15: 70278422-70286121
34.308752
10.40266
112.274


107
LOC145837
ch15: 67641112-67650833
34.291574
7.59729
74.8194


108
TU_0050712_0
chr4: 170217424-170228463
34.23107
1.504313
65.5606


109
TU_0043410_0
chr13: 33929484-33944669
34.112491
1.393529
24.8401


110
SNHG1
chr11: 62376035-62379936
33.971989
33.74365
270.512


111
MUC1
chr1: 153424923-153429324
33.838228
16.3238
654.278


112
MUC1
chr1: 153424823-153429324
33.823147
15.8436
644.44


113
TU_0099871_0
chr8: 128138047-128143500
33.697285
1.412872
33.2958


114
TU_0040383_0
chr3: 133360541-133429262
33.548813
2.553955
85.8384


115
MUC1
chr1: 153424923-153429324
33.495501
15.91355
627.622


116
TU_0049202_0
chr4: 102257900-102304755
33.391066
1.5555505
39.7522


117
TU_0120913_0
chr2: 181254530-181266950
33.188328
1
43.8515


118
BAGALNT4
chr11: 359794-372116
33.176248
6.3749
80.9639


119
TU_0100059_0
chr8: 141258835-141260573
33.169029
1.3615865
44.8943


120
TOP2A
chr17: 35798321-35827695
33.132056
1.9725825
34.1032


121
MUC1
chr1: 153424923-153429324
33.081326
15.9539
632.042


122
FU_0001265_0
chr6: 27081719-27082291
33.045746
1.3381905
100.5401





















TABLE 7







123
C7orf53
chr7: 111908143-111918171
33.024251
2.820945
32.2465


124
SLC45A2
chr5: 33980477-34020537
32.952911
2.012104
54.8589


125
TU_0099869_0
chr8: 128138047-128225937
32.928048
1.308804
30.4667


126
UGT1A6
chr2: 234266250-234346690
32.918772
1.662221
31.4671


127
TU_0120917_0
chr2.: 181265370-181266950
32.796137
1.0771403
36.3557


128
CACNA1D
chr3: 53504070-53821532
32.608994
4.51306
44.9904


129
UBE2C
chr20: 43874661-43879003
32.456813
1.6391285
58.398


130
ALDOC
chr17: 23924259-23928078
32.455953
14.98415
228.812


131
MUC1
chr1: 153424923-153429324
32.44845
15.5895
599.062


132
MMP11
chr22: 22445035-22456503
32.411555
3.257735
73.9158


133
TU_0084303_0
chr5: 15899476-15955226
32.39036
2.21168
14.4385


134
CACNA1D
chr3: 53504070-53821532
32.381439
4.484655
44.6867


135
UBE2C
chr20: 43874661-43879003
32.358151
1.705223
57.8559


136
CACNA1D
chr3: 53504070-53821532
32.353332
4.463805
44.2455


137
FGFRL1
chr4: 995609-1010686
32.275762
26.0133
450.449


138
FGFRL1
chr4: 996251-1010686
32.075261
27.0148
468.809


139
FGFRL1
chr4: 995759-1010686
32.069901
26.92945
467.246


140
MUC1
chr1: 153424923-153429324
32.011017
15.3218
586.058


141
TU_0099922_0
chr8: 128979617-128981414
31.833339
3.32544
32.6893


142
TU_0001173_0
chr6: 26385234-26386052
31.823293
2.339595
71.3388


143
MUC1
chr1: 153424923-153429324
31.781267
15.22945
587.582


144
TMEM178
chr2: 39746141-39798605
31.614406
13.40605
182.08


145
UBE2C
chr20: 43874661-43879003
31.37539
1.7154185
58.1531


146
KCNC2
chr12: 73720162-73889778
31.294059
1.8783795
104.225


147
MAGEC2
chrX: 141117794-141120742
31.286618
1
34.1099


148
SERHL2
chr22: 41279868-41300332
31.131788
3.670135
61.9969


149
KCNC2
chr12: 73720162-73889778
31.126593
1.868714
108.199


150
GRAMDA
chr22: 45401321-45454352
31.063732
5.977725
79.8338









Table 8 shows the number of cancer-associated lncRNAs nominated for four major cancer types. The number validated is indicated in the column on the right. This table reflects ongoing efforts.













TABLE 8








# of cancer-specific





lncRNAs nominated
# validated to date




















Prostate cancer
121
11



Breast cancer
6
6



Lung cancer
36
32



Pancreatic cancer
34
0










All publications, patents, patent applications and accession numbers mentioned in the above specification are herein incorporated by reference in their entirety. Although the invention has been described in connection with specific embodiments, it should be understood that the invention as claimed should not be unduly limited to such specific embodiments. Indeed, various modifications and variations of the described compositions and methods of the invention will be apparent to those of ordinary skill in the art and are intended to be within the scope of the following claims.

Claims
  • 1. A method of screening for the presence of prostate cancer in a subject, comprising (a) contacting a biological sample from a subject with a reagent for detecting the level of expression of one or more non-coding RNAs (ncRNA) selected from the group consisting of PCAT1, PCAT14, PCAT43 and PCAT 109; and(b) detecting the level of expression of said ncRNA in said sample using an in vitro assay, wherein an increased level of expression of said ncRNA in said sample relative to the level in normal prostate cells in indicative of prostate cancer in said subject.
  • 2. The method of claim 1, wherein the sample is selected from the group consisting of tissue, blood, plasma, serum, urine, urine supernatant, urine cell pellet, semen, prostatic secretions and prostate cells.
  • 3. The method of claim 1, wherein detection is carried out utilizing a method selected from the group consisting of a sequencing technique, a nucleic acid hybridization technique, a nucleic acid amplification technique, and an immunoassay.
  • 4. The method of claim 3, wherein the nucleic acid amplification technique is selected from the group consisting of polymerase chain reaction, reverse transcription polymerase chain reaction, transcription-mediated amplification, ligase chain reaction, strand displacement amplification, and nucleic acid sequence based amplification.
  • 5. The method of claim 1, wherein said cancer is selected from the group consisting of localized prostate cancer and metastatic prostate cancer.
  • 6. The method of claim 1, wherein said reagent is selected from the group consisting of a pair of amplification oligonucleotides and an oligonucleotide probe.
  • 7. A method of screening for the presence of prostate cancer in a subject, comprising (a) contacting a biological sample from a subject with a reagent for detecting the level of expression of two or more non-coding RNAs (ncRNA) selected from the group consisting of PCAT1, PCAT2, PCAT3, PCAT4, PCAT5, PCAT6, PCAT7, PCAT8, PCAT9, PCAT10, PCAT11, PCAT12, PCAT13, PCAT14, PCAT15, PCAT16, PCAT17, PCAT18, PCAT19, PCAT20, PCAT21, PCAT22, PCAT23, PCAT24, PCAT25, PCAT26, PCAT27, PCAT28, PCAT29, PCAT30, PCAT31, PCAT32, PCAT33, PCAT34, PCAT35, PCAT36, PCAT37, PCAT38, PCAT39, PCAT40, PCAT41, PCAT42, PCAT43, PCAT44, PCAT45, PCAT46, PCAT47, PCAT48, PCAT49, PCAT50, PCAT51, PCAT52, PCAT53, PCAT54, PCAT55, PCAT56, PCAT57, PCAT58, PCAT59, PCAT60, PCAT61, PCAT62, PCAT63, PCAT64, PCAT65, PCAT66, PCAT67, PCAT68, PCAT69, PCAT70, PCAT71, PCAT72, PCAT73, PCAT74, PCAT75, PCAT76, PCAT77, PCAT78, PCAT79, PCAT80, PCAT81, PCAT82, PCAT83, PCAT84, PCAT85, PCAT86, PCAT87, PCAT88, PCAT89, PCAT90, PCAT91, PCAT92, PCAT93, PCAT94, PCAT95, PCAT96, PCAT97, PCAT98, PCAT99, PCAT100, PCAT101, PCAT102, PCAT103, PCAT104, PCAT105, PCAT106, PCAT107, PCAT108, PCAT109, PCAT110, PCAT111, PCAT112, PCAT113, PCAT114, PCAT115, PCAT116, PCAT117, PCAT118, PCAT119, PCAT120, and PCAT121; and (b) detecting the level of expression of said ncRNA in said sample using an in vitro assay, wherein an increased level of expression of said ncRNA in said sample relative to the level in normal prostate cells in indicative of prostate cancer in said subject.
  • 8. The method of claim 7, wherein said two or more ncRNAs is ten or more.
  • 9. The method of claim 7, wherein said two or more ncRNAs is 25 or more.
  • 10. The method of claim 7, wherein said two or more ncRNAs is 50 or more.
  • 11. The method of claim 7, wherein said two or more ncRNAs is 100 or more.
  • 12. The method of claim 7, wherein said two or more ncRNAs is all 121 ncRNAs.
  • 13. An array, comprising reagents for detection of two or more ncRNAs selected from the group consisting of PCAT1, PCAT2, PCAT3, PCAT4, PCAT5, PCAT6, PCAT7, PCAT8, PCAT9, PCAT10, PCAT11, PCAT12, PCAT13, PCAT14, PCAT15, PCAT16, PCAT17, PCAT18, PCAT19, PCAT20, PCAT21, PCAT22, PCAT23, PCAT24, PCAT25, PCAT26, PCAT27, PCAT28, PCAT29, PCAT30, PCAT31, PCAT32, PCAT33, PCAT34, PCAT35, PCAT36, PCAT37, PCAT38, PCAT39, PCAT40, PCAT41, PCAT42, PCAT43, PCAT44, PCAT45, PCAT46, PCAT47, PCAT48, PCAT49, PCAT50, PCAT51, PCAT52, PCAT53, PCAT54, PCAT55, PCAT56, PCAT57, PCAT58, PCAT59, PCAT60, PCAT61, PCAT62, PCAT63, PCAT64, PCAT65, PCAT66, PCAT67, PCAT68, PCAT69, PCAT70, PCAT71, PCAT72, PCAT73, PCAT74, PCAT75, PCAT76, PCAT77, PCAT78, PCAT79, PCAT80, PCAT81, PCAT82, PCAT83, PCAT84, PCAT85, PCAT86, PCAT87, PCAT88, PCAT89, PCAT90, PCAT91, PCAT92, PCAT93, PCAT94, PCAT95, PCAT96, PCAT97, PCAT98, PCAT99, PCAT100, PCAT101, PCAT102, PCAT103, PCAT104, PCAT105, PCAT106, PCAT107, PCAT108, PCAT109, PCAT110, PCAT111, PCAT112, PCAT113, PCAT114, PCAT115, PCAT116, PCAT117, PCAT118, PCAT119, PCAT120, and PCAT121.
  • 14. The array of claim 13, wherein said two or more ncRNAs is ten or more.
  • 15. The array of claim 13, wherein said two or more ncRNAs is 25 or more.
  • 16. The array of claim 13, wherein said two or more ncRNAs is 50 or more.
  • 17. The array of claim 13, wherein said two or more ncRNAs is 100 or more.
  • 18. The array of claim 13, wherein said two or more ncRNAs is all 121 ncRNAs.
  • 19. The array of claim 13, wherein said reagent is selected from the group consisting of a pair of amplification oligonucleotides and an oligonucleotide probe.
Parent Case Info

This application is a continuation of U.S. patent application Ser. No. 16/453,195, filed Jun. 26, 2019, which is a continuation of U.S. patent application Ser. No. 15/064,266, filed Mar. 8, 2016, which is a continuation of U.S. patent application Ser. No. 13/299,000, filed Nov. 17, 2011, which claims priority to provisional application 61/415,490, filed Nov. 19, 2010, which is herein incorporated by reference in its entirety.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT

This invention was made with government support under CA069568, CA132874 and CA111275 awarded by the National Institutes of Health and W81XWH-09-2-0014 awarded by the Army Medical Research and Material Command. The government has certain rights in the invention.

Provisional Applications (1)
Number Date Country
61415490 Nov 2010 US
Continuations (3)
Number Date Country
Parent 16453195 Jun 2019 US
Child 17813193 US
Parent 15064266 Mar 2016 US
Child 16453195 US
Parent 13299000 Nov 2011 US
Child 15064266 US