This application is a 371 of PCT/112006/000664, filed Sep. 19, 2006, which claims the benefit of Italian patent application RM2005A000475, filed Sep. 19, 2005, the contents of each of which are incorporated herein by reference.
The instant invention concerns sequences transcribed by the RNA polymerase III type III and their use for medicine, agronomy and biotechnology.
The author has identified an unknown transcription of encoding and unpolyadenylated genome elements that are synthesized by means of RNA Pol III type III promoters or of very similar elements. Other than their identification and molecular characterization, said new transcription units were functionally analyzed and their regulative features were identified. Each transcription unit is functionally related to a specific RNA Pol II transcripts giving rise to specific sense/antisense sequence molecules.
Recent advances in mammalian genome studies are bringing to light the occurrence of a widespread transcription of non-coding (nc) regions devoted to the regulation of the protein coding genome expression [1-4]. The mechanisms of action of these transcripts are various and of different nature, although all of them are devoted to the regulation of fundamental genetic pathways involved in the determination of the cell phenotype. The concomitant evolution of non-coding regulatory transcripts and proteins that target different RNA:RNA or RNA:DNA complexes emphasizes the importance to study the regulatory processes mediated by nucleic acids interactions. It's now clear that either in procaryotes as well as in eukaryotes different ncRNAs can act in cis and be contemporaneously regulated in trans by other non-coding transcripts. The simultaneous occurrence of cis and trans regulatory elements bring to light the complexity of this network where the coexistence of different non-coding RNAs plays a key role in the control of other targets gene expression [5]. In this context a prominent role is played by the enlarging family of microRNAs (miRNAs) that act at post transcriptional level by inhibiting the translation of protein coding genes [6]. The known miRNAs, as protein-coding mRNAs, are synthesized as polyadenylated precursor molecules by the RNA Polymerase II transcription machinery [7]. Considering that the vast majority of the tools used in molecular biology are based on transcript collections obtained by oligo-dT RT-PCR (thus encompassing only polyadenylated RNA Polymerase II products) a wide contribution of non-polyadenylated transcripts to the human transcriptome has been shown [8]. However, the role of such transcripts in Pol II transcriptome expression regulation remains largely unexplored.
Among the non-coding elements one of the most investigated has been the Alu class of repetitive sequences that represents about one tenth of the whole human genome. Although it is not yet possible to discern a peculiar Alu's role these short transcripts has been shown to be involved in several biological processes such as RNA editing (where Alus are preferential sites for A to I RNA editing thus having profound implications either in gene expression regulation as well as in the mammalian genome evolution) [9], alternative splicing (internal exons that contain an Alu sequence are almost always alternatively spliced) [10], chromosomal recombination (the recombination between Alu elements is at the base of many genomic deletions associated with many human genetic disorders) [11], gene expression regulation (functioning as naturally occurring antisense RNAs) [12], cell stress response (such as heat shock response and/or translation inhibition) [13] and as putative miRNAs targets [14]. However, although the physiological role of Alus and all the other 7SL-derived transcripts needs to be further studied in detail, the fact that their transcription is RNA Polymerase (Pol) III-dependent bring to light a previously unexpected role in gene expression regulation of this enzyme that would need to be investigated in detail.
In this work we focus on a specific class of non-coding RNAs starting from a theoretical hypothesis on their putative function. In fact, starting from the observation that RNA Polymerase (Pol) III is specialized in transcription of non coding ncRNA genes, we postulated the presence in the genome of a large number of Pol III (or Pol III-like) transcription units each specifically regulating one (or more) specific Pol II genes, thus constituting functional “co-gene”/gene pairs.
Therefore it is an object of the invention a nucleic acid molecule comprising a nucleotide sequence that is characterized by:
being transcribed by an RNA polymerase III,
it does not undergone any polyadenylated tail addition (as for Pol II transcribed genes) and
it is able to modulate the expression of one or more specific RNA polymerase II-transcribed target genes.
Preferably said nucleotide sequence comprises a sequence of at least 50 nucleotides that is at least 70% identical to a fragment of one of the strands of the specific RNA polymerase II-transcribed target genes.
More preferably said sequence of at least 50 nucleotides is in a sense or an antisense configuration with respect to the fragment of one of the strands of the specific RNA polymerase II-transcribed target genes.
In a particular aspect the nucleic acid of the invention is comprised in one of the sequences from SEQ ID No. 51 to SEQ ID No. 84, preferably the sequence of at least 50 nucleotides that is at least 70% identical to a fragment of one of the strands of the specific RNA polymerase II-transcribed target gene is comprised in one the underlined fragments of the sequences from SEQ ID No. 51 to SEQ ID No. 84.
It is another object of the invention an expression vector comprising the nucleic acid according to the invention.
It is another object of the invention an array for the detection of specific nucleic acid sequences containing a repertoire of nucleic acids according to the invention.
It is another object of the invention the use of the nucleic acid according to the invention to modulate the expression of RNA polymerase II transcribed genes.
It is another object of the invention the use of the nucleic acid according to the invention to identify a target sequence for treatment and/or prevention of a molecular pathology, preferably an age related pathology, including Alzheimer disease; alternatively the pathology is caused by an alteration of cell proliferation, preferably the pathology is a tumor associated pathology.
It is another object of the invention a nucleic acid comprising at least one sequence being able to modulate the RNA polymerase III mediated expression of the nucleic acid as above described, preferably the sequence being able to modulate the RNA polymerase III mediated expression of the nucleic acid as above described is a promoter sequence.
In a particular aspect the sequence being able to modulate the RNA polymerase III mediated expression of the nucleic acid as above described is comprised in one of the sequences from SEQ ID No. 51 to SEQ ID No. 84. Preferably the sequence being able to modulate the RNA polymerase III mediated expression of the nucleic acid according to claims 1 to 5 is comprised in the bold regions of sequences from SEQ ID No. 51 to SEQ ID No. 84.
It is another object of the invention the use of the nucleic acid comprising the sequence being able to modulate the RNA polymerase III mediated expression of the nucleic acid as above described to modulate the expression of one or more specific RNA polymerase II-transcribed target genes.
It is another object of the invention the use of the nucleic acid comprising the sequence able to modulate the RNA polymerase III mediated expression of the nucleic acid as above described to identify a target sequence for treatment and/or prevention of a molecular pathology, preferably the pathology is an age related pathology, including Alzheimer disease. Alternatively the pathology is caused by an alteration of cell proliferation, preferably the pathology is a tumor associated pathology.
It is another object of the invention a vector comprising the nucleic acid comprising the sequence able to modulate the RNA polymerase III mediated expression of the nucleic acid as above described to get expression or silencing of a RNA polymerase II transcribed specific nucleotide sequence.
The invention shall be described in the following non limitative examples, by referring to figures.
All the sequence searches and alignments were carried out taking advantage of the Basic Local Alignment Search Tool of the National Center for Biotechnology Informations (“BLAST”) at the web site for the (U.S.) National Center for Biotechnology Information. The sequences used as query were the following: H1 PSE-nCACCATAAAnGTGAAAn (SEQ ID No. 1) or nTTTCACnTTTTATGGTGn (SEQ ID No. 2), U6 PSE (Acc No: M14486) CTTACCGTAACTTGAAAGT (SEQ ID No. 3), 7SLPSE (as reported in PMID: 2011518) TTGACC-TAAGTG (SEQ ID No. 4), DSE (Oct1 consensus sequence)-ATTTGCAT (SEQ ID No. 5) or ATGCAAAT (SEQ ID No. 6) with or without a single base of mismatch.
Cell Culture, Transfection and Luciferase Assay
For transient transfections Hela cells (grown in DMEM supplemented with 10% FCS), were grown in multiwell Petri dishes 16 hours before transfection. The expression [21A, 21A(1), 21A(2), 21A(3)] constructs containing the regions of interest cloned in the pTopo vectors (Invitrogen) were introduced into the cells using the Fugene 6 transfection reagent (Roche) according to the manufacturer's instructions. A plasmid Expressing Luciferase was used as control of transfection efficiency (to which all the results were normalized). 24, 48 and 72 hours after transfection cells were harvested and firely luciferase activity was measured by Dual-Luciferase reporter assay system (Promega). manufacturer's protocol. In order to specifically inhibit RNA Polymerase III and/or RNA Polimerase II, a cell-permeable chlorobenzenesulfonamide (ML-60218) (Calbiochem, California USA) and/or α-amanitin (Roche Diagnostics GmbH, Germany) were used at the concentration of 20 μM and 10 μg/ml respectively in the medium for 25 h (ML-60218) and 12 h (α-amanitin) before the luciferase activity detection.
RNAi-Silencing Assay.
In order to test the promoter activity of the novel transcription units we prepared six plasmid constructs expressing a firefly luciferase silencing hairpin (obtained by Gregory Hannon's Laboratory-Cold Spring Harbor Laboratories) which transcription was driven by the 11A, 14A, 21A, 29A, 38A, 51A promoters respectively. The hairpin sequence [targeting a firefly luciferase mRNA from a co-transfected expression plasmid (Promega)] is:
Oligos used to subclone the novel Pol III Type III promoters within Not I/HinD III restriction sites (in capital) were the following:
In this analysis the above constructs were co-transfected with a pGL3 plasmid (Promega) expressing Firefly (ff1) Luciferase as target to be silenced and with a pRL plasmid (Promega) expressing a Renilla Luciferase to which all the determinations were normalized. 24, 48 and 72 hours after transfection cells were harvested and firely/Renilla luciferase activities were measured by Dual-Luciferase reporter assay system (Promega) according to the manufacturer's protocol.
Plasmid Constructs Generation and Sequencing
The plasmid constructs p21A, p21A(1), p21A(2), p21A(3) were generated amplifying from a genomic DNA preparation the regions of interest; the PCR products were then subcloned in a pTOPO Vector (Invitrogen) following manufacturer's instructions. The oligos used to generate p21A PCR fragments were the following:
The plasmid constructs pAnti-21A was generated amplifying the transcribed region from p21A plasmid using the following oligos:
thus generating the transcribed region in anti-sense configuration. The pAnti-21A promoter was obtained by amplifying p21A promoter with the following oligos:
The PCR products were digested with the restriction enzyme Bam HI, purified by gel electrophoresis and ligated by T4 ligase (Invitrogen). The insert obtained was then subcloned in pTOPO vector (Invitrogen) following manufacturer's instructions. Prior to transfection all the plasmids were sequenced by DNA Sequencing Kit (Applied Biosystems) following manufacturer's instructions.
RT-PCR Reactions
In order to isolate and sequence a partial 21A cDNA ve performed different RT-PCR reactions. Starting from about 5 μg of total RNA, cDNA was synthesized by using an Oligo(dT)12-18 primer or a random hexamers mix and a Superscript first-strand synthesis system for RT-PCR (Invitrogen). cDNAs were diluted 10-50 times, then subjected to PCR reactions. The oligo used to isolate 21A RT-PCR product were: oligo forward 21AF 5′gctcacgtagtcccagcacttt-3′ (SEQ ID No. 29) and oligo reverse 21AR 5′-actatgttgcccaagctggtct-3′ (SEQ ID No. 30).
PCR products were separated on 1.5-2% agarose gel. The DNA bands were cut, purified by the DNA Gel Extraction Kit (Millipore) and sequenced.
2.8. Real-Time Quantitative RT-PCR
The RNA for 21A was measured by real-time quantitative RT-PCR using PE ABI PRISM@7700 Sequence Detection System (Perkin Elmer) and Sybr Green method. The sequences of 21A forward and reverse primers as designed by the Primer Express 1.5 software were 5′-GCTGAGGCAGGAGGATCACT-3′ (SEQ ID No. 31) and 5′-GCACTACCACACCCAGCTAATTTT-3′ (SEQ ID No. 32). The sequences of CENP-F forward and reverse primers were 5′-CTGCAGAAAGAACTCTCTCAACTTC-3′ (SEQ ID No. 33).
and 5′-TCAACAATTAAGTAGCTGGAACCA-3′ (SEQ ID No. 34). For endogenous control the expression of Glyceraldehyde 3 phosphate dehydrogenase (GAPDH) gene was examined. The sequences for human GAPDH primers were 5′-GAAGGTGAAGGTCGGAGTC-3′ (SEQ ID No. 35) and 5′-GAAGATGGTGATGGGATTTC-3′ (SEQ ID No. 36). The sequences for human 5s rRNA primers were 5′-TACGGCCATACCACCCTGAA-3′ (SEQ ID No. 37) and 5′-GCGGTCTCCCATCCAAGTAC-3′ (SEQ ID No. 38). Relative transcript levels were determined from the relative standard curve constructed from stock cDNA dilutions, and divided by the target quantity of the calibrator following manufacturer's instructions.
Anti-21A siRNA Synthesis
The Anti-21A siRNA was synthesized against a region of the 21A transcript of no homology with CENP-F so that the silencing effect was specific for the Pol III regulatory RNA and did not interfere with CENP-F mRNA stability. The siRNA synthesis was carried out taking advantage of the siRNA Construction Kit (Ambion, USA) according to the manufacturer's protocol. The Sense/2Antisense oligos used were: 5′-aaGTGTGGTGGCTCACcctgtctc-3′ (SEQ ID No. 39) and 5′-aaGTGAGCCACCACACcctgtctc-3′ (SEQ ID No. 40).
Proliferation Assay
We tested proliferation of HeLa cells transfected with 21A, 21A-1, 21A-2, 21A-3, Anti-21A constructs plating 5×105 cells per well in round-bottomed 96-well plate, incubated for 24/48/72 hours after transfection and pulsed with 3H thymidine (1.0 μCi/10 μl/well) (Amersham Biosciences) for the last 18 hours. We harvested the cells and evaluated cell proliferation by counting the thymidine uptake. We calculated the averaged proliferation rate, measured as counts per minute (cpm), and standard deviation (SD) for the triplicate wells of each sample.
RNA Isolation and Northern Blot Analysis
Based on a single step acid-phenol guanidium method, total RNA was extracted using TRIzol reagent (Invitrogen) according to the manufacture's protocol. Total RNAs, from HeLa cells, were electrophoresed through 1.5% agarose gels in the presence of formaldehyde and blotted onto Hybond N membranes (Amersham). The blot was hybridized with an 85 bp long probe contained the region from nucleotide 1194 to nucleotide 1278 of the 21A reported sequence (see Table 1) spanning a region internal to the transcript and complementary (96%) to part of the CenPF mRNA. The probe was obtained by PCR (using the 21A plasmid construct as template) using the following oligos: 21AF 5′-GCTCACGTAGTCCCAGCACTTT-3′ (SEQ ID No. 41); 21AR 5′-AGACCAGCTTGGGCAACATAGT-3′ (SEQ ID No. 42). Blot prehybridizations was performed at 65° C. for 2 h in 333 mM NaH2PO4 pH 7.2, 6.66% Sodium Dodecyl Sulphate and 250 mg/ml denatured salmon sperm DNA. Blot hybridization was performed at 65° C. for IS hours in the same solution containing 106 cpm/ml of denatured and labeled probes. After hybridization the blots were washed twice at 65° C. for 30 min in 0.2% sodium dodecyl sulphate, 2×SSPE and once at 65° C. for 30 min in 0.2% sodium dodecyl sulphate, 0.2×SSPE. Membranes were exposed to autoradiographic films for 24/48 hours and then developed.
2.4 Real-Time Quantitative RT-PCR
Total RNA preparations from different CENP-F (Centromeric Protein F) (Acc. n°NM016343) samples was subjected to reverse transcription by SuperScript II First Strand Synthesis Kit (Invitrogen) following manufacturer's instructions. The cDNA obtained was measured by real-time quantitative RT-PCR using PE ABI PRISM@ 7700 Sequence Detection System (Perkin Elmer). The sequences of forward and reverse primers as designed by the Primer Express 1.5 software were 5′-CTGCAGAAAGAACTCTCTCAACTTC-3′ (SEQ ID No. 43) and 5′-AGTTGTTAATTCATCGACCTTGGT-3′(SEQ ID No. 44). The TaqMan™ fluorogenic probe used was 5′-FAM-AGTACCTGTTTTCTGCTTCTCCTGTGCAGC-TAMRA-3′ (SEQ ID No. 45).
The probe was placed at the junction between two exons. During PCR amplification, 5′ nucleolytic activity of Taq polymerase cleaves the probe separating the 5′ reporter fluorescent dye from the 3′ quencher dye. Threshold cycle, CT, which correlates inversely with the target mRNA levels, was measured as the cycle number at which the reporter fluorescent emission increases above a threshold level. For endogenous control the expression of Glyceraldehyde 3 phosphate dehydrogenase (G3PDH) gene was examined by quantitative RT-PCR as described above. The sequences for human GAPDH primers and probe were 5′-GAAGGTGAAGGTCGGAGTC-3′ (SEQ ID No. 46), 5′-GAAGATGGTGATGGGATTTC-3′ (SEQ ID No. 47) and 5′-TET-CAAGCTTCCCGTTCTCAGCC-TAMRA-3′ (SEQ ID No. 4S).
Relative transcript levels were determined from the relative standard curve constructed from stock cDNA dilutions, and divided by the target quantity of the calibrator following manufacturer's instructions.
Western Blot Analysis
Equal amounts of proteins (10 μg/sample) from each sample were loaded on standard 4-12% NU-PAGE gradient gels (Invitrogen S.r.l., Milano, Italy). Blotting onto Protran nitrocellulose membranes (Schleicher & Schuell, Dassel, Germany) was performed in the X-Cell Sure Lock™ Electrophoresis Cell (Invitrogen S.r.l.), according to the manufacturer's instructions. The membranes were saturated overnight in 3% non-fat milk in TTBS buffer (500 nM NaCl; 20 mM Tris/Cl, pH 7.5; 0.05% Tween-20) and incubated for 4 hours at room temperature with the human Anti-Mitosin/CenPF ab90 (ABCAM, Cambridge, UK) and/or anti-Alpha Tubulin (Sigma, Missouri USA) mouse monoclonal antibodies. The Anti-Mitosin antibody recognized a weak signal at a very high apparent molecular mass (350-400 Kda) while the Anti-Alpha Tubulin showed a clear signal at 45 KDa. The immunoreactive band was revealed by an alkaline phosphate conjugated affinity-purified monoclonal anti-rabbit mouse IgG (Sigma-Aldrich Inc.) and (in the experiment indicated in
Anti-21A siRNA Synthesis
The Anti-21A siRNA was synthesized against a region of the 21A transcript of no homology with CENP-F so that the silencing effect was specific for the Pol III regulatory RNA and did not interfere with CENP-F mRNA stability. The siRNA synthesis was carried out taking advantage of the siRNA Constriction Kit (Ambion, USA) according to the manufacturer's protocol. The Sense/Antisense oligos used were: 5′-aaGTGTGGTGGCTCACcctgtctc-3′ (SEQ ID No. 49) and 5′-aaGTGAGCCACCACACcctgtctc-3′(SEQ ID No. 50).
Results
In Silico Identification of a Novel Set of snRNA Gene-Like Transcriptional Units in the Human Genome
To test our hypothesis we focused on Pol III Type III extragenic promoters, that are located upstream of the transcribed region. We screened the human genome for regions containing the consensus sequences characteristic of Pol III Type III promoters: the Proximal Sequence Element (PSE) and the Distal Sequence Element (DSE) [15, 16]. As first we tested the PSE sequences of three well characterized Pol III Type III non-coding (nc) RNAs (U6, H1, 7SL) for their ability to identify a large number of similar (if not equal) elements in the human genome by using BLAST (Basic Local Alighment of Sequence Tags) algorithm as bioinformatic tool (available at the web site for the (U.S.) National Center for Biotechnology Information; “Short Nearly Exact Matches” option, “Homo sapiens” organism database). (For sequences used as query see Materials and Methods). Interestingly while the first search with U6 and 7SK did not identify a significant number of homologous regions scattered throughout the genome the H1 consensus elements shared a high homology with 60 novel putative consensus sequences. Among these we selected (by a BLAST analysis) those who contained a DSE consensus sequence within an arbitrarily defined distance of 1000 base pairs upstream the PSE. Results evidenced 33 putative novel PSE/DSE-dependent promoters. In order to test the functional relationship between the occurrence of the PSE and the DSE consensus elements within that defined genomic distance we examined the frequency of the DSE consensus elements occurrence versus the PSE-DSE distance in the whole pool of novel promoters. Results pointed out an inverse correlation between the DSE occurrence and its distance to the PSE. A very high frequency of DSE elements was associated to the distance of a nucleosome (about 200 bp) from the PSE that significantly decrease at about 800 base pairs to the PSE [17]. Although the restricted number of putative DSE elements did not permit a proper statistical analysis the inverse correlation between DSE frequency and DSE-PSE distance was taken as preliminary indication of their functional relationship in these novel promoters (
However, since the Pol III Type III promoters were at the base of our search some of their structural features needed to be considered: i) the occurrence of a PSE consensus sequence does not constitute per se the minimal Pol III Type III promoter that is, on the contrary, the result of the simultaneous occurrence at an appropriate distance of the PSE and an A/T rich element (TATA box). In fact, it has been clearly demonstrated that the occurrence of a PSE consensus that lacks a downstream A/T rich element makes the promoter readable by RNA Pol II such as in the case of snRNA U2 [16]. In this context the transcription start site is not relevant for the choice of the RNA Polymerase at least in humans although it seems to be of fundamental importance in Xenopus [18]. Therefore the putative transcription units identified by our search might thus be transcribed either by Pol II or by Pol III, depending on the occurrence of a functional A/T rich region downstream the PSE. The further occurrence of a TATA box-like consensus sequence downstream the PSE in a large part of the novel element collection further support a canonical Pol III Type III structure pointing toward their Pol III-dependency. Altogether these findings brought to light 33 novel putative transcription units whose promoter organization is compatible with Pol III transcription (Table 1).
Table 1
ATTTGCATGTCGCTATGTGTTCTGGGAAATCACCATAAACGTGAAATGTC
GGGCGGAGGGAAGCTCATCAGTGGGGCCACGAGCT6AGTGCGTCCTGTCA
CTCCACTCCCATGTCCCTTGGGAAGGTCTGAGACTAGGGCCAGAGGCGGC
CCTAACAGGGCTCTCCCTGAGCTTCGGGGAGGTGAGTTCCCAGAGAACGG
GGCTCCGCGCGAGGTCAGACTGGGCAGGAGATGCCGTGGACCCCGCCCTT
CGGGGAGGGGCCCGGCGGATGCCTCCTTTGCCGGAGCTTGGAACAGACTC
ACGGCCAGCGAAGTGAGTTCAATGGCTGAGGTGAGGTACCCCGCAGGGGA
CCTCATAACCCAATTCAGACTACTCTCCTCCGCCCA
TTTTTGGAAAAAAA
CCATAAATGTGAAATGCTACCTTTATCCTATGTATTTGAATATATATACATAT
ATATTCAAGTACATTCTCTCTATATATGTGTGTTTATAATATCATATATA
TACACACACATATGTGTGTGTGTGTGTGTGTGTGTGTGTTACCTCTTTCA
ATTCCATAGTGTTTTAAGTAATCTAATTTTGGCATACTGAAAATACTGAT
AAGAAAAATTCTTA
TTTTTTCTTTCAAAATTTCCTTTGCATTTATAATAC
GCATTTTTCTGATTAGTAATGTTGAGCATTTTTAATATGCCTCTGGGCTA
ACAGAACAAAGGAGATAATAATATATCATGCAAACACAAACCCATCTATA
TCTTGATTCAAATATAAATTGCAAAAAACGGCTTGAAATTACTATAGAAA
TTTCAACAGAAACAAGGTCTTAGATAAACAGTCCCCAAC
TTTTTTGGTAC
CAACTGACACATGGATACAAGGAGGCCAGACAGGGAAGGGACTTTCCAAG
ATTGCCCAGGGAGTTCCTGCAAGAGTCAAGATTAGCACCTTTGCTGGTGT
TTCTCCACCACATCACACTGTCTCCAAATCAGGCTATTCAATTGTGTCTT
AGGAAATAGCTAAGATATTCAAAGACAATATAGAGTAAAGGAAAGAGGAA
TGGCTATTACCAAAAAGCAAAACCACAAGTGTTGGTGAAGATGTGGAGAA
GCAACACAAACCCATCTATATCTTGATTCAAATATCAATTGCAAACGGCT
TGAAATACTATAGAAATTTGAACAGAAACAAGGTCTTAGATAAACAGTCC
CCAAC
TTTTTTGGTACCAGGGACCAGTTTTGTGGGAGACAATTTGTCCAC
TCCCTTCTTGTGTGCCCCCTAATTATTCACTCCCCAATGCCCAGACATTA
TGATGCCTTCTCCTGCTCAGAGACCTTTCTGGGAGGAAGACCTACTCAGA
CCTGGTATTCCCTCATCCTAGGCTCTACCCTATTTTTCATCCAGCTGTTA
AAGCTGAGTGACTAATTTCACACTTATGTACGAATGACCCATAACTGGCT
TAATGCTGTGACCATCTTGGGGGTATTCAAAGCTGATAAACACTTTTTTA
AGTTATATAATAATCAAAGAAGCTTATCTTTCTGCTTTATTTCAAATTTC
ACCCCACAGGCCTTACTTATTTTTAAGATCAATGATTTTGATGGGCCCCC
CCTTCCCACTCTTAATTCAGGGTATTTCTGGCCCCATCCGGATCCAAACT
CTAATGCTCATCTCTTCCATACTGTCCTTTGCAGGTCATCGGTATTGCAA
GAGTTGCATAAGGCCCAATTCAGTCTCTGCCCCAAAAGCTCAAGTCCAAA
CTTCAGAATCTGGGAGGACAAGGATTCAGGAAATTTTGTCAGAACTATGA
ACCATAAAGGTGAAAGACATCATAAACGGGAATTTAGACAATCCTCAGAA
CGTAGTCCCAGCACTTTGGGAGGCTGAGGCAGGAGGATCACTTGAGCCCA
GGAATTTGAGACCAGCTTGGGCAACATAGTGAGACCTCATCTCTTAAAAA
AAAAAATTAGCTGGGTGTGGTAGTGCACACCTGTGGTCCCAGCTACTTTA
GAGGCTGAGGTAGAGGATTGCTTGAGCCTGGGAAGTTGGGGCTGTAGTGA
GCTTTGATTGCATCACTGCACTCCAGCCTGGGTGACAGAGCAAGACCCTG
TCTCTAAAAAATTAAATAAATAATAAAAAAATTAAAAAGTAACTCCC
TTT
TCTTTATTTTCAGGCTTCCTTCCCACCTGCTAATTCAAACACTTTACAAC
ACATTTTTGAAACTGGGAAGATTCATATTTTAGTATCTGTCAAATGATGA
TAAATTCGGAAGCCAGTGTAATTTATACCCTAGGGGCTGAGGTCTAATTC
AACATATTCCAGTTTCTATTTTCTAAAGCTAAAGAAACATGTGTTACAAT
GTAGATAGGGAATACTTTCTTAATGAACCATGCTGAACTGTAAGATTTTT
ATGGTGAGGTTAATATAAAGAGACATTAAACAAATATATTTCTGCTCTTT
AAAGTGAAATATCCAGACGCTATCCACCAGATTATAGGAAATGCAAAGCA
TGGGAACTTCTAAAAGATAAATTGTTTTAATCAATATGTAGTAAAAAGGG
AAAGGGAACTGTTATTGAATAAAAGTGACATCGTGACCAAATGTAATGTA
ATAACTTTGGACACTGCTTGAAGAAACCAACTATAAAAATTCATATTGAG
TCAGTCAAGAACATGTTTATATTGACTGGAA
TTTTATTACTTTAAGGATT
TAGCCTTGTATGGAACTGCCAAAGTGGCTGTACCATTTTGTATTCCTACC
AGCAATGAATGAAAACACCTGTTGATCTGCATCCTTACCACTATATGATA
TTGTCATATTTCAGATTTTAATCCGTCTAATAGATGTGTAGTGGTAGATA
GTTGCTTAATTTTCAATTCTCTTATGACATACAATGTTTAACATCTTTTT
ATATGTATATTTGCTATCTGTATATCCTCTTTGGTGAGGTGTCTGTTCAG
ATCTTTTTCCCATTTTAAATTGGATTGTTTTCTTATTTTTGAGTTTTAAG
TGTTCTTTTTATATTTTAAGTGCAAGCCCTTTATCAGATATGTATTTTGT
GCATATTTTCCCACTCTGTGGCTTGTATTTTAATTCTCTTAATAATATCT
ATTTGAATTGTGTCATATTCTTCTTTGTTCTTCTTCTTGTGTATTATGTT
ATTTGCAAGGCTGTGTAATTCCAAGGACTGTTATTCTTGGATGCTATGAT
CTAGTTCAAGCTTGAGGGCTTACTGTGCTCTTGCAGGGAAAGATAAAAGA
AAGTGTCAGAGTGAAAGAATGGTCAAATGTATGAACTCTTC
TTTTATTTA
CCATAAATGTGAAATTACTAGAACTCACAATAAATAGAAGTTAGTAAAGA
CACTGAATTCTAACTAGACGCTATTGCTTGTTGAAGGCTTTGATCTTAGG
AGGATTAGAAAGCATTCTAGGCCAGGCACGGTGGCTTCCTGTGTGTAATC
CCAGCAGTTGGAGAGGCTGAGGCAGGCGGGTTGCTTGAGCTCAGGAATTT
GAGACCAGCCTGGGCAACATGGCAAGACCCTGTCTCTACAAAAACATACA
AAACTTAGCCAGGCGTGGTGATGGCCACGTATGGTCCCAGCTACTCAGGT
GGCTGAGGCAGGAGGATTGATGAACCTGGGAGGCTAAGGCTCTAGTGAGC
CATGATCACACCACTGCACTCCAGCCTGGGTGACAGAGCCACACCCTGTC
TCAAAG
GAAAAAAAAAAAAAAAAAAGAATTCTAGTGGTGTGGTGTGGAAG
GGATGACAGTCCCCACTGAGCCCCAGCCAATAGCCGGCATCAACTGCAAG
ACATGTGAGTAAGCGAACCCTCAGATGATTCCAGCCCCCAGCCTTTGAGC
TGCCCCAACTGATGCTTTGTGGAACAGAGAAAAGCTGTCCCCATTGAGCT
ATTTTAATAAAATCTGTTTTAAATTATTTACTTCCTGGAACAAATCTCCC
TGTTGTGTTGGTTTATGAACATGGTTCTATTGCCTTCAGTCTATTGTCGG
AAATAAAAACAGTCCTGCAGTTGTTGATTGAGTGTACTATGCCTTTAAGA
AGTCATGGCACTCATGCAACAGCCATGTAGTTGTTGATTGAGAGTACTGT
GTCTTAAAAAAAGAAC
TTTTGCTAAATAAACTGACTCTGTGAGCAGCCCT
CAGTTAACGTGCGTTTTCTCTTGTGGGCAGGGGTGGGGGTAACAAGGTGC
TTGGTGAGGAGCTCCTGAGACTCATTGTCCAGGAGAAGGAATGTCACAAG
ATCAATTGATCAGTTAGGGTGGAGCAGGAACAAATCACAATGGTGGAATG
TTATCCACGCCAATCTGAATTTCTCCATGACATGGACCAGGTGGGACTGT
GGGTTTGGTGCCATGTACATGACCTGTGACTTAGTGGATGGAGTTCCTTA
GGCCACAGCAGCCTCTGGCTCAATGAAGCTTGATCTACTGAGTACCTGGA
CCACATGGGGCTCTAGCAGCAGTCCTATCTTGAGCCCAGAACAGTAACTT
CCTTTATGGTGACTTTTCTCTCAAGGACCTCCACTGCTTTCTTCTACTAT
TAAGCGATTTTAGAATAAGA
TTTTATTCTTGCTTAATTCTTCTCTTCAGA
CATCCATCTGTTCATCTATCACTGTCTATATATCTATGTATCTATCTATC
CATCCATCCATGCATCCATCCATGCATCCATGCATCCATCTATCACTATC
CATCCATCCATCCATCCATCCATTCATCCATCTATCTGTCTTCTACCTAC
CTACCTATCTAACTCTCTGGAGAACTCTGACTAATAAACTAGCTTTATAA
ACATGTTATTCTCTCTCTGCAATGTCTATTGCTTTATCTTCAGGAACATT
CCACACATCCTGTAAGACTTCAGTTAAATTATCTCTCTGTTTCTTCTCCA
ATCATCCTCTGCCTTCCCTAGTCTCCTAACGTACTTTGTACATCTGTCAC
AAAAAAAAAAAAA
CAAGACTAATAATTAGGCAACTCATTGAGTAGGCTGT
TGAACCAGCTAAAGTGGGAAAGAAATTATTCAAGTTCTAAACCTTTCTAC
TTGCAAATTAGCCAAAATCAATTGCATTTTAAGCACTGCATCACCTTGAT
TAAGACTGTGTGGTACTGGCATAAAGACAGACAAACAGATCAATGGAATA
AAATTGAGAGTCCAGAAATAAACCTTCACATTTATGGTGAATTCATTTTT
TATTTGTATACCGATAACTATAAACCTTTGATAAAAAAAGTTGAAGAAGA
CACATATAAATAGAATAATATTCTGTGTTCATGAATCAAAAAATTTAACA
ATGTTAAAATGTCTGTATTAACCAAAGCAATATACAAATTCAATGCAATT
TCTATCAAAATTTCAAGGATATGCATCACAGAAATAGAAAAAAAATTCTT
GAAATTCATATGGAACCACAGACACATAAAAACAGAATAGGCAAAGGAAC
AATGAGAAAGCAAAACAAAGCTTGAGGCATCACACTTCCTAAGTTAAAAT
TATATTGCAAAGCTACAGTAATCAAAAACAGTATACAAATGGCATGAAAA
CGAAAATGTGGACCAACGGAACAGAATATAGAGAGCCAGAAACTTAACTA
A
TTTTCAACAAGGGTACCAACAGGACACCCTGAAGTAAAGATAGTTTCTT
ATATACTGACATCTCCAAGAGGAATTCAAGATAATATGGCAAATGGGAAT
TAAATGAGAAAGTAAAAGAGAGGTGACCATATGCAATCTGGAGCAGGTGG
TGCAAATCATCTGCTGAAGTCTGTTAAGAACGTTCAGCAATATACACTTC
TTGAATCAAGCACACACACGCAGCTGCATTGTGTTAACAAAGTGAAACAT
TATTAGGTCGCATACTCTGAGTGACAATATCCTCGAATGATCATTTCTGT
GAGAAATTATTAGCTATGAGAAATTCAATTTGGTAGCATTTGGGCATTAC
AGACAGCTAGTCTGTTTATATCAGCAAGGCTTTGATGTTAAGACTGTGTA
ACTGCAGCCACAGGAAAAGCAGACTGAATACAGGGTGGATAAGGTCACAG
ATATAAAAATCAGATAGAGTTCTGTTCTATTATCTACATAGTGTGTACTT
TGGGAAGTTACTTAATATTTCTAAGCCTCAGTTTCCTCATAAAAATAAAA
ATGGCAAGCAATATGAAAACTATCTAATAGAATATTTGTGACACTAAATT
GTAATAATGTATATAAATCACTTAGCCTAGTATGTGGCATTTATTAACAC
TCAAGCAJAGTGTAA
TTTTTTAAAAAAACTCTTATATCCCTTACATGACA
AAAAAAA
TTAGCATCAGTTTAAAAGAATATTTCTTTAATCAACAGTTCTG
GTCTATTCAGACATTCTCCTGTTTTGTTAAGTGGAAATCTGTGTGGTCT
TTTTGGCTAATTTGCAAGTAGAAAGGTTTAGAACTTGAATAATTTCTTTC
CCACTTTAGCTGGTTCAACAGCCTACTCAATGAGTTGCCTAATTATTAGT
CTTG
TTTTTTTTTTTTTTTTCATTTATTTACTTTAGACAATTCCTGAGTG
GTGAAATAGATTACATCTTGTAATAATACAGCTATGAAATTCTGACCAGA
ATGAAAATATGAGTATGAAGAGAGTAATCATTTGCTTATTAATTCAAGGA
ACAATTTGCCA
TTTTTCAAGTATTATGAAAATAAGAGACTGTTGGACTCT
TTCTTAAGGATTATGCCTTGAAGTCATATTTTACTTCTGTAGAGTTCTAT
TAGATCAAACAAGTTACCTGGCCAAGCCTAGGGTCATAGTGGGAGAATTT
TGAGGGCCATATGGCAAACAGCCCACCATACAATACATTCTCAAATGGCT
TCTCAAATTTTACATTCTTGTGAATGATTCTCTCTTCTTGTGATTAATTT
GAAATCATAATGAATTTGGTCGTATCTTATTTTTCCCCTTTGTTTGTTTC
ACAACATTTGTTGGAAAGTTCATCCTGTA
TTTTAGTAGTACATAAGTTGA
ACCATAAAGGTGAATAATGACACAGACACTTAGATTGGGGAATGAGAAAA
TTTAGGAATGATTCATTTTAAGATATGGCCATCAATTATTTATAAGGGTT
AAAA
TACAGGCATTAAGGGAATAATAGCTGAATAGTAATAATAATACATT
ATGTCAACAGCGGTGACAAAGGAAACACTCAATGTATTTATAGAGCTAAA
TAAACGGCAGATCTAGGTCCTACGTTTTGACTCTGAACAACCTTCTCGGT
TGGATTTTGCTTCTGCCTAAGGATTATTTTGGAAAGAGCTATTATTATCC
GTGATTTATCACGCTGCACTGGGGGGAACTCATACTTTCCACGGAGACAA
TTACTGAATTCTCACTGGAGGCGCTTAAAGGAGCCAGGACCTGTTCTGAG
GGTTCAGGTGGGAAAGGTGTGCCAGCAGGGGACTGCAGCCTGGCACCATG
GGACGTGTGTGCTGTTGACCACTTCTGTGCCCAGATCCCTCAGGCGCTTT
CTCATTAGATGCACCCTTCAATCTCCTGGTTATTGAAACAGGACTGGGGA
TGTTAACAGATTCTGGAAACGTTCTCACAGATACACACAGAAATAATGTT
TACACCCTTCCAAAGATACATTATACATTCCTATGTACACTCAAATATTA
TTTTTAAACTTCCATTCCAATCATTAAGTAGAAATGCATTTAAGAATCAT
TGTACCTGGCAGGATATTCACAGAAATAAAATATTTATTGGCCATCTACT
TTGTTTAAGACCTCTTAACAAACCATAACTTATTAAAGCATAAAGTAACA
TACATAGTAAATAC
TTTTAAAATCTGTAAACAACTAATTCCTTTCTTCTT
CAAACCTGAAAGCTTGTCGACTATCTCTGTACAGTCAGACAAGAGGTGTG
TGTATGTGTGTGCGTGTGTAAAGGCTGAATTTTTAATTTTTAATTTTTGG
CGAGCGTGTGAGATGCTCTCCATTCCTTCTTCCCCACCCTTCAAGATGCT
GACTCTCCCACCCCCGTCAAGATAACTTTATTTTGGAGAGGAATACCCCT
CATGGCACTTGGAGATTTGAAAGGACTGCAGGAAATTTGGTGGGCATTAT
TATTCTATAAGTGATTTATTTCTACCCAGGCAATAGGTTTATTAGATCAT
TTTATCTTCAGGAACATTCCACACATCCTGTAAGACTTCAGTTAAATTAT
CTCTCTGTTTCTTCTCCAATCATCCTCTGCCTTCCCTAGTCTCCTAACGT
ACTTTGTACATCTGTCACAAACCCCTCATCATATTTACTGTAATTTTTTT
GTTGTTACAAATAAAAGAGCAAGCAGGCCCCTCACTGTAATTCACCTGTA
TTTGCATTTAACTTATTAACCAAGGCATACTATTTCAAATAATCTAATAT
However since the in silico search was based on the H1 PSE consensus and considering that it was used as query allowing only the first and the last bases to be different in the targets, it can reasonably supposed to have identified only those promoters whose, structure is very similar to that of H1 (for sequences used as query see materials and Methods section). This is further supported by the fact that out of H1 no other previously known Pol III Type III promoters were found in our PSE-based collection. Therefore this finding together with the observation of the large sequence divergence among the PSE consensus sequences of U6, 7SK and H1 suggests that the use of a degenerated PSE consensus as query (most likely derived from a bioinformatic analysis of several known Pol III Type III promoter consensus elements) would bring to light a considerably higher number of novel PSE-dependent transcription units in the human genome that would better clarify the likely impact of this effect at genome scale.
In order to further characterize in silico the novel transcription units we arbitrarily assumed as transcribed the sequence stretch starting from the 21st nucleotide downstream the predicted TATA box. In addition a 4×T repeat was considered as a Pol III transcription STOP signal although events of “read through” are possible and most likely affected by sequence context features [19, 20]. Although it has to be emphasized that the transcribed region of each element of this collection needs to be experimentally determined case by case (possibly in the context of its target gene of regulation), based on their in silico characterization we selected 33 novel transcripts to be subjected to additional analysis.
In order to test if a common secondary structure could be a hallmark of the novel molecules an in silico analysis of their secondary structure was performed by mfold algorithm [21]. Results showed that although hairpins with short stems (5-7 base pairs) were frequent no shared secondary structures were recurrent indicating that a peculiar molecular organization is not the common hallmark of this set of non-coding molecules. Interestingly, although their averaged free energy (δG) was extremely variable (−42.7±41.2) four transcripts (11A, 20A, 21A, and 29A) showed a δG value significantly lower than all the others (δG<−100). A statistical analysis of such δG differences was performed bringing to light a group of transcripts (11A, 20A, 21A and 29A) whose δG is significantly lower then expected (Student't TEST, 33 degrees of freedom, α significance level=0.1 corresponding to a P-value of 0.0001) thus keeping in line with their physiologically functional molecular organization (
In order to assess if the pool of transcription units was prevalently constituted by repeats such as retroposons we analyzed all the transcripts by Repeat Masker algorithm [22] evidencing that: i) only 2 out of 34 (5.9%) are Short Interspersed Nucleotide Elements (SINEs) such as 21A and 29A that were marked as AluJb elements. ii) three of them (8.8%) are part of Long Interspersed Nucleotide Element (LINE) such as 24A, 37A, and 38A. iii) two (5.9%) contained a MIR (17A and 40A) and iiii) three contained different types of Long Terminal Repeats (30A, 32A and 44A) (Table 2).
Placing results in the appropriate context (such as considering that Alus, LINEs and MIRs constitute about 15%, 30% and 1-5% of the human genome respectively) one should expect a higher frequency of repeats in this novel pool of sequences. In addition we observed that no more than three of the repeats-containing elements are ascribed to the same class of molecules. Altogether these observations evidence that the novel PSE-dependent transcripts are not associated to a specific class of repetitive sequences scattered throughout the human genome but instead they constitute a novel eterogeneous set of Type III promoter-driven elements.
When these non-coding sequences were used to challenge the human genome database (BLAST Analysis) results showed that 7 were internal to known or predicted protein-coding genes, 4 being in antisense and 3 in sense configuration. Interestingly, most of the novel sequence elements not mapping in coding regions shared a high sequence homology (˜80%) to a Pol II transcript/EST that maps in a different locus (Table 3). Such homologies reached much higher values (often about 90%) if only parts of the putative transcripts were considered. In fact, no ESTs entirely containing one of our transcription units were found so that if a sense/antisense-based regulation would occur it should be related to parts of the ncRNA sequences while the other part could have structural properties that facilitate this regulatory action (perhaps binding specific structural proteins). Based on these observations, a novel control mechanism of gene expression could be postulated where Pol III (or Pol III-like) elements act as trans-locus antisense of their homologous protein-coding RNAs. In this model the Pol III co-genes in antisense configuration with respect to one (or more) specific target gene could regulate its expression either by interfering with its mRNA maturation (if the homologous region is internal to an intron) or by inhibiting protein translation (if the homology is associated to an exon).
21A as Co-Gene Experimental Model
To test our hypothesis we selected one of the novel transcription units (here referred to as 21A) that maps in 8q24.13. If aligned to the human genome it shows several homology hits among which the highest were associated to multiple intronic regions of Centromeric Protein F (CENP-F; 1q32-q41) (Acc. N° NM016343) [23] thus constituting its putative natural trans-chromosomal antisense (
To check for 21A expression in cultured cells, we performed Northern blot analysis on total HeLa cell RNA using a 21A dsDNA probe. Two positive bands were detected: one corresponding in size to the expected 21A transcript (−300 nt), and the other one corresponding to a high molecular mass transcript (as expected for CENP-F mRNA) (
Pol III-Dependency of the Novel Transcription Units
The same experiment as above was repeated after 24 hours of cell treatment with ML-60218, a cell-permeable indazolo-sulfonamide compound that displays broad spectrum inhibitory activity against RNA Polymerase III [26]. Results showed an efficient luciferase-silencing activity in the absence of the Pol III inhibitor (as evidenced by a decreased luciferase emission) while after treatment with ML-60218 the luciferase signal was increased (
Altogether, these results evidence a decrease in hairpin synthesis of the novel transcription units as consequence of the reduced Pol III activity according with their Pol III-dependency of their transcription.
21A Acts as CENP-F Regulatory Co-Gene Modulating its Expression at Post-Transcriptional Level
To test whether the 21A transcript acts as an antisense inhibitor of CENP-F expression we measured by Western analysis CENP-F protein level in HeLa cells transiently transfected with four different 21A constructs carrying: i) the whole 21A region containing both DSE and PSE elements (p21A), ii) its upstream moiety, that contains the DSE and a MIR element (p21A-1), iii) the novel Pol III Type 3 transcription region (that includes an Alu Jb module) (p21A-2) and iiii) an empty vector as Mock control (pMock). Starting at 24 hours from transfection of the whole 21A region, inhibition of CENP-F accumulation (followed by a rapid degradation) was observed. Such inhibition was specifically associated to constructs expressing the 21A RNA (p21A, p21A-2) while the MIR element in the upstream moiety of the fragment (p21A-1 construct) was ineffective (
21A Overexpression Specifically Inhibits Cell Proliferation in Humans
Given the central role of CENP F in mitosis we tested the effect of ectopic 21A expression on cell proliferation. By measuring [3H]-thymidine incorporation we evidenced a dramatic arrest of cell proliferation after 48 hours in 21A-transfected cells. Again, the effect was specifically associated to the downstream 21A transcribed region (p21A/p21A-2 constructs) while transfection of the MIR-containing upstream moiety (p21A-1 construct) did not alter cell proliferation (
To further support the antisense role of 21A we transfected Hela cells with a construct expressing the transcript in antisense configuration (here referred to as pAnti-21A) thus quenching the activity of the endogenous 21A molecules. Results showed an increased cell proliferation 24/48 hours after transfection. Similar results were obtained when a 21A-specific siRNA expressing construct was transfected in HeLa cells while the negative control sample (cells transfected with an unrelated chicken-specific siRNA) maintained a cell proliferation rate similar to that of pMock-transfected cells (
These data suggest that the decreased amount of 21A transcript consequent to its siRNA-mediated silencing, as well as its suppression by antisense technology specifically increase CENP-F synthesis thus keeping in line with the proposed role of 21A as CENP-F regulatory co-gene. In addition, it has to be considered that the increased proliferation rate here observed supports the idea of a widespread regulatory action of 21A that may control at post-transcriptional level the expression of several target genes similarly to what has been proposed for miRNAs [27].
The 21A Regulatory Effect is Human-Specific
Considering that a 21A-driven cell proliferation inhibition is expected to be primate specific (Alu sequences were not found in other mammalian orders) we tested for its eventual occurrence in mouse. In fact, this would keep in line with an unspecific effect of 21A on cell proliferation may be due to the activation of a more general biological process such as most likely the interferon response (an antiviral cell reaction shared by all mammals) rather then a specific multilocus 21A regulatory action. As expected results showed that after transfection of p21A, p21A-1, p21A-2 and pMock the murine fibroblast NIH 3T3 cells did not show any proliferation decrease as assessed by [3H]-thymidine incorporation (
21A is a Key Factor of Cell Proliferation Control
As demonstrated by transfection experiments 21A overexpression is inversely correlated to cell proliferation. According with this finding its expression is very low in fully proliferating HeLa cells. Therefore in order to further demonstrate the inverse correlation between the endogenous 21A expression and cell proliferation we analyzed by quantitative Real Time RT-PCR its transcription level in different cell types with various proliferation potential. Results showed that three immortalized/fully proliferating cell lines here analyzed (HeLa as cervical adenocarcinoma; 293T as renal epithelial adenovirus transformed cells; LAN5 as neuroblastoma) the level of 21A transcription was very low if compared to the unproliferating/resting PBL cells (such as peripheral blood lymphocytes) in which a 276-fold increased 21A transcription was evidenced. I the same experiment, according with an inverse correlation between endogenous 21A transcription and the cell proliferation rate, the 21A RNA level in primary skin fibroblasts (whose proliferation rate is significantly lower than that of the tumor cell lines here analyzed) showed a 23-fold increased if compared to 393T cells and a very low expression level if compared to the resting/unproliferating PBL (
In order to check if the endogenous 21A over expression in unproliferating cells was related to a widespread increased RNA polymerase III activity rather then a 21A-specific activation we measured by Real Time RT-PCR the 5s rRNA expression level in the same samples. The results showed no direct correlation between 5s rRNA expression and the cell proliferation rate variations evidencing that the 21A over expression in resting cells was the consequence of a 21A-specific transcription activation rather then a more wide, unspecific increase of Pol III activity (
We here propose that the non-coding fraction of the human genome includes a larger than expected number of ncRNA genes controlled by DSE and PSE promoter elements. Due to their promoter structure, a number of these genes is likely to be transcribed by Pol III. We refer to them as co-genes since they could specifically co-act with a protein-coding Pol II gene. Given the very high sequence homology between Pol III and Pol II transcript pairs and in the light of the results we have obtained investigating the regulatory activity of 21A transcription unit, we propose that a large part of these novel elements may act as antisense inhibitors of protein translation and/or mRNA maturation although some of them (those whose homology with the Pol II target gene is in sense configuration) could play a role in gene expression regulation with different mechanisms. Altogether these findings provide evidence for the existence of a ncRNA gene set associated to PSE/DSE-containing promoters, whose products co-act with a corresponding set of protein-coding targets.
In conclusion, this study provides i) a collection of novel non-coding transcripts to be investigated for their potential regulatory action with respect to Pol II target genes ii) a novel source of PSE-dependent promoters useful for the identification of common regulatory regions specific for this type of promoters, iii) a novel class of molecules involved in the RNA gene expression regulatory mechanisms iiii) a novel transcript (21A) whose intriguing role in tumor cell proliferation control would need to be investigated in detail in the context of cancer studies.
Number | Date | Country | Kind |
---|---|---|---|
RM2005A0475 | Sep 2005 | IT | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/IT2006/000664 | 9/19/2006 | WO | 00 | 9/8/2008 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2007/034527 | 3/29/2007 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
20030235833 | Suwa et al. | Dec 2003 | A1 |
20040098761 | Trick et al. | May 2004 | A1 |
Number | Date | Country |
---|---|---|
WO 2004067779 | Aug 2004 | WO |
Entry |
---|
Sulston et al. Genome Res. 8, 1097-1108, 1998. |
U.S. Appl. No. 12/066,829, result # 4, sequence search result for SEQ ID No. 62 for dated Oct. 12, 2010, pp. 1-11. |
Birren et al. Homo sapiens chromosome, clone RP11-318C2, complete sequence, GenBank No. AC026894, retrieved from NCBI: Genbank on Jan. 21, 2011, dated Aug. 2002, pp. 1-50. |
Wen et al. Homo sapien chromosome 8 cline RPI-316L14 map 8q24.2, complete sequence, pp. 1-40, Genbank [online] Bethesda, MD USA: United States National Library of Medicine [retrieved on Jan. 23, 2012]. Retrieved from: GenBank, Accession No. AF187000). |
Sequence alignment SEQ ID No. 57 and AF18700 dated Jul. 6, 2011, retrieved on Jan. 23, 2012 from SCORE, pp. 19-30. |
Number | Date | Country | |
---|---|---|---|
20090023674 A1 | Jan 2009 | US |