This invention relates to DNA libraries based on plasmid or viral vectors that can express double-stranded RNA of 10-30 base pairs in length with all possible sequences, where each of the double stranded RNA is formed by a single RNA molecule in the form of hairpin, or formed by two separate RNA molecules with different 3′-overhangs. Each single member in such a DNA library encodes all components of a double stranded RNA as specified above. Such a library can be used in screening for double stranded RNA species that can induce a given phenotype without prior knowledge of their target genes. This invention further relates to a method to generate such a DNA library.
Messenger RNA (mRNA) is normally perceived as the information-carrying intermediate in protein synthesis that is transcribed by RNA polymerase from a DNA template and subsequently translated by ribosome to generate protein molecules. Recently more data have demonstrated that many genes are transcribed into RNA molecules that are not translated into proteins at all (Okazaki Y et. AL., Nature; 420 (6915): 563-573 (2002)). SOME OF THE UNTRANSLATED RNA WERE FOUND TO CARRY OUT functions in the regulation of the other MRNA by inducing the degradation of the MRNA IN A SEQUENCE SPECIFIC MANNER (AMBROS V., CELL; 113 (6): 673-676 (2003)). THIS is in good agreement with the recent finding that double stranded RNA and synthetic siRNA can also induce cognate MRNA degradation in a wide range of organisms (MCMANUS M T, SHARP P A., NATURE REV GENET.; 3 (10): 737-747 (2002)). LONG double stranded RNA was found to induce intensive non-specific inhibition of RNA synthesis in mammalian cells, but siRNA can bypass this obstacle and still maintain the strong inhibitory effect on target gene which shares sequence identity with THE SIRNA (ELBASHIR S M ET AL., NATURE; 411 (6836): 494-498 (2001)). THIS HAS MADE siRNA a primary tool for gene knockdown in functional genomics. SIRNA also has the potential to become drugs that can be used to cure a disease by reducing the activity of disease related gene.
SIRNA are generally double stranded RNA of 19-25 base pairs that are either formed by a single RNA molecule in the form of hairpin or formed by two separate RNA molecules, with different 3′-overhangs. SIRNA can be produced in three ways: chemical synthesis; expression from DNA vectors under the drive of a promoter; and RNase III (Dicer) cleavage of long double stranded RNA. All siRNA that have been used so far are designed to target a segment of a predefined gene.
The present invention relates to DNA libraries, each of which contains all possible permutations (permutation refers to different sequences) of double-stranded RNA of certain length. Such DNA libraries can be easily configured to produce all permutations of siRNA. It provides a high throughput screening method for double stranded RNA (as well as siRNA) in a target-independent manner for indications related to any given phenotype. More specifically, the siRNA encoded by such libraries can be used in such screening either individually, or as a mixture of any complexity, without the burden of knowing its sequence or its target gene. This method can overcome two major obstacles in siRNA application: 1) the incomplete knowledge about the transcriptome of each organism. According to the recent data from mouse transcriptome analysis, our knowledge about the transcriptome of this best understood model animal is still far from complete. Much less is known about the transcriptome of human and other animals. Since the application of our library does not require any prior information about the target sequence, it will allow immediate implementation of genome-wide siRNA screening in any organisms. 2) the extraordinarily high cost of siRNA. No matter how the siRNA is prepared, the cost of making siRNA targeting all known MRNA of an organism is extremely high. A single regenerateable DNA library that contains all permutation of siRNA that can be applied in any organisms virtually reduces the cost of siRNA production to a minimum level.
Accordingly, in one aspect the present invention relates to a DNA library for the production of a library of double stranded RNA molecules of a predefined length in the range of 10-30 base pairs in living cells, wherein the sequence (s) of the DNA region (or regions) encoding the double stranded part of double stranded RNA molecule (s) is randomized in a number selected from 4 to all nucleotide positions, and wherein both strands of said double stranded RNA molecule is produced from a single member of the DNA library. The invention also provides a kit containing the DNA library.
In another aspect the present invention provides a method of preparing the DNA library.
In yet another aspect the invention relates to an RNA library obtained from the DNA library.
Further aspects and advantages of the invention will become evident hereinafter from the following detailed description and attached claims.
Small interference RNA (siRNA) is a term initially used to define short double stranded RNA that have a 19-21 nt double-stranded region nested between 3′-UU or TT or other single stranded overhangs. A number of variations of this original form of siRNA (such as hairpin-type) have been introduced lately. Such siRNA can be used to reduce the expression of genes having identical sequence to the siRNA double stranded region in cells from a variety of different organisms. While longer double stranded DNA and RNA also could be produced by means of the methods of the invention, the libraries of the invention have been restricted to double stranded DNA and RNA of a length of 10-30 base pairs, since above the length of 30 base pairs, the nucleotides will be more likely to produce an immunoresponse, and other disturbing side-effects when transfected into living cells.
SIRNA are initially chemically synthesized, but several methods have been introduced to generate siRNA enzymatically, using viral promoters such as t7 promoter, or microRNA promoter such as H 1 or U6, in free form or in plasmid or viral vectors.
The current invention provides a method to construct DNA libraries encoding random siRNA libraries. Such a library differs from the prior art in that in the prior art, one would have to design the siRNA according to a known sequence of the gene, whereas from the present library one can screen through a fully random panel of different siRNA (without the need of prior knowledge of their sequences or their target sequences) to look for phenotypes associated with each siRNA, and then identify the genes related to each siRNA de novo.
The challenge of making a fully randomized DNA library based on plasmids or viral vectors encoding all permutations of siRNA is to make sure that each member of the DNA library expresses a distinct and complete double stranded RNA. None of the existing methods of making vector-based siRNA (short double stranded RNA) can meet this challenge.
The current invention describes the construction of a random DNA library with only one randomized region. Then for each plasmid, two promoters will drive the transcription of this region from the opposite direction to produce the two complementary RNA strands separately. Two transcription terminators were placed at each end of the randomized region to make sure that RNA of a defined length can be produced from each direction. The advantage of this approach is to avoid the troublesome cloning procedure in the dual-region system as will be described beneath for creating two reverse complementing regions in each individual plasmid. One example of the promoters that can be used in such a system is the RNA polymerase III promoters H1 or U6. For RNA polymerase III, a stretch OF TTTTT is needed for the proper termination of the transcription. In order to use this RNA polymerase to drive expression of the same region from both directions, the TTTTT stretch has to be inserted on the both ends of the randomized region. There is one problem though: the RNA polymerase III promoters has to be placed immediately next to the randomized region to ensure proper transcription start from the precise location of the beginning of the randomized region, but those promoters does not contain a AAAAA stretch that would allow THE TTTTT terminator to appear on the opposite direction. The only way this can be done is to mutate the RNA polymerase III promoters to insert such a AAAAA stretch, and nobody knows how the insertion of the AAAAA stretch will affect the transcription starting, and the rate of transcription. As will be shown below, we mutated the HI RNA polymerase III promoter and inserted an AAAAA stretch at the end of the promoter and verified that the mutated promoter support proper transcription start and product of effective siRNA. Thus, we first started to construct a plasmid library with the termination signal placed on both sides of the randomized region (
Construction of the Vector with Dual-H1 Promoter Against Renilla Luciferase
A plasmid with two mutated RNA polymerase III promoters, each embedding one transcription terminator sequence for the other promoter, was constructed with the siRNA region designed to target a model molecule Renilla luciferase (
Mutation of the H 1 RNA polymerase III promoters and construction of the example plasmid is described in details below.
Efficient inhibition of luciferase expression by PDHRL. Take the above three clones: clone 1, clone 2 and clone 3, and transfect plasmid into HEK293 cells on 24-well plate, at 1.2 ug, 0.6 ug respectively, together with plasmid of Renilla luciferase and firefly luciferase encoding plasmids. 48 hours later, measure the Renilla and Firefly Luciferase activity. (
Cloning the Randomized DNA into pDH to Form a Library that Encodes all Permutations of the siRNA
The construction of randomized DNA library that encodes all permutations of siRNA is done in a similar way as the construction of the anti-luciferase siRNA encoding plasmid in PDHRL, with the only difference that the second strand of the tester sequence was generated ENZYMATICALLY to preserve the randomized nature of the sequence.
Three oligonucleotides were synthesized with 19, 20 and 21 nt of randomized region embedded within the two known sequences.
The oligonucleotides were allowed to anneal to a primer CCCCAAGCTTAAAAA and filled in with Klenow fragment in the presence of 1 mM concentration of dNTP in proper buffer (all chemicals other than DNA oligonucleotides were purchased from New England Biolabs Inc. unless otherwise specified). The duplex oligos were cleaved with Bgl II-Hind III and cloned in the Bgl II-Hind III sites of the pDH to form PDH-LIBRARYA.
The quality of the pDH-libraryA was assessed by first clone length analysis of 41 clones, where single clone, a 10-clone pool and a 30-clone pool was used to prepared plasmid DNA and cleavage with restriction enzyme. The results suggested that all clones have the insert of the same length (
One alternative to plasmid vectors for epitopic expression of foreign gene is various types of viral vectors. Since all cloning strategies for constructing viral vectors are common knowledge, and anybody with reasonable knowledge of the art can produce viral constructs that can carry out similar expression functions as the plasmids, the disclosure of making DNA libraries as above will also enable the production of DNA libraries as such in viral vectors.
Construction of DNA Libraries Containing a Pair of Randomized Regions with Inverted Sequences
Although the vectors with two promoters and two terminators as represented by PDHRL and PDH-LIBRARYA are the preferred modes of the current invention, other methods of forming DNA libraries that encode all permutations of siRNA become obvious once the concept of DNA library encoding all permutations of siRNA is disclosed here. One such method is to form a plasmid library that encodes all permutations of the hairpin form of the siRNA. As an example, such a library can be formed according to the following procedure.
Slight modification of the above cloning protocol as illustrated in
It has to be stressed that due the enzymatic handling of the library, all siRNA that contain the restriction enzyme sites are lost. This will result in about 0.025% siRNA loss each restriction enzyme used. So in this sense the preferred mode of the current invention, based on two promoters and two terminators, will suffer less siRNA loss and be a more complete library, than the library generated according to the above hairpin library protocol due to the number of enzymes used in the individual protocols. Since the library contains about 2.75×1011 permutations in theory, the loss of siRNA species caused by the use of restriction enzymes will only have neglectable effect on the quality of library and for the screening of active siRNA against any specific gene. In the text of this invention, the referral to “all permutations of siRNA” should be understood as having this effect considered and included. Further elimination of this effect will be done by eliminating the use of restriction enzymes in the construction of the libraries.
Another note is that the sequences and restriction enzymes are only one set of examples that can be used to carry out the construction of the plasmid. The person skilled in the art can easily choose different restriction enzymes and corresponding sequences of the oligonucleotides to carry out the construction in similar manner in plasmids and viral vectors, according to the principle disclosed as above.
Generation of DNA Libraries that Encode Cell-Specific, Tissue-Specific or Species Specific Double Stranded RNA
With the disclosure of the random DNA libraries encoding all permutations of double stranded RNA of a given length, the method of establishing DNA libraries that encode cell-specific, tissue-specific or species specific double stranded RNA should be considered to be obvious to a person skilled in the field. One example of constructing such DNA libraries is presented below.
An oligonucleotide with 19 nt of randomized region is allowed to hybridize to MRNA purified from a specific cell type. The MRNA can be immobilized onto a streptavidin coated solid support (plastic beads for example) via biotin added to the end of the MRNA with Poly (A) polymerase. Immobilization of MRNA can be done in other ways too. After hybridization, all unbound DNA oligonucleotides are washed away and the bound DNA sub-random oligonucleotides are collected and cloned into the vector in a protocol identical to protocols described for fully randomized DNA oligonucleotides. The libraries resulted from this process will be highly enriched for molecules that encode double stranded RNA with sequence identical to the MRNA sources.
It should be noted that although all cloning procedures herein are described in the context of a single plasmid vector, the principle should be applicable to all types of plasmids, and the cassette containing the mutated promoters, terminators and the coding region of the DNA libraries can be transferred between those different types of plasmids.
It should be further noted that although all cloning procedures are described in the context of a single type of promoter, H1 promoter, the principle should be applicable to all types of RNA polymerase III type of promoters.
One alternative to plasmid vector for epitopic expression of foreign gene is various types of viral vectors. Since all cloning strategies for constructing viral vectors are common knowledge, and anybody with reasonable knowledge of the art can produce viral constructs that can carry out similar expression functions as the plasmids, the disclosure of making DNA libraries as above should also enable the production of DNA libraries as such in viral vectors.
The current invention involves DNA libraries that can generate double stranded RNA of 10-30 base pair in length, with at least one strand of the double stranded RNA having single stranded overhangs, and further involves methods to produce such DNA libraries. It is acknowledged that most frequently used double stranded RNA is siRNA of 19-21 base pair in length, normally with TT or UU overhangs on at least one of the strands. So the advantage of the current invention is discussed in comparison to siRNA generated by other methods.
In practice, only one in three to five or so short double stranded RNA that fulfill the basic structural requirement (19-21-base pair double stranded region, 3′ single stranded overhangs (normally TT, or UU, but not limited to such overhangs). For knocking down the 30,000 human genes using siRNA, about 90,000-150,000 siRNA then will have to be synthesized, at the cost of 18-30 million US dollars.
Similar amount of cost has to be allocated to any additional organism for which the full spectrum of siRNA will be generated for all genes The current invention can generate a siRNA library encoded in plasmids that contains in theory all the permutations (419=2.75×1011) of siRNA (19 base pair duplexes plus overhangs) (the size of libraries for double stranded RNA of other length can be easily calculated in similar way), that can be used in any organisms for which the a proper promoter (s) can be found). The cost of generating this library is just a minimal fraction of the cost of synthesizing all siRNA chemically. In other words, this is a library with the complexity of 2.75×1011 that contains reagents that can silence any gene in a mammalian and non-mammalian system. This is a very powerful toolbox for high throughput genome wide functional genomics and drug target screening, as well as nucleic acid drug development.
The complexity of this library can be further reduced dramatically by introducing a one-step oligoselection on the Library oligonucleotides. Such an approach will lead to the creation of gene-, cell/tissue-, or organism-specific siRNA encoding library that has much lower complexity (102-108), without sacrificing the usefulness of the library. Such a low complexity library can be partially or completely sequenced using different sequencing methods and enable the creation of plasmid collections that contains known siRNA encoders for each gene in an organism such as human, mouse or rat.
The description of the above is most based on plasmid system but the same library and collection can be easily established in viral vector using the same principle.
A few key classes of application of the invention is listed here as examples
1) A full collection of siRNA encoding plasmids can be selected for any given gene from this library through standard screening (which could be automated).
2) A full collection of siRNA encoding plasmids can be selected for any given cell type, tissue and organism can be established according to the invention.
3) Such collections of siRNA encoding plasmids can then be easily evaluated for their individual capacity to knockdown gene expression.
4) Most powerfully, such DNA libraries can be used for phenotype-based screening of target genes without prior knowledge of the target sequence or the siRNA sequences, thus the artisan can avoid the biased pre-selection of target genes. This will become one most significant way of functional annotation and drug target screening.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/SE2003/001077 | 6/23/2003 | WO | 00 | 12/20/2004 |
Number | Date | Country | |
---|---|---|---|
60390108 | Jun 2002 | US |