Methods, libraries and computer program products for gene silencing with reduced off-target effects

Abstract
The present invention provides methods, libraries and computer program products for selecting siRNA that reduce off-target effects and methods for gene silencing using these siRNAs. By comparing nucleotide sequences at positions 2-7 or 2-8 of the sense and/or antisense regions of candidate siRNAs to the 3′ UTR region of mRNAs, one can select siRNAs that have reduced off-target effects.
Description

BRIEF DESCRIPTION OF THE FIGURES


FIG. 1 is a representation of a microarray analysis that identifies off-targeted genes.



FIGS. 2A and 2B are representations of the results of an analysis that shows that maximum sequence alignment fails to predict accurately off-targeted gene regulation by RNAi. The sense (top) and antisense (bottom) sequences of each siRNA were aligned separately to the sequences of their corresponding 347 experimentally validated off-targets and a comparable number of control untargeted genes to identify the alignments with the maximum percent identity. The number of alignments in each identity window was then plotted for the off-targeted (black) and untargeted (white) populations.



FIGS. 3A-3D are representations of a systematic single base mismatch analysis of siRNA functionality.



FIG. 4 is a representation of the variations of Smith-Waterman scoring parameters that fail to improve the ability to distinguish off-targets from untargeted genes.



FIGS. 5A-5C are bar graphs that show that exact complementarity between the siRNA seed sequence and the 3′ UTR (but not 5′ UTR or ORF) distinguishes off-targeted from untargeted genes.



FIG. 6 is a bar graph that demonstrates that the seed sequence association with off-targeting is not due to 3′ UTR length.



FIGS. 7A and 7B. FIG. 7A is a graph of the frequency of all possible heptamer sequences in a collection of human 3′ UTRs. FIG. 7B is a graph of the frequency of all possible hexamer sequences in a collection of human 3′ UTRs. While the frequency of some seeds is very low, others are quite high. The distribution of a subset of the heptamer and hexamer sequences is shown.



FIGS. 8A and 8B. FIG. 8A is a representation of the distribution of seeds by frequency in 3′ UTRs for Refseq 15 Human NM 3′UTRs, from lmh_analysis.xls. FIG. 8B is a representation of the distribution of seed by frequency in 3′ UTRs for Refseq 17 Rat NM 3′ UTRs, from SeedThresholdForMouseRat.xls.


Claims
  • 1. A method of designing a library of siRNA sequences, said method comprising collecting a set of siRNA sequences of at least 100 siRNAs that target at least 25 different genes, wherein said siRNA sequences comprise 18-30 bases, and at least 25% of the siRNA sequences have at positions 2-7 of an antisense sequence a hexamer sequence that is the reverse complement of a sequence selected from the group consisting of:
  • 2. The method according to claim 1, wherein said set of siRNA sequences comprises sequences of at least 200 siRNAs.
  • 3. The method according to claim 1, wherein said set of siRNA sequences comprises sequences of at least 500 siRNAs.
  • 4. The method according to claim 1, wherein said set of siRNA sequences comprises sequences of at least 1000 siRNAs.
  • 5. The method according to claim 1, wherein said set of siRNA targets at least 50 different genes.
  • 6. The method according to claim 1, wherein said set of siRNA targets at least 100 different genes.
  • 7. The method according to claim 1, wherein said set of siRNA sequences that target at least 25,000 different genes wherein at least 25% of the siRNA sequences have at positions 2-7 of an antisense sequence a hexamer sequence that is the reverse complement of a sequence selected from the group consisting of
  • 8. The method according to claim 1, wherein at least 50% of the siRNA sequences have a hexamer sequence at positions 2-7 of said antisense sequence that is the reverse complement of a sequence selected from group consisting of
  • 9. A library of siRNA sequences, said library comprising a collection of siRNA sequences of at least 100 siRNAs that target at least 25 different genes, wherein said siRNA sequences comprise 18-30 bases, and at least 25% of the siRNA sequences have a hexamer sequence at positions 2-7 of an antisense sequence selected from the group consisting of the reverse complement of
  • 10. The library according to claim 9, wherein said set of siRNA sequences comprises sequences of at least 200 siRNAs.
  • 11. The library according to claim 9, wherein said set of siRNA sequences targets at least 50 different genes.
  • 12. The library according to claim 9, wherein at least 50% of the siRNA sequences have a hexamer sequence at positions 2-7 of said antisense sequence selected from group consisting of the reverse complement of
  • 13. A computer program product for use in conjunction with a computer system, the computer program product comprising a computer readable storage medium and a computer program mechanism embedded therein, the computer program mechanism comprising: a. an input module, wherein said input module permits a user to identify a target sequence;b. a database mining module, wherein said database mining module is coupled to said input module and is capable of searching a siRNA database comprised of siRNA sequences targeting at least 25 different genes, wherein said siRNA sequences comprise 18-30 bases, andc. an output module, wherein said output module is coupled to said siRNA database mining module and said output module is capable of providing to said user an identification of one or more siRNA sequences from said database where each siRNA that is identified comprises an antisense sequence that is at least 80% complementary to a region of said target sequence and at least 25% of the siRNA sequences identified from said database have a hexamer sequence at positions 2-7 of said antisense sequence selected from the group consisting of the reverse complement of
  • 14. The computer program product of claim 13 further comprising said siRNA database.
Provisional Applications (1)
Number Date Country
60782970 Mar 2006 US