Methods for identifying small molecules that bind specific rna structural motifs

Information

  • Patent Application
  • 20040219545
  • Publication Number
    20040219545
  • Date Filed
    February 03, 2004
    20 years ago
  • Date Published
    November 04, 2004
    20 years ago
Abstract
The present invention relates to a method for screening and identifying test compounds that bind to a preselected target ribonucleic acid (“RNA”). Direct, non-competitive binding assays are advantageously used to screen libraries of compounds for those that selectively bind to a preselected target RNA. Binding of target RNA molecules to a particular test compound is detected using any physical method that measures the altered physical property of the target RNA bound to a test compound. The structure of the test compound attached to the labeled RNA is also determined. The methods used will depend, in part, on the nature of the library screened. The methods of the present invention provide a simple, sensitive assay for high-throughput screening of libraries of compounds to identify pharmaceutical leads.
Description


1. INTRODUCTION

[0002] The present invention relates to a method for screening and identifying test compounds that bind to a preselected target ribonucleic acid (“RNA”). Direct, non-competitive binding assays are advantageously used to screen libraries of compounds for those that selectively bind to a preselected target RNA. Binding of target RNA molecules to a particular test compound is detected using any physical method that measures the altered physical property of the target RNA bound to a test compound. The methods of the present invention provide a simple, sensitive assay for high-throughput screening of libraries of compounds to identify pharmaceutical leads.



2. BACKGROUND OF THE INVENTION

[0003] Protein-nucleic acid interactions are involved in many cellular functions, including transcription, RNA splicing, mRNA decay, and mRNA translation. Readily accessible synthetic molecules that can bind with high affinity to specific sequences of single- or double-stranded nucleic acids have the potential to interfere with these interactions in a controllable way, making them attractive tools for molecular biology and medicine. Successful approaches for blocking function of target nucleic acids include using duplex-forming antisense oligonucleotides (Miller, 1996, Progress in Nucl. Acid Res. & Mol. Biol. 52:261-291; Ojwang & Rando, 1999, Achieving antisense inhibition by oligodeoxynucleotides containing N7 modified 2′-deoxyguanosine using tumor necrosis factor receptor type 1, METHODS: A Companion to Methods in Enzymology 18:244-251) and peptide nucleic acids (“PNA”) (Nielsen, 1999, Current Opinion in Biotechnology 10:71-75), which bind to nucleic acids via Watson-Crick base-pairing. Triplex-forming anti-gene oligonucleotides can also be designed (Ping et al., 1997, RNA 3:850-860; Aggarwal et al., 1996, Cancer Res. 56:5156-5164; U.S. Pat. No. 5,650,316), as well as pyrrole-imidazole polyamide oligomers (Gottesfeld et al., 1997, Nature 387:202-205; White et al., 1998, Nature 391:468-471), which are specific for the major and minor grooves of a double helix, respectively.


[0004] In addition to synthetic nucleic acids (i.e., antisense, ribozymes, and triplex-forming molecules), there are examples of natural products that interfere with deoxyribonucleic acid (“DNA”) or RNA processes such as transcription or translation. For example, certain carbohydrate-based host cell factors, calicheamicin oligosaccharides, interfere with the sequence-specific binding of transcription factors to DNA and inhibit transcription in vivo (Ho et al., 1994, Proc. Natl. Acad. Sci. USA 91:9203-9207; Liu et al., 1996, Proc. Natl. Acad. Sci. USA 93:940-944). Certain classes of known antibiotics have been characterized and were found to interact with RNA. For example, the antibiotic thiostreptone binds tightly to a 60-mer from ribosomal RNA (Cundliffe et al., 1990, in The Ribosome: Structure, Function & Evolution (Schlessinger et al, eds.) American Society for Microbiology, Washington, D.C. pp. 479-490). Bacterial resistance to various antibiotics often involves methylation at specific rRNA sites (Cundliffe, 1989, Ann. Rev. Microbiol. 43:207-233). Aminoglycosidic aminocyclitol (aminoglycoside) antibiotics and peptide antibiotics are known to inhibit group I intron splicing by binding to specific regions of the RNA (von Ahsen et al., 1991, Nature (London) 353:368-370). Some of these same aminoglycosides have also been found to inhibit hammerhead ribozyme finction (Stage et al., 1995, RNA 1:95-101). In addition, certain aminoglycosides and other protein synthesis inhibitors have been found to interact with specific bases in 16S rRNA (Woodcock et al., 1991, EMBO J. 10:3099-3103). An oligonucleotide analog of the 16S rRNA has also been shown to interact with certain aminoglycosides (Purohit et al., 1994, Nature 370:659-662). A molecular basis for hypersensitivity to aminoglycosides has been found to be located in a single base change in mitochondrial rRNA (Hutchin et al., 1993, Nucleic Acids Res. 21:4174-4179). Aninoglycosides have also been shown to inhibit the interaction between specific structural RNA motifs and the corresponding RNA binding protein. Zapp et al. (Cell, 1993, 74:969-978) has demonstrated that the aminoglycosides neomycin B, lividomycin A, and tobramycin can block the binding of Rev, a viral regulatory protein required for viral gene expression, to its viral recognition element in the IIB (or RRE) region of HIV RNA. This blockage appears to be the result of competitive binding of the antibiotics directly to the RRE RNA structural motif.


[0005] Single stranded sections of RNA can fold into complex tertiary structures consisting of local motifs such as loops, bulges, pseudoknots, guanosine quartets and turns (Chastain & Tinoco, 1991, Progress in Nucleic Acid Res. & Mol. Biol. 41:131-177; Chow & Bogdan, 1997, Chemical Reviews 97:1489-1514; Rando & Hogan, 1998, Biologic activity of guanosine quartet forming oligonucleotides in “Applied Antisense Oligonucleotide Technology” Stein. & Krieg (eds) John Wiley and Sons, New York, pages 335-352). Such structures can be critical to the activity of the nucleic acid and affect functions such as regulation of mRNA transcription, stability, or translation (Weeks & Crothers, 1993, Science 261:1574-1577). The dependence of these functions on the native three-dimensional structural motifs of single-stranded stretches of nucleic acids makes it difficult to identify or design synthetic agents that bind to these motifs using general, simple-to-use sequence-specific recognition rules for the formation of double- and triple-helical nucleic acids used in the design of antisense and ribozyme type molecules. Approaches to screening generally involve competitive assays designed to identify compounds that disrupt the interaction between a target RNA and a physiological, host cell factor(s) that had been previously identified to specifically interact with that particular target RNA. In general, such assays require the identification and characterization of the host cell factor(s) deemed to be required for the function of the target RNA. Both the target RNA and its preselected host cell binding partner are used in a competitive format to identify compounds that disrupt or interfere with the two components in the assay.


[0006] Citation or identification of any reference in Section 2 of this application is not an admission that such reference is available as prior art to the present invention.



3. SUMMARY OF THE INVENTION

[0007] The present invention relates to methods for identifying compounds that bind to preselected target elements of nucleic acids including, but not limited to, specific RNA sequences, RNA structural motifs, and/or RNA structural elements. The specific target RNA sequences, RNA structural motifs, and/or RNA structural elements are used as targets for screening small molecules and identifying those that directly bind these specific sequences, motifs, and/or structural elements. For example, methods are described in which a preselected target RNA having a detectable label is used to screen a library of test compounds, preferably under physiologic conditions. Any complexes formed between the target RNA and a member of the library are identified using physical methods that detect the altered physical property of the target RNA bound to a test compound. In particular, the present invention relates to methods for using a target RNA having a detectable label to screen a library of test compounds free in solution, in labeled tubes or microtiter plate, or in a microarray. Compounds in the library that bind to the labeled target RNA will form a detectably labeled complex. The detectably labeled complex can then be identified and removed from the uncomplexed, unlabeled test compounds in the library, and from uncomplexed, labeled target RNA, by a variety of methods, including but not limited to, methods that differentiate changes in the electrophoretic, chromatographic, or thermostable properties of the complexed target RNA. Such methods include, but are not limited to, electrophoresis, fluorescence spectroscopy, surface plasmon resonance, mass spectrometry, scintillation, proximity assay, structure-activity relationships (“SAR”) by NMR spectroscopy, size exclusion chromatography, affinity chromatography, and nanoparticle aggregation. The structure of the test compound attached to the labeled RNA is then determined. The methods used will depend, in part, on the nature of the library screened. For example, assays or microarrays of test compounds, each having an address or identifier, may be deconvoluted, e.g., by cross-referencing the positive sample to original compound list that was applied to the individual test assays. Another method for identifying test compounds includes de novo structure determination of the test compounds using mass spectrometry or nuclear magnetic resonance (“NMR”). The test compounds identified are useful for any purpose to which a binding reaction may be put, for example in assay methods, diagnostic procedures, cell sorting, as inhibitors of target molecule function, as probes, as sequestering agents and the like. In addition, small organic molecules which interact specifically with target RNA molecules may be useful as lead compounds for the development of therapeutic agents.


[0008] The methods described herein for the identification of compounds that directly bind to a particular preselected target RNA are well suited for high-throughput screening. The direct binding method of the invention offers advantages over drug screening systems for competitors that inhibit the formation of naturally-occurring RNA binding protein:target RNA complexes; i.e., competitive assays. The direct binding method of the invention is rapid and can be set up to be readily performed, e.g., by a technician, making it amenable to high throughput screening. The method of the invention also eliminates the bias inherent in the competitive drug screening systems, which require the use of a preselected host cell factor that may not have physiological relevance to the activity of the target RNA. Instead, the methods of the invention are used to identify any compound that can directly bind to specific target RNA sequences, RNA structural motifs, and/or RNA structural elements, preferably under physiologic conditions. As a result, the compounds so identified can inhibit the interaction of the target RNA with any one or more of the native host cell factors (whether known or unknown) required for activity of the RNA in vivo.


[0009] The present invention may be understood more fully by reference to the detailed description and examples, which are intended to illustrate non-limiting embodiments of the invention.



3.1. Definitions

[0010] As used herein, a “target nucleic acid” refers to RNA, DNA, or a chemically modified variant thereof. In a preferred embodiment, the target nucleic acid is RNA. A target nucleic acid also refers to tertiary structures of the nucleic acids, such as, but not limited to loops, bulges, pseudoknots, guanosine quartets and turns. A target nucleic acid also refers to RNA elements such as, but not limited to, the HIV TAR element, internal ribosome entry site, “slippery site”, instability elements, and adenylate uridylate-rich elements, which are described in Section 5.1. Non-limiting examples of target nucleic acids are presented in Section 5.1 and Section 6.


[0011] As used herein, a “library” refers to a plurality of test compounds with which a target nucleic acid molecule is contacted. A library can be a combinatorial library, e.g., a collection of test compounds synthesized using combinatorial chemistry techniques, or a collection of unique chemicals of low molecular weight (less than 1000 daltons) that each occupy a unique three-dimensional space.


[0012] As used herein, a “label” or “detectable label” is a composition that is detectable, either directly or indirectly, by spectroscopic, photochemical, biochem cal, immunochemical, or chemical means. For example, useful labels include radioactive isotopes (e.g., 32P, 35S, and 3H), dyes, fluorescent dyes, electron-dense reagents, enzymes and their substrates (e.g., as commonly used in enzyme-linked immunoassays, e.g., lkaline phosphatase and horse radish peroxidase), biotin-streptavidin, digoxigenin, or haptens and proteins for which antisera or monoclonal antibodies are available. Moreover, a label or detectable moiety can include a “affinity tag” that, when coupled with the target nucleic acid and incubated with a test compound or compound library, allows for the affinity capture of the target nucleic acid along with molecules bound to the target nucleic acid. One skilled in the art will appreciate that a affinity tag bound to the target nucleic acids has, by definition, a complimentary ligand coupled to a solid support that allows for its capture. For example, useful affinity tags and complimentary partners include, but are not limited to, biotin-streptavidin, complimentary nucleic acid fragments (e.g., oligo dT-oligo dA, oligo T-oligo A, oligo dG-oligo dC, oligo G-oligo C), aptaners, or haptens and proteins for which antisera or monoclonal antibodies are available. The label or detectable moiety is typically bound, either covalently, through a linker or chemical bound, or through ionic, van der Waals or hydrogen bonds to the molecule to be detected.


[0013] As used herein, a “dye” refers to a molecule that, when exposed to radiation, emits radiation at a level that is detectable visually or via conventional spectroscopic means. As used herein, a “visible dye” refers to a molecule having a chromophore that absorbs radiation in the visible region of the spectrum (i.e., having a wavelength of between about 400 nm and about 700 nm) such that the transmitted radiation is in the visible region and can be detected either visually or by conventional spectroscopic means. As used herein, an “ultraviolet dye” refers to a molecule having a chromophore that absorbs radiation in the ultraviolet region of the spectrum (i.e., having a wavelength of between about 30 nm and about 400 nm). As used herein, an “infrared dye” refers to a molecule having a chromophore that absorbs radiation in the infrared region of the spectrum (i.e., having a wavelength between about 700 nm and about 3,000 nm). A “chromophore” is the network of atoms of the dye that, when exposed to radiation, emits radiation at a level that is detectable visually or via conventional spectroscopic means. One of skill in the art will readily appreciate that although a dye absorbs radiation in one region of the spectrum, it may emit radiation in another region of the spectrum. For example, an ultraviolet dye may emit radiation in the visible region of the spectrum. One of skill in the art will also readily appreciate that a dye can transmit radiation or can emit radiation via fluorescence or phosphorescence.


[0014] The phrase “pharmaceutically acceptable salt(s),” as used herein includes but is not limited to salts of acidic or basic groups that may be present in test compounds identified using the methods of the present invention. Test compounds that are basic in nature are capable of forming a wide variety of salts with various inorganic and organic acids. The acids that can be used to prepare pharmaceutically acceptable acid addition salts of such basic compounds are those that form non-toxic acid addition salts, i.e., salts containing pharmacologically acceptable anions, including but not limited to sulfuric, citric, maleic, acetic, oxalic, hydrochloride, hydrobromide, hydroiodide, nitrate, sulfate, bisulfate, phosphate, acid phosphate, isonicotinate, acetate, lactate, salicylate, citrate, acid citrate, tartrate, oleate, tannate, pantothenate, bitartrate, ascorbate, succinate, maleate, gentisinate, fumarate, gluconate, glucaronate, saccharate, formate, benzoate, glutamate, methanesulfonate, ethanesulfonate, benzenesulfonate, p-toluenesulfonate and pamoate (i.e., 1,1′-methylene-bis-(2-hydroxy-3-naphthoate)) salts. Test compounds that include an amino moiety may form pharmaceutically or cosmetically acceptable s :ts with various amino acids, in addition to the acids mentioned above. Test compounds that are acidic in nature are capable of forming base salts with various pharmacologically or cosmetically acceptable cations. Examples of such salts include alkali metal or alkaline earth metal salts and, particularly, calcium, magnesium, sodium lithium, zinc, potassium, and iron salts.


[0015] By “substantially one type of test compound,” as used herein, is meant that the assay can be performed in such a fashion that at some point, only one compound need be used in each reaction so that, if the result is indicative of a binding event occurring between the target RNA molecule and the test compound, the test compound can be easily identified.







4. DESCRIPTION OF DRAWINGS

[0016]
FIG. 1. Gel retardation analysis to detect peptide-RNA interactions. In 20 μl reactions containing increasing concentrations of Tat47-58 peptide (0.1 μM, 0.2 μM, 0.4 μM, 0.8 μM, 1.6 μM) 50 pmole TAR RNA oligonucleotide was added in TK buffer. The reaction mixture was then heated at 90° C. for 2 min and allowed to cool slowly to 24° C. 10 ml of 30% glycerol was added to each sample and applied to a 12% non-denaturing polyacrylamide gel. The gel was electrophoresed using 1200 volt-hours at 4° C. in TBE Buffer. Following electrophoresis, the gel was dried and the radioactivity was quantitated with a phosphorimager. The concentration of peptide added is indicated above each lane.


[0017]
FIG. 2. Gentamicin interacts with an oligonucleotide corresponding to the 16S rRNA. 20 μl reactions containing increasing concentrations of gentamicin (1 ng/ml, 10 ng/ml, 100 ng/ml, 1 μg/ml, 10 μg/ml, 50 μg/ml, 500 μg/ml) were added to 50 pmole RNA oligonucleotide in TKM buffer, heated at 90° C. for 2 min and allowed to cool slowly to 24° C. Then 10 μl of 30% glycerol was added to each sample and the samples were applied to a 13.5% non-denaturing polyacrylamide gel. The gel was electrophoresed using 1200 volt-hours at 4° C. in TBE Buffer Following electrophoresis, the gel was dried and the radioactivity was quantitated using a phosphorimager. The concentration of gentamicin added is indicated above each lane.


[0018]
FIG. 3. The presence of 10 pg/ml gentamicin produces a gel mobility shift in the presence of the 16S rRNA oligonucleotide. 20 μl reactions containing 30 increasing concentrations of gentamicin (100 ng/ml, 10 ng/ml, 1 ng/ml, 100 pg/ml, and 10 pg/ml) were added to 50 pmole RNA oligonucleotide in TKM buffer were treated as described for FIG. 2.


[0019]
FIG. 4. Gentamicin binding to the 16S rRNA oligonucleotide is weak in the absence of MgCl2. Reaction mixtures containing gentamicin (1 mg/ml, 100 μg/ml, 10 μg/ml, 1 μg/ml, 0.1 μg/ml, and 10 ng/ml) were treated as described in FIG. 2 except that the TKM buffer does not contain MgCl2.


[0020]
FIG. 5. Gel retardation analysis to detect peptide-RNA interactions. In reactions containing increasing concentrations of Tat47-58 peptide (0.1 μM, 0.2 μM, 0.4 μM, 0.8 μM, 1.6 μM) 50 pmole TAR RNA oligonucleotide was added in TK buffer. The reaction mixture was then heated at 90° C. for 2 min and allowed to cool slowly to 24° C. The reactions were loaded onto a SCE9610 automated capillary electrophoresis apparatus (SpectruMedix; State College, Pa.). The peaks correspond to the amount of free TAR RNA (“TAR”) or the Tat-TAR complex (“Tat-TAR”). The concentration of peptide added is indicated below each lane.







5. DETAILED DESCRIPTION OF THE INVENTION

[0021] The present invention relates to methods for identifying compounds that bind to preselected target elements of nucleic acids, in particular, RNAs, including but not limited to preselected target RNA sequencing structural motifs, or structural elements. Methods are described in which a preselected target RNA having a detectable label is used to screen a library of test compounds. Any complexes formed between the target RNA and a member of the library are identified using physical methods that detect the altered physical property of the target RNA bound to a test compound. Changes in the physical property of the RNA-test compound complex relative to the target RNA or test compound can be measured by methods such as, but not limited to, methods that detect a change in mobility due to a change in mass, change in charge, or a change in thermostability. Such methods include, but are not limited to, electrophoresis, fluorescence spectroscopy, surface plasmon resonance, mass spectrometry, scintillation, proximity assay, structure-activity relationships (“SAR”) by NMR spectroscopy, size exclusion chromatography, affinity chromatography, and nanoparticle aggregation. In particular, the present invention relates to methods for using a target RNA having a detectable label to screen a library of test compounds free in solution, in labeled tubes or microtiter plate, or in a microarray. Compounds in the library that bind to the labeled target RNA will form a detectably labeled complex. The detectably labeled complex can then be identified and removed from the unlabeled, uncomplexed test compounds in the library by a variety of methods capable of differentiating changes in the physical properties of the complexed target RNA. The structure of the test compound attached to the labeled RNA is also determined. The methods used will depend, in part, on the nature of the library screened. For example, assays or microarrays of test compounds, each having an address or identifier, may be deconvoluted, e.g., by cross-referencing the positive sample to an original compound list that was applied to the individual test assays. Another method for identifying test compounds includes de novo structure determination of the test compounds using mass spectrometry or nuclear magnetic resonance (“NMR”).


[0022] Thus, the methods of the present invention provide a simple, sensitive assay for high-throughput screening of libraries of test compounds, in which the test compounds of the library that specifically bind a preselected target nucleic acid are easily distinguished from non-binding members of the library. The structures of the binding molecules are deciphered from the input library by methods depending on the type of library that is used. The test compounds so identified are useful for any purpose to which a binding reaction may be put, for example in assay methods, diagnostic procedures, cell sorting, as inhibitors of target molecule function, as probes, as sequestering agents and lead compounds for development of therapeutics, and the like. Small organic compounds that are identified to interact specifically with the target RNA molecules are particularly attractive candidates as lead compounds for the development of therapeutic agents.


[0023] The assay of the invention reduces bias introduced by competitive binding assays which require the identification and use of a host cell factor (presumably essential for modulating RNA function) as a binding partner for the target RNA. The assays of the present invention are designed to detect any compound or agent that binds to the target RNA, preferably under physiologic conditions. Such agents can then be tested for biological activity, without establishing or guessing which host cell factor or factors is required for modulating the function and/or activity of the target RNA.


[0024] Section 5.1 describes examples of protein-RNA interactions that are important in a variety of cellular functions and several target RNA elements that can be used to identify test compounds. Compounds that inhibit these interactions by binding to the RNA and successfully competing with the natural protein or host cell factor that endogenously binds to the RNA may be important, e.g., in treating or preventing a disease or abnormal condition, such as an infection or unchecked growth. Section 5.2 describes detectable labels for target nucleic acids that are useful in the methods of the invention. Section 5.3 describes libraries of test compounds. Section 5.4 provides conditions for binding a labeled target RNA to a test compound of a library and detecting RNA binding to a test compound using the methods of the invention. Section 5.5 provides methods for separating complexes of target RNAs bound to a test compound from an unbound RNA. Section 5.6 describes methods for identifying test compounds that are bound to the target RNA. Section 5.7 describes a secondary, biological screen of test compounds identified by the methods of the invention to test the effect of the test compounds in vivo. Section 5.8 describes the use of test compounds identified by the methods of the invention for treating or preventing a disease or abnormal condition in mammals.



5.1. Biologically Important RNA-Host Cell Factor Interactions

[0025] Nucleic acids, and in particular RNAs, are capable of folding into complex tertiary structures that include bulges, loops, triple helices and pseudoknots, which can provide binding sites for host cell factors, such as proteins and other RNAs. RNA-protein and RNA-RNA interactions are important in a variety cellular functions, including transcription, RNA splicing, RNA stability and translation. Furthermore, the binding of such host cell factors to RNAs may alter the stability and translational efficiency of such RNAs, and according affect subsequent translation. For example, some diseases are associated with protein overproduction or.decreased protein function. In this case, the identification of compounds to modulate RNA stability and translational efficiency will be useful to treat and prevent such diseases.


[0026] The methods of the present invention are useful for identifying test compounds that bind to target RNA elements in a high throughput screening assay of libraries of test compounds in solution. In particular, the methods of the present invention are useful for identifying a test compound that binds to a target RNA elements and inhibits the interaction of that RNA with one or more host cell factors in vivo. The molecules identified using the methods of the invention are useful for inhibiting the formation of a specific bound RNA:host cell factor complexes in vivo.


[0027] In some embodiments, test compounds identified by the methods of the invention are useful for increasing or decreasing the translation of messenger RNAs (“mRNAs”), e.g., protein production, by binding to one or more regulatory elements in the 5′ untranslated region, the 3′ untranslated region, or the coding region of the mRNA. Compounds that bind to mRNA can, inter alia, increase or decrease the rate of mRNA processing, alter its transport through the cell, prevent or enhance binding of the mRNA to ribosomes, suppressor proteins or enhancer proteins, or alter mRNA stability. Accordingly, compounds that increase or decrease mRNA translation can be used to treat or prever disease. For example, diseases associated with protein overproduction, such as amyloidosis, or with the production of mutant proteins, such as Ras, can be treated or prevented by decreasing translation of the mRNA that codes for the overproduced protein, thus inhibiting production of the protein. Conversely, the symptoms of diseases associated with decreased protein function, such as hemophelia, may be treated by increasing translation of mRNA coding for the protein whose function is decreased, e.g., factor IX in some forms of hemophilia.


[0028] The methods of the invention can be used to identify compounds that bind to mRNAs coding for a variety of proteins with which the progression of diseases in mammals is associated. These mRNAs include, but are not limited to, those coding for amyloid protein and amyloid precursor protein; anti-angiogenic proteins such as angiostatin, endostatin, METH-1 and METH-2; apoptosis inhibitor proteins such as survivin, clotting factors such as Factor IX, Factor VIII, and others in the clotting cascade; collagens; cyclins and cyclin inhibitors, such as cyclin dependent kinases, cyclin D1, cyclin E, WAF1, cdk4 inhibitor, and MTS1; cystic fibrosis transmembrane conductance regulator gene (CFTR); cytokines such as IL-1, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-10, IL-11, IL-12, IL-13, IL-14, IL-15, IL-16, IL-17 and other interleukins; hematopoetic growth factors such as erythropoietin (Epo); colony stimulating factors such as G-CSF, GM-CSF, M-CSF, SCF and thrombopoietin; growth factors such as BNDF, BMP, GGRP, EGF, FGF, GDNF, GGF, HGF, IGF-1, IGF-2, KGF, myotrophin, NGF, OSM, PDGF, somatotrophin, TGF-β, TGF-α anc VEGF; antiviral cytokines such as interferons, antiviral proteins induced by interferons, TNF-α, and TNF-β; enzymes such as cathepsin K, cytochrome P-450 and other cytochromes, farnesyl transferase, glutathione-S transferases, heparanase, HMG CoA synthctase, N-acetyltransferase, phenylalanine hydroxylase, phosphodiesterase, ras carboxyl-terminal protease, telomerase and TNF converting enzyme; glycoproteins such as cadherins, e.g., N-cadherin and E-cadherin; cell adhesion molecules; selectins; transmembrane glycoproteins such as CD40; heat shock proteins; hormones such as 5-α reductase, atrial natriuretic factor, calcitonin, corticotrophin releasing factor, diuretic hormones, glucagon, gonadotropin, gonadotropin releasing hormone, growth hormone, growth hormone releasing factor, somatotropin, insulin, leptin, luteinizing hormone, luteinizing hormone releasing hormone, parathyroid hormone, thyroid hormone, and thyroid stimulating hormone; proteins involved in immune responses, including antibodies, CTLA4, hemagglutinin, MHC proteins, VLA-4, and kallikrein-kininogen-kinin system; ligands such as CD4; oncogene products such as sis, hst, protein tyrosine kinase receptors, ras, abl, mos, myc, fos, jun, H-ras, ki-ras, c-fms, bcl-2, L-myc, c-myc, gip, gsp, and HER-2; receptors such as bombesin receptor, estrogen receptor, GABA receptors, growth factor receptors including EGFR, PDGFR, FGFR, and NGFR, GTP-binding regulatory proteins, interleukin receptors, ion channel receptors, leukotriene receptor antagonists, lipoprotein receptors, opioid pain receptors, substance P receptors, retinoic acid and retinoid receptors, steroid receptors, T-cell receptors, thyroid hormone receptors, TNF receptors; tissue plasminogen activator; transmembrane receptors; transmembrane transporting systems, such as calcium pump, proton pump, Na/Ca exchanger, MRP1, MRP2, P170, LRP, and cMOAT; transferrin; and tumor suppressor gene products such as APC, brca1, brca2, DCC, MCC, MTS1, NF1, NF2, nm23, p53 and Rb. In addition to the eukaryotic genes listed above, the invention, as described, can be used to define molecules that interrupt viral, bacterial or fungal transcription or translation efficiencies and therefore form the basis for a novel anti-infectious disease therapeutic. Other target genes include, but are not limited to, those disclosed in Section 5.1 and Section 6.


[0029] The methods of the invention can be used to identify mRNA-binding test compounds for increasing or decreasing the production of a protein, thus treating or preventing a disease associated with decreasing or increasing the production of said protein, respectively. The methods of the invention may be useful for identifying test compounds for treating or preventing a disease in mammals, including cats, dogs, swine, horses, goats, sheep, cattle, primates and humans. Such diseases include, but are not limited to, amyloidosis, hemophilia, Alzheimer's disease, atherosclerosis, cancer, giantism, dwarfism, hypothyroidism, hyperthyroidism, inflammation, cystic fibrosis, autoimmune disorders, diabetes, aging, obesity, neurodegenerative disorders, and Parkinson's disease. Other diseases include, but are not limited to, those described in Section 5.1 and diseases caused by aberrant expression of the genes disclosed in Example 6. In addition to the eukaryotic genes listed above, the invention, as described, can be used to define molecules that interrupt viral, bacterial or fungal transcription or translation efficiencies and therefore form the basis for a novel anti-infectious disease therapeutic.


[0030] In other embodiments, test compounds identified by the methods of the invention are useful for preventing the interaction of an RNA, such as a transfer RNA (“tRNA”), an enzymatic RNA or a ribosomal RNA (“rRNA”), with a protein or with another RNA, thus preventing, e.g., assembly of an in vivo protein-RNA or RNA-RNA complex that is essential for the viability of a cell. The term “enzymatic RNA,” as used herein, refers to RNA molecules that are either self-splicing, or that form an enzyme by virtue of their association with one or more proteins, e.g., as in RNase P, telomerase or small nuclear ribonuclear protein particles. For example, inhibition of an interaction between rRNA and one or more ribosomal proteins may inhibit the assembly of ribosomes, rendering a cell incapable of synthesizing proteins. In addition, inhibition of the interaction of precursor rRNA with ribonucleases or ribonucleoprotein complexes (such as RNase P) that process the precursor rRNA prevent maturation of the rRNA and its assembly into ribosomes. Similarly, a tRNA:tRNA synthetase complex may be inhibited by test compounds identified by the methods of the invention such that tRNA molecules do not become charged with amino acids. Such interactions include, but are not limited to, rRNA interactions with ribosomal proteins, tRNA interactions with tRNA synthetase, RNase P protein interactions with RNase P RNA, and telomerase protein interactions with telomerase RNA.


[0031] In other embodiments, test compounds identified by the methods of the invention are useful for treating or preventing a viral, bacterial, protozoan or fungal infection. For example, transcriptional up-regulation of the genes of human immunodeficiency virus type 1 (“HIV-1”) requires binding of the HIV Tat protein to the HIV trans-activation response region RNA (“TAR RNA”). HIV TAR RNA is a 59-base stem-loop structure located at the 5′-end of all nascent HIV-1 transcripts (Jones & Peterlin, 1994, Annu. Rev. Biochem. 63:717-43). Tat protein is known to interact with uracil 23 in the bulge region of the stem of TAR RNA. Thus, TAR RNA is a potential binding target for test compounds, such as small peptides and peptide analogs that bind to the bulge region of TAR RNA and inhibit formation of a Tat-TAR RNA complex involved in HIV-1 upregulation (see Hwang et al.,1999 Proc. Natl. Acad. Sci. USA 96:12997-13002). Accordingly, test compounds that bind to TAR RNA are useful as anti-HIV therapeutics (Hamy et al., 1997, Proc. Natl. Acad. Sci. USA 94:3548-3553; Hamy et al., 1998, Biochemistry 37:5086-5095; Mei et al., 1998, Biochemistry 37:14204-14212), and therefore, are useful for treating or preventing AIDS.


[0032] The methods of the invention can be used to identify test compounds to treat or prevent viral, bacterial, protozoan or fungal infections in a patient. In some embodiments, the methods of the invention are useful for identifying compounds that decrease translation of microbial genes by interacting with mRNA, as described above, or for identifying compounds that inhibit the interactions of microbial RNAs with proteins or other ligands that are essential for viability of the virus or microbe. Examples of microbial target RNAs useful in the present invention for identifying antiviral, antibacterial, anti-protozoan and anti-fungal compounds include, but are not limited to, general antiviral and anti-inflammatory targets such as mRNAs of INFα, INFγ, RNAse L, RNAse L inhibitor protein, PKR, tumor necrosis factor, interleukins 1-15, and IMP dehydrogenase; internal ribosome entry sites; HIV-1 CT rich domain and RNase H MRNA; HCV internal ribosome entry site (required to direct translation of HCV mRNA), and the 3′-untranslated tail of HCV genomes; rotavirus NSP3 binding site, which binds the protein NSP3 that is required for rotavirus mRNA translation; HBV epsilon domain; Dengue virus 5′ and 3′ untranslated regions, including IRES; INFα, INFβ and INFγ; plasmodium falciparum mRNAs; the 16S ribosomal subunit ribosomal RNA and the RNA component of RNase P of bacteria; and the RNA component of telomerase in fungi and cancer cells. Other target viral and bacterial mRNAs include, but are not limited to, those disclosed in Section 6.


[0033] One of skill in the art will appreciate that, although such target RNAs are functionally conserved in various species (e.g., from yeast to humans), they exhibit nucleotide sequence and structural diversity. Therefore, inhibition of, for example, yeast telomerase by an anti-fungal compound identified by the methods of the invention might not interfere with human telomerase and normal human cell proliferation.


[0034] Thus, the methods of the invention can be used to identify test compounds that interfere with one or more target RNA interactions with host cell factors that are important for cell growth or viability, or essential in the life cycle of a virus, a bacterium, a protozoa or a fungus. Such test compounds and/or congeners that demonstrate desirable biologic and pharmacologic activity can be administered to a patient in need thereof in order to treat or prevent a disease caused by viral, bacterial, protozoan, or fungal infections. Such diseases include, but are not limited to, HIV infection, AIDS, human T-cell leukemia, SIV infection, FIV infection, fel.ne leukemia, hepatitis A, hepatitis B, hepatitis C, Dengue fever, malaria, rotavirus infection, severe acute gastroenteritis, diarrhea, encephalitis, hemorrhagic fever, syphilis, legionella, whooping cough, gonorrhea, sepsis, influenza, pneumonia, tinea infection, candida infection, and meningitis.


[0035] Non-limiting examples of RNA elements involved in the regulation of gene expression, i.e., mRNA stability, translational efficiency via translational initiation and ribosome assembly, etc., include the HIV TAR element, internal ribosome entry site, “slippery site”, instability elements, and adenylate uridylate-rich elements, as discussed below.



5.1.1. HIV TAR Element

[0036] Transcriptional up-regulation of the genes of human immunodeficiency virus type 1 (“HIV-1”) requires binding of the HIV Tat protein to the HIV trans-activation response region RNA (“TAR RNA”), a 59-base stem-loop structure located at the 5′ end of all nascent HIV-1 transcripts (Jones & Peterlin, 1994, Annu. Rev. Biochem. 63:717-43). Tat protein is known to interact with uracil 23 in the bulge region of the stem of TAR RNA. Thus, TAR RNA is a useful binding target for test compounds, such as small peptides and peptide analogs that bind to the bulge region of TAR RNA and inhibit formation of a Tat-TAR RNA complex involved in HIV-1 up-regulation (see Hwang et al., 1999 Proc. Natl. Acad. Sci. USA 96:12997-13002). Accordingly, test compounds that bind to TAR RNA can be useful as anti-HIV therapeutics (Harny et al., 1997, Proc. Natl. Acad. Sci. USA 94:3548-3553; Hamy et al., 1998, Biochemistry 37:5086-5095; Mei et al., 1998, Biochemistry 37:14204-14212), and therefore, are useful for treating or preventing AIDS.



5.1.2. Internal Ribosome Entry Site (“IRES”)

[0037] Internal ribosome entry sites (“IRES”) are found in the 5′ untranslated regions (“5′ UTR”) of several mRNAs, and are thought to be involved in the regulation of translational efficiency. When the IRES element is present on an mRNA downstream of a translational stop codon, it directs ribosomal re-entry (Ghattas et al., 1991, Mol. Cell. Biol. 11:5848-5959), which permits initiation of translation at the start of a second open reading frame.


[0038] As reviewed by Jang et al., a large segment of the 5′ nontranslated region, approximately 400 nucleotides in length, promotes internal entry of ribosomes independent of the non-capped 5′ end of picornavirus mRNAs (mammalian plus-strand RNA viruses whose genomes serve as mRNA). This 400 nucleotide segment (IRES), maps approximately 200 nt down-stream from the 5′ end and is highly structured. IRES elements of different picornaviruses, although functionally similar in vitro and in vivo, are not identical in sequence or structure. However, IRES elements of the genera entero- and rhinoviruses, on the one hand, and cardio- and aphthoviruses, on the other hand, reveal similarities corresponding to phylogenetic kinship. All IRES elements contain a conserved Yn-Xm-AUG unit (Y, pyrimidine; X, nucleotide) which appears essential for IRES function. The IRES elements of cardio-, entero- and aphthoviruses bind a cellular protein, p57. In the case of cardioviruses, the interaction between a specific stem-loop of the IREs is essential for translation in vitro. The IRES elements of entero- and cardioviruses also bind the cellular protein, p52, but the significance of this interaction remains to be shown. The function of p57 or p52 in cellular metabolism is unknown. Since picomaviral IRES elements function in vivo in the absence of any viral gene products, is speculated that IRES-like elements may also occur in specific cellular mRNAs releasing them from cap-dependent translation (Jang et al., 1990, Enzyme 44(1-4):292-309).



5.1.3. “Slippery Site”

[0039] Programmed, or directed, ribosomal frameshifting, when ribosomes shift from one translation reading frame to another and synthesize two viral proteins from a single viral mRNA, is directed by a unique site in viral mRNAs called the “slippery site.” The slippery site directs ribosomal frameshifting in the −1 or +1 direction that causes the ribosome to slip by one base in the 5′ direction thereby placing the ribosome in the new reading frame to produce a new protein.


[0040] Programmed, or directed, ribosomal frameshifting is of particular value to viruses that package their plus strands, as it eliminates the need to splice their mRNAs and reduces the risk of packaging defective genomes and regulates the ratio of viral proteins synthesized. Examples of programmed translational frameshifting (both +1 and −1 shifts) have been identified in ScV systems (Lopinski et al., 2000, Mol. Cell. Biol. 20(4):1095-103, retroviruses (Falk et al., 1993, J. Virol. 67:273-6277; Jacks & Varmus, 1985, Science 230:1237-1242; Morikawa & Bishop, 1992, Virology 186:389-397; Nam et al., 1993, J. Virol. 67:196-203); coronaviruses (Brierley et al., 1987, EMBO J. 6:3779-3785; Herold & Siddell, 1993, Nucleic Acids Res. 21:5838-5842); giardiaviruses, which are also members of the Totiviridae (Wang et al., 1993, Proc. Natl. Acad. Sci. USA 90:8595-8599); two bacterial genes (Blinkowa & Walker, 1990, Nucleic Acids Res., 18: i725-1729; Craigen & Caskey, 1986, Nature 322:273); bacteriophage genes (Condron et al., 1991, Nucleic Acids Res. 19:5607-5612); astroviruses (Marczinke et al., 1994, J. Virol. 68:5588-5595); the yeast EST3 gene (Lundblad & Morris, 1997, Curr. Biol. 7:969-976); and the rat, mouse, Xenopus, and Drosophila ornithine decarboxylase antizymes (Matsufuji et al., 1995, Cell 80:51-60); and a significant number of cellular genes (Herold & Siddell, 1993, Nucleic Acids Res. 21:5838-5842).


[0041] Drugs targeted to ribosomal frameshifting minimize the problem of virus drug resistance because this strategy targets a host cellular process rather than one introduced into the cell by the virus, which minimizes the ability of viruses to evolve drug-resistant mutants. Compounds that target the RNA elements involved in regulating programmed frameshifting should have several advantages, including (a) any selective pressure on the host cellular translational machinery to adapt to the drugs would have to occur at the host evolutionary time scale, which is on the order of millions of years, (b) ribosomal frameshifting is not used to express any host proteins, and (c) altering viral frameshifting efficiencies by modulating the activity of a host protein minimizing the likelihood that the virus will acquire resistance to such inhibition by mutations in its own genome.



5.1.4. Instability Elements

[0042] “Instability elements” may be defined as specific sequence elements that promote the recognition of unstable mRNAs by cellular turnover machinery. Instability elements have been found within mRNA protein coding regions as well as untranslated regions.


[0043] Altering the control of stability of normal mRNAs may lead to disease. The alteration of mRNA stability has been implicated in diseases such as, but not limited to, cancer, immune disorders, heart disease, and fibrotic disorders.


[0044] There are several examples of mutations that delete instability elements which then result in stabilization of mRNAs that may be involved in the onset of cancer. In Burkitt's lymphoma, a portion of the c-myc proto-oncogene is translocated to an Ig locus, producing a form of the c-myc mRNA that is five times more stable (see, e.g., Kapstein et al., 1996, J. Biol. Chem. 271(31):18875-84). The highly oncogenic v-fos mRNA lacks the 3′ UTR adenylate uridylate rich element (“ARE”) that is found in the more labile and weakly oncogenic c-fos mRNA (see, e.g., Schiavi et al., 1992, Biochim Biophys Acta. 1114(2-3):95-106). Differences between the benign cervical lesions brought about by nonintegrated circular human papillomavirus type 16 and its integrated form, that lacks the 3′ UTR ARE and correlates with cervical carcinomas, may be a consequence of stabilizing the E6/E7 transcripts encoding oncogenic proteins. Integration of the virus results in deletion of the ARE instability element, resulting in stabilizion of the transcripts and over-expression of the proteins (see, e.g., Jeon & Lambert, 1995, Proc. Natl. Acad. Sci. USA 92(5):1654-8). Deletion of AREs from the 3′ UTR of the IL-2 and IL-3 genes promotes increased stabilization of these mRNAs, high expression of these proteins, and leads to the formation of cancerous cells (see, e.g., Stoecklin et al., 2000, Mol. Cell. Biol. 20(11):3753-63).


[0045] Mutations in trans-acting factors involved in mRNA turnover may also promote cancer. In monocytic tumors, the lymphokine GM-CSF mRNA is specifically stabilized as a consequence of an oncogenic lesion in a trans-acting factor that controls mRNA turnover rates. Furthermore, the normally unstable IL-3 transcript is inappropriately long-lived in mast tumor cells. Similarly, the labile GM-CSF mRNA is greatly stabilized in bladder carcinoma cells. See, e.g., Bickel et al., 1990, J. Immunol. 145(3):840-5.


[0046] The immune system is regulated by a large number of regulatory molecules that either activate or inhibit the immune response. It has now been clearly demonstrated that stability of the transcripts encoding these proteins are highly regulated. Altered regulation of these molecules leads to mis-regulation of this process and can result in drastic medical consequences. For example, recent results using transgenic mice have shown that mis-regulation of the stability of the important modulator TNFα mRNA leads to diseases such as, but not limited to, rheumatoid arthritis and a Crohn's-like liver disease. See, e.g., Clark, 2000, Arthritis Res. 2(3):172-4.


[0047] Smooth muscle in the heart is modulated by the β-adrenergic receptor, which in turn responds to the sympathetic neurotransmitter norepinephrine and the adrenal hormone epinephrine. Chronic heart failure is characterized by impairment of smooth muscle cells, which results, in part, from the more rapid decay of the β-adrenergic receptor mRNA. See, e.g., Ellis & Frielle, 1999, Biochem. Biophys. Res. Commun. 258(3):552-8.


[0048] A large number of diseases result from over-expression of collagen. For example, cirrhosis results from damage to the liver as a consequence of carcer, viral infection, or alcohol abuse. Such damage causes mis-regulation of collagen expression, leading to the formation of large collagen deposits. Recent results indicate that the sizeable increase in collagen expression is largely attributable to stabilization of its mRNA. See, e.g., Lindquist et al., 2000, Am. J. Physiol. Gastrointest. Liver Physiol. 279(3):G471-6.



5.1.5. Adenylate Uridylate-Rich Elements (“ARE”)

[0049] Adenylate uridylate-rich elements (“ARE”) are found in the 3′ untranslated regions (“3′ UTR”) of several mRNAs, and involved in the turnover of mrRNAs, such as but not limited to transcription factors, cytokines, and lymphokines. AREs may function both as stabilizing and destabilizing elements. ARE mRNAs are classified into five groups, depending on sequence (Bakheet et al., 2001, Nucl. Acids Res. 29(1):246-254). An ongoing database at the web site http://rc.kfshrc.edu.sa/ared contains ARE-containing mRNAs and their cluster groups, which is incorporated by reference in its entirety. The ARE motifs are classified as follows:
1Group I Cluster(AUUUAUUUAUUUAUUUAUUUA)SEQ ID NO: 1Group II Cluster(AUUUAUUUAUUUAUUUA) stretchSEQ ID NO: 2Group III Cluster(WAUUUAUUUAUUUAW) stretchSEQ ID NO: 3Group IV Cluster(WWAUUUAUUUAWW) stretchSEQ ID NO: 4Group V Cluster(WWWWAUUUAWWWW) stretchSEQ ID NO: 5


[0050] The ARE-mRNAs were clustered into five groups containing five, four, three and two pentameric repeats, while the last group contains only one pentamer within the 13-bp ARE pattern. Functional categories were assigned whenever possible according to NCBI-COG functional annotation (Tatusov et al., 2001, Nucleic Acids Research, 29(1): 22-28), in addition to the categories: inflammation, immune response, development/differentiation, using an extensive literature search.


[0051] Group I contains many secreted proteins including GM-CSF, IL-1, IL-11, IL-12 and Gro-β that affect the growth of hematopoietic and immune cells (Witsell & Schook, 1992, Proc. Natl Acad. Sci. USA, 89:4754-4758). Although TNFα is both a pro-inflammatory and anti-tumor protein, there is experimental evidence that it can act as a growth factor in certain leukemias and lymphomas (Liu et al., 2000, J. Biol. Chem. 275:21086-21093).


[0052] Unlike Group I, Groups II-V contain functionally diverse gene families comprising immune response, cell cycle and proliferation, inflammation and coagulation, angiogenesis, metabolism, energy, DNA binding and transcription, nutrient transportation and ionic homeostasis, protein synthesis, cellular biogenesis, signal transduction, and apoptosis (Bakheet et al., 2001, Nucl. Acids Res. 29(1):246-254).


[0053] Several groups have described ARE-binding proteins that influence the ARE-mRNA stability. Among the well-characterized proteins are the mammalian homologs of ELAV (embryonic lethal abnormal vision) proteins including AUF1, HuR and He1-N2 (Zhang et al., 1993, Mol. Cell. Biol. 13:7652-7665; Levine et al., 1993, Mol. Cell. Biol. 13:3494-3504: Ma et al., 1996, J. Biol. Chem. 271 :8144-8151). The zinc-finger protein tristetraprolin has been identified as another ARE-binding protein with destabilizing activity on TNF-α, IL-3 and GM-CSF mRNAs (Stoecklin et al., 2000, Mol. Cell. Biol. 20:3753-3763; Carballo et al., 2000, Blood 95:1891-1899).


[0054] Since ARE-containing genes are clearly important in biological systems, including but not limited to a number of the early response genes that regulate cell proliferation and responses to exogenous agents, the identification of compounds that bind to one or more of the ARE clusters and potentially modulate the stability of the target RNA can potentially be of value as a therapeutic.



5.2. Detectably Labeled Target RNAs

[0055] Target nucleic acids, including but not limited to RNA and DNA, useful in the methods of the present invention have a label that is detectable via conventional spectroscopic means or radiographic means. Preferably, target nucleic acids are labeled with a covalently attached dye molecule. Useful dye-molecule labels include, but are not limited to, fluorescent dyes, phosphorescent dyes, ultraviolet dyes, infrared dyes, and visible dyes. Preferably, the dye is a visible dye.


[0056] Useful labels in the present invention can include, but are not limited to, spectroscopic labels such as fluorescent dyes (e.g., fluorescein and derivatives such as fluorescein isothiocyanate (FITC) and Oregon Green™, rhodamine and derivatives (e.g., Texas red, tetramethylrhodimine isothiocynate (TRITC), bora-3a,4a-diaza-s-indacene (BODIPY®) and derivatives, etc.), digoxigenin, biotin, phycoerythrin, AMCA, CyDye™, and the like), radiolabels (e.g., 3H, 125I, 35S, 14C, 32P,33P, etc.), enzymes (e.g., horse radish peroxidase, alkaline phosphatase etc.), spectroscopic calorimetric labels such as colloidal gold or colored glass or plastic (e.g. polystyrene, polypropylene, latex, etc.) beads, or nanoparticles—nanoclusters of inorganic ions with defined dimension from 0.1 to 1000 nm. Useful affinity tags and complimentary partners include, but are not limited to, biotin-streptavidin, complimentary nucleic acid fragments (e.g., oligo dT-oligo dA, oligo T-oligo A, oligo dG-oligo dC, oligo G-oligo C), aptamer-streptavidin, or haptens and proteins for which antisera or monoclonal antibodies are available. The label may be coupled directly or indirectly to a component of the detection assay (e.g., the detection reagent) according to methods well known in the art. A wide variety of labels may be used, with the choice of label depending on sensitivity required, ease of conjugation with the compound, stability requirements, available instrumentation, and disposal provisions.


[0057] In one embodiment, nucleic acids that are labeled at one or more specific locations are chemically synthesized using phosphoramidite or other solution or solid-phase methods. Detailed descriptions of the chemistry used to form polynucleotides by the phosphoramidite method are well known (see, e.g., Caruthers et al., U.S. Pat. Nos. 4,458,066 and 4,415,732; Caruthers et al., 1982, Genetic Engineering 4:1-17; Users Manual Model 392 and 394 Polynucleotide Synthesizers, 1990, pages 6-1 through 6-22, Applied Biosystems, Part No. 901237; Ojwang, et al., 1997, Biochemistry, 36:6033-6045). The phosphoramidite method of polynucleotide synthesis is the preferred method because of its efficient and rapid coupling and the stability of the starting materials. The synthesis is performed with the growing polynucleotide chain attached to a solid support, such that excess reagents, which are generally in the liquid phase, can be easily removed by washing, decanting, and/or filtration, thereby eliminating the need for purification steps between synthesis cycles.


[0058] The following briefly describes illustrative steps of a typical polynucleotide synthesis cycle using the phosphoramidite method. First, a solid support to which is attached a protected nucleoside monomer at its 3′ terminus is treated with acid, e.g., trichloroacetic acid, to remove the 5′-hydroxyl protecting group, freeing the hydroxyl group for a subsequent coupling reaction. After the coupling reaction is completed an activated intermediate is formed by contacting the support-bound nucleoside with a protected nucleoside phosphoramidite monomer and a weak acid, e.g., tetrazole. The weak acid protonates the nitrogen atom of the phosphoramidite forming a reactive intermediate. Nucleoside addition is generally complete within 30 seconds. Next, a capping step is performed, which terminates any polynucleotide chains that did not undergo nucleoside addition. Capping is preferably performed using acetic anhydride and 1-methylimidazole. The phosphite group of the internucleotide linkage is then converted to the more stable phosphotriester by oxidation using iodine as the preferred oxidizing agent and water as the oxygen donor. After oxidation, the hydroxyl protecting group of the newly added nucleoside is removed with a protic acid, e.g., trichloroacetic acid or dichloroacetic acid, and the cycle is repeated one or more times until chain elongation is complete. After synthesis, the polynucleotide chain is cleaved from the support using a base, e.g., ammonium hydroxide or t-butyl amine. The cleavage reaction also removes any phosphate protecting groups, e.g., cyanoethyl. Finally, the protecting groups on the exocyclic amines of the bases and any protecting groups on the dyes are removed by treating the polynucleotide solution in base at an elevated temperature, e.g., at about 55° C. Preferably the various protecting groups are removed using ammonium hydroxide or t-butyl amine.


[0059] Any of the nucleoside phosphoramidite monomers can be labeled using standard phosphoramidite chemistry methods (Hwang et al., 1999, Proc. Natl. Acad. Sci. USA 96(23):12997-13002; Ojwang et aL, 1997, Biochemistry. 36:6033-6045 and references cited therein). Dye molecules useful for covalently coupling to phosphoramidites preferably comprise a primary hydroxyl group that is not part of the dye's chromophore. Illustrative dye molecules include, but are not limited to, disperse dye CAS 4439-31-0, disperse dye CAS 6054-58-6, disperse dye CAS 4392-69-2 (Sigma-Aldrich, St. Louis, Mo.), disperse red, and 1-pyrenebutanol (Molecular Probes, Eugene, Oreg.). Other dyes useful for coupling to phosphoramidites will be apparent to those of skill in the art, such as fluoroscein, cy3, and cyS fluorescent dyes, and may be purchased from, e.g., Sigma-Aldrich, St. Louis, Mo. or Molecular Probes, Inc., Eugene, Oreg.


[0060] In another embodiment, dye-labeled target RNA molecules are synthesized enzymatically using in vitro transcription (Hwang et al., 1999, Proc. Natl. Acad. Sci. USA 96(23):12997-13002 and references cited therein). In this embodiment, a template DNA is denatured by heating to about 90° C. and an oligonucleotide primer is annealed to the template DNA, for example by slow-cooling the mixture of the denatured template and the primer from about 90° C. to room temperature. A mixture of ribonucleoside-5′-triphosphates capable of supporting template-directed enzymatic extension of the primed template (e.g., a mixture including GTP, ATP, CTP, and UTP), including one or more dye-labeled ribonucleotides (Sigma-Aldrich, St. Louis, Mo.), is added to the primed template. Next, a polymerase enzyme is added to the mixture under conditions where the polymerase enzyme is active, which are well-known to those skilled in the art. A labeled polynucleotide is formed by the incorporation of the labeled ribonucleotides during polymerase-mediated strand synthesis.


[0061] In yet another embodiment of the invention, nucleic acid molecules are end-labeled after their synthesis. Methods for labeling the 5′-end of an oligonucleotide include but are by no means limited to: (i) periodate oxidation of a 5′-to-5′-coupled ribonucleotide, followed by reaction with an arnine-reactive label (Heller & Morisson, 1985, in Rapid Detection and Identification ofInfectious Agents, D. T. Kingsbury and S. Falkow, eds., pp. 245-256, Academic Press); (ii) condensation of ethylenediamine with 5′-phosphorylated polynucleotide, followed by reaction with an amine reactive label (Morrison, European Patent Application 232 967); (iii) introduction of an aliphatic amine substituent using an aminohexyl phosphite reagent in solid-phase DNA synthesis, followed by reaction with an amine reactive label (Cardullo et al., 1988, Proc. Natl. Acad. Sci. USA 85:8790-8794); and (iv) introduction of a thiophosphate group on the 5′-end of the nucleic acid, using phosphatase treatment followed by end-labeling with ATP-?S and kinase, which reacts specifically and efficiently with maleimide-labeled fluorescent dyes (Czworkowski et al., 1991, Biochem. 30:4821-4830).


[0062] A detectable label should not be incorporated into a target nucleic acid at the specific binding site at which test compounds are likely to bind, since the presence of a covalently attached label might interfere sterically or chemically with the binding of the test compounds at this site. Accordingly, if the region of the target nucleic acid that binds to a host cell factor is known, a detectable label is preferably incorporated into the nucleic acid molecule at one or more positions that are spatially or sequentially remote from the binding region.


[0063] After synthesis, the labeled target nucleic acid can be purified using standard techniques known to those skilled in the art (see Hwang et al., 1999, Proc. Natl. Acad. Sci. USA 96(23):12997-13002 and references cited therein). Depending on the length of the target nucleic acid and the method of its synthesis, such purification techniques include, but are not limited to, reverse-phase high-performance liquid chromatography (“reverse-phase HPLC”), fast performance liquid chromatography (“FPLC”), and gel purification. After purification, the target RNA is refolded into its native conformation, preferably by heating to approximately 85-95° C. and slowly cooling to room temperature in a buffer, e.g., a buffer comprising about 50 mM Tris-HCl, pH 8 and 100 mM NaCl.


[0064] In another embodiment, the target nucleic acid can also be radiolabeled. A radiolabel, such as, but not limited to, an isotope of phosphorus, sulfur, or hydrogen, may be incorporated into a nucleotide, which is added either after or during the synthesis of the target nucleic acid. Methods for the synthesis and purification of radiolabeled nucleic acids are well known to one of skill in the art. See, e.g., Sambrook et al., 1989, in Molecular Cloning: A Laboratory Manual, pp 10.2-10.70, Cold Spring Harbor Laboratory Press, and the references cited therein, which are hereby incorporated by reference in their entireties.


[0065] In another embodiment, the target nucleic acid can be attached to an inorganic nanoparticle. A nanoparticle is a cluster of ions with controlled size from 0.1 to 1000 nm comprised of metals, metal oxides, or semiconductors including, but not limited to Ag2S, ZnS, CdS, CdTe, Au, or TiO2. Nanoparticles have unique optical, electronic and catalytic properties relative to bulk materials which can be adjusted according to the size of the particle. Methods for the attachment of nucleic acids are well know to one of skill in the art (see, e.g., Niemeyer, 2001, Angew. Chem. Int. Ed. 40: 4129-4158, International Patent Publication WO/0218643, and the references cited therein, the disclosures of which are hereby incorporated by reference in their entireties).



5.3. Libraries of Small Molecules

[0066] Libraries screened using the methods of the present invention can comprise a variety of types of test compounds. In some embodiments, the test compounds are nucleic acid or peptide molecules. In a non-limiting example, peptide molecules can exist in a phage display library. In other embodiments, types of test compounds include, but are not limited to, peptide analogs including peptides comprising non-naturally occurring amino acids, e.g., D-amino acids, phosphorous analogs of amino acids, such as α-amino phosphoric acids and α-amino phosphoric acids, or amino acids having non-peptide linkages, nucleic acid analogs such as phosphorothioates and PNAs, hormones, antigens, synthetic or naturally occurring drugs, opiates, dopamine, serotonin, catecholamines, thrombin, acetylcholine, prostaglandins, organic molecules, pheromones, adenosine, sucrose, glucose, lactose and galactose. Libraries of polypeptides or proteins can also be used.


[0067] In a preferred embodiment, the combinatorial libraries are small organic molecule libraries, such as, but not limited to, benzodiazepines, isoprenoids, thiazolidinones, metathiazanones, pyrrolidines, morpholino compounds, and diazepindiones. In another embodiment, the combinatorial libraries comprise peptoids; random bio-oligomers; diversomers such as hydantoins, benzodiazepines and dipeptides; vinylogous polypeptides; nonpeptidal peptidomimetics; oligocarbamates; peptidyl phosphonates; peptide nucleic acid libraries; antibody libraries; or carbohydrate libraries. Combinatorial libraries are themselves commercially available (see, e.g., Advanced ChemTech Europe Ltd., Cambridgeshire, UK; ASINEX, Moscow Russia; BioFocus plc, Sittingbourne, UK; Bionet Research (A division of Key Organics Limited), Camelford, UK; ChemBridge Corporation, San Diego, Calif.; ChemDiv Inc, San Diego, Calif.; ChemRx Advanced Technologies, South San Francisco, Calif.; ComGenex Inc., Budapest, Hungary; Evotec OAI Ltd, Abingdon, UK; IF LAB Ltd., Kiev, Ukraine; Maybridge plc, Cornwall, UK; PharmaCore, Inc., North Carolina; SIDDCO Inc, Tucson, Ariz.; TimTec Inc, Newark, Delaware; Tripos Receptor Research Ltd, Bude, UK; Toslab, Ekaterinburg, Russia).


[0068] In one embodiment, the combinatorial compound library for the methods of the present invention may be synthesized. There is a great interest in synthetic methods directed toward the creation of large collections of small organic compounds, or libraries, which could be screened for pharmacological, biological or other activity (Dolle, 2001, J. Comb. Chem. 3:477-517; Hall etal., 2001, J. Comb. Chem. 3:125-150; Dolle, 2000, J. Comb. Chem. 2:383-433; Dolle, 1999, J. Comb. Chem. 1:235-282). The synthetic methods applied to create vast combinatorial libraries are performed in solution or in the solid phase, i.e., on a solid support. Solid-phase synthesis makes it easier to conduct multi-step reactions and to drive reactions to completion with high yields because excess reagents can be easily added and washed away after each reaction step. Solid-phase combinatorial synthesis also tends to improve isolation, purification and screening. However, the more traditional solution phase chemistry supports a wider variety of organic reactions than solid-phase chemistry. Methods and strategies for the synthesis of combinatorial libraries can be found in A Practical Guide to Combinatorial Chemistry, A. W. Czarnik and S. H. Dewitt, eds., American Chemical Society, 1997; The Combinatorial Index, B. A. Bunin, Academic Press, 1998; Organic Synthesis on Solid Phase, F. Z. Dörwald, Wiley-VCH, 2000; and Solid-Phase Organic Syntheses, Vol. 1, A. W. Czarnik, ed., Wiley Interscience, 2001.


[0069] Combinatorial compound libraries of the present invention may be synthesized using apparatuses described in U.S. Pat. No. 6,358,479 to Frisina et al., U.S. Pat. No. 6,190,619 to Kilcoin et al., U.S. Pat. No. 6,132,686 to Gallup et al., U.S. Pat. No. 6,126,904 to Zuellig et al., U.S. Pat. No. 6,074,613 to Harness et al., U.S. Pat. No. 6,054,100 to Stanchfield et al., and U.S. Pat. No. 5,746,982 to Saneii et al. which are hereby incorporated by reference in their entirety. These patents describe synthesis apparatuses capable of holding a plurality of reaction vessels for parallel synthesis of multiple discrete compounds or for combinatorial libraries of compounds.


[0070] In one embodiment, the combinatorial compound library can be synthesized in solution. The method disclosed in U.S. Pat. No. 6,194,612 to Boger et al., which is hereby incorporated by reference in its entirety, features compounds useful as templates for solution phase synthesis of combinatorial libraries. The template is designed to permit reaction products to be easily purified from unreacted reactants using liquid/liquid or solid/liquid extractions. The compounds produced by combinatorial synthesis using the template will preferably be small organic molecules. Some compounds in the library may mimic the effects of non-peptides or peptides. In contrast to solid phase synthesize of combinatorial compound libraries, liquid phase synthesis does not require the use of specialized protocols for monitoring the individual steps of a multistep solid phase synthesis (Egner et al., 1995, J. Org. Chem. 60:2652; Anderson et al., 1995, J. Org. Chem. 60:2650; Fitch et al., 1994, J. Org. Chem. 59:7955; Look et al., 1994, J. Org. Chem. 49:7588; Metzger et al., 1993, Angew. Chem., Int. Ed. Engl. 32:894; Youngquist et al., 1994, Rapid Commun. Mass Spect. 8:77; Chu et al., 1995, J. Am. Chem. Soc. 117:5419; Brummel et al., 1994, Science 264:399; Stevanovic et al., 1993, Bioorg. Med. Chem. Lett. 3:431).


[0071] Combinatorial compound libraries useful for the methods of the present invention can be synthesized on solid supports. In one embodiment, a split synthesis method, a protocol of separating and mixing solid supports during the synthesis, is used to synthesize a library of compounds on solid supports (see Lam et al., 1997, Chem. Rev. 97:41-448; Ohlmeyer et al., 1993, Proc. Natl. Acad. Sci. USA 90:10922-10926 and references cited therein). Each solid support in the final library has substantially one type of test compound attached to its surface. Other methods for synthesizing combinatorial libraries on solid supports, wherein one product is attached to each support, will be known to those of skill in the art (see, e.g., Nefzi et al., 1997, Chem. Rev. 97:449-472 and U.S. Pat. No. 6,087,186 to Cargill et al. which are hereby incorporated by reference in their entirety).


[0072] As used herein, the term “solid support” is not limited to a specific type of solid support. Rather a large number of supports are available and are known to one skilled in the art. Solid supports include silica gels, resins, derivatized plastic films, glass beads, cotton, plastic beads, polystyrene beads, alumina gels, and polysaccharides. A suitable solid support may be selected on the basis of desired end use and suitability for various synthetic protocols. For example, for peptide synthesis, a solid support can be a resin such as p-methylbenzhydrylamine (PMBHA) resin (Peptides International, Louisville, Ky.), polystyrenes (e.g., PAM-resin obtained from Bachem Inc., Peninsula Laboratories, etc.), including chloromethylpolystyrene, hydroxymethylpolystyrene and aminomethylpolystyrene, poly (dimethylacrylamide)-grafted styrene co-divinyl-benzene (e.g, POLYHIPE resin, obtained from Aminotech, Canada), polyamide resin (obtained from Peninsula Laboratories), polystyrene resin grafted with polyethylene glycol (e.g., TENTAGEL or ARGOGEL, Bayer, Tubingen, Germany) polydimethylacrylamide resin (obtained from Milligen/Biosearch, California), or Sepharose (Pharmacia, Sweden).


[0073] In one embodiment, the solid phase support is suitable for in vivo use, i.e., it can serve as a carrier or support for administration of the test compound to a patient (e.g., TENTAGEL, Bayer, Tubingen, Germany). In a particular embodiment, the solid support is palatable and/or orally ingestable.


[0074] In some embodiments of the present invention, compounds can be attached to solid supports via linkers. Linkers can be integral and part of the solid support, or they may be nonintegral that are either synthesized on the solid support or attached thereto after synthesis. Linkers are useful not only for providing points of test compound attachment to the solid support, but also for allowing different groups of molecules to be cleaved from the solid support under different conditions, depending on the nature of the linker. For example, linkers can be, inter alia, electrophilically cleaved, nucleophilically cleaved, photocleavable, enzymatically cleaved, cleaved by metals, cleaved under reductive conditions or cleaved under oxidative conditions.


[0075] In another embodiment, the combinatorial compound libraries can be assembled in situ using dynamic combinatorial chemistry as described in European Patent Application 1,118,359 Al to Lehn; Huc & Nguyen, 2001, Comb. Chem. High Throughput. Screen. 4:53-74; Lehn and Eliseev, 2001, Science 291:2331-2332; Cousins et al. 2000, Curr. Opin. Chem. Biol. 4: 270-279; and Karan & Miller, 2000, Drug. Disc. Today 5:67-75 which are incorporated by reference in their entirety.


[0076] Dynamic combinatorial chemistry uses non-covalent interaction with a target biomolecule, including but not limited to a protein, RNA, or DNA, to favor assembly of the most tightly binding molecule that is a combination of constituent subunits present as a mixture in the presence of the biomolecule. According to the laws of thermodynamics, when a collection of molecules is able to combine and recombine at equilibrium through reversible chemical reactions in solution, molecules, preferably one molecule, that bind most tightly to a templating biomolecule will be present in greater amount than all other possible combinations. The reversible chemical reactions include, but are not limited to, imine, acyl-hydrazone, amide, acetal, or ester formation between carbonyl-containing compounds and amines, hydrazines, or alcohols; thiol exchange between disulfides; alcohol exchange in borate esters; Diels-Alder reactions; thermal- or photoinduced sigmatropic or electrocyclic rearrangements; or Michael reactions.


[0077] In the preferred embodiment of this technique, the constituent components of the dynamic combinatorial compound library are allowed to combine and reach equilibrium in the absence of the target RNA and then incubated in the presence of the target RNA, preferably at physiological conditions, until a second equilibrium is reached. The second, perturbed, equilibrium (the so-called “templated mixture”) can, but need not necessarily, be fixed by a further chemical transformation, including but not limited to reduction, oxidation, hydrolysis, acidification, or basification, to prevent restoration of the original equilibrium when the dynamical combinatorial compound library is separated from the target RNA.


[0078] In the preferred embodiment of this technique, the predominant product or products of the templated dynamic combinatorial library can separated from the minor products and directly identified. In another embodiment, the identity of the predominant product or products can be identified by a deconvolution strategy involving preparation of derivative dynamic combinatorial libraries, as described in European Patent Application 1,118,359 A1, which is incorporated by reference in their entirety, whereby each component of the mixture is, preferably one-by-one but possibly group-wise, left out of the mixture and the ability of the derivative library mixture at chemical equilibrium to bind the target RNA is measured. The components whose removal most greatly reduces the ability of the derivative dynamic combinatorial library to bind the target RNA are likely the components of the predominant product or products in the original dynamic combinatorial library.



5.4. Library Screening

[0079] After a target nucleic acid, such as but not limited to RNA or DNA, is labeled and a test compound library is synthesized or purchased or both, the labeled target nucleic acid is used to screen the library to identify test compounds that bind to the nucleic acid. Screening comprises contacting a labeled target nucleic acid with an individual, or small group, of the components of the compound library. Preferably, the contacting occurs in an aqueous solution, and most preferably, under physiologic conditions. The aqueous solution preferably stabilizes the labeled target nucleic acid and prevents denaturation or degradation of the nucleic acid without interfering with binding of the test compounds. The aqueous solution can be similar to the solution in which a complex between the target RNA and its corresponding host cell factor (if known) is formed in vitro. For example, TK buffer, which is commonly used to form Tat protein-TAR RNA complexes in vitro, can be used in the methods of the invention as an aqueous solution to screen a library of test compounds for TAR RNA binding compounds.


[0080] The methods of the present invention for screening a library of test compounds preferably comprise contacting a test compound with a target nucleic acid in the presence of an aqueous solution, the aqueous solution comprising a buffer and a combination of salts, preferably approximating or mimicking physiologic conditions. The aqueous solution optionally further comprises non-specific nucleic acids, such as, but not limited to, DNA; yeast tRNA; salmon sperm DNA; homoribopolymers such as, but not limited to, poly IC, polyA, polyU, and polyC; and non-specific RNA. The non-specific RNA may be an unlabeled target nucleic acid having a mutation at the binding site, which renders the unlabeled nucleic acid incapable of interacting with a test compound at that site. For example, if dye-labeled TAR RNA is used to screen a library, unlabeled TAR RNA having a mutation in the uracil 23/cytosine 24 bulge region may also be present in the aqueous solution. Without being bound by any theory, the addition of unlabeled RNA that is essentially identical to the dye-labeled target RNA except for a mutation at the binding site might minimize interactions of other regions of the dye-labeled target RNA with test compounds or with the solid support and prevent false positive results.


[0081] The solution further comprises a buffer, a combination of salts, and optionally, a detergent or a surfactant. The pH of the solution typically ranges from about 5 to about 8, preferably from about 6 to about 8, most preferably from about 6.5 to about 8. A variety of buffers may be used to achieve the desired pH. Suitable buffers include, but are not limited to, Tris, Mes, Bis-Tris, Ada, Aces, Pipes, Mopso, Bis-Tris propane, Bes, Mops, Tes, Hepes, Dipso, Mobs, Tapso, Trizma, Heppso, Popso, TEA, Epps, Tricine, Gly-Gly, Bicine, and sodium-potassium phosphate. The buffering agent comprises from about 10 mM to about 100 mM, preferably from about 25 mM to about 75 mM, most preferably from about 40 mM to about 60 mM buffering agent. The pH of the aqeuous solution can be optimized for different screening reactions, depending on the target RNA used and the types of test compounds in the library, and therefore, the type and amount of the buffer used in the solution can vary from screen to screen. In a preferred embodiment, the aqueous solution has a pH of about 7.4, which can be achieved using about 50 mM Tris buffer.


[0082] In addition to an appropriate buffer, the aqueous solution further comprises a combination of salts, from about 0 mM to about 100 mM KCl, from about 0 mM to about 1 M NaCl, and from about 0 mM to about 200 mM MgCl2. In a preferred embodiment, the combination of salts is about 100 mM KCl, 500 mM NaCl, and 10 mM MgCl2. Without being bound by any theory, Applicant has found that a combination of KCl, NaCl, and MgCl2 stabilizes the target RNA such that most of the RNA is not denatured or digested over the course of the screening reaction. The optional concentration of each salt used in the aqueous solution is dependent on the particular target RNA used and can be determined using routine experimentation.


[0083] The solution optionally comprises from about 0.01% to about 0.5% (w/v) of a detergent or a surfactant. Without being bound by any theory, a small amount of detergent or surfactant in the solution might reduce non-specific binding of the target RNA to the solid support and control aggregation and increase stability of target RNA molecules. Typical detergents useful in the methods of the present invention include, but are not limited to, anionic detergents, such as salts of deoxycholic acid, 1-heptanesulfonic acid, N-laurylsarcosine, lauryl sulfate, 1-octane sulfonic acid and taurocholic acid; cationic detergents such as benzalkonium chloride, cetylpyridinium, methylbenzethonium chloride, and decamethonium bromide; zwitterionic detergents such as CHAPS, CHAPSO, alkyl betaines, alkyl amidoalkyl betaines, N-dodecyl-N,N-dimethyl-3-ammonio-1-propanesulfonate, and phosphatidylcholine; and non-ionic detergents such as n-decyl a-D-glucopyranoside, n-decyl β-D-maltopyranoside, n-dodecyl β-D-maltoside, n-octyl β-D-glucopyranoside, sorbitan esters, n-tetradecyl β-D-maltoside, octylphenoxy polyethoxyethanol (Nonidet P-40), nonylphenoxypolyethoxyethanol (NP-40), and tritonis. Preferably, the detergent, if present, is a nonionic detergent. Typical surfactants useful in the methods of the present invention include, but are not limited to, ammonium lauryl sulfate, polyethylene glycols, butyl glucoside, decyl glucoside, Polysorbate 80, lauric acid, myristic acid, palmitic acid, potassium palmitate, undecanoic acid, lauryl betaine, and lauryl alcohol. More preferably, the detergent, if present, is Triton X-100 and present in an amount of about 0.1% (w/v).


[0084] Non-specific binding of a labeled target nucleic acid to test compounds can be further minimized by treating the binding reaction with one or more blocking agents. In one embodiment, the binding reactions are treated with a blocking agent, e.g., bovine serum albumin (“BSA”), before contacting with to.the labeled target nucleic acid. In another embodiment, the binding reactions are treated sequentially with at least two different blocking agents. This blocking step is preferably performed at room temperature for from about 0.5 to about 3 hours. In a subsequent step, the reaction mixture is further treated with unlabeled RNA having a mutation at the binding site. This blocking step is preferably performed at about 4° C. for from about 12 hours to about 36 hours before addition of the dye-labeled target RNA. Preferably, the solution used in the one or more blocking steps is substantially similar to the aqueous solution used to screen the library with the dye-labeled target RNA, e.g., in pH and salt concentration.


[0085] Once contacted, the mixture of labeled target nucleic acid and the test compound is preferably maintained at 4° C. for from about 1 day to about 5 days, preferably from about 2 days to about 3 days with constant agitation. To identify the reactions in which binding to the labeled target nucleic acid occurred, after the incubation period, bound from free compounds are determined using an electrophoretic technique (see Section 5.5.1), or any of the methods disclosed in Section 5.5 infra. In another embodiment, the complexed target nucleic acid does not need to be separated from the free target nucleic acid if a technique (i.e., spectrometry) that diferentiates between bound and unbound target nucleic acids is used.


[0086] The methods for identifying small molecules bound to labeled nucleic acid will vary with the type of label on the target nucleic acid. For example, if a target RNA is labeled with a visible of fluorescent dye, the target RNA complexes are preferably identified using a chromatographic technique that separates bound from free target by an electrophoretic or size differential technique using individual reactions. The reactions corresponding to changes in the migration of the complexed RNA can be cross-referenced to the small molecule compound(s) added to said reaction. Alternatively, complexed target RNA can be screened en masse and then separated from free target RNA using an electrophoretic or size differential technique, the resultant complexed target is then analyzed using a mass spectrometric technique. In this fashion the bound small molecule can be identified on the basis of its molecular weight. In this reaction a priori knowledge of the exact molecular weights of all compounds within the library is known. In another embodiment, the test compounds bound to the target nucleic acid may not require separation from the unbound target nucleic acid if a technique such as, but not limited to, spectrometry is used.



5.5. Separation Methods for Screening Test Compounds

[0087] Any method that detects an altered physical property of a target nucleic acid complexed to a test compound from the unbound target nucleic acid may be used for separation of the complexed and non-complexed target nucleic acids. Methods that can be utilized for the physical separation of complexed target RNA from unbound target RNA include, but are not limited to, electrophoresis, fluorescence spectroscopy, surface plasmon resonance, mass spectrometry, scintillation, proximity assay, structure-activity relationships (“SAR”) by NMR spectroscopy, size exclusion chromatography, affinity chromatography, and nanoparticle aggregation.



5.5.1. Electrophoresis

[0088] Methods for separation of the complex of a target RNA bound to a test compound from the unbound RNA comprises any method of electrophoretic separation, including but not limited to, denaturing and non-denaturing polyacrylamide gel electrophoresis, urea gel electrophoresis, gel filtration, pulsed field gel electrophoresis, two dimensional gel electrophoresis, continuous flow electrophoresis, zone electrophoresis, agarose gel electrophoresis, and capillary electrophoresis.


[0089] In a preferred embodiment, an automated electrophoretic system comprising a capillary cartridge having a plurality of capillary tubes is used for high-throughput screening of test compounds bound to target RNA. Such an apparatus for performing automated capillary gel electrophoresis is disclosed in U.S. Pat. Nos. 5,885,430; 5,916,428; 6,027,627; and 6,063,251, the disclosures of which are incorporated by reference in their entireties.


[0090] The device disclosed in U.S. Pat. No. 5,885,430, which is incorporated by reference in its entirety, allows one to simultaneously introduce samples into a plurality of capillary tubes directly from microtiter trays having a standard size. U.S. Pat. No. 5,885,430 discloses a disposable capillary cartridge which can be cleaned between electrophoresis runs, the cartridge having a plurality of capillary tubes. A first end of each capillary tube is retained in a mounting plate, the first ends collectively forming an array in the mounting plate. The spacing between the first ends corresponds to the spacing between the centers of the wells of a microtiter tray having a standard size. Thus, the first ends of the capillary tubes can simultaneously be dipped into the samples present in the tray's wells. The cartridge is provided with a second mounting plate in which the second ends of the capillary tubes are retained. The second ends of the capillary tubes are arranged in an array which corresponds to the wells in the microtiter tray, which allows for each capillary tube to be isolated from its neighbors and therefore free from cross-contamination, as each end is dipped into an individual well.


[0091] Plate holes may be provided in each mounting plate and the capillary tubes inserted through these plate holes. In such a case, the plate holes are sealed airtight so that the side of the mounting plate having the exposed capillary ends can be pressurized. Application of a positive pressure in the vicinity of the capillary openings in this mounting plate allows for the introduction of air and fluids during electrophoretic operations and also can be used to force out gel and other materials from the capillary tubes during reconditioning. The capillary tubes may be protected from damage using a needle comprising a cannula and/or plastic tubes, and the like when they are placed in these plate holes. When metallic cannula or the like are used, they can serve as electrical contacts for current flow during electrophoresis. In the presence of a second mounting plate, the second mounting plate is provided with plate holes through which the second ends of the capillary tubes project. In this instance, the second mounting plate serves as a pressure containment member of a pressure cell and the second ends of the capillary tubes communicate with an internal cavity of the pressure cell. The pressure cell is also formed with an inlet and an outlet. Gels, buffer solutions, cleaning agents, and the like may be introduced into the internal cavity through the inlet, and each of these can simultaneously enter the second ends of the capillaries.


[0092] In another preferred embodiment, the automated electrophoretic system can comprise a chip system consisting of complex designs of interconnected channels that perform and analyze enzyme reactions using part of a channel design as a tiny, continuously operating electrophoresis material, where reactions with one sample are going on in one area of the chip while electrophoretic separation of the products of another sample is taking place in a different part of the chip. Such a system is disclosed in U.S. Pat. Nos. 5,699,157; 5,842,787; 5,869,004; 5,876,675; 5,942,443; 5,948,227; 6,042,709; 6,042,710; 6,046,056; 6,048,498; 6,086,740; 6,132,685; 6,150,119; 6,150,180; 6,153,073; 6,167,910; 6,171,850; and 6,186,660, the disclosures of which are incorporated by reference in their entireties.


[0093] The system disclosed in U.S. Pat. No. 5,699,157, which is hereby incorporated by reference in its entirety, provides for a microfluidic system for high-speed electrophoretic analysis of subject materials for applications in the fields of chemistry, biochemistry, biotechnology, molecular biology and numerous other areas. The system has a channel in a substrate, a light source and a photoreceptor. The channel holds subject materials in solution in an electric field so that the materials move through the channel and separate into bands according to species. The light source excites fluorescent light in the species bands and the photoreceptor is arranged to receive the fluorescent light from the bands. The system further has a means for masking the channel so that the photoreceptor can receive the fluorescent light only at periodically spaced regions along the channel. The system also has an unit connected to analyze the modulation frequencies of light intensity received by the photoreceptor so that velocities of the bands along the channel are determined, which allows the materials to be analyzed.


[0094] The system disclosed in U.S. Pat. No. 5,699,157 also provides for a method of performing high-speed electrophoretic analysis of subject materials, which comprises the steps of holding the subject materials in solution in a channel of a microfluidic system; subjecting the materials to an electric field so that the subject materials move through the channel and separate into species bands; directing light toward the channel; receiving light from periodically spaced regions along the channel simultaneously; and analyzing the frequencies of light intensity of the received light so that velocities of the bands along the channel can be determined for analysis of said materials. The determination of the velocity of a species band determines the electrophoretic mobility of the species and its identification.


[0095] U.S. Pat. No. 5,842,787, which is hereby incorporated by reference in its entirety, is generally directed to devices and systems employ channels having, at least in part, depths that are varied over those which have been previously described (such as the device disclosed in U.S. Pat. No. 5,699,157), wherein said channel depths provide numerous beneficial and unexpected results such as but not limited to, a reduction in sample perturbation, reduced non-specific sample mixture by diffusion, and increased resolution.


[0096] In another embodiment, the electrophoretic method of separation comprises polyacrylamide gel electrophoresis. In a preferred embodiment, the polyacrylamide gel electrophoresis is non-denaturing, so as to differentiate the mobilities of the target RNA bound to a test compound from free target RNA. If the polyacrylamide gel electrophoresis is denaturing, then the target RNA:test compound complex must be cross-linked prior to electrophoresis to prevent the disassociation of the target RNA from the test compound during electrophoresis. Such techniques are well known to one of skill in the art.


[0097] In one embodiment of the method, the binding of test compounds to target nucleic acid can be detected, preferably in an automated fashion, by gel electrophoretic analysis of interference footprinting. RNA can be degraded at specific base sites by enzymatic methods such as ribonucleases A, U2, CL3, T1, Phy M, and B. cereus or chemical methods such as diethylpyrocarbonate, sodium hydroxide, hydrazine, piperidine formate, dimethyl sulfate, [2,12-dimethyl-3,7,11,17-tetraazacyclo[11.3.1]heptadeca-1(17),2,11,13,15-pentaenato] nickel(II) (NiCR), cobalt(II)chloride, or iron(II) ethylenediaminetetraacetate (Fe-EDTA) as described for example in Zheng et al., 1999, Biochem. 37:2207-2214; Latham & Cech, 1989, Science 245:276-282; and Sambrook et al., 2001, in Molecular Cloning: A Laboratory Manual, pp 12.61-12.73, Cold Spring Harbor Laboratory Press, and the references cited therein, which are hereby incorporated by reference in their entireties. The specific pattern of cleavage sites is determined by the accessibility of particular bases to the reagent employed to initiate cleavage and, as such, is therefore is determined by the three-dimensional structure of the RNA.


[0098] The interaction of small molecules with a target nucleic acid can change the accessibility of bases to these cleavage reagents both by causing conformational changes in the target nucleic acid or by covering a base at the binding interface. When a test compound binds to the nucleic acid and changes the accessibility of bases to cleavage reagents, the observed cleavage pattern will change. This method can be used to identify and characterize the binding of small molecules to RNA as described, for example, by Prudent et al., 1995, J. Am. Chem. Soc. 117:10145-10146 and Mei et al., 1998, Biochem. 37:14204-14212.


[0099] In the preferred embodiment of this technique, the detectably labeled target nucleic acid is incubated with an individual test compound and then subjected to treatment with a cleavage reagent, either enzymatic or chemical. The reaction mixture can be preferably be examined directly, or treated further to isolate and concentrate the nucleic acid. The fragments produced are separated by electrophoresis and the pattern of cleavage can be compared to a cleavage reaction performed in the absence of test compound. A change in the cleavage pattern directly indicates that the test compound binds to the target nucleic acid. Multiple test compounds can be examined both in parallel and serially.


[0100] Other embodiments of electrophoretic separation include, but are not limited to urea gel electrophoresis, gel filtration, pulsed field gel electrophoresis, two dimensional gel electrophoresis, continuous flow electrophoresis, zone electrophoresis, and agarose gel electrophoresis.



5.5.2. Fluorescence Spectroscopy

[0101] In a preferred embodiment, fluorescence polarization spectroscopy, an optical detection method that can differentiate the proportion of a fluorescent molecule that is either bound or unbound in solution (e.g., the labeled target nucleic acid of the present invention), can be used to read reaction results without electrophoretic separation of the samples. Fluorescence polarization spectroscopy can be used to read the reaction results in the chip system disclosed in U.S. Pat. Nos. 5,699,157; 5,842,787; 5,869,004; 5,876,675; 5,942,443; 5,948,227; 6,042,709; 6,042,710; 6,046,056; 6,048,498; 6,086,740; 6,132,685; 6,150,119; 6,150,180; 6,153,073; 6,167,910; 6,171,850; and 6,186,660, the disclosures of which are incorporated by reference in their entireties. The application of fluorescence polarization spectroscopy to the chip system disclosed in the U.S. Patents listed supra is fast, efficient, and well-adapted for high-throughput screening.


[0102] In another embodiment, a compound that has an affinity for the target nucleic acid of interest can be labeled with a fluorophore to screen for test compounds that bind to the target nucleic acid. For example, a pyrene-containing aminoglycoside analog was used to accurately monitor antagonist binding to a prokaryotic 16S rRNA A site (which comprises the natural target for arninoglycoside antibiotics) in a screen using a fluorescence quenching technique in a 96-well plate format (Hamasaki & Rando, 1998, Anal. Biochem. 261(2):183-90).


[0103] In another embodiment, fluorescence resonance energy transfer (FRET) can be used to screen for test compounds that bind to the target nucleic acid. FRET, a characteristic change in fluorescence, occurs when two fluorophores with overlapping emission and excitation wavelength bands are held together in close proximity, such as by a binding event. In the preferred embodiment, the fluorophore on the target nucleic acid and the fluorophore on the test compounds will have overlapping excitation and emission spectra such that one fluorophore (the donor) transfers its emission energy to excite the other fluorophore (the acceptor). The acceptor preferably emits light of a different wavelength upon relaxing to the ground state, or relaxes non-radiatively to quench fluorescence. FRET is very sensitive to the distance between the two fluorophores, and allows measurement of molecular distances less than 10 nm. For example, U.S, Pat. No. 6,337,183 to Arenas et al., which is incorporated by reference in its entirety, describes a screen for compounds that bind RNA that uses FRET to measure the effect of test compounds on the stability of a target RNA molecule where the target RNA is labeled with both fluorescent acceptor and donor molecules and the distance between the two fluorophores as determined by FRET provides a measure of the folded structure of the RNA. Matsumoto et al. (2000, Bioorg. Med. Chem. Lett. 10:1857-1861) describe a system where a peptide that binds to HIV-1 TAR RNA is labeled on one end with a fluorescein fluorophore and a tetramethylrhodamine on the other end. The conformational change of the peptide upon binding to the RNA provided a FRET signal to screen for compounds that bound to the TAR RNA.


[0104] In the preferred embodiment, both the target nucleic acid and a compound that has an affinity for the target nucleic acid of interest are labeled with fluorophores with overlapping emission and excitation spectra (donor and acceptor), including but not limited to fluorescein and derivatives, rhodamine and derivatives, cyanine dyes and derivatives, bora-3a,4a-diaza-s-indacene (BODIPY®) and derivatives, pyrene, nanoparticles, or non-fluorescent quenching molecules. Binding of a labeled test compound to the target nucleic acid can be identified by the change in observable fluorescence as a result of FRET.


[0105] If the target nucleic acid is labeled with the donor fluorophore, then the test compounds is labeled with the acceptor fluorophore. Conversely, if the target nucleic acid is labeled with the acceptor fluorophore, then the test compounds is labeled with the donor fluorophore. A wide variety of labels may be used, with the choice of label depending on sensitivity required, ease of conjugation with the compound, stability requirements, available instrumentation, and disposal provisions. The fluorophore on the target nucleic acid must be in close proximity to the binding site of the test compounds, but should not be incorporated into a target nucleic acid at the specific binding site at which test compounds are likely to bind, since the presence of a covalently attached label might interfere sterically or chemically with the binding of the test compounds at this site.


[0106] In yet another embodiment, homogeneous time-resolved fluorescence (“HTPF”) techniques based on time-resolved energy transfer from lanthanide ion complexes to a suitable acceptor species can be adapted for high-throughput screening for inhibitors of RNA-protein complexes (Hemmilä, 1999, J. Biomol. Screening 4:303-107; Mathis, 1999, J. Biomol. Screening 4:309-313). HTRF is similar to fluorescence resonance energy transfer using conventional organic dye pairs, but has several advantages, such as increased sensitivity and efficiency, and background elimination (Xavier et al., 2000, Trends Biotechnol. 18(8):349-356).


[0107] Fluorescence spectroscopy has traditionally been used to characterize DNA-protein and protein-protein interactions, but fluorescence spectroscopy has not been widely used to characterize RNA-protein interactions because of an interfering absorption of RNA nucleotides with the intrinsic tryptophan fluorescence of proteins (Xavier et al., 2000, Trends Biotechnol. 18(8):349-356.). However, fluorescence spectroscopy has been used in studying the: single tryptophan residue within the arginine-rich RNA-binding domain of Rev protein and its interaction with the RRE in a time-resolved fluorescence study (Kwon & Carson, 1998, Anal. Biochem. 264:133-140). Thus, in this invention, fluorescence spectroscopy is less preferred if the test compounds or peptides or proteins possess intrinsic tryptophan fluorescence. However, fluorescence spectroscopy can be used for test compounds that do not possess intrinsic fluorescence.



5.5.3. Surface Plasmon Resonance (“SPR”)

[0108] Surface plasmon resonance (SPR) can be used for determining kinetic rate constants and equilibrium constants for macromolecular interactions by following the association project in “real time” (Schuck, 1997, Annu. Rev. Biophys. Biomol. Struct. 26:541-566).


[0109] The principle of SPR is summarized by Xavier et al. (Trends Biotechnol., 2000, 18(8):349-356) as follows. Total internal reflection occurs at the boundary between two substances of different refractive index. The incident light's electromagnetic field penetrates beyond the interface as an evanescent wave, which extends a few hundred nanometers beyond the surface into the medium. Insertion of a thin gold foil at the interace produced SPR owing to the absorption of the energy from the evanescent wave by free electron clouds of the metal (plasmons). As a result of this absorbance, there is a drop in the intensity of the reflected light at a particular angle of incidence. The evanescent wave profile depends exquisitely on the refractive index of the medium it probes. Thus, the angle at which absorption occurs is very sensitive to the refractive changes in the external medium. All proteins and nucleic acids are known to change the refractive index of water by a similar amount per unit mass, irrespective of their amino acid or nucleotide composition (the refractive index change is different for proteins and nucleic acids). When the protein or nucleic acid content of the layer at the sensor changes, the refractive index also changes. Typically, one member of a complex is immobilized in a dextran layer and then the other member is introduced into the solution, either in a flow cell (Biacore AB, Uppsala, Sweden) or a stirred cuvette (Affinity Sensors, Santa Fe, N. Mex.). It has been determined that there is a linear correlation between the surface concentration of protein or nucleic acid and the shift in resonance angle, which can be used to quantitate kinetic rate constants and/or the equilibrium constants.


[0110] In the present invention, the target RNA may be immobilized to the sensor surface through a streptavidin-biotin linkage, the linkage of which is disclosed by Crouch et al. (Methods Mol. Biol., 1999, 118:143-160). The RNA is biotinylated either during synthesis or post-synthetically via the conversion of the 3′ terminal ribonucleoside of the RNA into a reactive free amino group or using a T7 polymerase incorporated guanosine monophosphorothioate at the 5′ end. SPR has been used to determine the stoichiometry and affinity of the interaction between the HIV Rev protein and the RRE (Van Ryk & Venkatesan, 1999, J. Biol. Chem. 274:17452-17463) and the aminoglycoside antibiotics with RRE and a model RNA derived from the 16S ribosomal A site, respectively (Hendrix et al., 1997, J. Am. Chem. Soc. 119:3641-3648; Wong et al., 1998, Chem. Biol. 5:397-406).


[0111] In one embodiment of the present invention, the target nucleic acid can be immobilized to a sensor surface (e.g., by a streptavidin-biotin linkage) and SPR can be used to (a) determine whether the target RNA binds a test compound and (b) further characterize the binding of the target nucleic acids of the present invention to a test compound.



5.5.4. Mass Spectrometry

[0112] An automated method for analyzing mass spectrometer data which can analyze complex mixtures containing many thousands of components and can correct for background noise, multiply charged peaks and atomic isotope peaks is described in U.S. Pat. No. 6,147,344, which is hereby incorporated by reference in its entirety. The system disclosed in U.S. Pat. No. 6,147,344 is a method for analyzing mass spectrometer data in which a control sample measurement is performed providing a background noise check. The peak height and width values at each m/z ratio as a finction of time are stored in a memory. A mass spectrometer operation on a material to be analyzed is performed and the peak height and width values at each m/z ratio versus time are stored in a second memory location. The mass spectrometer operation on the material to be analyzed is repeated a fixed number of times and the stored control sample values at each m/z ratio level at each time increment are subtracted from each corresponding one from the operational runs thus producing a difference value at each mass ratio for each of the multiple runs at each time increment. If the MS value minus the background noise does not exceed a preset value, the m/z ratio data point is not recorded, thus eliminating background noise, chemical noise and false positive peaks from the mass spectrometer data. The stored data for each of the multiple runs is then compared to a predetermined value at each m/z ratio and the resultant series of peaks, which are now determined to be above the background, is stored in the m/z points in which the peaks are of significance.


[0113] One possibility for the utilization of mass spectrometry in high throughput screening is the integration of SPR with mass spectrometry. Approaches that have been tried are direct analysis of the analyte retained on the sensor chip and mass spectrometry with the eluted analyte (Sonksen et al., 1998, Anal. Chem. 70:2731-2736; Nelson & Krone, 1999, J. Mol. Recog. 12:77-93). Further developments, especially in the interfacing of the sensor chip with the mass spectrometer and in reusing the sensor chip, are required to make SPR combined with mass spectroscopy a high-throughput method for biomolecular interaction analysis and the screening of targets for small molecule inhibitors (Xavier et al., 2000, Trends Biotechnol. 18(8):349-356).


[0114] In one embodiment of the present invention, the target nucleic acid complexed to a test compound can be determined by any of the mass spectrometry processed described supra. Furthermore, mass spectrometry can also be used to elucidate the structure of the test compound.



5.5.5. Scintillation Proximity Assay (“SPA”)

[0115] Scintillation Proximity Assay (“SPA”) is a method that can be used for screening small molecules that bind to the target RNAs. SPA would involve radiolabeling either the target RNA or the test compound and then quantitating its binding to the other member to a bead or a surface impregnated with a scintillant (Cook, 1996, Drug Discov. Today 1:287-294). Currently, fluorescence-based techniques are preferred for high-throughput screening (Pope et al., 1999, Drug Discov. Today 4:350-362).


[0116] Screening for small molecules that inhibit Tat peptide:TAR RNA interaction has been performed with SPA, and inhibitors of the interaction were isolated and characterized (Mei et al., 1997, Bioorg. Med. Chem. 5:1173-1184; Mei et al., 1998, Biochemistry 37:14204-14212). A similar approach can be used to identify small molecules that directly bind to a preselected target RNA element in accordance with the invention can be utilized.


[0117] SPA can be adapted to high throughput screening by the availability of microplates, wherein the scintillant is directly incorporated into the plastic of the microtiter wells (Nakayama et al., 1998, J. Biomol. Screening 3:43-48). Thus, one embodiment of the present invention comprises (a) labeling of the target nucleic acid with a radioactive or fluorescent label; (b) contacted the labeled nucleic acid with test compounds, wherein each test compound is in a microtiter well coated with scintillant and is tethered to the microtiter well; and (c) identifying and quantifying the test compounds bound to the target nucleic acid with SPA, wherein the test compound is identified by virtue of its location in the microplate.



5.5.6. Structure-Activity Relationships (“SAR”) by NMR Spectroscopy

[0118] NMR spectroscopy is a valuable technique for identifying complexed target nucleic acids by qualitatively determining changes in chemical shift, specifically from distances measured using relaxation effects, and NMR-based approaches have been used in the identification of small molecule binders of protein drug targets (Xavier et al., 2000, Trends Biotechnol. 18(8):349-356). The determination of structure-activity relationships (“SAR”) by NMR is the first method for NMR described in which small molecules that bind adjacent subsites are identified by two-dimentional 1H-15N spectra of the target protein (Shuker et al., 1996, Science 274:1531-1534). The signal from the bound molecule is monitored by employing line broadening, transferred NOEs and pulsed field gradient diffusion measurements (Moore, 1999, Curr. Opin. Biotechnol. 10:54-58). A strategy for lead generation by NMR using a library of small molecules has been recently described (Fejzo et al., 1999, Chem. Biol. 6:755-769).


[0119] In one embodiment of the present invention, the target nucleic acid complexed to a test compound can be determined by SAR by NMR. Furthermore, SAR by NMR can also be used to elucidate the structure of the test compound.



5.5.7. Size Exclusion Chromatography

[0120] In another embodiment of the present invention, size-exclusion chromatography is used to purify test compounds that are bound to a target nucleic from a complex mixture of compounds. Size-exclusion chromatography separates molecules based on their size and uses gel-based media comprised of beads with specific size distributions. When applied to a column, this media settles into a tightly packed matrix and forms a complex array of pores. Separation is accomplished by the inclusion or exclusion of molecules by these pores based on molecular size. Small molecules are included into the pores and, consequently, their migration through the matrix is retarded due to the added distance they must travel before elution. Large molecules are excluded from the pores and migrate with the void volume when applied to the matrix. In the present invention, a target nucleic acid is incubated with a mixture of test compounds while free in solution and allowed to reach equilibrium. When applied to a size exclusion column, test compounds free in solution are retained by the column, and test compounds bound to the target nucleic acid are passed through the column. In a preferred embodiment, spin columns commonly used for “desalting” of nucleic acids will be employed to separate bound from unbound test compounds (e.g., Bio-Spin columns manufactured by BIO-RAD). In another embodiment, the size exclusion matrix is packed into multiwell plates to allow high throughput separation of mixtures (e.g., PLASMID 96-well SEC plates manufactured by Millipore).



5.5.8. Affinity Chromatography

[0121] In one embodiment of the present invention, affinity capture is used to purify test compounds that are bound to a target nucleic acid labeled with an affinity tag from a complex mixture of compounds. To accomplish this, a target nucleic acid labeled with an affinity tag is incubated with a mixture of test compounds while free in solution and then captured to a solid support once equilibrium has been established; alternatively, target nucleic acids labeled with an affinity tag can be captured to a solid support first and then allowed to reach equilibrium with a mixture of test compounds.


[0122] The solid support is typically comprised of, but not limited to, cross-linked agarose beads that are coupled with a ligand for the affinity tag. Alternatively, the solid support may be a glass, silicon, metal, or carbon, plastic (polystyrene, polypropylene) surface with or without a self-assembled monolayer (SAM) either with a covalently attached ligand for the affinity tag, or with inherent affinity for the tag on the target nucleic acid.


[0123] Once the complex between the target nucleic acid and test compound has reached equilibrium and has been captured, one skilled in the art will appreciate that the retention of bound compounds and removal of unbound compounds is facilitated by washing the solid support with large excesses of binding reaction buffer. Furthermore, retention of high affinity compounds and removal of low affinity compounds can be accomplished by a number of means that increase the stringency of washing; these means include, but are not limited to, increasing the number and duration of washes, raising the salt concentration of the wash buffer, addition of detergent or surfactant to the wash buffer, and addition of non-specific competitor to the wash buffer.


[0124] In one embodiment, the test compounds themselves are detectably labeled with fluorescent dyes, radioactive isotopes, or nanoparticles. When the test compounds are applied to the captured target nucleic acid in a spatially addressed fashion (e.g., in separate wells of a 96-well microplate), binding between the test compounds and the target nucleic acid can be determined by the presence of the detectable label on the test compound using fluorescence.


[0125] Following the removal of unbound compounds, bound compounds with high affinity for the target nucleic acid can be eluted from the immobilized target nucleic acids and analyzed. The elution of test compounds can be accomplished by any means that break the non-covalent interactions between the target nucleic acid and compound. Means for elution include, but are not limited to, hanging the pH, changing the salt concentration, the application of organic solvents, and the application of molecules that compete with the bound ligand. In a preferred embodiment, the means employed for elution will release the compound from the target RNA, but will not effect the interaction between the affinity tag and the solid support, thereby achieving selective elution of test compound. Moreover, a preferred embodiment will employ an elution buffer that is volatile to allow for subsequent concentration by lyophilization of the eluted compound (e.g., 0 M to 5 M ammonium acetate).



5.5.9. Nanoparticle Aggregation

[0126] In one embodiment of the present invention, both the target nucleic acid and the test compounds are labeled with nanoparticles. A nanoparticle is a cluster of ions with controlled size from 0.1 to 1000 nm comprised of metals, metal oxides, or semiconductors including, but not limited to Ag2S, ZnS, CdS, CdTe, Au, or TiO2. Methods for the attachment of nucleic acids and small molecules to nanoparticles are well know to one of skill in the art (reviewed in Niemeyer, 2001, Angew. Chem. Int. Ed. 40:4129-4158. The references cited therein are hereby incorporated by reference in their entireties). In particular, if multiple copies of the target nucleic acid are attached to a single nanoparticle and multiple copies of a test compound are attached to another nanoparticle, then interaction between the test compound and target nucleic acid will induce aggregation of nanoparticles as described, for example, by Mitchel et al. 1999, J. Am. Chem. Soc. 121:8122-8123. The aggregate can be detected by changes in absorbance or fluorescence spectra and physically separated from the unbound components through filtration or centrifugation.



5.6. Methods for Identifying or Characterizing the Test Compounds Bound to the Target Nucleic Acids

[0127] If the library comprises arrays or microarrays of test compounds, wherein each test compound has an address or identifier, the test compound can be deconvoluted, e.g., by cross-referencing the positive sample to original compound list that was applied to the individual test assays.


[0128] If the library is a peptide or nucleic acid library, the sequence of the test compound can be determined by direct sequencing of the peptide or nucleic acid. Such methods are well known to one of skill in the art.


[0129] A number of physico-chemical techniques can be used for the de novo characterization of test compounds bound to the target.



5.6.1. Mass Spectrometry

[0130] Mass spectrometry (e.g., electrospray ionization (“ESI”) and matrix-assisted laser desorption-ionization (“MALDI”), Fourier-transform ion cyclotron resonance (“FT-ICR”)) can be used both for high-throughput screening of test compounds that bind to a target RNA and elucidating the structure of the test compound. Thus, one example of mass spectroscopy is that separation of a bound and unbound complex and test compound structure elucidation can be carried out in a single step.


[0131] MALDI uses a pulsed laser for desorption of the ions and a time-of-flight analyzer, and has been used for the detection of noncovalent tRNA:amino-acyl-tRNA synthetase complexes (Gruic-Sovulj et al., 1997, J. Biol. Chem. 272:32084-32091). However, covalent cross-linking between the target nucleic acid and the test compound is required for detection, since a non-covalently bound complex may dissociate during the MALDI process.


[0132] ESI mass spectrometry (“ESI-MS”) has been of greater utility for studying non-covalent molecular interactions because, unlike the MALDI process, ESI-MS generates molecular ions with little to no fragmentation (Xavier et al., 2000, Trends Biotechnol. 18(8):349-356). ESI-MS has been used to study the complexes formed by HIV Tat peptide and protein with the TAR RNA (Sannes-Lowery et al., 1997, Anal. Chem. 69:5130-5135).


[0133] Fourier-transform ion cyclotron resonance (“FT-ICR”) mass spectrometry provides high-resolution spectra, isotope-resolved precursor ion selection, and accurate mass assignments (Xavier et al., 2000, Trends Biotechnol. 18(8):349-356). FT-ICR has been used to study the interaction of aminoglycoside antibiotics with cognate and non-cognate RNAs (Hofstadler et al., 1999, Anal. Chem. 71:3436-3440; Griffey et al., 1999, Proc. Natl. Acad. Sci. USA 96:10129-10133). As true for all of the mass spectrometry methods discussed herein, FT-ICR does not require labeling of the target RNA or a test compound.


[0134] An advantage of mass spectroscopy is not only the elucidation of the structure of the test compound, but also the determination of the structure of the test compound bound to the preselected target RNA. Such information can enable the discovery of a consensus structure of a test compound that specifically binds to a preselected target RNA.



5.6.2. NMR Spectroscopy

[0135] As described above, NMR spectroscopy is a technique for identifying binding sites in target nucleic acids by qualitatively determining changes in chemical shift, specifically from distances measured using relaxation effects. Examples of NMR that can be used for the invention include, but are not limited to, one-dimentional NMR, two-dimentional NMR, correlation spectroscopy (“COSY”), and nuclear Overhauser effect (“NOE”) spectroscopy. Such methods of structure determination of test compounds are well known to one of skill in the art.


[0136] Similar to mass spectroscopy, an advantage of NMR is the not only the elucidation of the structure of the test compound, but also the determination of the structure of the test compound bound to the preselected target RNA. Such information can enable the discovery of a consensus structure of a test compound that specifically binds to a preselected target RNA.



5.6.3. Vibrational Spectroscopy

[0137] Vibrational spectroscopy (e.g. infrared (IR) spectroscopy or Raman spectroscopy) can be used for elucidating the structure of the test compound on the isolated bead.


[0138] Infrared spectroscopy measures the frequencies of infrared light (wavelengths from 100 to 10,000 nm) absorbed by the test compound as a result of excitation of vibrational modes according to quantum mechanical selection rules which require that absorption of light cause a change in the electric dipole moment of the molecule. The infrared spectrum of any molecule is a unique pattern of absorption wavelengths of varying intensity that can be considered as a molecular fingerprint to identify any compound.


[0139] Infrared spectra can be measured in a scanning mode by measuring the absorption of individual frequencies of light, produced by a grating which separates frequencies from a mixed-frequency infrared light source, by the test compound relative to a standard intensity (double-beam instrument) or pre-measured (‘blank’) intensity (single-beam instrument). In a preferred embodiment, infrared spectra are measured in a pulsed mode (FT-IR) where a mixed beam, produced by an interferometer, of all infrared light frequencies is passed through or reflected off the test compound. The resulting interferogram, which may or may not be added with the resulting interferograms from subsequent pulses to increase the signal strength while averaging random noise in the electronic signal, is mathematically transformed into a spectrum using Fourier Transform or Fast Fourier Transform algorithms.


[0140] Raman spectroscopy measures the difference in frequency due to absorption of infrared frequencies of scattered visible or ultraviolet light relative to the incident beam. The incident monochromatic light beam, usually a single laser frequency, is not truly absorbed by the test compound but interacts with the electric field transiently. Most of the light scattered off the sample with be unchanged (Rayleigh scattering) but a portion of the scatter light will have frequencies that are the sum or difference of the incident and molecular vibrational frequencies. The selection rules for Raman (inelastic) scattering require a change in polarizability of the molecule. While some vibrational transitions are observable in both infrared and Raman spectrometry, must are observable only with one or the other technique. The Raman spectrum of any molecule is a unique pattern of absorption wavelengths of varying intensity that can be considered as a molecular fingerprint to identify any compound.


[0141] Raman spectra are measured by submitting monochromatic light to the sample, either passed through or preferably reflected off, filtering the Rayleigh scattered light, and detecting the frequency of the Raman scattered light. An improved Raman spectrometer is described in U.S. Pat. No. 5,786,893 to Fink et al., which is hereby incorporated by reference.


[0142] Vibrational microscopy can be measured in a spatially resolved fashion to address single beads by integration of a visible microscope and spectrometer. A microscopic infrared spectrometer is described in U.S. Pat. No. 5,581,085 to Reffner et al., which is hereby incorporated by reference in its entirety. An instrument that simultaneously performs a microscopic infrared and microscopic Raman analysis on a sample is described in U.S. Pat. No. 5,841,139 to Sostek et al., which is hereby incorporated by reference in its entirety.


[0143] In the preferred embodiment, test compounds can be identified by matching the IR or Raman spectra of a test compound to a dataset of vibrational (IR or Raman) spectra previously acquired for each compound in the combinatorial library. By this method, the spectra of compounds with known structure are recorded so that comparison with these spectra can identify compounds again when isolated from RNA binding experiments.



5.7. Secondary Biological Screens

[0144] The test compounds identified in the binding assay (for convenience referred to herein as a “lead” compound) can be tested for biological activity using host cells containing or engineered to contain the target RNA element coupled to a functional readout system. For example, the lead compound can be tested in a host cell engineered to contain the target RNA element controlling the expression of a reporter gene. In this example, the lead compounds are assayed in the presence or absence of the target RNA. Alternatively, a phenotypic or physiological readout can be used to assess activity of the target RNA in the presence and absence of the lead compound.


[0145] In one embodiment, the lead compound can be tested in a host cell engineered to contain the target RNA element controlling the expression of a reporter gene, such as, but not limited to, β-galactosidase, green fluorescent protein, red fluorescent protein, luciferase, chloramphenicol acetyltransferase, alkaline phosphatase, and β-lactamase. In a preferred embodiment, a cDNA encoding the target element is fused upstream to a reporter gene wherein translation of the reporter gene is repressed upon binding of the lead compound to the target RNA. In other words, the steric hindrance caused by the binding of the lead compound to the target RNA repressed the translation of the reporter gene. This method, termed the translational repression assay procedure (“TRAP”) has been demonstrated in E. coli and S. cerevisiae (Jain & Belasco, 1996, Cell 87(1):115-25; Huang & Schreiber, 1997, Proc. Natl. Acad. Sci. USA 94:13396-13401).


[0146] In another embodiment, a phenotypic or physiological readout can be used to assess activity of the target RNA in the presence and absence of the lead compound. For example, the target RNA may be overexpressed in a cell in which the target RNA is endogenously expressed. Where the target RNA controls expression of a gene product involved in cell growth or viability, the in vivo effect of the lead compound can be assayed by measuring the cell growth or viability of the target cell. Alternatively, a reporter gene can also be fused downstream of the target PNA sequence and the effect of the lead compound on reporter gene expression can be assayed.


[0147] Alternatively, the lead compounds identified in the binding assay can be tested for biological activity using animal models for a disease, condition, or syndrome of interest. These include animals engineered to contain the target RNA element coupled to a functional readout system, such as a transgenic mouse. Animal model systems can also be used to demonstrate safety and efficacy.


[0148] Compounds displaying the desired biological activity can be considered to be lead compounds, and will be used in the design of congeners or analogs possessing useful pharmacological activity and physiological profiles. Following the identification of a lead compound, molecular modeling techniques can be employed, which have proven to be useful in conjunction with synthetic efforts, to design variants of the lead that can be more effective. These applications may include, but are lot limited to, Pharmacophore Modeling (cf. Lamothe, et al. 1997, J. Med. Chem. 40: 3542; Mottola et al. 1996, J. Med. Chem. 39: 285; Beusen et al. 1995, Biopolymers 36: 181; P. Fossa et al. 1998, Comput. Aided Mol. Des. 12: 361), QSAR development (cf Siddiqui et al. 1999, J. Med. Chem. 42: 4122; Barreca et al. 1999 Bioorg. Med. Chem. 7: 2283; Kroemer et al. 1995, J. Med. Chem. 4917; Schaal et al. 2001, J. Med. Chem. 44: 155; Buolamwini & Assefa 2002, J. Mol. Chem. 45: 84), Virtual docking and screening/scoring (cf Anzini et al. 2001, J. Med. Chem. 44: 1134; Faaland et al. 2000, Biochem. Cell. Biol. 78: 415; Silvestri et al 2000, Bioorg. Med. Chem. 8: 2305; J. Lee et al 2001, Bioorg. Med. Chem. 9: 19), and Structure Prediction using RNA structural programs including, but not limited to mFold (as described by Zuker et al. Algorithms and Thermodynamics for RNA Secondary Structure Prediction: A Practical Guide in RNA Biochemistry and Biotechnology pp. 11-43, J. Barciszewski & B. F. C. Clark, eds. (NATO ASI Series, Kluwer Academic Publishers, 1999) and Mathews et al. 1999 J. Mol. Biol. 288: 911-940); RNAmotif (Macke et al. 2001, Nucleic Acids Res. 29: 4724-4735; and the Vienna RNA package (Hofacker et al. 1994, Monatsh. Chem. 125: 167-188).


[0149] Further examples of the application of such techniques can be found in several review articles, such as Rotivinen et al., 1988, Acta Pharmaceutical Fennica 97:159-166; Ripka, 1998, New Scientist 54-57; McKinaly & Rossmann, 1989, Annu. Rev. Pharmacol. Toxiciol. 29:111-122; Perry & Davies, QSAR: Quantitative Structure-Activity Relationships in Drug Design pp. 189-193 (Alan R. Liss, Inc. 1989); Lewis & Dean, 1989, Proc. R. Soc. Lond. 236:125-140 and 141-162; Askew et al., 1989, J. Am. Chem. Soc. 111: 1082-1090. Molecular modeling tools employed may include those from Tripos, Inc., St. Louis, Mo. (e.g., Sybyl/UNITY, CONCORD, DiverseSolutions), Accelerys, San Diego, Calif. (e.g., Catalyst, Wisconsin Package {BLAST, etc.}), Schrodinger, Portland, Oreg. (e.g., QikProp QikFit, Jaguar) or other such vendors as BioDesign, Inc. (Pasadena, Calif.), Allelix, Inc. (Mississauga, Ontario, Canada), and Hypercube, Inc. (Cambridge, Ontario, Canada), and may include privately designed and/or “academic” software (e.g. RNAMotif, MFOLD). These application suites and programs include tools for the atomistic construction and analysis of structural models for drug-like molecules, proteins, and DNA or RNA and their potential interactions. They also provide for the calculation of important physical properties, such as solubility estimates, permeability metrics, and empirical measures of molecular “druggability” (e.g., Lipinski “Rule of 5” as described by Lipinski et al. 1997, Adv. Drug Delivery Rev. 23: 3-25). Most importantly, they provide appropriate metrics and statistical modeling power (such as the patented CoMFA technology in Sybyl as described in U.S. Pat. Nos. 6,240,374 and 6,185,506) to develop Quantitative Structural Activity Relationships (QSARs) which are used to guide the synthesis of more efficacious clinical development candidates while improving desirable physical properties, as determined by results from the aforementioned secondary screening protocols.



5.8. Use of Identified Compounds That Bind RNA to Treat/Prevent Disease

[0150] Biologically active compounds identified using the methods of the invention or a pharmaceutically acceptable salt thereof can be administered to a patient, preferably a mammal, more preferably a human, suffering from a disease whose progression is associated with a target RNA:host cell factor interaction in vivo. In certain embodiments, such compounds or a pharmaceutically acceptable salt thereof is administered to a patient, preferably a mammal, more preferably a human, as a preventative measure against a disease associated with an RNA:host cell factor interaction in vivo.


[0151] In one embodiment, “treatment” or “treating” refers to an amelioration of a disease, or at least one discernible symptom thereof. In another embodiment, “treatment” or “treating” refers to an amelioration of at least one measurable physical parameter, not necessarily discernible by the patient. In yet another embodiment, “treatment” or “treating” refers to inhibiting the progression of a disease, either physically, e.g., stabilization of a discernible symptom, physiologically, e.g., stabilization of a physical parameter, or both. In yet another embodiment, “treatment” or “treating” refers to delaying the onset of a disease.


[0152] In certain embodiments, the compound or a pharmaceutically acceptable salt thereof is administered to a patient, preferably a manmial, more preferably a human, as a preventative measure against a disease associated with an RNA:host cell factor interaction in vivo. As used herein, “prevention” or “preventing” refers to a reduction of the risk of acquiring a disease. In one embodiment, the compound or a pharmaceutically acceptable salt thereof is administered as a preventative measure to a patient. According to this embodiment, the patient can have a genetic predisposition to a disease, such as a family history of the disease, or a non-genetic predisposition to the disease. Accordingly, the compound and pharmaceutically acceptable salts thereof can be used for the treatment of one manifestation of a disease and prevention of another.


[0153] When administered to a patient, the compound or a pharmaceutically acceptable salt thereof is preferably administered as component of a composition that optionally comprises a pharmaceutically acceptable vehicle. The composition can be administered orally, or by any other convenient route, for example, by infusion or bolus injection, by absorption through epithelial or mucocutaneous linings (e.g., oral mucosa, rectal, and intestinal mucosa, etc.) and may be administered together with another biologically active agent. Administration can be systemic or local. Various delivery systems are known, e.g., encapsulation in liposomes, microparticles, microcapsules, capsules, etc., and can be used to administer the compound and pharmaceutically acceptable salts thereof.


[0154] Methods of administration include but are not limited to intradermal, intramuscular, intraperitoneal, intravenous, subcutaneous, intranasal, epidural, oral, sublingual, intranasal, intracerebral, intravaginal, transdermal, rectally, by inhalation, or topically, particularly to the ears, nose, eyes, or skin. The mode of administration is left to the discretion of the practitioner. In most instances, administration will result in the release of the compound or a pharmaceutically acceptable salt thereof into the bloodstream.


[0155] In specific embodiments, it may be desirable to administer the compound or a pharmaceutically acceptable salt thereof locally. This may be achieved, for example, and not by way of limitation, by local infusion during surgery, topical application, e.g., in conjunction with a wound dressing after surgery, by injection, by means of a catheter, by means of a suppository, or by means of an implant, said implant being of a porous, non-porous, or gelatinous material, including membranes, such as sialastic membranes, or fibers.


[0156] In certain embodiments, it may be desirable to introduce the compound or a pharmaceutically acceptable salt thereof into the central nervous system by any suitable route, including intraventricular, intrathecal and epidural injection. Intraventricular injection may be facilitated by an intraventricular catheter, for example, attached to a reservoir, such as an Ommaya reservoir.


[0157] Pulmonary administration can also be employed, e.g., by use of an inhaler or nebulizer, and formulation with an aerosolizing agent, or via perfusion in a fluorocarbon or synthetic pulmonary surfactant. In certain embodiments, the compound and pharmaceutically acceptable salts thereof can be formulated as a suppository, with traditional binders and vehicles such as triglycerides.


[0158] In another embodiment, the compound and pharmaceutically acceptable salts thereof can be delivered in a vesicle, in particular a liposome (see Langer, 1990, Science 249:1527-1533; Treat et al., in Liposomes in the Therapy of Infectious Disease and Cancer, Lopez-Berestein and Fidler (eds.), Liss, New York, pp. 353-365 (1989); Lopez-Berestein, ibid., pp. 317-327; see generally ibid.).


[0159] In yet another embodiment, the compound and pharmaceutically acceptable salts thereof can be delivered in a controlled release system (see, e.g., Goodson, in Medical Applications of Controlled Release, supra, vol. 2, pp. 115-138 (1984)). Other controlled-release systems discussed in the review by Langer, 1990, Science 249:1527-1533) may be used. In one embodiment, a pump may be used (see Langer, supra; Sefton, 1987, CRC Crit. Ref. Biomed. Eng. 14:201; Buchwald et al., 1980, Surgery 88:507 Saudek et al, 1989, N. Engl. J. Med. 321:574). In another embodiment, polymeric materials can be used (see Medical Applications of Controlled Release, Langer and Wise (eds.), CRC Pres., Boca Raton, Fla. (1974); Controlled Drug Bioavailability, Drug Product Design and Performance, Smolen and Ball (eds.), Wiley, New York (1984); Ranger and Peppas, 1983, J. Macromol. Sci. Rev. Macromol. Chem. 23:61; see also Levy et al., 1985, Science 228:190; During et al., 1989, Ann. Neurol. 25:351; Howard etal., 1989, J. Neurosurg. 71:105). In yet another embodiment, a controlled-release system can be placed in proximity of a target RNA of the compound or a pharmaceutically acceptable salt thereof, thus requiring only a fraction of the systemic dose.


[0160] Compositions comprising the compound or a pharmaceutically acceptable salt thereof (“compound compositions”) can additionally comprise a suitable amount of a pharmaceutically acceptable vehicle so as to provide the form for proper administration to the patient.


[0161] In a specific embodiment, the term “pharmaceutically acceptable” means approved by a regulatory agency of the Federal or a state government or listed in the U.S. Pharmacopeia or other generally recognized pharmacopeia for use in animals, mammals, and more particularly in humans. The term “vehicle” refers to a diluent, adjuvant, excipient, or carrier with which a compound of the invention is administered. Such pharmaceutical vehicles can be liquids, such as water and oils, including those of petroleum, animal, vegetable or synthetic origin, such as peanut oil, soybean oil, mineral oil, sesame oil and the like. The pharmaceutical vehicles can be saline, gum acacia, gelatin, starch paste, talc, keratin, colloidal silica, urea, and the like. In addition, auxiliary, stabilizing, thickening, lubricating and coloring agents may be used. When administered to a patient, the pharmaceutically acceptable vehicles are preferably sterile. Water is a preferred vehicle when the compound of the invention is administered intravenously. Saline solutions and aqueous dextrose and glycerol solutions can also be employed as liquid vehicles, particularly for injectable solutions. Suitable pharmaceutical vehicles also include excipients such as starch, glucose, lactose, sucrose, gelatin, malt, rice, flour, chalk, silica gel, sodium stearate, glycerol monostearate, talc, sodium chloride, dried skim milk, glycerol, propylene, glycol, water, ethanol and the like. Compound compositions, if desired, can also contain minor amounts of wetting or emulsifying agents, or pH buffering agents.


[0162] Compound compositions can take the form of solutions, suspensions, emulsion, tablets, pills, pellets, capsules, capsules containing liquids, powders, sustained-release formulations, suppositories, emulsions, aerosols, sprays, suspensions, or any other form suitable for use. In one embodiment, the pharmaceutically acceptable vehicle is a capsule (see e.g., U.S. Pat. No. 5,698,155). Other examples of suitable pharmaceutical vehicles are described in Remington's Pharmaceutical Sciences, Alfonso R. Gennaro, ed., Mack Publishing Co. Easton, Pa., 19th ed., 1995, pp. 1447 to 1676, incorporated herein by reference.


[0163] In a preferred embodiment, the compound or a pharmaceutically acceptable salt thereof is formulated in accordance with routine procedures as a pharmaceutical composition adapted for oral administration to human beings. Compositions for oral delivery may be in the form of tablets, lozenges, aqueous or oily suspensions, granules, powders, emulsions, capsules, syrups, or elixirs, for example. Orally administered compositions may contain one or more agents, for example, sweetening agents such as fructose, aspartame or saccharin; flavoring agents such as peppermint, oil of wintergreen, or cherry; coloring agents; and preserving agents, to provide a pharmaceutically palatable preparation. Moreover, where in tablet or pill form, the compositions can be coated to delay disintegration and absorption in the gastrointestinal tract thereby providing a sustained action over an extended period of time. Selectively permeable membranes surrounding an osmotically active driving compound are also suitable for orally administered compositions. In these later platforms, fluid from the environment surrounding the capsule is imbibed by the driving compound, which swells to displace the agent or agent composition through an aperture. These delivery platforms can provide an essentially zero order delivery profile as opposed to the spiked profiles of immediate release formulations. A time delay material such as glycerol monostearate or glycerol stearate may also be used. Oral compositions can include standard vehicles such as mannitol, lactose, starch, magnesium stearate, sodium saccharine, cellulose, magnesium carbonate, and the like. Such vehicles are preferably of pharmaceutical grade. Typically, compositions for intravenous administration comprise sterile isotonic aqueous buffer. Where necessary, the compositions may also include a solubilizing agent.


[0164] In another embodiment, the compound or a pharmaceutically acceptable salt thereof can be formulated for intravenous administration. Compositions for intravenous administration may optionally include a local anesthetic such as lignocaine to lessen pain at the site of the injection. Generally, the ingredients are supplied either separately or mix ed together in unit dosage form, for example, as a dry lyophilized powder or water-free concentrate in a hermetically sealed container such as an ampoule or sachette indicating the quantity of active agent. Where the compound or a pharmaceutically acceptable salt thereof is to be administered by infusion, it can be dispensed, for example, with an infusion bottle containing sterile pharmaceutical grade water or saline. Where the compound or a pharmaceutically acceptable salt thereof is administered by injection, an ampoule of sterile water for injection or saline can be provided so that the ingredients may be mixed prior to administration.


[0165] The amount of a compound or a pharmaceutically acceptable salt thereof that will be effective in the treatment of a particular disease will depend on the nature of the disease, and can be determined by standard clinical techniques. In addition, in vitro or in vivo assays may optionally be employed to help identify optimal dosage ranges. The precise dose to be employed will also depend on the route of administration, and the seriousness of the disease, and should be decided according to the judgment of the practitioner and each patient's circumstances. However, suitable dosage ranges for oral administration are generally about 0.001 milligram to about 200 milligrams of a compound or a pharmaceutically acceptable salt thereof per kilogram body weight per day. In specific preferred embodiments of the invention, the oral dose is about 0.01 milligram to about 100 milligrams per kilogram body weight per day, more preferably about 0.1 milligram to about 75 milligrams per kilogram body weight per day, more preferably about 0.5 milligram to 5 milligrams per kilogram body weight per day. The dosage amounts described herein refer to total amounts administered; that is, if more than one compound is administered, or if a compound is administered with a therapeutic agent, then the preferred dosages correspond to the total amount administered. Oral compositions preferably contain about 10% to about 95% active ingredient by weight.


[0166] Suitable dosage ranges for intravenous (i.v.) administration are about 0.01 milligram to about 100 milligrams per kilogram body weight per day, about 0.1 milligram to about 35 milligrams per kilogram body weight per day, and about 1 milligram to about 10 milligrams per kilogram body weight per day. Suitable dosage ranges for intranasal administration are generally about 0.01 pg/kg body weight per day to about 1 mg/kg body weight per day. Suppositories generally contain about 0.01 milligram to about 50 milligrams of a compound of the invention per kilogram body weight per day and comprise active ingredient in the range of about 0.5% to about 10% by weight.


[0167] Recommended dosages for intradermal, intramuscular, intraperitoneal, subcutaneous, epidural, sublingual, intracerebral, intravaginal, transdermal administration or administration by inhalation are in the range of about 0.001 milligram to about 200 milligrams per kilogram of body weight per day. Suitable doses for topical administration are in the range of about 0.001 milligram to about 1 milligram, depending on the area of administration. Effective doses may be extrapolated from dose-response curves derived from in vitro or animal model test systems. Such animal models and systems are well known in the art.


[0168] The compound and pharmaceutically acceptable salts thereof are preferably assayed in vitro and in vivo, for the desired therapeutic or prophylactic activity, prior to use in humans. For example, in vitro assays can be used to determine whether it is preferable to administer the compound, a pharmaceutically acceptable salt thereof, and/or another therapeutic agent. Animal model systems can be used to demonstrate safety and efficacy.


[0169] A variety of compounds can be used for treating or preventing diseases in mammals. Types of compounds include, but are not limited to, peptides, peptide analogs including peptides comprising non-natural amino acids, e.g., D-amino acids, phosphorous analogs of amino acids, such as a-amino phosphonic acids and a-amino phosphinic acids, or amino acids having non-peptide linkages, nucleic acids, nucleic acid analogs such as phosphorothioates or peptide nucleic acids (“PNAs”), hormones, antigens, synthetic or naturally occurring drugs, opiates, dopamine, serotonin, catecholamines, thrombin, acetylcholine, prostaglandins, organic molecules, pheromones, adenosine, sucrose, glucose, lactose and galactose.



6. EXAMPLE: THERAPEUTIC TARGETS

[0170] The therapeutic targets presented herein are by way of example, and the present invention is not to be limited by the targets described herein. The therapeutic targets presented herein as DNA sequences are understood by one of skill in the art that the sequences can be converted to RNA sequences.



6.1. Tumor Necrosis Factor Alpha (“TNF-α”)

[0171] GenBank Accession # X01394:
2   1gcagaggacc agctaagagg gagagaagca actacagacc ccccctgaaa acaaccctca(SEQ ID NO: 6)  61gacgccacat cccctgacaa gctgccaggc aggttctctt cctctcacat actgacccac 121ggctccaccc tctctcccct ggaaaggaca ccatgagcac tgaaagcatg atccgggacg 181tggagctggc cgaggaggcg ctccccaaga agacaggggg gccccagggc tccaggcggt 241gcttgttcct cagcctcttc tccttcctga tcgtggcagg cgccaccacg ctcttctgcc 301tgctgcactt tggagtgatc ggcccccaga gggaagagtt ccccagggac ctctctctaa 361tcagccctct ggcccaggca gtcagatcat cttctcgaac cccgagtgac aagcctgtag 421cccatgttgt agcaaaccct caagctgagg ggcagctcca gtggctgaac cgccgggcca 481atgccctcct ggccaatggc gtggagctga gagataacca gctggtggtg ccatcagagg 541gcctgtacct catctactcc caggtcctct tcaagggcca aggctgcccc tccacccatg 601tgctcctcac ccacaccatc agccgcatcg ccgtctccta ccagaccaag gtcaacctcc 661tctctgccat caagagcccc tgccagaggg agaccccaga gggggctgag gccaagccct 721ggtatgagcc catctatctg ggaggggtct tccagctgga gaagggtgac cgactcagcg 781ctgagatcaa tcggcccgac tatctcgact ttgccgagtc tgggcaggtc tactttggga 841tcattgccct gtgaggagga cgaacatcca accttcccaa acgcctcccc tgccccaatc 901cctttattac cccctccttc agacaccctc aacctcttct ggctcaaaaa gagaattggg 961ggcttagggt cggaacccaa gcttagaact ttaagcaaca agaccaccac ttcgaaacct1021gggattcagg aatgtgtggc ctgcacagtg aattgctggc aaccactaag aattcaaact1081ggggcctcca gaactcactg gggcctacag ctttgatccc tgacatctgg aatctggaga1141ccagggagcc tttggttctg gccagaatgc tgcaggactt gagaagacct cacctagaaa1201ttgacacaag tggaccttag gccttcctct ctccagatgt ttccagactt ccttgagaca1261cggagcccag ccctccccat ggagccagct ccctctattt atgtttgcac ttgtgattat1321ttattattta tttattattt atttatttac agatgaatgt atttatttgg gagaccgggg1381tatcctgggg gacccaatgt aggagctgcc ttggctcaga catgttttcc gtgaaaacgg1441agctgaacaa taggctgttc ccatgtagcc ccctggcctc tgtgccttct tttgattatg1501ttttttaaaa tatttatctg attaagttgt ctaaacaatg ctgatttggt gaccaactgt1561cactcattgc tgagcctctg ctccccaggg gagttgtgtc tgtaatcgcc ctactattca1621gtggcgagaa ataaagtttg ctt


[0172] General Target Regions:


[0173] (1) 5′ Untranslated Region—nts 1-152


[0174] (2) 3′ Untranslated Region—nts 852-1643


[0175] Initial Specific Target Motif:
3Group I AU-Rich Element (ARE) Cluster in 3′ untranslated region5′ AUUUAUUUAUUUAUUUAUUUA 3′(SEQ ID NO: 1)



6.2. Granulocyte-macrophage Colony Stimulating Factor (“GM-CSF”)

[0176] GenBank Accession #NM000758:
4  1gctggaggat gtggctgcag agcctgctgc tcttgggcac tgtggcctgc agcatctctg(SEQ ID NO: 7) 61cacccgcccg ctcgcccagc cccagcacgc agccctggga gcatgtgaat gccatccagg121aggcccggcg tctcctgaac ctgagtagag acactgctgc tgagatgaat gaaacagtag181aagtcatctc agaaatgttt gacctccagg agccgacctg cctacagacc cgcctggagc241tgtacaagca gggcctgcgg ggcagcctca ccaagctcaa gggccccttg accatgatgg301ccagccacta caagcagcac tgccctccaa ccccggaaac ttcctgtgca acccagacta361tcacctttga aagtttcaaa gagaacctga aggactttct gcttgtcatc ccctttgact421gctgggagcc agtccaggag tgagaccggc cagatgaggc tggccaagcc ggggagctgc481tctctcatga aacaagagct agaaactcag gatggtcatc ttggagggac caaggggtgg541gccacagcca tggtgggagt ggcctggacc tgccctgggc cacactgacc ctgatacagg601catggcagaa gaatgggaat attttatact gacagaaatc agtaatattt atatatttat661atttttaaaa tatttattta tttatttatt taagttcata ttccatattt attcaagatg721ttttaccgta ataattatta ttaaaaatat gcttct


[0177] GenBank Accession #XM003751:
5  1tctggaggat gtggctgcag agcctgctgc tcttgggcac tgtggcctgc agcatctctg(SEQ ID NO: 8) 61cacccgcccg ctcgcccagc cccagcacgc agccctggga gcatgtgaat gccatccagg121aggcccggcg tctcctgaac ctgagtagag acactgctgc tgagatgaat gaaacagtag181aagtcatctc agaaatgttt gacctccagg agccgacctg cctacagacc cgcctggagc241tgtacaagca gggcctgcgg ggcagcctca ccaagctcaa gggccccttg accatgatgg301ccagccacta caagcagcac tgccctccaa ccccggaaac ttcctgtgca acccagacta361tcacctttga aagtttcaaa gagaacctga aggactttct gcttgtcatc ccctttgact421gctgggagcc agtccaggag tgagaccggc cagatgaggc tggccaagcc ggggagctgc481tctctcatga aacaagagct agaaactcag gatggtcatc ttggagggac caaggggtgg541gccacagcca tggtgggagt ggcctggacc tgccctgggc cacactgacc ctgatacagg601catggcagaa gaatgggaat attttatact gacagaaatc agtaatattt atatatttat661atttttaaaa tatttattta tttatttatt taagttcata ttccatattt attcaagatg721ttttaccgta ataattatta ttaaaaatat gcttct


[0178] General Target Regions:


[0179] (1) 5′ Untranslated Region—nts 1-32


[0180] (2) 3′ Untranslated Region—nts 468-789


[0181] Initial Specific Target Motif:


[0182] Group I AU-Rich Element (ARE) Cluster in 3′ untranslated region 5′ AUUUAUUUAUUUAUUUAUUUA 3′ (SEQ ID NO: 1)



6.3. Interleukin 2 (“IL-2”)

[0183] GenBank Accession #U25676:
6  1atcactctct ttaatcacta ctcacattaa cctcaactcc tgccacaatg tacaggatgc(SEQ ID NO: 9) 61aactcctgtc ttgcattgca ctaattcttg cacttgtcac aaacagtgca cctacttcaa121gttcgacaaa gaaaacaaag aaaacacagc tacaactgga gcatttactg ctggatttac181agatgatttt gaatggaatt aataattaca agaatcccaa actcaccagg atgctcacat241ttaagtttta catgcccaag aaggccacag aactgaaaca gcttcagtgt ctagaagaag301aactcaaacc tctggaggaa gtgctgaatt tagctcaaag caaaaacttt cacttaagac361ccagggactt aatcagcaat atcaacgtaa tagttctgga actaaaggga tctgaaacaa421cattcatgtg tgaatatgca gatgagacag caaccattgt agaatttctg aacagatgga481ttaccttttg tcaaagcatc atctcaacac taacttgata attaagtgct tcccacttaa541aacatatcag gccttctatt tatttattta aatatttaaa ttttatattt attgttgaat601gtatggttgc tacctattgt aactattatt cttaatctta aaactataaa tatggatctt661ttatgattct ttttgtaagc cctaggggct ctaaaatggt ttaccttatt tatcccaaaa721atatttatta ttatgttgaa tgttaaatat agtatctatg tagattggtt agtaaaacta781tttaataaat ttgataaata taaaaaaaaa aaacaaaaaa aaaaa


[0184] General Target Regions:


[0185] (1) 5′ Untranslated Region—nts 1-47


[0186] (2) 3′ Untranslated Region—nts 519-825


[0187] Initial Specific Target Motifs:
7Group III AU-Rich Element (ARE) Cluster in 3′ untranslated region5′NAUUUAUUUAUUUAN 3′(SEQ ID NO: 10)



6.4. Interleukin 6 (“IL-6”)

[0188] GenBank Accession #NM000600:
8   1ttctgccctc gagcccaccg ggaacgaaag agaagctcta tctcgcctcc aggagcccag(SEQ ID NO: 11)  61ctatgaactc cttctccaca agcgccttcg gtccagttgc cttctccctg gggctgctcc 121tggtgttgcc tgctgccttc cctgccccag tacccccagg agaagattcc aaagatgtag 181ccgccccaca cagacagcca ctcacctctt cagaacgaat tgacaaacaa attcggtaca 241tcctcgacgg catctcagcc ctgagaaagg agacatgtaa caagagtaac atgtgtgaaa 301gcagcaaaga ggcactggca gaaaacaacc tgaaccttcc aaagatggct gaaaaagatg 361gatgcttcca atctggattc aatgaggaga cttgcctggt gaaaatcatc actggtcttt 421tggagtttga ggtataccta gagtacctcc agaacagatt tgagagtagt gaggaacaag 481ccagagctgt gcagatgagt acaaaagtcc tgatccagtt cctgcagaaa aaggcaaaga 541atctagatgc aataaccacc cctgacccaa ccacaaatgc cagcctgctg acgaagctgc 601aggcacagaa ccagtggctg caggacatga caactcatct cattctgcgc agctttaagg 661agttcctgca gtccagcctg agggctcttc ggcaaatgta gcatgggcac ctcagattgt 721tgttgttaat gggcattcct tcttctggtc agaaacctgt ccactgggca cagaacttat 781gttgttctct atggagaact aaaagtatga gcgttaggac actattttaa ttatttttaa 841tttattaata tttaaatatg tgaagctgag ttaatttatg taagtcatat ttatattttt 901aagaagtacc acttgaaaca ttttatgtat tagttttgaa ataataatgg aaagtggcta 961tgcagtttga atatcctttg tttcagagcc agatcatttc ttggaaagtg taggcttacc1021tcaaataaat ggctaactta tacatatttt taaagaaata tttatattgt atttatataa1081tgtataaatg gtttttatac caataaatgg cattttaaaa aattc


[0189] General Target Regions:


[0190] (1) 5′ Untranslated Region—nts 1-62


[0191] (2) 3′ Untranslated Region—nts 699-1125


[0192] Initial Specific Target Motifs:
9Group III AU-Rich Element (ARE) Cluster in 3′ untranslated region5′ NAUUUAUUUAUUUAN 3′(SEQ ID NO: 10)



6.5. Vascular Endothelial Growth Factor (“VEGF”)

[0193] GenBank Accession #AF022375:
10   1aagagctcca gagagaagtc gaggaagaga gagacggggt cagagagagc gcgcgggcgt(SEQ ID NO: 12)  61gcgagcagcg aaagcgacag gggcaaagtg agtgacctgc ttttgggggt gaccgccgga 121gcgcggcgtg agccctcccc cttgggatcc cgcagctgac cagtcgcgct gacggacaga 181cagacagaca ccgcccccag ccccagttac cacctcctcc ccggccggcg gcggacagtg 241gacgcggcgg cgagccgcgg gcaggggccg gagcccgccc ccggaggcgg ggtggagggg 301gtcggagctc gcggcgtcgc actgaaactt ttcgtccaac ttctgggctg ttctcgcttc 361ggaggagccg tggtccgcgc gggggaagcc gagccgagcg gagccgcgag aagtgctagc 421tcgggccggg aggagccgca gccggaggag ggggaggagg aagaagagaa ggaagaggag 481agggggccgc agtggcgact cggcgctcgg aagccgggct catggacggg tgaggcggcg 541gtgtgcgcag acagtgctcc agcgcgcgcg ctccccagcc ctggcccggc ctcgggccgg 601gaggaagagt agctcgccga ggcgccgagg agagcgggcc gccccacagc ccgagccgga 661gagggacgcg agccgcgcgc cccggtcggg cctccgaaac catgaacttt ctgctgtctt 721gggtgcattg gagccttgcc ttgctgctct acctccacca tgccaagtgg tcccaggctg 781cacccatggc agaaggagga gggcagaatc atcacgaagt ggtgaagttc atggatgtct 841atcagcgcag ctactgccat ccaatcgaga ccctggtgga catcttccag gagtaccctg 901atgagatcga gtacatcttc aagccatcct gtgtgcccct gatgcgatgc gggggctgct 961ccaatgacga gggcctggag tgtgtgccca ctgaggagtc caacatcacc atgcagatta1021tgcggatcaa acctcaccaa ggccagcaca taggagagat gagcttccta cagcacaaca1081aatgtgaatg cagaccaaag aaagatagag caagacaaga aaatccctgt gggccttgct1141cagagcggag aaagcatttg tttgtacaag atccgcagac gtgtaaatgt tcctgcaaaa1201acacacactc gcgttgcaag gcgaggcagc ttgagttaaa cgaacgtact tgcagatgtg1261acaagccgag gcggtgagcc gggcaggagg aaggagcctc cctcagggtt tcgggaacca1321gatctctctc caggaaagac tgatacagaa cgatcgatac agaaaccacg ctgccgccac1381cacaccatca ccatcgacag aacagtcctt aatccagaaa cctgaaatga aggaagagga1441gactctgcgc agagcacttt gggtccggag ggcgagactc cggcggaagc attcccgggc1501gggtgaccca gcacggtccc tcttggaatt ggattcgcca ttttattttt cttgctgcta1561aatcaccgag cccggaagat tagagagttt tatttctggg attcctgtag acacacccac1621ccacatacat acatttatat atatatatat tatatatata taaaaataaa tatctctatt1681ttatatatat aaaatatata tattcttttt ttaaattaac agtgctaatg ttattggtgt1741cttcactgga tgtatttgac tgctgtggac ttgagttggg aggggaatgt tcccactcag1801atcctgacag ggaagaggag gagatgagag actctggcat gatctttttt ttgtcccact1861tggtggggcc agggtcctct cccctgccca agaatgtgca aggccagggc atgggggcaa1921atatgaccca gttttgggaa caccgacaaa cccagccctg gcgctgagcc tctctacccc1981aggtcagacg gacagaaaga caaatcacag gttccgggat gaggacaccg gctctgacca2041ggagtttggg gagcttcagg acattgctgt gctttgggga ttccctccac atgctgcacg2101cgcatctcgc ccccaggggc actgcctgga agattcagga gcctgggcgg ccttcgctta2161ctctcacctg cttctgagtt gcccaggagg ccactggcag atgtcccggc gaagagaaga2221gacacattgt tggaagaagc agcccatgac agcgcccctt cctgggactc gccctcatcc2281tcttcctgct ccccttcctg gggtgcagcc taaaaggacc tatgtcctca caccattgaa2341accactagtt ctgtcccccc aggaaacctg gttgtgtgtg tgtgagtggt tgaccttcct2401ccatcccctg gtccttccct tcccttcccg aggcacagag agacagggca ggatccacgt2461gcccattgtg gaggcagaga aaagagaaag tgttttatat acggtactta tttaatatcc2521ctttttaatt agaaattaga acagttaatt taattaaaga gtagggtttt ttttcagtat2581tcttggttaa tatttaattt caactattta tgagatgtat cttttgctct ctcttgctct2641cttatttgta ccggtttttg tatataaaat tcatgtttcc aatctctctc tccctgatcg2701gtgacagtca ctagcttatc ttgaacagat atttaatttt gctaacactc agctctgccc2761tccccgatcc cctggctccc cagcacacat tcctttgaaa gagggtttca atatacatct2821acatactata tatatattgg gcaacttgta tttgtgtgta tatatatata tatatgttta2881tgtatatatg tgatcctgaa aaaataaaca tcgctattct gttttttata tgttcaaacc2941aaacaagaaa aaatagagaa ttctacatac taaatctctc tcctttttta attttaatat3001ttgttatcat ttatttattg gtgctactgt ttatccgtaa taattgtggg gaaaagatat3061taacatcacg tctttgtctc tagtgcagtt tttcgagata ttccgtagta catatttatt3121tttaaacaac gacaaagaaa tacagatata tcttaaaaaa aaaaaa


[0194] General Target Regions:


[0195] (1) 5′ Untranslated Region—nts 1-701


[0196] (2) 3′ Untranslated Region—nts 1275-3166


[0197] Initial Specific Target Motifs:
11(1)Internal Ribosome Entry Site (IRES) in 5′ untranslated region nts 513-7045′CCGGGCUCAUGGACGGGUGAGGCGGCGGUGUGCGCAGACAGU(SEQ ID NO: 13)GCUCCAGCGCGCGCGCUCCCCAGCCCUGGCCCGGCCUCGGGCCGGGAGGAAGAGUAGCUCGCCGAGGCGCCGAGGAGAGCGGGCCGCCCCACAGCCCGAGCCGGAGAGGGACGCGAGCCGCGCGCCCCGGUCGGGCCUCCGAAACCAUGAACUUUCUGCUGUCUUGGGUGCAUUGGAGCCUUGCCUUGCUGCUCUACCUCCACCAUG 3′(2)Group III AU-Rich Element (ARE) Cluster in 3′ untranslated region5′ NAUUUAUUUAUUUAN 3′(SEQ ID NO: 10)



6.6. Human Immunodeficiency Virus I (“HIV-1”)

[0198] GenBank Accession #NC001802:
12   1ggtctctctg gttagaccag atctgagcct gggagctctc tggctaacta gggaacccac(SEQ ID NO: 14)  61tgcttaagcc tcaataaagc ttgccttgag tgcttcaagt agtgtgtgcc cgtctgttgt 121gtgactctgg taactagaga tccctcagac ccttttagtc agtgtggaaa atctctagca 181gtggcgcccg aacagggacc tgaaagcgaa agggaaacca gaggagctct ctcgacgcag 241gactcggctt gctgaagcgc gcacggcaag aggcgagggg cggcgactgg tgagtacgcc 301aaaaattttg actagcggag gctagaagga gagagatggg tgcgagagcg tcagtattaa 361gcgggggaga attagatcga tgggaaaaaa ttcggttaag gccaggggga aagaaaaaat 421ataaattaaa acatatagta tgggcaagca gggagctaga acgattcgca gttaatcctg 481gcctgttaga aacatcagaa ggctgtagac aaatactggg acagctacaa ccatcccttc 541agacaggatc agaagaactt agatcattat ataatacagt agcaaccctc tattgtgtgc 601atcaaaggat agagataaaa gacaccaagg aagctttaga caagatagag gaagagcaaa 661acaaaagtaa gaaaaaagca cagcaagcag cagctgacac aggacacagc aatcaggtca 721gccaaaatta ccctatagtg cagaacatcc aggggcaaat ggtacatcag gccatatcac 781ctagaacttt aaatgcatgg gtaaaagtag tagaagagaa ggctttaagc ccagaagtga 841tacccatgtt ttcagcatta tcagaaggag ccaccccaca agatttaaac accatgctaa 901acacagtggg gggacatcaa gcagccatgc aaatgttaaa agagaccatc aatgaggaag 961ctgcagaatg ggatagagtg catccagtgc atgcagggcc tattgcacca ggccagatga1021gagaaccaag gggaagtgac atagcaggaa ctactagtac ccttcaggaa caaataggat1081ggatgacaaa taatccacct atcccagtag gagaaattta taaaagatgg ataatcctgg1141gattaaataa aatagtaaga atgtatagcc ctaccagcat tctggacata agacaaggac1201caaaggaacc ctttagagac tatgtagacc ggttctataa aactctaaga gccgagcaag1261cttcacagga ggtaaaaaat tggatgacag aaaccttgtt ggtccaaaat gcgaacccag1321attgtaagac tattttaaaa gcattgggac cagcggctac actagaagaa atgatgacag1381catgtcaggg agtaggagga cccggccata aggcaagagt tttggctgaa gcaatgagcc1441aagtaacaaa ttcagctacc ataatgatgc agagaggcaa ttttaggaac caaagaaaga1501ttgttaagtg tttcaattgt ggcaaagaag ggcacacagc cagaaattgc agggccccta1561ggaaaaaggg ctgttggaaa tgtggaaagg aaggacacca aatgaaagat tgtactgaga1621gacaggctaa ttttttaggg aagatctggc cttcctacaa gggaaggcca gggaattttc1681ttcagagcag accagagcca acagccccac cagaagagag cttcaggtct ggggtagaga1741caacaactcc ccctcagaag caggagccga tagacaagga actgtatcct ttaacttccc1801tcaggtcact ctttggcaac gacccctcgt cacaataaag ataggggggc aactaaagga1861agctctatta gatacaggag cagatgatac agtattagaa gaaatgagtt tgccaggaag1921atggaaacca aaaatgatag ggggaattgg aggttttatc aaagtaagac agtatgatca1981gatactcata gaaatctgtg gacataaagc tataggtaca gtattagtag gacctacacc2041tgtcaacata attggaagaa atctgttgac tcagattggt tgcactttaa attttcccat2101tagccctatt gagactgtac cagtaaaatt aaagccagga atggatggcc caaaagttaa2161acaatggcca ttgacagaag aaaaaataaa agcattagta gaaatttgta cagagatgga2221aaaggaaggg aaaatttcaa aaattgggcc tgaaaatcca tacaatactc cagtatttgc2281cataaagaaa aaagacagta ctaaatggag aaaattagta gatttcagag aacttaataa2341gagaactcaa gacttctggg aagttcaatt aggaatacca catcccgcag ggttaaaaaa2401gaaaaaatca gtaacagtac tggatgtggg tgatgcatat ttttcagttc ccttagatga2461agacttcagg aagtatactg catttaccat acctagtata aacaatgaga caccagggat2521tagatatcag tacaatgtgc ttccacaggg atggaaagga tcaccagcaa tattccaaag2581tagcatgaca aaaatcttag agccttttag aaaacaaaat ccagacatag ttatctatca2641atacatggat gatttgtatg taggatctga cttagaaata gggcagcata gaacaaaaat2701agaggagctg agacaacatc tgttgaggtg gggacttacc acaccagaca aaaaacatca2761gaaagaacct ccattccttt ggatgggtta tgaactccat cctgataaat ggacagtaca2821gcctatagtg ctgccagaaa aagacagctg gactgtcaat gacatacaga agttagtggg2881gaaattgaat tgggcaagtc agatttaccc agggattaaa gtaaggcaat tatgtaaact2941ccttagagga accaaagcac taacagaagt aataccacta acagaagaag cagagctaga3001actggcagaa aacagagaga ttctaaaaga accagtacat ggagtgtatt atgacccatc3061aaaagactta atagcagaaa tacagaagca ggggcaaggc caatggacat atcaaattta3121tcaagagcca tttaaaaatc tgaaaacagg aaaatatgca agaatgaggg gtgcccacac3181taatgatgta aaacaattaa cagaggcagt gcaaaaaata accacagaaa gcatagtaat3241atggggaaag actcctaaat ttaaactgcc catacaaaag gaaacatggg aaacatggtg3301gacagagtat tggcaagcca cctggattcc tgagtgggag tttgttaata cccctccctt3361agtgaaatta tggtaccagt tagagaaaga acccatagta ggagcagaaa ccttctatgt3421agatggggca gctaacaggg agactaaatt aggaaaagca ggatatgtta ctaatagagg3481aagacaaaaa gttgtcaccc taactgacac aacaaatcag aagactgagt tacaagcaat3541ttatctagct ttgcaggatt cgggattaga agtaaacata gtaacagact cacaatatgc3601attaggaatc attcaagcac aaccagatca aagtgaatca gagttagtca atcaaataat3661agagcagtta ataaaaaagg aaaaggtcta tctggcatgg gtaccagcac acaaaggaat3721tggaggaaat gaacaagtag ataaattagt cagtgctgga atcaggaaag tactattttt3781agatggaata gataaggccc aagatgaaca tgagaaatat cacagtaatt ggagagcaat3841ggctagtgat tttaacctgc cacctgtagt agcaaaagaa atagtagcca gctgtgataa3901atgtcagcta aaaggagaag ccatgcatgg acaagtagac tgtagtccag gaatatggca3961actagattgt acacatttag aaggaaaagt tatcctggta gcagttcatg tagccagtgg4021atatatagaa gcagaagtta ttccagcaga aacagggcag gaaacagcat attttctttt4081aaaattagca ggaagatggc cagtaaaaac aatacatact gacaatggca gcaatttcac4141cggtgctacg gttagggccg cctgttggtg ggcgggaatc aagcaggaat ttggaattcc4201ctacaatccc caaagtcaag gagtagtaga atctatgaat aaagaattaa agaaaattat4261aggacaggta agagatcagg ctgaacatct taagacagca gtacaaatgg cagtattcat4321ccacaatttt aaaagaaaag gggggattgg ggggtacagt gcaggggaaa gaatagtaga4381cataatagca acagacatac aaactaaaga attacaaaaa caaattacaa aaattcaaaa4441ttttcgggtt tattacaggg acagcagaaa tccactttgg aaaggaccag caaagctcct4501ctggaaaggt gaaggggcag tagtaataca agataatagt gacataaaag tagtgccaag4561aagaaaagca aagatcatta gggattatgg aaaacagatg gcaggtgatg attgtgtggc4621aagtagacag gatgaggatt agaacatgga aaagtttagt aaaacaccat atgtatgttt4681cagggaaagc taggggatgg ttttatagac atcactatga aagccctcat ccaagaataa4741gttcagaagt acacatccca ctaggggatg ctagattggt aataacaaca tattggggtc4801tgcatacagg agaaagagac tggcatttgg gtcagggagt ctccatagaa tggaggaaaa4861agagatatag cacacaagta gaccctgaac tagcagacca actaattcat ctgtattact4921ttgactgttt ttcagactct gctataagaa aggccttatt aggacacata gttagcccta4981ggtgtgaata tcaagcagga cataacaagg taggatctct acaatacttg gcactagcag5041cattaataac accaaaaaag ataaagccac ctttgcctag tgttacgaaa ctgacagagg5101atagatggaa caagccccag aagaccaagg gccacagagg gagccacaca atgaatggac5161actagagctt ttagaggagc ttaagaatga agctgttaga cattttccta ggatttggct5221ccatggctta gggcaacata tctatgaaac ttatggggat acttgggcag gagtggaagc5281cataataaga attctgcaac aactgctgtt tatccatttt cagaattggg tgtcgacata5341gcagaatagg cgttactcga cagaggagag caagaaatgg agccagtaga tcctagacta5401gagccctgga agcatccagg aagtcagcct aaaactgctt gtaccaattg ctattgtaaa5461aagtgttgct ttcattgcca agtttgtttc ataacaaaag ccttaggcat ctcctatggc5521aggaagaagc ggagacagcg acgaagagct catcagaaca gtcagactca tcaagcttct5581ctatcaaagc agtaagtagt acatgtaatg caacctatac caatagtagc aatagtagca5641ttagtagtag caataataat agcaatagtt gtgtggtcca tagtaatcat agaatatagg5701aaaatattaa gacaaagaaa aatagacagg ttaattgata gactaataga aagagcagaa5761gacagtggca atgagagtga aggagaaata tcagcacttg tggagatggg ggtggagatg5821gggcaccatg ctccttggga tgttgatgat ctgtagtgct acagaaaaat tgtgggtcac5881agtctattat ggggtacctg tgtggaagga agcaaccacc actctatttt gtgcatcaga5941tgctaaagca tatgatacag aggtacataa tgtttgggcc acacatgcct gtgtacccac6001agaccccaac ccacaagaag tagtattggt aaatgtgaca gaaaatttta acatgtggaa6061aaatgacatg gtagaacaga tgcatgagga tataatcagt ttatgggatc aaagcctaaa6121gccatgtgta aaattaaccc cactctgtgt tagtttaaag tgcactgatt tgaagaatga6181tactaatacc aatagtagta gcgggagaat gataatggag aaaggagaga taaaaaactg6241ctctttcaat atcagcacaa gcataagagg taaggtgcag aaagaatatg cattttttta6301taaacttgat ataataccaa tagataatga tactaccagc tataagttga caagttgtaa6361cacctcagtc attacacagg cctgtccaaa ggtatccttt gagccaattc ccatacatta6421ttgtgccccg gctggttttg cgattctaaa atgtaataat aagacgttca atggaacagg6481accatgtaca aatgtcagca cagtacaatg tacacatgga attaggccag tagtatcaac6541tcaactgctg ttaaatggca gtctagcaga agaagaggta gtaattagat ctgtcaattt6601cacggacaat gctaaaacca taatagtaca gctgaacaca tctgtagaaa ttaattgtac6661aagacccaac aacaatacaa gaaaaagaat ccgtatccag agaggaccag ggagagcatt6721tgttacaata ggaaaaatag gaaatatgag acaagcacat tgtaacatta gtagagcaaa6781atggaataac actttaaaac agatagctag caaattaaga gaacaatttg gaaataataa6841aacaataatc tttaagcaat cctcaggagg ggacccagaa attgtaacgc acagttttaa6901ttgtggaggg gaatttttct actgtaattc aacacaactg tttaatagta cttggtttaa6961tagtacttgg agtactgaag ggtcaaataa cactgaagga agtgacacaa tcaccctccc7021atgcagaata aaacaaatta taaacatgtg gcagaaagta ggaaaagcaa tgtatgcccc7081tcccatcagt ggacaaatta gatgttcatc aaatattaca gggctgctat taacaagaga7141tggtggtaat agcaacaatg agtccgagat cttcagacct ggaggaggag atatgaggga7201caattggaga agtgaattat ataaatataa agtagtaaaa attgaaccat taggagtagc7261acccaccaag gcaaagagaa gagtggtgca gagagaaaaa agagcagtgg gaataggagc7321tttgttcctt gggttcttgg gagcagcagg aagcactatg ggcgcagcct caatgacgct7381gacggtacag gccagacaat tattgtctgg tatagtgcag cagcagaaca atttgctgag7441ggctattgag gcgcaacagc atctgttgca actcacagtc tggggcatca agcagctcca7501ggcaagaatc ctggctgtgg aaagatacct aaaggatcaa cagctcctgg ggatttgggg7561ttgctctgga aaactcattt gcaccactgc tgtgccttgg aatgctagtt ggagtaataa7621atctctggaa cagatttgga atcacacgac ctggatggag tgggacagag aaattaacaa7681ttacacaagc ttaatacact ccttaattga agaatcgcaa aaccagcaag aaaagaatga7741acaagaatta ttggaattag ataaatgggc aagtttgtgg aattggttta acataacaaa7801ttggctgtgg tatataaaat tattcataat gatagtagga ggcttggtag gtttaagaat7861agtttttgct gtactttcta tagtgaatag agttaggcag ggatattcac cattatcgtt7921tcagacccac ctcccaaccc cgaggggacc cgacaggccc gaaggaatag aagaagaagg7981tggagagaga gacagagaca gatccattcg attagtgaac ggatccttgg cacttatctg8041ggacgatctg cggagcctgt gcctcttcag ctaccaccgc ttgagagact tactcttgat8101tgtaacgagg attgtggaac ttctgggacg cagggggtgg gaagccctca aatattggtg8161gaatctccta cagtattgga gtcaggaact aaagaatagt gctgttagct tgctcaatgc8221cacagccata gcagtagctg aggggacaga tagggttata gaagtagtac aaggagcttg8281tagagctatt cgccacatac ctagaagaat aagacagggc ttggaaagga ttttgctata8341agatgggtgg caagtggtca aaaagtagtg tgattggatg gcctactgta agggaaagaa8401tgagacgagc tgagccagca gcagataggg tgggagcagc atctcgagac ctggaaaaac8461atggagcaat cacaagtagc aatacagcag ctaccaatgc tgcttgtgcc tggctagaag8521cacaagagga ggaggaggtg ggttttccag tcacacctca ggtaccttta agaccaatga8581cttacaaggc agctgtagat cttagccact ttttaaaaga aaagggggga ctggaagggc8641taattcactc ccaaagaaga caagatatcc ttgatctgtg gatctaccac acacaaggct8701acttccctga ttagcagaac tacacaccag ggccaggggt cagatatcca ctgacctttg8761gatggtgcta caagctagta ccagttgagc cagataagat agaagaggcc aataaaggag8821agaacaccag cttgttacac cctgtgagcc tgcatgggat ggatgacccg gagagagaag8881tgttagagtg gaggtttgac agccgcctag catttcatca cgtggcccga gagctgcatc8941cggagtactt caagaactgc tgacatcgag cttgctacaa gggactttcc gctggggact9001ttccagggag gcgtggcctg ggcgggactg gggagtggcg agccctcaga tcctgcatat9061aagcagctgc tttttgcctg tactgggtct ctctggttag accagatctg agcctgggag9121ctctctggct aactagggaa cccactgctt aagcctcaat aaagcttgcc ttgagtgctt9181c


[0199] Initial Specific Target Motifs:


[0200] (1) Trans-activation response region/Tat protein binding site—TAR RNA—nts 1-60


[0201] “Minimal” TAR RNA element
135′ GGCAGAUCUGAGCCUGGGAGCUCUCU(SEQ ID NO: 15)GCC 3′


[0202] (2) Gag/Pol Frameshifting Site—“Minimal” frameshifting element
145′ UUUUUUAGGGAAGAUCUGGCCUUCCUACAA(SEQ ID NO: 16)GGGAAGGCCAGGGAAUUUUCUU 3′



6.7. Hepatitis C Virus (“HCV”—Genotypes 1a & 1b)

[0203] GenBank Accession #NC001433:
15   1ttgggggcga cactccacca tagatcactc ccctgtgagg aactactgtc ttcacgcaga(SEQ ID NO: 17)  61aagcgtctag ccatggcgtt agtatgagtg ttgtgcagcc tccaggaccc cccctcccgg 121gagagccata gtggtctgcg gaaccggtga gtacaccgga attgccagga cgaccgggtc 181ctttcttgga tcaacccgct caatgcctgg agatttgggc gtgcccccgc gagactgcta 241gccgagtagt gttgggtcgc gaaaggcctt gtggtactgc ctgatagggt gcttgcgagt 301gccccgggag gtctcgtaga ccgtgcatca tgagcacaaa tcctaaacct caaagaaaaa 361ccaaacgtaa caccaaccgc cgcccacagg acgttaagtt cccgggcggt ggtcagatcg 421ttggtggagt ttacctgttg ccgcgcaggg gccccaggtt gggtgtgcgc gcgactagga 481agacttccga gcggtcgcaa cctcgtggaa ggcgacaacc tatccccaag gctcgccggc 541ccgagggtag gacctgggct cagcccgggt acccttggcc cctctatggc aacgagggta 601tggggtgggc aggatggctc ctgtcacccc gtggctctcg gcctagttgg ggccccacag 661acccccggcg taggtcgcgt aatttgggta aggtcatcga tacccttaca tgcggcttcg 721ccgacctcat ggggtacatt ccgcttgtcg gcgcccccct agggggcgct gccagggccc 781tggcacatgg tgtccgggtt ctggaggacg gcgtgaacta tgcaacaggg aatctgcccg 841gttgctcttt ctctatcttc ctcttagctt tgctgtcttg tttgaccatc ccagcttccg 901cttacgaggt gcgcaacgtg tccgggatat accatgtcac gaacgactgc tccaactcaa 961gtattgtgta tgaggcagcg gacatgatca tgcacacccc cgggtgcgtg ccctgcgtcc1021gggagagtaa tttctcccgt tgctgggtag cgctcactcc cacgctcgcg gccaggaaca1081gcagcatccc caccacgaca atacgacgcc acgtcgattt gctcgttggg gcggctgctc1141tctgttccgc tatgtacgtt ggggatctct gcggatccgt ttttctcgtc tcccagctgt1201tcaccttctc acctcgccgg tatgagacgg tacaagattg caattgctca atctatcccg1261gccacgtatc aggtcaccgc atggcttggg atatgatgat gaactggtca cctacaacgg1321ccctagtggt atcgcagcta ctccggatcc cacaagccgt cgtggacatg gtggcggggg1381cccactgggg tgtcctagcg ggccttgcct actattccat ggtggggaac tgggctaagg1441tcttgattgt gatgctactc tttgctggcg ttgacgggca cacccacgtg acagggggaa1501gggtagcctc cagcacccag agcctcgtgt cctggctctc acaaggccca tctcagaaaa1561tccaactcgt gaacaccaac ggcagctggc acatcaacag gaccgctctg aattgcaatg1621actccctcca aactgggttc attgctgcgc tgttctacgc acacaggttc aacgcgtccg1681ggtgcccaga gcgcatggct agctgccgcc ccatcgatga gttcgctcag gggtggggtc1741ccatcactca tgatatgcct gagagctcgg accagaggcc atattgctgg cactacgcgc1801ctcgaccgtg cgggatcgtg cctgcgtcgc aggtgtgtgg tccagtgtat tgcttcactc1861cgagccctgt tgtagtgggg acgaccgatc gtttcggcgc tcctacgtat agctgggggg1921agaatgagac agacgtgctg ctacttagca acacgcggcc gcctcaaggc aactggtttg1981ggtgcacgtg gatgaacagc actgggttca ccaagacgtg cgggggccct ccgtgcaaca2041tcgggggggt cggcaacaac accttggtct gccccacgga ttgcttccgg aagcaccccg2101aggccactta cacaaagtgt ggctcggggc cctggttgac acccaggtgc atggttgact2161acccatacag gctctggcac tacccctgca ctgttaactt taccgtcttt aaggtcagga2221tgtatgtggg gggcgtggag cacaggctca atgctgcatg caattggact cgaggagagc2281gctgtgactt ggaggacagg gataggtcag aactcagccc gctgctgctg tctacaacag2341agtggcagat actgccctgt tccttcacca ccctaccggc cctgtccact ggcttgatcc2401atcttcaccg gaacatcgtg gacgtgcaat acctgtacgg tatagggtcg gcagttgtct2461cctttgcaat caaatgggag tatatcctgt tgcttttcct tcttctggcg gacgcgcgcg2521tctgtgcctg cttgtggatg atgctgctga tagcccaggc tgaggccacc ttagagaacc2581tggtggtcct caatgcggcg tctgtggccg gagcgcatgg ccttctctcc ttcctcgtgt2641tcttctgcgc cgcctggtac atcaaaggca ggctggtccc tggggcggca tatgctctct2701atggcgtatg gccgttgctc ctgctcttgc tggccttacc accacgagct tatgccatgg2761accgagagat ggctgcatcg tgcggaggcg cggtttttgt aggtctggta ctcttgacct2821tgtcaccata ctataaggtg ttcctcgcta ggctcatatg gtggttacaa tattttatca2881ccagagccga ggcgcacttg caagtgtggg tcccccctct caatgttcgg ggaggccgcg2941atgccatcat cctccttaca tgcgcggtcc atccagagct aatctttgac atcaccaaac3001tcctgctcgc catactcggt ccgctcatgg tgctccaggc tggcataact agagtgccgt3061actttgtacg cgctcagggg ctcatccgtg catgcatgtt agtgcggaag gtcgctggag3121gccactatgt ccaaatggcc ttcatgaagc tggccgcgct gacaggtacg tacgtatatg3181accatcttac tccactgcgg gattgggccc acgcgggcct acgagacctt gcggtggcag3241tagagcccgt cgtcttctct gacatggaga ctaaactcat cacctggggg gcagacaccg3301cggcgtgtgg ggacatcatc tcgggtctac cagtctccgc ccgaaggggg aaggagatac3361ttctaggacc ggccgatagt tttggagagc aggggtggcg gctccttgcg cctatcacgg3421cctattccca acaaacgcgg ggcctgcttg gctgtatcat cactagcctc acaggtcggg3481acaagaacca ggtcgatggg gaggttcagg tgctctccac cgcaacgcaa tctttcctgg3541cgacctgcgt caatggcgtg tgttggaccg tctaccatgg tgccggctcg aagaccctgg3601ccggcccgaa gggtccaatc acccaaatgt acaccaatgt agaccaggac ctcgtcggct3661ggccggcgcc ccccggggcg cgctccatga caccgtgcac ctgcggcagc tcggaccttt3721acttggtcac gaggcatgct gatgtcgttc cggtgcgccg gcggggcgac agcaggggga3781gcctgctttc ccccaggccc atctcctacc tgaagggctc ctcgggtgga ccactgcttt3841gcccttcggg gcacgttgta ggcatcttcc gggctgctgt gtgcacccgg ggggttgcga3901aggcggtgga cttcataccc gttgagtcta tggaaactac catgcggtct ccggtcttca3961cagacaactc atcccctccg gccgtaccgc aaacattcca agtggcacat ttacacgctc4021ccactggcag cggcaagagc accaaagtgc cggctgcata tgcagcccaa gggtacaagg4081tgctcgtcct aaacccgtcc gttgccgcca cattgggctt tggagcgtat atgtccaagg4141cacatggcat cgagcctaac atcagaactg gggtaaggac catcaccacg ggcggcccca4201tcacgtactc cacctattgc aagttccttg ccgacggtgg atgctccggg ggcgcctatg4261acatcataat atgtgatgaa tgccactcaa ctgactcgac taccatcttg ggcatcggca4321cagtcctgga tcaggcagag acggctggag cgcggctcgt cgtgctcgcc accgccacgc4381ctccgggatc gatcaccgtg ccacacccca acatcgagga agtggccctg tccaacactg4441gagagattcc cttctatggc aaagccatcc ccattgaggc catcaagggg ggaaggcatc4501tcatcttctg ccattccaag aagaagtgtg acgagctcgc cgcaaagctg acaggcctcg4561gactcaatgc tgtagcgtat taccggggtc tcgatgtgtc cgtcataccg actagcggag4621acgtcgttgt cgtggcaaca gacgctctaa tgacgggttt taccggcgac tttgactcag4681tgatcgactg caacacatgt gtcacccaga cagtcgattt cagcttggat cccaccttca4741ccattgagac gacaacgctg ccccaagacg cggtgtcgcg tgcgcagcgg cgaggtagga4801ctggcagggg caggagtggc atctacaggt ttgtgactcc aggagaacgg ccctcaggca4861tgttcgactc ctcggtcctg tgtgagtgct atgacgcagg ctgcgcttgg tatgagctca4921cgcccgctga gacctcggtt aggttgcggg cttacctaaa tacaccaggg ttgcccgtct4981gccaggacca cctagagttc tgggagagcg tcttcacagg cctcacccac atagatgccc5041acttcttgtc ccagaccaaa caggcaggag acaacctccc ctacctggta gcataccaag5101ccacagtgtg cgccagggct caggctccac ctccatcgtg ggaccaaatg tggaagtgtc5161tcatacggct aaagcccaca ctgcatgggc caacgcccct gctgtacagg ctaggagccg5221ttcaaaatga ggtcactctc acacacccca taaccaaata catcatggca tgcatgtcgg5281ctgacctgga ggtcgtcact agcacctggg tgctagtagg cggagtcctt gcggctctgg5341ccgcgtactg cctgacgaca ggcagcgtgg tcattgtggg caggatcatc ttgtccggga5401ggccagctgt tattcccgac agggaagtcc tctaccagga gttcgatgag atggaagagt5461gtgcttcaca cctcccttac atcgagcaag gaatgcagct cgccgagcaa ttcaaacaga5521aggcgctcgg attgctgcaa acagccacca agcaagcgga ggctgctgct cccgtggtgg5581agtccaagtg gcgagccctt gaggtcttct gggcgaaaca catgtggaac ttcatcagcg5641ggatacagta cttggcaggc ctatccactc tgcctggaaa ccccgcgata gcatcattga5701tggcttttac agcctctatc accagcccgc tcaccaccca aaataccctc ctgtttaaca5761tcttgggggg atgggtggct gcccaactcg ctccccccag cgctgcttcg gctttcgtgg5821gcgccggcat tgccggtgcg gccgttggca gcataggtct cgggaaggta cttgtggaca5881ttctggcggg ctatggggcg ggggtggctg gcgcactcgt ggcctttaag gtcatgagcg5941gcgagatgcc ctccactgag gatctggtta atttactccc tgccatcctt tctcctggcg6001ccctggttgt cggggtcgtg tgcgcagcaa tactgcgtcg gcacgtgggc ccgggagagg6061gggctgtgca gtggatgaac cggctgatag cgttcgcttc gcggggtaac cacgtctccc6121ccacgcacta tgtgcccgag agcgacgccg cggcgcgtgt tactcagatc ctctccagcc6181ttaccatcac tcagttgctg aagaggcttc atcagtggat taatgaggac tgctccacgc6241cttgttccgg ctcgtggcta aaggatgttt gggactggat atgcacggtg ttgagtgact6301tcaagacttg gctccagtcc aagctcctgc cgcggttacc gggactccct ttcctgtcat6361gccaacgcgg gtacaaggga gtctggcggg gggatggcat catgcaaacc acctgcccat6421gtggagcaca gatcaccgga catgtcaaaa atggctccat gaggattgtt gggccaaaaa6481cctgcagcaa cacgtggcat ggaacattcc ccatcaacgc atacaccacg ggcccctgca6541cgccctcccc agcgccgaac tattccaggg cgctgtggcg ggtggctgct gaggagtacg6601tggaggttac gcgggtgggg gatttccact acgtgacggg catgaccact gacaacgtga6661aatgcccatg ccaggttcca gcccctgaat ttttcacgga ggtggatgga gtacggttgc6721acaggtatgc tccagtgtgc aaacctctcc tacgagagga ggtcgtattc caggtcgggc6781tcaaccagta cctggtcggg tcacagctcc catgtgagcc cgaaccggat gtggcagtgc6841tcacttccat gctcaccgac ccctctcata ttacagcaga gacggccaag cgtaggctgg6901ccagggggtc tcccccctcc ttggccagct cttcagctag ccagttgtct gcgccttctt6961tgaaggcgac atgtactacc catcatgact ccccggacgc tgacctcatc gaggccaacc7021tcctgtggcg gcaggagatg ggcgggaaca tcacccgtgt ggagtcagaa aataaggtgg7081taatcctgga ctctttcgat ccgattcggg cggtggagga tgagagggaa atatccgtcc7141cggcggagat cctgcgaaaa cccaggaagt tccccccagc gttgcccata tgggcacgcc7201cggattacaa ccctccactg ctagagtcct ggaaggaccc ggactacgtc cccccggtgg7261tacacgggtg ccctttgcca tctaccaagg cccccccaat accacctcca cggaggaaga7321ggacggttgt cctgacagag tccaccgtgt cttctgcctt ggcggagctc gctactaaga7381cctttggcag ctccgggtcg tcggccgttg acagcggcac ggcgactggc cctcccgatc7441aggcctccga cgacggcgac aaaggatccg acgttgagtc gtactcctcc atgccccccc7501tcgagggaga gccaggggac cccgacctca gcgacgggtc ttggtctacc gtgagcgggg7561aagctggtga ggacgtcgtc tgctgctcaa tgtcctatac atggacaggt gccttgatca7621cgccatgcgc tgcggaggag agcaagttgc ccatcaatcc gttgagcaac tctttgctgc7681gtcaccacag tatggtctac tccacaacat ctcgcagcgc aagtctgcgg cagaagaagg7741tcacctttga cagactgcaa gtcctggacg accactaccg ggacgtgctc aaggagatga7801aggcgaaggc gtccacagtt aaggctaggc ttctatctat agaggaggcc tgcaaactga7861cgcccccaca ttcggccaaa tccaaatttg gctacggggc gaaggacgtc cggagcctat7921ccagcagggc cgtcaaccac atccgctccg tgtgggagga cttgctggaa gacactgaaa7981caccaattga taccaccatc atggcaaaaa atgaggtttt ctgcgtccaa ccagagaaag8041gaggccgcaa gccagctcgc cttatcgtat tcccagacct gggggtacgt gtatgcgaga8101agatggccct ttacgacgtg gtctccaccc ttcctcaggc cgtgatgggc ccctcatacg8161gattccagta ctctcctggg cagcgggtcg agttcctggt gaatacctgg aaatcaaaga8221aatgccctat gggcttctca tatgacaccc gctgctttga ctcaacggtc actgagaatg8281acatccgtac tgaggaatca atttaccaat gttgtgactt ggcccccgaa gccaggcagg8341ccataaggtc gctcacagag cggctttatg tcgggggtcc cctgactaat tcgaaggggc8401agaactgcgg ttatcgccgg tgccgcgcaa gtggcgtgct gacgactagc tgcggcaaca8461ccctcacatg ttacttgaag gccactgcgg cctgtcgagc tgcaaagctc caggactgca8521cgatgctcgt gaacggagac gaccttgtcg ttatctgtga gagtgcggga acccaggagg8581atgcggcggc cctacgagcc ttcacggagg ctatgactag gtattccgcc ccccccgggg8641acccgcccca accagaatac gacttggagc tgataacgtc atgctcctcc aatgtgtcgg8701tcgcgcacga tgcatccggc aaaagggtgt actacctcac ccgtgacccc accacccccc8761tcgcacgggc tgcgtgggag acagttagac acactccagt caactcctgg ctaggcaata8821tcatcatgta tgcgcccacc ctatgggcga ggatgattct gatgactcat ttcttctcta8881tccttctagc tcaggagcaa cttgaaaaag ccctggattg tcagatctac ggggcctgtt8941actccattga gccacttgac ctacctcaga tcattgaacg actccatggt cttagcgcat9001tttcactcca cagttactct ccaggtgaga tcaatagggt ggcttcatgc ctcaggaaac9061ttggggtacc gcctttgcga gtctggagac atcgggccag aagtgtccgc gctaagctac9121tgtcccaggg ggggagggct gccacttgcg gcaagtacct cttcaactgg gcagtaaaga9181ccaagcttaa actcactcca atcccggctg cgtcccagct agacttgtcc ggctggttcg9241ttgctggtta caacggggga gacatatatc acagcctgtc tcgtgcccga ccccgttggt9301tcatgttgtg cctactccta ctttctgtag gggtaggcat ctacctgctc cccaaccggt9361gaacggggag ctaaccactc caggccaata ggccattccc tttttttttt ttc


[0204] General Target Region:


[0205] 5′ Untranslated Region—nts 1-328 - Internal Ribosome Entry Site (IRES):
165′UUGGGGGCGACACUCCACCAUAGAUCACUCCCCUGUGAGGAACUACUGUCU(SEQ ID NO: 18)UCACGCAGAAAGCGUCUAGCCAUGGCGUUAGUAUGAGUGUUGUGCAGCCUCCAGGACCCCCCCUCCCGGGAGAGCCAUAGUGGUCUGCGGAACCGGUGAGUACACCGGAAUUGCCAGGACGACCGGGUCCUUUCUUGGAUCAACCCGCUCAAUGCCUGGAGAUUUGGGCGUGCCCCCGCGAGACUGCUAGCCGAGUAGUGUUGGGUCGCGAAAGGCCUUGUGGUACUGCCUGAUAGGGUGCUUGCGAGUGCCCCGGGAGGUCUCGUAGACCGUGCAU3′


[0206] Initial Specific Target Motifs:


[0207] (1) Subdomain IIIc within HCV IRES—nts 213-226
175′AUUUGGGCGUGCCC3′(SEQ ID NO: 19)


[0208] (2) Subdomain IIId within HCV IRES—nts 241-267
185′GCCGAGUAGUGUUGGGUCGCGAAAGGC3′(SEQ ID NO: 20)



6.8. Ribonuclease P RNA (“RNaseP”)

[0209] GenBank Accession #s


[0210] X15624 Homo sapiens RNaseP H1 RNA:
19  1atgggcggag ggaagctcat cagtggggcc acgagctgag tgcgtcctgt cactccactc(SEQ ID NO: 21) 61ccatgtccct tgggaaggtc tgagactagg gccagaggcg gccctaacag ggctctccct121gagcttcagg gaggtgagtt cccagagaac ggggctccgc gcgaggtcag actgggcagg181agatgccgtg gaccccgccc ttcggggagg ggcccggcgg atgcctcctt tgccggagct241tggaacagac tcacggccag cgaagtgagt tcaatggctg aggtgaggta ccccgcaggg301gacctcataa cccaattcag accactctcc tccgcccatt


[0211] U64885 Staphylococcus aureus RNaseP (rrnB) RNA:
20  1gaggaaagtc cgggctcaca cagtctgaga tgattgtagt gttcgtgctt gatgaaacaa(SEQ ID NO: 22) 61taaatcaagg cattaatttg acggcaatga aatatcctaa gtctttcgat atggatagag121taatttgaaa gtgccacagt gacgtagctt ttatagaaat ataaaaggtg gaacgcggta181aacccctcga gtgagcaatc caaatttggt aggagcactt gtttaacgga attcaacgta241taaacgagac acacttcgcg aaatgaagtg gtgtagacag atggttatca cctgagtacc301agtgtgacta gtgcacgtga tgagtacgat ggaacagaac gcggcttat


[0212] M17569 Escherichia coli RNA component (M1 RNA) of ribonuclease P (mpB) gene:
21  1gaagctgacc agacagtcgc cgcttcgtcg tcgtcctctt cgggggagac gggcggaggg(SEQ ID NO: 23) 61gaggaaagtc cgggctccat agggcagggt gccaggtaac gcctgggggg gaaacccacg121accagtgcaa cagagagcaa accgccgatg gcccgcgcaa gcgggatcag gtaagggtga181aagggtgcgg taagagcgca ccgcgcggct ggtaacagtc cgtggcacgg taaactccac241ccggagcaag gccaaatagg ggttcataag gtacggcccg tactgaaccc gggtaggctg301cttgagccag tgagcgattg ctggcctaga tgaatgactg tccacgacag aacccggctt361atcggtcagt ttcacct


[0213] Z70692 Mycobacterium tuberculosis RNaseP (mpB) RNA:
22    1ccaccggtta cgatcttgcc gaccatggcc ccacaatagg gccggggaga cccggcgtca(SEQ ID NO: 24)   61gtggtgggcg gcacggtcag taacgtctgc gcaacacggg gttgactgac gggcaatatc  121ggctccatag cgtcggccgc ggatacagta aaggagcatt ctgtgacgga aaagacgccc  181gacgacgtct tcaaacttgc caaggacgag aaggtcgaat atgtcgacgt ccggttctgt  241gacctgcctg gcatcatgca gcacttcacg attccggctt cggcctttga caagagcgtg  301tttgacgacg gcttggcctt tgacggctcg tcgattcgcg ggttccagtc gatccacgaa  361tccgacatgt tgcttcttcc cgatcccgag acggcgcgca tcgacccgtt ccgcgcggcc  421aagacgctga atatcaactt ctttgtgcac gacccgttca ccctggagcc gtactcccgc  481gacccgcgca acatcgcccg caaggccgag aactacctga tcagcactgg catcgccgac  541accgcatact tcggcgccga ggccgagttc tacattttcg attcggtgag cttcgactcg  601cgcgccaacg gctccttcta cgaggtggac gccatctcgg ggtggtggaa caccggcgcg  661gcgaccgagg ccgacggcag tcccaaccgg ggctacaagg tccgccacaa gggcgggtat  721ttcccagtgg cccccaacga ccaatacgtc gacctgcgcg acaagatgct gaccaacctg  781atcaactccg gcttcatcct ggagaagggc caccacgagg tgggcagcgg cggacaggcc  841gagatcaact accagttcaa ttcgctgctg cacgccgccg acgacatgca gttgtacaag  901tacatcatca agaacaccgc ctggcagaac ggcaaaacgg tcacgttcat gcccaagccg  961ctgttcggcg acaacgggtc cggcatgcac tgtcatcagt cgctgtggaa ggacggggcc 1021ccgctgatgt acgacgagac gggttatgcc ggtctgtcgg acacggcccg tcattacatc 1081ggcggcctgt tacaccacgc gccgtcgctg ctggccttca ccaacccgac ggtgaactcc 1141tacaagcggc tggttcccgg ttacgaggcc ccgatcaacc tggtctatag ccagcgcaac 1201cggtcggcat gcgtgcgcat cccgatcacc ggcagcaacc cgaaggccaa gcggctggag 1261ttccgaagcc ccgactcgtc gggcaacccg tatctggcgt tctcggccat gctgatggca 1321ggcctggacg gtatcaagaa caagatcgag ccgcaggcgc ccgtcgacaa ggatctctac 1381gagctgccgc cggaagaggc cgcgagtatc ccgcagactc cgacccagct gtcagatgtg 1441atcgaccgtc tcgaggccga ccacgaatac ctcaccgaag gaggggtgtt cacaaacgac 1501ctgatcgaga cgtggatcag tttcaagcgc gaaaacgaga tcgagccggt caacatccgg 1561ccgcatccct acgaattcgc gctgtactac gacgtttaag gactcttcgc agtccgggtg 1621tagagggagc ggcgtgtcgt tgccagggcg ggcgtcgagg tttttcgatg ggtgacggtg 1681gccggcaacg gcgcgccgac caccgctgcg aagagcccgt ttaagaacgt tcaaggacgt 1741ttcagccggg tgccacaacc cgcttggcaa tcatctcccg accgccgagc gggttgtctt 1801tcacatgcgc cgaaactcaa gccacgtcgt cgcccaggcg tgtcgtcgcg gccggttcag 1861gttaagtgtc ggggattcgt cgtgcgggcg ggcgtccacg ctgaccaacg gggcagtcaa 1921ctcccgaaca ctttgcgcac taccgccttt gcccgccgcg tcacccgtag gtagttgtcc 1981aggaattccc caccgtcgtc gtttcgccag ccggccgcga ccgcgaccgc attgagctgg 2041cgcccgggtc ccggcagctg gtcggtgggc ttgccgcgca ccaacaccag cgcgttgcgg 2101gcccgggtgg cggtcagcca ggcctgacgg agcagctcca cgtcggctgc gggaaccaga 2161tcggcggccg cgatgacatc cagggattgc agcgtcgagg tgttgtgcag ggcgggaacc 2221tggtgcgcat gctgtagctg cagcaactgc acggtccatt cgatgtcggc cagtccgccg 2281cggcccagtt tggtgtgtgt gttggggtcg gcaccgcgcg gcaaccgctc ggactcgata 2341cgggccttga tgcggcgaat ctcgcgcacc gagtcagcgg acacaccgtc gggcggatac 2401cgcgttttgt cgaccatccg taggaatcgc tgacccaact cggcatcgcc ggcaaccgcg 2461tgtgcgcgta gcagggcctg gatctcccat ggctgtgccc actgctcgta gtatgcggcg 2521taggacccca gggtgcggac cagcggaccg ttgcggccct cgggtcgcaa attggcgtcg 2581agctccagcg gcggatcgac gctgggtgtc cccagcagcg cccgaacccg ctcggcgatc 2641gatgtcgacc atttcaccgc ccgtgcatcg tcgacgccgg tggccggctc acagacgaac 2701atcacgtcgg catccgaccc gtagcccaac tcggcaccac ccagccgacc catgccgatg 2761accgcgatgg ccgccggggc gcgatcgtcg tcgggaaggc tggcccggat catgacgtcc 2821agcgcggcct gcagcaccgc cacccacacc gacgtcaacg cccggcacac ctcggtgacc 2881tcgagcaggc cgagcaggtc cgccgaaccg atgcgggcca gctctcgacg acgcagcgtg 2941cgcgcgccgg cgatggcccg ctccgggtcg gggtagcggc tcgccgaggc gatcagcgcc 3001cgagccacgg cggcgggctc ggtctcgagc agcttcgggc ccgcaggccc gtcctcgtac 3061tgctggatga cccgcggcgc gcgcatcaac agatccggca catacgccga ggtacccaag 3121acatgcatga gccgcttggc caccgcgggc ttgtcccgca gcgtggccag gtaccagctt 3181tcggtggcca gcgcctcact gagccgccgg taggccagca gtccgccgtc gggatcgggg 3241gcatacgaca tccagtccag cagcctgggc agcagcaccg actgcacccg tccgcgccgg 3301ccgctttgat tgaccaacgc cgacatgtgt ttcaacgcgg tctgcggtcc ctcgtagccc 3361agcgcggcca gccggcgccc cgcggcctcc aacgtcatgc cgtgggcgat ctccaacccg 3421gtcgggccga tcgattccag cagcggttga tagaagagtt tggtgtgtaa cttcgacacc 3481cgcacgttct gcttcttgag ttcctcccgc agcaccccgg ccgcatcgtt tcggccatcg 3541ggccggatgt gggccgcgcg cgccagccag cgcactgcct cctcgtcttc gggatcggga 3601agcaggtggg tgcgcttgag ccgctgcaac tgcagtcggt gctcgagcag cctgaggaac 3661tcatacgacg cggtcatgtt cgccgcgtcc tcacgcccga tgtagccgcc ttcgcccaac 3721gccgccaatg cgtccaccgt ggacgccacc cgtaacgact cgtcgctacg ggcatgaacc 3781agctgcagta gctgtacggc gaactccacg tcgcgcaatc cgccgctgcc gagtttgagc 3841tcgcggccgc ggacatcggc gggcaccagc tgctccaccc gccgccgcat ggcctgcacc 3901tcgaccacaa agtcttcgcg ctcgcaggct cgccacacca tcggcatcaa ggcggtcagg 3961taacgctcgc caagttccgc gtcgccaacg actggccgtg ctttcagcaa cgcctgaaac 4021tcccaggtct tggcccagcg ctggtagtag gcgatgtgcg actcgagcgt acggaccagc 4081tccccgttgc gcccctccgg acgcagggcg gcgtccacct cgaaaaaggc cgccgaggcc 4141acccgcatca tctcgctggc cacgcgcgcg ttgcgcgggt cggagcgctc ggcaacgaat 4201atgacatcga cgtcgctgac gtagttcagt tcgcgcgcac cgcacttgcc catcgcgatg 4261accgccaggc gcggtggcgg gtgctcgccg cacacgctcg cctcggccac gcgcagcgcc 4321gccgccagag cggcgtccgc ggcgtccgcc aggcgtgcgg ccaccacggt gaatggcagc 4381accggttcgt cctcgaccgt cgcggccagg tcgagagcgg ccagcattag cacgtagtcg 4441cggtactggg ttcgcaatcg gtgcacgagc gagcccggca taccctccga ttcctcgacg 4501cactcgacga acgaccgctg cagctggtca tgggacggca gtgtgacctt gccccgcagc 4561aatttccagg actgcggatg ggcgaccagg tgatcgccca acgccagcga cgagcccagc 4621accgagaaca gccgcccgcg cagactgcgt tcgcgcagca gagccgcgtt gagctcgtcc 4681catccggtgt ctggattctc cgacagccgg atcaaggcgc gcagcgcggc atcggcgtcc 4741ggagcgcgtg acagcgacca cagcaggtcg acgtgcgcct gatcctcgtg ccgatcccac 4801cccagctgag ccagacgctc accagcaggg gggtcaacta atccgagccg gccaacgctg 4861ggcaacttcg gccgctgcgt ggcgagtttg gtcacgacca cgacggtagc gcaaagcgcg 4921tcggcgtcgg atcaaccggt agatctgggc tacagcgaca ggtaggtgcg cagctcgtat 4981ggcgtgacgt ggctgcggta gttcgcccac tccgtgcgct tgttgcgcaa gaaaaagtca 5041aaaacgtgct cccccaaggc ctccgcgacg agttcggagg cctccatggc gcgcagcgca 5101ctatccaaac tggacggcaa ttctcggtac cccatcgctc ggcgttcctc gggtgtgagg 5161tcccatacgt tgtcctcggc ctgcgggccc agcacgtaac ccttctctac accccgcaat 5221cccgcggcca gcagcacggc gaatgtcaga tagggattgc acgccgaatc agggctgcgt 5281acttcgaccc gccgcgacga ggtcttgtgc ggcgtgtaca tcggcacccg cactagggcg 5341gatcggttgg cggcccccca cgacgcggcc gtgggcgctt cgccgccctg caccagccgc 5401ttgtaagagt tgacccactg atttgtgacc gcgctgatct cgcaagcgtg ctccaggatc 5461ccggcgatga acgatttacc cacttccgac agctgcagcg gatcatcagc gctgtggaac 5521gcgttgacat caccctcgaa caggctcatg tgggtgtgca tcgccgagcc cgggtgctgg 5581ccgaatggct tgggcatgaa cgacgcccgg gcgccctctt ccagcgcgac ttctttgatg 5641acgtagcgga aggtcatcac gttgtcagcc atcgacagag cgtcggcaaa ccgcaggtcg 5701atctcctgct ggccgggtgc gccttcgtga tggctgaact ccaccgagat gcccatgaat 5761tccagggcat cgatcgcgtg gcggcgaaag ttcaaggcgg agtcgtgcac cgcttggtcg 5821aaatagccgg cgttgtcgac cgggacgggc accgacccgt cctcgggtcc gggcttgagc 5881aggaagaact cgatttcggg atgcacgtag caggagaagc cgagttcgcc ggccttcgtc 5941agctgccgcc gcaacacgtg ccgcgggtcc gcccacgacg gcgagccgtc cggcatggtg 6001atgtcgcaaa acatccgcgc tgagtggtgg tggccggaac tggtggccca gggcagcacc 6061tggaaggtcg acgggtccgg gtgcgccacc gtatcggatt ccgagacccg cgcaaagccc 6121tcgatcgagg atccgtcgaa gccgatgcct tcctcgaagg cgccctcgag ttcggctggg 6181gcgatggcga ccgacttgag gaaaccgagc acgtctgtga accacagccg gacgaagcgg 6241atgtcgcgtt cttccagggt acgaagaacg aattccttct gtcggtccat acctcgaaca 6301gtatgcactg tctgttaaaa ccgtgttacc gatgcccggc cagaagcgtt gcggggcggc 6361ccgcaagggg agtgcgcggt gagttcaggg cgcgcaccgc agactcgtcg gcggcaaggt 6421cccgtcgaga aaatagtgca tcaccgcaga gtccacacac tggttgccat cgaacaccgc 6481agtgtgttgg gtgccgtcga aggtgatcag cggtgcgccc agctggcggg ccaggtctac 6541cccggactga tacggagtgg ccgggtcgtg ggtggtggac accacgacga ccttgccagc 6601cccggccggc gccgcggggt gcggcgtcga cgttgccggc accggccaca gcgcgcacag 6661atcgcggggg gcggatccgg tgaactgccc gtagctaagg aacggggcga cctgacggat 6721ccgttggtcg gcggccaccc aggccgctgg atcggccggt gtgggcgcat cgacgcaccg 6781gaccgcgttg aacgcgtcct ggtcgttgct gtagtgcccg tctgcatccc ggccgtcata 6841gtcgtcggca agcaccagca agtcgccggc gtcgctgccg cgctgcagcc ccagcagacc 6901actggtcagg tacttccagc gctgagggct gtacagcgcg ttgatggtgc ccgtcgtcgc 6961gtcggcgtag ctcaggccac gtggatccga cgtcttaccc ggcttctgca ccagcgggtc 7021aaccagggcg tggtagcggt tgacccactg ggccgagtcg gtgcccagag ggcaggccgg 7081cgagcgggcg cagtcggcgg cgtagtcatt gaaagcggtc tgaaatcccg ccatttggct 7141gatgctttcc tcgattgggc taacggctgg atcgatagcg ccgtcgagga ccatcgcccg 7201cacatgagta ccgaaccgtt ccaggtaagc ggtgcccaac tcggtgccgt agctgtatcc 7261gaggtagttg atctgatcgt cacctaacgc ttggcgaacc atgtccatgt cccgtgcgac 7321ggacgcggta ccgatattgg ccaagaagct gaagcccatc cggtcaacac agtcctgggc 7381caactgccgg tagacctgtt cgacgtgggt gacaccggcc ggactgtagt cggccatcgg 7441atcgcgccgg tacgcgtcga actcggcgtc ggtgcgacac cgcaacgcag gggtcgagtg 7501gccgacccct ctcgggtcga agcccaccag gtcgaagtgg cggagaatgt cggtgtcggc 7561gatcgcgggt gccatagcgg cgaccatgtc gaccgccgac gccccgggtc ccccaggatt 7621gaccagcagt gctccgaatc gctgtcccgt cgcggggacg cggatcaccg ccaacttcgc 7681ttgtgtccca ccgggttggt cgtagtcgac ggggacggac accgtcgcgc agcgtgcagt 7741gcgaatttcg ctggtgtcgg cgatgaactc gcggcagctg ttccaactct gttgcggcgc 7801cacgaccggc gcacccgggg tttggccggc gccgggttct tcagtcgcgc cggccaacgg 7861gggcgctgct aggggcagtc cgccgagcag caacccgaag gacagcagcg ccgagctcaa 7921cggtctgcgg cgccacatgg ccgccatcgt ctcaccggcg aatacctgtg acggcgcgaa 7981atgatcacac cttcgtttct tcgccccgct agcacttggc gccgctgggc ggcgtggtgc 8041cgccgattaa atacgccgtc acgtactcgt caatgcagct gtcgccctgg aataccaccg 8101tgtgctgggt tccgtcgaag gtcagcaacg aaccgcgaag ctggttcgcc aggtcgaccc 8161cggccttgta cggcgtcgcc gggtcatggg tggtggatac caccaccgtc ggcactaggc 8221cgggcgccga gacggcatgg ggctgacttg tgggtggcac cggccagaac gcgcaggtgc 8281ccagcggcgc atcaccggtg aacttcccgt agctcatgaa cggtgcgatc tcccgggcgc 8341ggcggtcttc gtcgatgacc ttgtcgcgat cggtaaccgg gggctgatcg acgcaattga 8401tcgccacccg cgcgtcaccg gaattgttgt agcggccgtg cgagtcccga cgcatgtaca 8461tgtcggccag agccagcagg gtgtctccgc gattgtcgac cagctccgac agcccgtcgg 8521tcaagtgttg ccacagattc ggtgagtaca gcgccataat ggtgcccacg atggcgtcgc 8581tataactcag cccgcgcgga tccttcgtgc gcgccggcct gctgatcctc gggttgtccg 8641ggtcgaccaa cggatcgacc aggctgtggt agacctcgac ggctttggcc gggtcggcgc 8701ccagcgggca gcccgcgttc ttggcgcagt cggcggcata gttgttgaac gcgtcctgga 8761agcccttggc ctggcgcagc tccgcctcga tgggatcggc attggggtcg acggcaccgt 8821cgagaatcat tgcccgcacc cgctgcggaa attcctcggc atacgcggag ccgatccggg 8881tgccgtacga gtagcccagg taggtcagct tgtcgtcgcc caacgccgcg cgaatggcat 8941ccaggtcctt ggcgacgttg accgtcccga catgggccag aaagttcttg cccatcttgt 9001ccacacagcg accgacgaat tgcttggtct cgttctcgat gtgcgccaca ccctcccggc 9061tgtagtcaac ctgcggctcg gcccgcagcc ggtcgttgtc ggcatcggag ttgcaccaga 9121tcgccggccg ggacgacgcc accccgcggg ggtcgaaccc aaccaggtcg aacctttcgt 9181gcacccgctt cggcaatgtc tggaagacgc ccaaggcggc ctcgataccg gattcgccgg 9241gtccaccggg atttatgacc agcgaaccga tcttgtctcc cgtcgccgga aagcgaatca 9301gcgccagcgc cgccacgtca ccatcggggc ggtcgtagtc gaccggtaca gcgagcttgc 9361cgcataacgc gccgccgggg atctttactt gcgggtttga cgaccggcac ggtgtccact 9421ccaccggctg gcccagcttc ggctccgcca tacgagcgcg tcccccgacc acgcggatgc 9481agcccacaag aaccaacgcc acggcggcga gcgcggccca gatcaacagc atgcgcgcga 9541tcttgtcgcg gcgagacagc ctcatgccca caatgctgcc agagcagacc cgagatcctg 9601gccagcggcc accgtcggcc gactaaccgg ccgctgccag cagtcctgcc atcgccgatg 9661gcgaactcgt cggccatccc ccatacgtcc ggtaacagat ccgggcaaga caccgacccg 9721tcgaccggat ccggcacggg cgcgtcggcc tcggcggtgc acaactgcga catcaggttg 9781gcgctggcac cccgtccacg ccggcatggt gcaccttggc catcgcccga gggcgatccc 9841cgatgccgtc caccccttcg acgaacccat ctcccacggc ggtcgccggc agcgacgcga 9901tgtggccgca gatctccgag agttcggccc gcccgcccgg cgacggcaac ccgatgccgt 9961gcaagtgacg atcgatgtga ggttcaaggt tcagcgcact gctggcaagc tttttccgaa10021accgcggcct cgccttgatc tggagtcaga acgcgtcacg cagccggtca aaggcgtaac10031ccatgctcga gcaaacatgc atgggctgag tggacgtttc cagacacagc aactggcgtc10141caggccactg agccgctgca tgcgcgatgg tatgccgatg ggggccccgg gcgcgtctga10201ggggaagaag tggcagactg tcagggtccg acgaacccgg ggaccctaac gggccacgag10261gatcgacccg accaccatta gggacagtga tgtctgagca gactatctat ggggccaata10321cccccggagg ctccgggccg cggaccaaga tccgcaccca ccacctacag agatggaagg10381ccgacggcca caagtgggcc atgctgacgg cctacgacta ttcgacggcc cggatcttcg10441acgaggccgg catcccggtg ctgctggtcg gtgattcggc ggccaacgtc gtgtacggct10501acgacaccac cgtgccgatc tccatcgacg agctgatccc gctggtccgt ggcgtggtgc10561ggggtgcccc gcacgcactg gtcgtcgccg acctgccgtt cggcagctac gaggcggggc10621ccaccgccgc gttggccgcc gccacccggt tcctcaagga cggcggcgca catgcggtca10681agctcgaggg cggtgagcgg gtggccgagc aaatcgcctg tctgaccgcg gcgggcatcc10741cggtgatggc acacatcggc ttcaccccgc aaagcgtcaa caccttgggc ggcttccggg10801tgcagggccg cggcgacgcc gccgaacaaa ccatcgccga cgcgatcgcc gtcgccgaag10861ccggagcgtt tgccgtcgtg atggagatgg tgcccgccga gttggccacc cagatcaccg10921gcaagcttac cattccgacg gtcgggatcg gcgctgggcc caactgcgac ggccaggtcc10981tggtatggca ggacatggcc gggttcagcg gcgccaagac cgcccgcttc gtcaaacggt11041atgccgatgt cggtggtgaa ctacgccgtg ctgcaatgca atacgcccaa gaggtggccg11101gcggggtatt ccccgctgac gaacacagtt tctgaccaag ccgaatcagc ccgatgcgcg11161ggcattgcgg tggcgccctg gatgccgtcg acgccggatt gccggcgcgg acgcgccagc11221gggacccatc ggcgtcgcgt tcgccggttg agcccggggt gagcccagac attcgatgtg11281cccaacacca tccgccacag cccaattgat gtggcactct atgcatgcct atccccgacc11341aaccaccacc gcggcgacgc atcatgaccg gaggcgaaga tgccagtaga ggcgcccaga11401ccagcgcgcc atctggaggt cgagcgcaag ttcgacgtga tcgagtcgac ggtgtcgccg11461tcgttcgagg gcatcgccgc ggtggttcgc gtcgagcagt cgccgaccca gcagctcgac11521gcggtgtact tcgacacacc gtcgcacgac ctggcgcgca accagatcac cttgcggcgc11581cgcaccggcg gcgccgacgc cggctggcat ctgaagctgc cggccggacc cgacaagcgc11641accgagatgc gagcaccgct gtccgcatca ggcgacgctg tgccggccga gttgttggat11701gtggtgctgg cgatcgtccg cgaccagccg gttcagccgg tcgcgcggat cagcactcac11761cgcgaaagcc agatcctgta cggcgccggg ggcgacgcgc tggcggaatt ctgcaacgac11821gacgtcaccg catggtcggc cggggcattc cacgccgctg gtgcagcgga caacggccct11881gccgaacagc agtggcgcga atgggaactg gaactggtca ccacggatgg gaccgccgat11941accaagctac tggaccggct agccaaccgg ctgctcgatg ccggtgccgc acctgccggc12001cacggctcca aactggcgcg ggtgctcggt gcgacctctc ccggtgagct gcccaacggc12061ccgcagccgc cggcggatcc agtacaccgc gcggtgtccg agcaagtcga gcagctgctg12121ctgtgggatc gggccgtgcg ggccgacgcc tatgacgccg tgcaccagat gcgagtgacg12181acccgcaaga tccgcagctt gctgacggat tcccaggagt cgtttggcct gaaggaaagt12241gcgtgggtca tcgatgaact gcgtgagctg gccgatgtcc tgggcgtagc ccgggacgcc12301gaggtactcg gtgaccgcta ccagcgcgaa ctggacgcgc tggcgccgga gctggtacgc12361ggccgggtgc gcgagcgcct ggtagacggg gcgcggcggc gataccagac cgggctgcgg12421cgatcactga tcgcattgcg gtcgcagcgg tacttccgtc tgctcgacgc tctagacgcg12481cttgtgtccg aacgcgccca tgccacttct ggggaggaat cggcaccggt aaccatcgat12541gcggcctacc ggcgagtccg caaagccgca aaagccgcaa agaccgccgg cgaccaggcg12601ggcgaccacc accgcgacga ggcattgcac ctgatccgca agcgcgcgaa gcgattacgc12661tacaccgcgg cggctactgg ggcggacaat gtgtcacaag aagccaaggt catccagacg12721ttgctaggcg atcatcaaga cagcgtggtc agccgggaac atctgatcca gcaggccata12781gccgcgaaca ccgccggcga ggacaccttc acctacggtc tgctctacca acaggaagcc12841gacttggccg agcgctgccg ggagcagctt gaagccgcgc tgcgcaaact cgacaaggcg12901gtccgcaaag cacgggattg agcccgccag gggcggacga gttggcctgt aagccggatt12961ctgttccgcg ccgccacagc caagctaacg gcggcacggc ggcgaccatc catctggaca13021caccgttacc gggtgcctcg agcggcctac ccgcaggctc gggcgagcaa ccctcaagcg13081cctgcgcggc cgcactttcg gtgcggcctt cttggccttg cttcgggtgg ggtttgccta13141gccaccccgg tcacccggaa tgctggtgcg ctcttaccgc accgtttcac ccttgccacc13201acgaggatgg cggtctgttt tctgtggcac tttcccgcga gtcacctcgg attgccgtta13261gcaatcaccc tgctctgtga agtccggact ttcctcgact cgacgctgaa cctcgtgaat13321ccacacaagc cctacgcgag ccgcggccgc ccagccaact catccgcgac gaccacgcta13381ccccgctggg cggtgtcgcg gccagtgtga ccgctggacg acacggctag tcggacagcc13441gatccggcgg gcagtcctta tcgtggactg gtgacacggt gggacaaacg cgtcgactcc13501ggcgactggg acgccatcgc tgccgaggtc agcgagtacg gtggcgcact gctacctcgg13561ctgatcaccc ccggcgaggc cgcccggctg cgcaagctgt acgccgacga cggcctgttt13621cgctcgacgg tcgatatggc atccaagcgg tacggcgccg ggcagtatcg atatttccat13681gccccctatc ccgagtgatc gagcgtctca agcaggcgct gtatcccaaa ctgctgccga13741tagcgcgcaa ctggtgggcc aaactgggcc gggaggcgcc ctggccagac agccttgatg13801actggttggc gagctgtcat gccgccggcc aaacccgatc cacagcgctg atgttgaagt13861acggcaccaa cgactggaac gccctacacc aggatctcta cggcgagttg gtgtttccgc13921tgcaggtggt gatcaacctg agcgatccgg aaaccgacta caccggcggc gagttcctgc13981ttgtcgaaca gcggcctcgc gcccaatccc ggggtaccgc aatgcaactt ccgcagggac14041atggttatgt gttcacgacc cgtgatcggc cggtgcggac tagccgtggc tggtcggcat14101ctccagtgcg ccatgggctt tcgactattc gttccggcga acgctatgcc atggggctga14161tctttcacga cgcagcctga ttgcacgcca tctatagata gcctgtctga ttcaccaatc14221gcaccgacga tgccccatcg gcgtagaact cggcgatgct cagcgatgcc agatcaagat14281gcaaccgata taggacgccc gacccggcat ccaacgccag ccgcaacaac attttgatcg14341gcgtgacatg tgacaccacc agcaccgtcg cgccttcgta gccaacgatg atccgatcac14401gtccccgccg aacccgccgc agcacgtcgt cgaagctttc cccacccggg ggcgtgatgc14461tggtgtcctg cagccagcga cggtgcagct cgggatcgcg ttctgcggcc tccgcgaacg14521tcagcccctc ccaggcgccg aagtcggtct cgaccaggtc gtcatcgacg accacgtcca14581gggccagggc tctggcggcg gtcaccgcgg tgtcgtaagc ccgctgtagc ggcgaggaga14641ccaccgcagc gatcccgccg cgccgcgcca gatacccggc cgccgcacca acctggcgcc14701accccacctc gttcaacccc gggttgccgc gccccgaata gcggcgttgc tccgacagct14761ccgtctgccc gtggcgcaac aaaagtagtc gggtgggtgt accgcgggcg ccggtccagc14821cgggagatgt cggtgactcg gtcgcaacga ttttggcagg atccgcatcc gccgcagccg14881attgcgcggc ggcgtccatc gcgtcattgg ccaaccggtc tgcatacgtg ttccgggcac14941gcggaaccca ctcgtagttg atcctgcgaa actgggacgc caacgcctga gcctggacat15001agagcttcag cagatccggg tgcttgacct tccaccgccc ggacatctgc tccaccacca15061gcttggagtc catcagcacc gcggcctcgg tggcacctag tttcacggcg tcgtccaaac15121cggctatcag gccgcggtat tcggcgacgt tgttcgtcgc ccggccgatc gcctgcttgg15181actcggccag cacggtggag tgatcggcgg tccacaccac cgcgccgtat ccggccggtc15241cgggattgcc ccgcgatccg ccgtcggctt cgatgacaac tttcactcct caaatccttc15301gagccgcaac aagatcgctc cgcattccgg gcagcgcacc acttcatcct cggcggccgc15361cgagatctgg gccagctcgc cgcggccgat ctcgatccgg caggcaccac atcgatgacc15421ttgcaaccgc ccggcccctg gcccgcctcc ggcccgctgt ctttcgtaga gccccgcaag15481ctcgggatca agtgtcgccg tcagcatgtc gcgttgcgat gaatgttggt gccgggcttg15541gtcgatttcg gcaagtgcct cgtccaaagc ctgctgggcg gcggccaggt cggcccgcaa15601cgcttggagc gcccgcgact cggcggtctg ttgagcctgc agctcctcgc ggcgttccag15661cacctccagc agggcatctt ccaaactggc ttgacggcgt tgcaagctgt cgagctcgtg15721ctgcagatca gccaattgct tggcgtccgt tgcacccgaa gtgagcaacg accggtcccg15781gtcgccacgc ttacgcaccg catcgatctc cgactcaaaa cgcgacacct ggccgtccaa15841gtcctccgcc gcgattcgca gggccgccat cctgtcgttg gcggcgttgt gctcggcctg15901cacctgctgg taagccgccc gctgcggcag atgggtagcc cgatgcgcga tccgggtcag15961ctcagcatcc agcttcgcca attccagtag cgaccgttgc tgtgccactc cggctttcat16021gcctgatctc tcccagtttc gtgatcgagg ttccacgggt cggtgcagat ggtgcacaca16081cgcaccggca gcgacgcgcc gaaatgagac cgcaacactt cggcggcctg gccgcaccac16141gggaattcgc ttgcccaatg cgcgacgtcg atcagggcca cttgcgaagc tcggcaatgc16201tcgtcggctg gatgatgtcg cagatcggcc gtaacgtacg cttgcacgtc cgcggcggcc16261acggtggcaa gcaacgagtc cccggcgccg ccgcagaccg cgacccgcga caccagcagg16321tcgggatccc cggcggcgcg cacaccggtc gcagtcggcg gcaacgcggc ctccagacgg16381gcaacaaagg tgcgcagcgg ttcgggtttt ggcagtctgc caatccggcc taacccgctg16441ccgaccggcg gtggtaccag cgcgaagatg tcgaatgccg gctcctcgta agggtgcgcg16501gcgcgcatcg ccgccaacac ctcggcgcgc gctcgtgcgg gtgcgacgac ctcgacccgg16561tcctcggcca cccgttcgac ggtaccgacg ctgcctatgg cgggcgacgc cccgtcgtgc16621gccaggaact gcccggtacc cgcgacactc cagctgcagt gcgagtagtc gccgatatgg16681ccggcaccgg cctcaaagac cgctgcccgc accgcctctg agttctcgcg cggcacatag16741atgacccact tgtcgagatc ggccgctccg ggcaccgggt cgagaacggc gtcgacggtc16801agaccaacag cgtgtgccag cgcgtcggac acacccggcg acgccgagtc ggcgttggtg16861tgcgcggtaa acaacgagcg accggtccgg atcaggcggt gcaccagcac accctttggc16921gtgttggccg cgaccgtatc gaccccacgc agtaacaacg ggtggtgcac caatagcagt16981ccggcctggg gaacctggtc caccaccgcc ggcgtcgcgt ccaccgcaac ggtcaccgaa17041tccaccacgt cgtcggggtc gccgcacacc agacccaccg aatcccacga ctgggcaagc17101cgcggcgggt aggcctggtc cagcacgtcg atgacatcgg ccagccgcac actcatcggc17161gtcctccacg ctttgcccac tcggcgatcg ccgccaccag cacgggccac tccgggcgca17221ccgccgcccg caggtaccgc gcgtccaggc cgacgaaggt gtcaccgcgg cgcaccgcaa17281ttcctttgct ctgcaaatag tttcgtaatc cgtcagcatc ggcgatgttg aacagtacga17341aaggggccgc accatcgacc acctcggcac ccaccgatct cagtccggcc accatctccg17401cgcgcagcgc cgtcaaccgc accgcatcgg ctgcggcagc ggcgaccgcc cggggggcgc17461agcaagcagc gatggccgtc agttgcaatg ttcccaacgg ccagtgcgct cgctgcacgg17521tcaaccgagc cagcacgtct ggcgagccga gcgcgtagcc cacccgcaat ccggccagcg17581accacgtttt cgtcaagcta cggagcacca gcacatcggg cagcgagtca tcggccaacg17641attgcggctc gccgggaacc caatcagcga acgcctcgtc gaccaccagg atgcgtcccg17701gccggcgtaa ctcgagcagc tgctcgcgga ggtgcagcac cgaggtgggg ttggtcggat17761tacccacgac gacaaggtcg gcgtcgtcag gcacgtgcgc ggtgtccagc acgaacggcg17821gctttaggac aacatggtgc gccgtgattc cggcagcgct caaggctatg gccggctcgg17881tgaacgcggg cacgacgatt gctgcccgca ccggacttag gttgtgcagc aatgcgaatc17941cctccgccgc cccgacgagc gggagcactt cgtcacgggt tctgccatga cgttcagcga18001ccgcgtcttg cgcccggtgc acatcgtcgg tgctcggata gcgggccagc tccggcagca18061gcgcggcgag ctgccggacc aaccattccg ggggccggtc atggcggacg ttgacggcga18121agtccagcac gccgggcgcg acatcctgat caccgtggta gcgcgccgcg gcaagcgggc18181tagtgtctag actcgccaca gcgtcaaaca gtagtgggcc ggtgtgcggg ccaagaatcc18241agagcaccgc cgacgcgttg tctacgcggc gacaaccgcg acatcacagg cagctaacag18301ggcgtcggcg gtgatgatcg tcaggccaag cagctgtgcc tgggcgatga gcacacggtc18361gaatggatgt cgatggtgat ccggaagctc tgcggtgcgc agtgtgtgcg tggtcaactg18421acagcggcga cgtgccgcag cggcgcattc gatcgggcac gtaagaagcc gatggctcgg18481gcggcgggag cttgccgagg cggtagttga tcgcgatctc ccaggcactg gcggccgaca18541agagaatgct gttgcggacg tcctgaacaa tcgcccgtgt ttcgttgacg gcatccgcag18601ccaaacgtgg gtgtcgatga ggtagcgctt caccggtgaa agcgttcgag cacgtcgtct18661gacaacggag cgtccaaatc gtcgggcacg cggtacacgc catggtcaat gcctaaccgc18721cgagtctcat gaggatgcag cggcacaagc tttgctaccg gctcgccgcg gcgggcaatc18781tcaacctctg cccgccgtag acgagccgca gcagctcgga caggcgtgtc ttcgcctcgt18841gaacgccgac ccgcttcgca ggcgcccaga ctttcgcgtc gaccacctgc tcaccaaact18901tcgcgatcat cgcctgatac cacagcgcca acgggtagcg gtttgtccaa ccgcttcgtc18961aacgacaatg ggatcgtgac cgacacgacc gcgagcggga ccaattgccc gcctcctcca19021cgcgccgccg cacggcgcgc atcgtcgccg ggtgaatcgc cgcagctggt gatcttcgat19081ctggacggca cgctgaccga ctcggcgcgc ggaatcgtat ccagcttccg acacgcgctc19141aaccacatcg gtgccccagt acccgaaggc gacctggcca ctcacatcgt cggcccgccc19201atgcatgaga cgctgcgcgc catggggctc ggcgaatccg ccgaggaggc gatcgtagcc19261taccgggccg actacagcgc ccgcggttgg gcgatgaaca gcttgttcga cgggatcggg19321ccgctgctgg ccgacctgcg caccgccggt gtccggctgg ccgtcgccac ctccaaggca19381gagccgaccg cacggcgaat cctgcgccac ttcggaattg agcagcactt cgaggtcatc19441gcgggcgcga gcaccgatgg ctcgcgaggc agcaaggtcg acgtgctggc ccacgcgctc19501gcgcagctgc ggccgctacc cgagcggttg gtgatggtcg gcgaccgcag ccacgacgtc19561gacggggcgg ccgcgcacgg catcgacacg gtggtggtcg gctggggcta cgggcgcgcc19621gactttatcg acaagacctc caccaccgtc gtgacgcatg ccgccacgat tgacgagctg19681agggaggcgc taggtgtctg atccgctgca cgtcacattc gtttgtacgg gcaacatctg19741ccggtcgcca atggccgaga agatgttcgc ccaacagctt cgccaccgtg gcctgggtga19801cgcggtgcga gtgaccagtg cgggcaccgg gaactggcat gtaggcagtt gcgccgacga19861gcgggcggcc ggggtgttgc gagcccacgg ctaccctacc gaccaccggg ccgcacaagt19921cggcaccgaa cacctggcgg cagacctgtt ggtggccttg gaccgcaacc acgctcggct19981gttgcggcag ctcggcgtcg aagccgcccg ggtacggatg ctgcggtcat tcgacccacg20041ctcgggaacc catgcgctcg atgtcgagga tccctactat ggcgatcact ccgacttcga20101ggaggtcttc gccgtcatcg aatccgccct gcccggcctg cacgactggg tcgacgaacg20161tctcgcgcgg aacggaccga gttgatgccc cgcctagcgt tcctgctgcg gcccggctgg20221ctggcgttgg ccctggtcgt ggtcgcgttc acctacctgt gctttacggt gctcgcgccg20281tggcagctgg gcaagaatgc caaaacgtca cgagagaacc agcagatcag gtattccctc20341gacaccccgc cggttccgct gaaaaccctt ctaccacagc aggattcgtc ggcgccggac20401gcgcagtggc gccgggtgac ggcaaccgga cagtaccttc cggacgtgca ggtgctggcc20461cgactgcgcg tggtggaggg ggaccaggcg tttgaggtgt tggccccatt cgtggtcgac20521ggcggaccaa ccgtcctggt cgaccgtgga tacgtgcggc cccaggtggg ctcgcacgta20581ccaccgatcc cccgcctgcc ggtgcagacg gtgaccatca ccgcgcggct gcgtgactcc20641gaaccgagcg tggcgggcaa agacccattc gtcagagacg gcttccagca ggtgtattcg20701atcaataccg gacaggtcgc cgcgctgacc ggagtccagc tggctgggtc ctatctgcag20761ttgatcgaag accaacccgg cgggctcggc gtgctcggcg ttccgcatct agatcccggg20821ccgttcctgt cctatggcat ccaatggatc tcgttcggca ttctggcacc gatcggcttg20881ggctatttcg cctacgccga gatccgggcg cgccgccggg aaaaagcggg gtcgccacca20941ccggacaagc caatgacggt cgagcagaaa ctcgctgacc gctacggccg ccggcggtaa21001accaacatca cggccaatac cgcagccccc gcctggacca cccgcgacag caccacggcg21061cggcgcagat cggccacctt gggcgaccgg ccgtcgccca aggtgggccg gatctgcaac21121tcatggtggt accgggtggg cccacccagc cgcacgtcaa gcgccccagc aaacgccgcc21181tcgacgacac cggcgttggg gctgggatgg cgggcggcgt cgcgccgcca ggcccgtacc21241gcaccgcggg gcgacccacc gaccaccggc gcgcagatca ccaccagcac cgccgtcgcc21301cgtgcgccaa catagttggc ccagtcatcc aatcgtgctg cagcccaacc gaatcggaga21361taacgcggcg agcggtagcc gatcatcgag tccagggtgt tgatggcacg atatcccagc21421accgcaggca cgccgctcga agccgcccac agcagcggca ccacctgggc gtcggcggtg21481ttttcggcca ccgactccag cgcggcacgc gtcaggcccg ggccgcccag ctgggccggg21541tcacgcccgc acagcgacgg cagcagccgt cgcgccgcct cgacatcgtc gcgctccaac21601aggtccgata tctggcggcc ggtgcgcgcc agcgaagttc cgcccagcgc tgcccaggtg21661gccgtcgcgg tggccgccac gggccaggac ctgccgggta gccgctgcag tgccgcgccg21721agcaagccca ccgcgccgac cagcaggccg acgtgtaccg caccggcgac ccggccgtca21781cggtaggtga tctgctccag cttggcggcc gcccgaccga acagggccac cggatgacct21841cgtttggggt cgccgaacac gacgtcgagc aggcagccga tcagcacgcc gacggccctg21901gtctgccagg tcgatgcaaa cactccggca gcgtcgcaca cgtggtctac gctcagctat21961ttatgacctc atacggcagc tatccacgat gaagcggcca gctacccggg ttgccgacct22021gttgaacccg gcggcaatgt tgttgccggc agcgaatgtc atcatgcagc tggcagtgcc22081gggtgtcggg tatggcgtgc tggaaagccc ggtggacagc ggcaacgtct acaagcatcc22141gttcaagcgg gcccggacca ccggcaccta cctggcggtg gcgaccatcg ggacggaatc22201cgaccgagcg ctgatccggg gtgccgtgga cgtcgcgcac cggcaggttc ggtcgacggc22261ctcgagccca gtgtcctata acgccttcga cccgaagttg cagctgtggg tggcggcgtg22321tctgtaccgc tacttcgtgg accagcacga gtttctgtac ggcccactcg aagatgccac22381cgccgacgcc gtctaccaag acgccaaacg gttagggacc acgctgcagg tgccggaggg22441gatgtggccg ccggaccggg tcgcgttcga cgagtactgg aagcgctcgc ttgatgggct22501gcagatcgac gcgccggtgc gcgagcatct tcgcggggtg gcctcggtag cgtttctccc22561gtggccgttg cgcgcggtgg ccgggccgtt caacctgttt gcgacgacgg gattcttggc22621accggagttc cgcgcgatga tgcagctgga gtggtcacag gcccagcagc gtcgcttcga22681gtggttactt tccgtgctac ggttagccga ccggctgatt ccgcatcggg cctggatctt22741cgtttaccag ctttacttgt gggacatgcg gtttcgcgcc cgacacggcc gccgaatcgt22801ctgatagagc ccggccgagt gtgagcctga cagcccgaca ccggcggcgt gtgtcgcgtc22861gccaggttca cgctcggcga tctagagccg ccgaaaacct acttctgggt tgcctcccga22921atcaacgtgc tgatctgctc gagcagctca cgcatatcgg cgcgcatcgc atccaccgcg22981gcatacaggt cggccttggt cgccggcagc tggtccgacg tcattggccg caccggcggt23041gctgtctgtc gcgccgcgct gtcgctttga aacccaggtc gctcacccac gaccacgaca23101ctgccatatc cggcgccccg ccgacaacga agcacagcta gccggtgggc gcggacggga23161tcgaaccgcc gaccgctggt gtgtaaaacc agagctctac cgctgagcta cgcgcccatg23221accgccgcag gctacacgcc ttgcggccaa gcacccaaaa ccttaggccg taagcgccgc23281cagagcgtcg gtccacagcc gctgatcgcg aacttcaccc ggctgcttca tctcggcgaa23341ccgaatgatc cctgaccgat cgaccacaaa ggtgccccgg ttagcgatgc cggcctgctc23401gttgaagacg ccgtaggcct gactgaccgc gccgtgtggc cagaagtccg acaacagcgg23461aaacgtgaat ccgctctgcg tcgcccagat cttgtgagtg ggtggcgggc ccaccgaaat23521cgctagcgcg gcgctgtcgt cgttctcaaa ctcgggcagg tgatcacgca actggtccag23581ctcgccctgg cagatgcccg tgaacgccaa cggaaagaac accaacagca cgttctttgc23641accccggtag ccgcgcaggg tgacaagctg ctgattctgg tcgcgcaacg tgaagtcagg23701ggcggtggct ccgacgttca gcatcagcgc ttgccagccc gcgatttcgg ctgtaccaat23761ctgctggcgc tccagttgcc cagattgacc gacgaggtcg gcatcagccc agctgtgggc23821gccgcctcgg caatctcggc gggcaataca tggccgggct ggccggtctt gggcgtcacc23881acccaaatca caccgtcctc ggcgagcggg ccgatcgcat ccatcagggt gtccaccaaa23941tcgccgtcgc catcacgcca ccacaacagg acgacatcga tgacctcgtc ggtgtcttca24001tcgagcaact ctcccccgca cgcttcttcg atggccgcgc ggatgtcgtc gtcggtgtct24061tcgtcccagc cccattcctg gataagttgg tctcgttgga tgcccaattt gcgggcgtag24121ttcgaggcgt gatccgccgc gaccaccgtg gaacctcctt cagtctccgc gggccatgtg24181cacaccgtcg cgatgggcat tatcgtcgca cagccagaac cggtccaccc gcccgcctca24241gaaggcggcc acgcacattg tcaatgcctt tgtcttggtg tcgttgagcc gatcaacccg24301ccggttgaat tccgctgtcg acgcgtgcgc accgatggca tttgccaccg cgcgggccgc24361gtcgacatat gcgttgagcg catcccccag ttgcgcggac agcgcggcgc tcagactgcc24421tgagaccgtc gaggcactgt tgttgagcgc gtcgatggcc ggaccttcgg tcggcccggt24481gttgcggccc tgattgaacg cggccacgta ggcgttcacc ttgtcgatgg cgtccttgct24541ggtggccgcc agcgcgtcac acgaggtgcg aatcgccttg gtcgtcagcg attgttggcg24601ctgcgactcc cggatgctcg acgtcgccgc cgaagccgac accgacgcgg acaccgacga24661gcggtaggcc ggtgcgacgt tggtgtcggg catggccgta ccgtcggtga cagtggtaca24721tccgacgatc cccatcagca gcagcgcgat gcagccgagc gccagggcgc ctcgcctggg24781gagctccccc ccgtgcctgc gaggcacggc gcgccatccg atgagcacgg catgtgaggt24841tacctggtcg cagcgcgacc gcgctggccg tggtgtgtcg cgcatccgca gaaccgagcg24901gagtgcggct atccgccgcc gacgccggtg cggcacgata gggggacgac catctaaaca24961gcacgcaagc ggaagcccgc cacctacagg agtagtgcgt tgaccaccga tttcgcccgc25021cacgatctgg cccaaaactc aaacagcgca agcgaacccg accgagttcg ggtgatccgc25081gagggtgtgg cgtcgtattt gcccgacatt gatcccgagg agacctcgga gtggctggag25141tcctttgaca cgctgctgca acgctgcggc ccgtcgcggg cccgctacct gatgttgcgg25201ctgctagagc gggccggcga gcagcgggtg gccatcccgg cattgacgtc taccgactat25261gtcaacacca tcccgaccga gctggagccg tggttccccg gcgacgaaga cgtcgaacgt25321cgttatcgag cgtggatcag atggaatgcg gccatcatgg tgcaccgtgc gcaacgaccg25381ggtgtgggcg tgggtggcca tatctcgacc tacgcgtcgt ccgcggcgct ctatgaggtc25441ggtttcaacc acttcttccg cggcaagtcg cacccgggcg gcggcgatca ggtgttcatc25501cagggccacg cttccccggg aatctacgcg cgcgccttcc tcgaagggcg gttgaccgcc25561gagcaactcg acggattccg ccaggaacac agccatgtcg gcggcgggtt gccgtcctat25621ccgcacccgc ggctcatgcc cgacttctgg gaattcccca ccgtgtcgat gggtttgggc25681ccgctcaacg ccatctacca ggcacggttc aaccactatc tgcatgaccg cggtatcaaa25741gacacctccg atcaacacgt gtggtgtttt ttgggcgacg gcgagatgga cgaacccgag25801agccgtgggc tggcccacgt cggcgcgctg gaaggcttgg acaacttgac cttcgtgatc25861aactgcaatc tgcagcgact cgacggcccg gtgcgcggca acggcaagat catccaggag25921ctggagtcgt tcttccgcgg tgccggctgg aacgtcatca aggtggtgtg gggccgcgaa25981tgggatgccc tgctgcacgc cgaccgcgac ggtgcgctgg tgaatttaat gaatacaaca26041cccgatggcg attaccagac ctataaggcc aacgacggcg gctacgtgcg tgaccacttc26101ttcggccgcg acccacgcac caaggcgctg gtggagaaca tgagcgacca ggatatctgg26161aacctcaaac ggggcggcca cgattaccgc aaggtttacg ccgcctaccg cgccgccgtc26221gaccacaagg gacagccgac ggtgatcctg gccaagacca tcaaaggcta cgcgctgggc26281aagcatttcg aaggacgcaa tgccacccac cagatgaaaa aactgaccct ggaagacctt26341aaggagtttc gtgacacgca gcggattccg gtcagcgacg cccagcttga agagaatccg26401tacctgccgc cctactacca ccccggcctc aacgccccgg agattcgtta catgctcgac26461cggcgccggg ccctcggggg ctttgttccc gagcgcagga ccaagtccaa agcgctgacc26521ctgccgggtc gcgacatcta cgcgccgctg aaaaagggct ctgggcacca ggaggtggcc26581accaccatgg cgacggtgcg cacgttcaaa gaagtgttgc gcgacaagca gatcgggccg26641aggatagtcc cgatcattcc cgacgaggcc cgcaccttcg ggatggactc ctggttcccg26701tcgctaaaga tctataaccg caatggccag ctgtataccg cggttgacgc cgacctgatg26761ctggcctaca aggagagcga agtcgggcag atcctgcacg agggcatcaa cgaagccggg26821tcggtgggct cgttcatcgc ggccggcacc tcgtatgcga cgcacaacga accgatgatc26881cccatttaca tcttctactc gatgttcggc ttccagcgca ccggcgatag cttctgggcc26941gcggccgacc agatggctcg agggttcgtg ctcggggcca ccgccgggcg caccaccctg27001accggtgagg gcctgcaaca cgccgacggt cactcgttgc tgctggccgc caccaacccg27061gcggtggttg cctacgaccc ggccttcgcc tacgaaatcg cctacatcgt ggaaagcgga27121ctggccagga tgtgcgggga gaacccggag aacatcttct tctacatcac cgtctacaac27181gagccgtacg tgcagccgcc ggagccggag aacttcgatc ccgagggcgt gctgcggggt27241atctaccgct atcacgcggc caccgagcaa cgcaccaaca aggcgcagat cctggcctcc27301ggggtagcga tgcccgcggc gctgcgggca gcacagatgc tggccgccga gtgggatgtc27361gccgccgacg tgtggtcggt gaccagttgg ggcgagctaa accgcgacgg ggtggccatc27421gagaccgaga agctccgcca ccccgatcgg ccggcgggcg tgccctacgt gacgagagcg27481ctggagaatg ctcggggccc ggtgatcgcg gtgtcggact ggatgcgcgc ggtccccgag27541cagatccgac cgtgggtgcc gggcacatac ctcacgttgg gcaccgacgg gttcggcttt27601tccgacactc ggcccgccgc tcgccgctac ttcaacaccg acgccgaatc ccaggtggtc27661gcggttttgg aggcgttggc gggcgacggc gagatcgacc catcggtgcc ggtcgcggcc27721gcccgccagt accggatcga cgacgtggcg gctgcgcccg agcagaccac ggatcccggt27781cccggggcct aacgccggcg agccgaccgc ctttggccga atcttccaga aatctggcgt27841agcttttagg agtgaacgac aatcagttgg ctccagttgc ccgcccgagg tcgccgctcg27901aactgctgga cactgtgccc gattcgctgc tgcggcggtt gaagcagtac tcgggccggc27961tggccaccga ggcagtttcg gccatgcaag aacggttgcc gttcttcgcc gacctagaag28021cgtcccagcg cgccagcgtg gcgctggtgg tgcagacggc cgtggtcaac ttcgtcgaat28081ggatgcacga cccgcacagt gacgtcggct ataccgcgca ggcattcgag ctggtgcccc28141aggatctgac gcgacggatc gcgctgcgcc agaccgtgga catggtgcgg gtcaccatgg28201agttcttcga agaagtcgtg cccctgctcg cccgttccga agagcagttg accgccctca28261cggtgggcat tttgaaatac agccgcgacc tggcattcac cgccgccacg gcctacgccg28321atgcggccga ggcacgaggc acctgggaca gccggatgga ggccagcgtg gtggacgcgg28381tggtacgcgg cgacaccggt cccgagctgc tgtcccgggc ggccgcgctg aattgggaca28441ccaccgcgcc ggcgaccgta ctggtgggaa ctccggcgcc cggtccaaat ggctccaaca28501gcgacggcga cagcgagcgg gccagccagg atgtccgcga caccgcggct cgccacggcc28561gcgctgcgct gaccgacgtg cacggcacct ggctggtggc gatcgtctcc ggccagctgt28621cgccaaccga gaagttcctc aaagacctgc tggcagcatt cgccgacgcc ccggtggtca28681tcggccccac ggcgcccatg ctgaccgcgg cgcaccgcag cgctagcgag gcgatctccg28741ggatgaacgc cgtcgccggc tggcgcggag cgccgcggcc cgtgctggct agggaacttt28801tgcccgaacg cgccctgatg ggcgacgcct cggcgatcgt ggccctgcat accgacgtga28861tgcggcccct agccgatgcc ggaccgacgc tcatcgagac gctagacgca tatctggatt28921gtggcggcgc gattgaagct tgtgccagaa agttgttcgt tcatccaaac acagtgcggt28981accggctcaa gcggatcacc gacttcaccg ggcgcgatcc cacccagcca cgcgatgcct29041atgtccttcg ggtggcggcc accgtgggtc aactcaacta tccgacgccg cactgaagca29101tcgacagcaa tgccgtgtca tagattccct cgccggtcag agggggtcca gcaggggccc29161cggaaagata ccaggggcgc cgtcggacgg aaagtgatcc agacaacagg tcgcgggacg29221atctcaaaaa catagcttac aggcccgttt tgttggttat atacaaaaac ctaagacgag29281gttcataatc tgttacaccg cgcaaaaccg tcttcacagt gttctcttag acacgtgatt29341gcgttgctcg cacccggaca gggttcgcaa accgagggaa tgttgtcgcc gtggcttcag29401ctgcccggcg cagcggacca gatcgcggcg tggtcgaaag ccgctgatct agatcttgcc29461cggctgggca ccaccgcctc gaccgaggag atcaccgaca ccgcggtcgc ccagccattg29521atcgtcgccg cgactctgct ggcccaccag gaactggcgc gccgatgcgt gctcgccggc29581aaggacgtca tcgtggccgg ccactccgtc ggcgaaatcg cggcctacgc aatcgccggt29641gtgatagccg ccgacgacgc cgtcgcgctg gccgccaccc gcggcgccga gatggccaag29701gcctgcgcca ccgagccgac cggcatgtct gcggtgctcg gcggcgacga gaccgaggtg29761ctgagtcgcc tcgagcagct cgacttggtc ccggcaaacc gcaacgccgc cggccagatc29821gtcgctgccg gccggctgac cgcgttggag aagctcgccg aagacccgcc ggccaaggcg29881cgggtgcgtg cactgggtgt cgccggagcg ttccacaccg agttcatggc gcccgcactt29941gacggctttg cggcggccgc ggccaacatc gcaaccgccg accccaccgc cacgctgctg30001tccaaccgcg acgggaagcc ggtgacatcc gcggccgcgg cgatggacac cctggtctcc30061cagctcaccc aaccggtgcg atgggacctg tgcaccgcga cgctgcgcga acacacagtc30121acggcgatcg tggagttccc ccccgcgggc acgcttagcg gtatcgccaa acgcgaactt30181cggggggttc cggcacgcgc cgtcaagtca cccgcagacc tggacgagct ggcaaaccta30241taaccgcgga ctcggccaga acaaccacat acccgtcagt tcgatttgta cacaacatat30301tacgaaggga agcatgctgt gcctgtcact caggaagaaa tcattgccgg tatcgccgag30361atcatcgaag aggtaaccgg tatcgagccg tccgagatca ccccggagaa gtcgttcgtc30421gacgacctgg acatcgactc gctgtcgatg gtcgagatcg ccgtgcagac cgaggacaag30481tacggcgtca agatccccga cgaggacctc gccggtctgc gtaccgtcgg tgacgttgtc30541gcctacatcc agaagctcga ggaagaaaac ccggaggcgg ctcaggcgtt gcgcgcgaag30601attgagtcgg agaaccccga tgccgttgcc aacgttcagg cgaggcttga ggccgagtcc30661aagtgagtca gccttccacc gctaatggcg gtttccccag cgttgtggtg accgccgtca30721cagcgacgac gtcgatctcg ccggacatcg agagcacgtg gaagggtctg ttggccggcg30781agagcggcat ccacgcactc gaagacgagt tcgtcaccaa gtgggatcta gcggtcaaga30841tcggcggtca cctcaaggat ccggtcgaca gccacatggg ccgactcgac atgcgacgca30901tgtcgtacgt ccagcggatg ggcaagttgc tgggcggaca gctatgggag tccgccggca30961gcccggaggt cgatccagac cggttcgccg ttgttgtcgg caccggtcta ggtggagccg31021agaggattgt cgagagctac gacctgatga atgcgggcgg cccccggaag gtgtccccgc31081tggccgttca gatgatcatg cccaacggtg ccgcggcggt gatcggtctg cagcttgggg31141cccgcgccgg ggtgatgacc ccggtgtcgg cctgttcgtc gggctcggaa gcgatcgccc31201acgcgtggcg tcagatcgtg atgggcgacg ccgacgtcgc cgtctgcggc ggtgtcgaag31261gacccatcga ggcgctgccc atcgcggcgt tctccatgat gcgggccatg tcgacccgca31321acgacgagcc tgagcgggcc tcccggccgt tcgacaagga ccgcgacggc tttgtgttcg31381gcgaggccgg tgcgctgatg ctcatcgaga cggaggagca cgccaaagcc cgtggcgcca31441agccgttggc ccgattgctg ggtgccggta tcacctcgga cgcctttcat atggtggcgc31501ccgcggccga tggtgttcgt gccggtaggg cgatgactcg ctcgctggag ctggccgggt31561tgtcgccggc ggacatcgac cacgtcaacg cgcacggcac ggcgacgcct atcggcgacg31621ccgcggaggc caacgccatc cgcgtcgccg gttgtgatca ggccgcggtg tacgcgccga31681agtctgcgct gggccactcg atcggcgcgg tcggtgcgct cgagtcggtg ctcacggtgc31741tgacgctgcg cgacggcgtc atcccgccga ccctgaacta cgagacaccc gatcccgaga31801tcgaccttga cgtcgtcgcc ggcgaaccgc gctatggcga ttaccgctac gcagtcaaca31861actcgttcgg gttcggcggc cacaatgtgg cgcttgcctt cgggcgttac tgaagcacga31921catcgcgggt cgcgaggccc gaggtggggg tccccccgct tgcgggggcg agtcggaccg31981atatggaagg aacgttcgca agaccaatga cggagctggt taccgggaaa gcctttccct32041acgtagtcgt caccggcatc gccatgacga ccgcgctcgc gaccgacgcg gagactacgt32101ggaagttgtt gctggaccgc caaagcggga tccgtacgct cgatgaccca ttcgtcgagg32161agttcgacct gccagttcgc atcggcggac atctgcttga ggaattcgac caccagctga32221cgcggatcga actgcgccgg atgggatacc tgcagcggat gtccaccgtg ctgagccggc32281gcctgtggga aaatgccggc tcacccgagg tggacaccaa tcgattgatg gtgtccatcg32341gcaccggcct gggttcggcc gaggaactgg tcttcagtta cgacgatatg cgcgctcgcg32401gaatgaaggc ggtctcgccg ctgaccgtgc agaagtacat gcccaacggg gccgccgcgg32461cggtcgggtt ggaacggcac gccaaggccg gggtgatgac gccggtatcg gcgtgcgcat32521ccggcgccga ggccatcgcc cgtgcgtggc agcagattgt gctgggagag gccgatgccg32581ccatctgcgg cggcgtggag accaggatcg aagcggtgcc catcgccggg ttcgctcaga32641tgcgcatcgt gatgtccacc aacaacgacg accccgccgg tgcatgccgc ccattcgaca32701gggaccgcga cggctttgtg ttcggcgagg gcggcgccct tctgttgatc gagaccgagg32761agcacgccaa ggcacgtggc gccaacatcc tggcccggat catgggcgcc agcatcacct32821ccgatggctt ccacatggtg gccccggacc ccaacgggga acgcgccggg catgcgatta32881cgcgggcgat tcagctggcg ggcctcgccc ccggcgacat cgaccacgtc aatgcgcacg32941ccaccggcac ccaggtcggc gacctggccg aaggcagggc catcaacaac gccttgggcg33001gcaaccgacc ggcggtgtac gcccccaagt ctgccctcgg ccactcggtg ggcgcggtcg33061gcgcggtcga atcgatcttg acggtgctcg cgttgcgcga tcaggtgatc ccgccgacac33121tgaatctggt aaacctcgat cccgagatcg atttggacgt ggtggcgggt gaaccgcgac33181cgggcaatta ccggtatgcg atcaataact cgttcggatt cggcggccac aacgtggcaa33241tcgccttcgg acggtactaa accccagcgt tacgcgacag gagacctgcg atgacaatca33301tggcccccga ggcggttggc gagtcgctcg acccccgcga tccgctgttg cggctgagca33361acttcttcga cgacggcagc gtggaattgc tgcacgagcg tgaccgctcc ggagtgctgg33421ccgcggcggg caccgtcaac ggtgtgcgca ccatcgcgtt ctgcaccgac ggcaccgtga33481tgggcggcgc catgggcgtc gaggggtgca cgcacatcgt caacgcctac gacactgcca33541tcgaagacca gagtcccatc gtgggcatct ggcattcggg tggtgcccgg ctggctgaag33601gtgtgcgggc gctgcacgcg gtaggccagg tgttcgaagc catgatccgc gcgtccggct33661acatcccgca gatctcggtg gtcgtcggtt tcgccgccgg cggcgccgcc tacggaccgg33721cgttgaccga cgtcgtcgtc atggcgccgg aaagccgggt gttcgtcacc gggcccgacg33781tggtgcgcag cgtcaccggc gaggacgtcg acatggcctc gctcggtggg ccggagaccc33841accacaagaa gtccggggtg tgccacatcg tcgccgacga cgaactcgat gcctacgacc33901gtgggcgccg gttggtcgga ttgttctgcc agcaggggca tttcgatcgc agcaaggccg33961aggccggtga caccgacatc cacgcgctgc tgccggaatc ctcgcgacgt gcctacgacg34021tgcgtccgat cgtgacggcg atcctcgatg cggacacacc gttcgacgag ttccaggcca34081attgggcgcc gtcgatggtg gtcgggctgg gtcggctgtc gggtcgcacg gtgggtgtac34141tggccaacaa cccgctacgc ctgggcggct gcctgaactc cgaaagcgca gagaaggcag34201cgcgtttcgt gcggctgtgc gacgcgttcg ggattccgct ggtggtggtg gtcgatgtgc34261cgggctatct gcccggtgtc gaccaggagt ggggtggcgt ggtgcgccgt ggcgccaagt34321tgctgcacgc gttcggcgag tgcaccgttc cgcgggtcac gctggtcacc cgaaagacct34381acggcggggc atacattgcg atgaactccc ggtcgttgaa cgcgaccaag gtgttcgcct34441ggccggacgc cgaggtcgcg gtgatgggcg ctaaggcggc cgtcggcatc ctgcacaaga34501agaagttggc cgccgctccg gagcacgaac gcgaagcgct gcacgaccag ttggccgccg34561agcatgagcg catcgccggc ggggtcgaca gtgcgctgga catcggtgtg gtcgacgaga34621agatcgaccc ggcgcatact cgcagcaagc tcaccgaggc gctggcgcag gctccggcac34681ggcgcggccg ccacaagaac atcccgctgt agttctgacc gcgagcagac gcagaatcgc34741acgcgcgagg tccgcgccgt gcgattctgc gtctgctcgc cagttatccc cagcggtggc34801tggtcaacgc gaggcgctcc tcgcatgctc ggacggtgcc taccgacgcg ctaacaattc34861tcgagaaggc cggcgggttc gccaccaccg cgcaattgct cacggtcatg acccgccaac34921agctcgacgt ccaagtgaaa aacggcggcc tcgttcgcgt ttggtacggg gtctacgcgg34981cacaagagcc ggacctgttg ggccgcttgg cggctctcga tgtgttcatg ggggggcacg35041ccgtcgcgtg tctgggcacc gccgccgcgt tgtatggatt cgacacggaa aacaccgtcg35101ctatccatat gctcgatccc ggagtaagga tgcggcccac ggtcggtctg atggtccacc35161aacgcgtcgg tgcccggctc caacgggtgt caggtcgtct cgcgaccgcg cccgcatgga35221ctgccgtgga ggtcgcacga cagttgcgcc gcccgcgggc gctggccacc ctcgacgccg35281cactacggtc aatgcgctgc gctcgcagtg aaattgaaaa cgccgttgct gagcagcgag35341gccgccgagg catcgtcgcg gcgcgcgaac tcttaccctt cgccgacgga cgcgcggaat35401cggccatgga gagcgaggct cggctcgtca tgatcgacca cgggctgccg ttgcccgaac35461ttcaataccc gatacacggc cacggtggtg aaatgtggcg agtcgacttc gcctggcccg35521acatgcgtct cgcggccgaa tacgaaagca tcgagtggca cgcgggaccg gcggagatgc35581tgcgcgacaa gacacgctgg gccaagctcc aagagctcgg gtggacgatt gtcccgattg35641tcgtcgacga tgtcagacgc gaacccggcc gcctggcggc ccgcatcgcc cgccacctcg35701accgcgcgcg tatggccggc tgaccgctgg tgagcagacg cagagtcgca ctgcggccgg35761cgcagtgcga ctctgcgtct gctcgcgctc aacggctgag gaactcctta gccacggcga35821ctacgcgctc gcgatcccgt ggcaccagac cgatccgggt ccggcggtcg aggatatcgt35881ccacatccag cgccccctca tgggtcaccg cgtattcgaa ctccgcccgg gtcacgtcga35941tgccgtcggc gaccggctcg gtgggccgct cacatgtggc ggcggcagcg acgttggccg36001cctcggcccc gtaccgcgcc accagcgact cgggcaatcc ggcgcccgat ccgggggccg36061gcccagggtt cgccggtgcg ccgatcagcg gcaggttgcg agtgcggcac ttcgcggctc36121gcaggtgtcg cagcgtgatg gcgcgattca gcacatcctc tgccatgtag cggtattccg36181tcagcttgcc gccgaccaca ctgatcacgc ccgacggcga ttcaaaaaca gcgtggtcac36241gcgaaacgtc ggcggtgcgg ccctggacac cagcaccgcc ggtgtcgatt agcggccgca36301atcccgcata ggcaccgatg acatccttgg tgccgaccgc cgtccccaat gcggtgttca36361ccgtatccag caggaacgtg atctcttccg aagacggttg tggcacatcg ggaatcgggc36421cgggtgcgtc ttcgtcggtc agcccgagat agatccggcc cagctgctcg ggcatggcga36481acacgaagcg gttcagctca ccggggatcg gaatggtcag cgcggcagtc ggattggcaa36541acgacttcgc gtcgaagacc agatgtgtgc cgcggctggg gcgtagcctc agggacgggt36601cgatctcacc cgcccacacg cccgccgcgt tgatgacggc acgcgccgac agcgcgaacg36661actgccgggt gcgccggtcg gtcaactcca ccgaagtgcc ggtgacattc gacgcgccca36721cgtaagtgag gatgcgggcg ccgtgctggg ccgcggtgcg cgcgacggcc atgaccagcc36781gggcgtcgtc gatcaattgc ccgtcgtacg cgagcagacc accgtcgagg ccgtcccgcc36841gaacggtggg agcaatctcc accacccgtg acgccgggat tcggcgcgat cggggcaacg36901tcgccgccgg cgtacccgct agcacccgca aagcgtcgcc ggccaggaaa ccggcacgca36961ccaacgcccg cttggtgtga cccatcgacg gcaacaacgg gaccagttgc ggcatggcat37021gcacgagatg aggagcgttg cgtgtcatca ggattccgcg ttcgacggcg ctgcgccggg37081cgatgcccac gttgccgctg gccagatagc gcagaccgcc gtgcaccaac ttcgagctcc37141agcggctggt gccgaacgcc agatcatgct tttccaccaa ggccaccgtc agaccgcggg37201tggcagcatc taaggcaatg ccaacaccgg taatgccgcc gcctatcacg atgacgtcga37261gtgcgccacc gtcggccagt gcggtcaggt cggcggagcg acgcgccgcg ttgagtgcag37321ccgagtgggg catcagcaca aatatccgtt cagtgcgtgg gtaagttcgg tggccagcgc37381ggcggaatcg aggatcgaat cgacgatgtc cgcggactgg atggtcgact gggcgatcag37441caacaccatg gtcgccagtc gacgagcgtc gccggagcgc acactgcccg accgctgcgc37501cactgtcagc cgggcggcca acccctcgat caggacctgc tggctggtgc cgaggcgctc37561ggtgatgtac accctggcca gctccgagtg catgaccgac atgatcagat cgtcaccccg37621caaccggtcg gccaccgcga caatctgctt taccaacgct tcccggtcgt ccccgtcgag37681gggcacctcc cgcagcacgt cggcgatatg gctggtcagc atggacgcca tgatcgaccg37741ggtgtccggc cagcgacggt atacggtcgg gcggctcacg cccgcgcgcc gggcgatctc37801ggcaagtgtc acccggtcca cgccgtaatc gacgacgcag ctcgccgctg cccgcaggat37861acgaccaccg gtatccgcgc ggtcattact cattgacagc atgtgtaata ctgtaacgcg37921tgactcaccg cgaggaactc cttccaccga tgaaatggga cgcgtgggga gatcccgccg37981cggccaagcc actttctgat ggcgtccggt cgttgctgaa gcaggttgtg ggcctagcgg38041actcggagca gcccgaactc gaccccgcgc aggtgcagct gcgcccgtcc gccctgtcgg38101gggcagacca



6.9. X-Linked Inhibitor of Apoptosis Protein (“XIAP”)

[0214] GenBank Accession #U45880:
23   1gaaaaggtgg acaagtccta ttttcaagag aagatgactt ttaacagttt tgaaggatct(SEQ ID NO: 25)  61aaaacttgtg tacctgcaga catcaataag gaagaagaat ttgtagaaga gtttaataga 121ttaaaaactt ttgctaattt tccaagtggt agtcctgttt cagcatcaac actggcacga 181gcagggtttc tttatactgg tgaaggagat accgtgcggt gctttagttg tcatgcagct 241gtagatagat ggcaatatgg agactcagca gttggaagac acaggaaagt atccccaaat 301tgcagattta tcaacggctt ttatcttgaa aatagtgcca cgcagtctac aaattctggt 361atccagaatg gtcagtacaa agttgaaaac tatctgggaa gcagagatca ttttgcctta 421gacaggccat ctgagacaca tgcagactat cttttgagaa ctgggcaggt tgtagatata 481tcagacacca tatacccgag gaaccctgcc atgtattgtg aagaagctag attaaagtcc 541tttcagaact ggccagacta tgctcaccta accccaagag agttagcaag tgctggactc 601tactacacag gtattggtga ccaagtgcag tgcttttgtt gtggtggaaa actgaaaaat 661tgggaacctt gtgatcgtgc ctggtcagaa cacaggcgac actttcctaa ttgcttcttt 721gttttgggcc ggaatcttaa tattcgaagt gaatctgatg ctgtgagttc tgataggaat 781ttcccaaatt caacaaatct tccaagaaat ccatccatgg cagattatga agcacggatc 841tttacttttg ggacatggat atactcagtt aacaaggagc agcttgcaag agctggattt 901tatgctttag gtgaaggtga taaagtaaag tgctttcact gtggaggagg gctaactgat 961tggaagccca gtgaagaccc ttgggaacaa catgctaaat ggtatccagg gtgcaaatat1021ctgttagaac agaagggaca agaatatata aacaatattc atttaactca ttcacttgag1081gagtgtctgg taagaactac tgagaaaaca ccatcactaa ctagaagaat tgatgatacc1141atcttccaaa atcctatggt acaagaagct atacgaatgg ggttcagttt caaggacatt1201aagaaaataa tggaggaaaa aattcagata tctgggagca actataaatc acttgaggtt1261ctggttgcag atctagtgaa tgctcagaaa gacagtatgc aagatgagtc aagtcagact1321tcattacaga aagagattag tactgaagag cagctaaggc gcctgcaaga ggagaagctt1381tgcaaaatct gtatggatag aaatattgct atcgtttttg ttccttgtgg acatctagtc1441acttgtaaac aatgtgctga agcagttgac aagtgtccca tgtgctacac agtcattact1501ttcaagcaaa aaatttttat gtcttaatct aactctatag taggcatgtt atgttgttct1561tattaccctg attgaatgtg tgatgtgaac tgactttaag taatcaggat tgaattccat1621tagcatttgc taccaagtag gaaaaaaaat gtacatggca gtgttttagt tggcaatata1681atctttgaat ttcttgattt ttcagggtat tagctgtatt atccattttt tttactgtta1741tttaattgaa accatagact aagaataaga agcatcatac tataactgaa cacaatgtgt1801attcatagta tactgattta atttctaagt gtaagtgaat taatcatctg gattttttat1861tcttttcaga taggcttaac aaatggagct ttctgtatat aaatgtggag attagagtta1921atctccccaa tcacataatt tgttttgtgt gaaaaaggaa taaattgttc catgctggtg1981gaaagataga gattgttttt agaggttggt tgttgtgttt taggattctg tccattttct2041tgtaaaggga taaacacgga cgtgtgcgaa atatgtttgt aaagtgattt gccattgttg2101aaagcgtatt taatgataga atactatcga gccaacatgt actgacatgg aaagatgtca2161gagatatgtt aagtgtaaaa tgcaagtggc gggacactat gtatagtctg agccagatca2221aagtatgtat gttgttaata tgcatagaac gagagatttg gaaagatata caccaaactg2281ttaaatgtgg tttctcttcg gggagggggg gattggggga ggggccccag aggggtttta2341gaggggcctt ttcactttcg acttttttca ttttgttctg ttcggatttt ttataagtat2401gtagaccccg aagggtttta tgggaactaa catcagtaac ctaacccccg tgactatcct2461gtgctcttcc tagggagctg tgttgtttcc cacccaccac ccttccctct gaacaaatgc2521ctgagtgctg gggcactttg


[0215] General Target Region:


[0216] Internal Ribosome Entry Site (IRES) in 5′ untranslated region:
245′AGCUCCUAUAACAAAAGUCUGUUGCUUGUGUUUCACAUUUUGGAUU(SEQ ID NO: 26)UCCUAAUAUAAUGUUCUCUUUUUAGAAAAGGUGGACAAGUCCUAUUUUCAAGAGAAG3′


[0217] Initial Specific Target Matif:


[0218] RNP core binding site within XIAP IRES
255′GGAUUUCCUAAUAUAAUGUUCUCUUUUU3′(SEQ ID NO: 27)



6.10. Survivin

[0219] GenBank Accession #NM001168:
26   1ccgccagatt tgaatcgcgg gacccgttgg cagaggtggc ggcggcggca tgggtgcccc(SEQ ID NO: 28)  61gacgttgccc cctgcctggc agccctttct caaggaccac cgcatctcta cattcaagaa 121ctggcccttc ttggagggct gcgcctgcac cccggagcgg atggccgagg ctggcttcat 181ccactgcccc actgagaacg agccagactt ggcccagtgt ttcttctgct tcaaggagct 241ggaaggctgg gagccagatg acgaccccat agaggaacat aaaaagcatt cgtccggttg 301cgctttcctt tctgtcaaga agcagtttga agaattaacc cttggtgaat ttttgaaact 361ggacagagaa agagccaaga acaaaattgc aaaggaaacc aacaataaga agaaagaatt 421tgaggaaact gcgaagaaag tgcgccgtgc catcgagcag ctggctgcca tggattgagg 481cctctggccg gagctgcctg gtcccagagt ggctgcacca cttccagggt ttattccctg 541gtgccaccag ccttcctgtg ggccccttag caatgtctta ggaaaggaga tcaacatttt 601caaattagat gtttcaactg tgctcctgtt ttgtcttgaa agtggcacca gaggtgcttc 661tgcctgtgca gcgggtgctg ctggtaacag tggctgcttc tctctctctc tctctttttt 721gggggctcat ttttgctgtt ttgattcccg ggcttaccag gtgagaagtg agggaggaag 781aaggcagtgt cccttttgct agagctgaca gctttgttcg cgtgggcaga gccttccaca 841gtgaatgtgt ctggacctca tgttgttgag gctgtcacag tcctgagtgt ggacttggca 901ggtgcctgtt gaatctgagc tgcaggttcc ttatctgtca cacctgtgcc tcctcagagg 961acagtttttt tgttgttgtg tttttttgtt tttttttttt ggtagatgca tgacttgtgt1021gtgatgagag aatggagaca gagtccctgg ctcctctact gtttaacaac atggctttct1081tattttgttt gaattgttaa ttcacagaat agcacaaact acaattaaaa ctaagcacaa1141agccattcta agtcattggg gaaacggggt gaacttcagg tggatgagga gacagaatag1201agtgatagga agcgtctggc agatactcct tttgccactg ctgtgtgatt agacaggccc1261agtgagccgc ggggcacatg ctggccgctc ctccctcaga aaaaggcagt ggcctaaatc1321ctttttaaat gacttggctc gatgctgtgg gggactggct gggctgctgc aggccgtgtg1381tctgtcagcc caaccttcac atctgtcacg ttctccacac gggggagaga cgcagtccgc1441ccaggtcccc gctttctttg gaggcagcag ctcccgcagg gctgaagtct ggcgtaagat1501gatggatttg attcgccctc ctccctgtca tagagctgca gggtggattg ttacagcttc1561gctggaaacc tctggaggtc atctcggctg ttcctgagaa ataaaaagcc tgtcatttc



7. EXAMPLE: IDENTIFICATION OF A DYE-LABELED TARGET RNA BOUND TO SMALL MOLECULAR WEIGHT COMPOUNDS

[0220] The results presented in this Example indicate that gel mobility shift assays can be used to detect the binding of small molecules, such as the Tat peptide and gentamicin, to their respective target RNAs.



7.1. Materials and Methods


7.1.1. Buffers

[0221] Tris-potassium chloride (TK) buffer is composed of 50 mM Tris-HCl pH 7.4, 20 mM KCl, 0.1% Triton X-100, and 0.5 mM MgCl2. Tris-borate-EDTA (TBE) buffer is composed of 45 mM Tris-borate pH 8.0, and 1 mM EDTA. Tris-Potassium chloride-magnesium (TKM) buffer is composed of 50 mM Tris-HCl pH 7.4, 20mM KCl, 0.1 % Triton X-100 and 5 mM MgCl2.



7.1.1. Gel Retardation Analysis

[0222] RNA oligonucleotides were purchased from Dharmacon, Inc, Lafayette, Colo.). 500 pmole of either a 5′ fluorescein labeled oligonucleotide corresponding to the 16S rRNA A site (5′-GGCGUCACACCUUCGGGUGAAGUCGCC-3′ (SEQ ID NO: 29); Moazed & Noller, 1987, Nature 327:389-394; Woodcock et al., 1991, EMBO J. 10:3099-3103; Yoshizawa et al., 1998, EMBO J. 17:6437-6448) or a 5′ fluorescein labeled oligonucleotide correspording to the HIV-1 TAR element TAR RNA (5′-GGCGUCACACCUUCGGGUGAAGUCGCC-3′ (SEQ ID NO: 30); Huq et al., 1999, Nucleic Acids Research. 27:1084-1093; Hwang et al., 1999, Proc. Natl. Acad. Sci. USA 96:12997-13002) was 3′ labeled with 5′-32p cytidine 3′, 5′-bis(phosphate) (NEN) and T4 RNA ligase (NEBiolabs) in 10% DMSO as per manufacturer's instructions. The labeled oligonucleotides were purified using G-25 Sephadex columns (Boehringer Mannheim). For Tat-TAR gel retardation reactions the method of Huq et al. (Nucleic Acids Research, 1999, 27:1084-1093) was utilized with TK buffer containing 0.5 mM MgCl2 and a 12-mer Tat peptide (YGRKKRRQRRRP (SEQ ID NO: 31); single letter amino acid code). For 16S rRNA-gentamicin reactions, the method of Huq et al. was used with TKM buffer. In 20 μl reaction volumes 50 pmoles of 32P cytidine-labeled oligonucleotide and either gentamicin sulfate (Sigma) or the short Tat peptide (Tat47-58) in TK or TKM buffer were heated at 90° C. for 2 minutes and allow to cool to room temperature (approximately 24° C.) over 2 hours. Then 10 μl of 30% glycerol was added to each reaction tube and the entire sample was loaded onto a TBE non-denaturing polyacrylamide gel and electrophoresed at 1200-1600 volt-hours at 4° C. The gel was exposed to an intensifying screen and radioactivity was quantitated using a Typhoon phosporimager (Molecular Dynamics).



7.2. Background

[0223] One method used to demonstrate small molecule interactions with natural occurring RNA structures such as ribosomes is by a method called chemical footprinting or toe printing (Moazed & Noller, 1987, Nature 327:389-394; Woodcock et al., 1991, EMBO J. 10:3099-3103; Yoshizawaetal., 1998, EMBO J. 17:6437-6448). Here the use of gel mobility shift assays to monitor RNA-small molecule interactions are described. This approach allows for rapid visualization of small molecule-RNA interactions based on the difference between mobility of RNA alone versus RNA in a complex with a small molecule. To validate this approach, an RNA oligonucleotide corresponding to the well-characterized gentamicin binding site on the 16S rRNA (Moazed & Noller, 1987, Nature 327:389-394) and the equally well-characterized HIV-1 TAT protein binding site on the HIV-1 TAR element (Huq et al., 1999, Nucleic Acids Res. 27: 1084-1093) were chosen. The purpose of these experiments is to lay the groundwork for the use of chromatographic techniques in a high throughput fashion, such as microcapillary electrophoresis, for drug discovery.



7.3. Results

[0224] A gel retardation assay was performed using the Tat47-58 peptide and the TAR RNA oligonucleotide. As shown in FIG. 1, in the presence of the Tat peptide, a clear shift is visible when the products are separated on a 12% non-denaturing polyacrylamide gel. In the reaction that lacks peptide, only the free RNA is visible. These observations confirm previous reports made using other Tat peptides (Hamy et al., 1997, Proc. Natl. Acad. Sci. USA 94:3548-3553; Huq et al., 1999, Nucleic Acids Res. 27: 1084-1093).


[0225] Based on the results of FIG. 1, it was hypothesized that RNA interactions with small organic molecules could also be visualized using this method. As shown in FIG. 2, the addition of varying concentrations of gentamicin to an RNA oligonucleotide corresponding to the 16S rRNA A site produces a mobility shift. These results demonstrate that the binding of the small molecule gentamicin to an RNA oligonucleotide having a defined structure in solution can be monitored using this approach. In addition, as shown in FIG. 2, a concentration as low as 10 ng/ml gentamicin produces the mobility shift.


[0226] To determine whether lower concentrations of gentamicin would be sufficient to produce a gel shift, similar experiment was performed, as shown in FIG. 2, except that the concentrations of gentamicin ranged from 100 ng/ml to 10 pg/ml. As shown in FIG. 3, gel mobility shifts are produced when the gentamicin concentration is as low as 10 pg/ml. Further, the results shown in FIG. 3 demonstrate that the shift is specific to the 16S rRNA oligonucleotide as the use of an unrelated oligonucleotide, corresponding to the HIV TAR RNA element, does not result in a gel mobility shift when incubated with 10 μg/ml gentamicin. In addition, if a concentration as low as 10 pg/ml gentamicin produces a gel mobility shift then it should be possible to detect changes to RNA structural motifs when small amounts of compound from a library of diverse compounds is screened in this fashion.


[0227] Further analysis of the gentamicin-RNA interaction indicates that the interaction is Mg− and temperature dependent. As shown in FIG. 4, when MgCl2 is not present (TK buffer), 1 mg/ml of gentamicin must be added to the reaction to produce a gel shift.


[0228] Similarly, the temperature of the reaction when gentamicin is added is also important. When gentamicin is present in the reaction during the entire denaturation/renaturation cycle, that is, when gentamicin is added at 90C° C. or 85 ° C., a gel shift is visualized (data not shown). In contrast, when gentamicin is added after the renaturation step has proceeded to 75° C., a mobility shift is not produced. These results are consistent with the notion that gentamicin may recognize and interact with an RNA structure formed early in the renaturation process.



8. EXAMPLE: IDENTIFICATION OF A DYE-LABELED TARGET RNA BOUND TO SMALL MOLECULAR WEIGHT COMPOUNDS BY CAPILLARY ELECTROPHORESIS

[0229] The results presented in this Example indicate that interactions between a peptide and its target RNA, such as the Tat peptide and TAR RNA, can be monitored by gel retardation assays in an automated capillary electrophoresis system.



8.1. Materials and Methods


8.1.1. Buffers

[0230] Tris-potassium chloride (TK) buffer is composed of 50 mM Tris-HCl pH 7.4, 20 mM KCl, 0.1% Triton X-100, and 0.5 mM MgCl2. Tris-borate-EDTA (TBE) buffer is composed of 45 mM Tris-borate pH 8.0, and 1 mM EDTA. Tris-Potassium chloride-magnesium (TKM) buffer is composed of 50 mM Tris-HCl pH 7.4, 20 mM KCl, 0.1% Triton X-100 and 5 mM MgCl2.



8.1.1. Gel Retardation Analysis Using Capillary Electrophoresis

[0231] RNA oligonucleotides were purchased from Dharmacon, Inc. Lafayette, Colo.). 500 pmole of a 5′ fluorescein labeled oligonucleotide corresponding to the HIV-1 TAR element TAR RNA (5′-GGCGUCACACCUUCGGGUGAAGUCGCC-3′ (SEQ ID NO: 30); Huq et al., 1999, Nucleic Acids Research. 27:1084-1093; Hwang et al., 1999, Proc. Natl. Acad. Sci. USA 96:12997-13002) was used. For Tat-TAR gel retardation reactions the method of Huq et al. (Nucleic Acids Research, 1999, 27:1084-1093) was utilized with TK buffer containing 0.5 mM MgCl2 and a 12-mer Tat peptide (YGRKKRRQRRRP (SEQ ID NO: 31); single letter amino acid code). In 20 μl reaction volumes 50 pmoles of labeled oligonucleotide and the short Tat peptide (Tat47-58) in TK or TKM buffer were heated at 90° C. for 2 minutes and allow to cool to room temperature (approximately 24° C.) over 2 hours. The reactions were loaded onto a SCE9610 automated capillary electrophoresis apparatus (SpectruMedix; State College, Pa.).



8.2. Results

[0232] As presented in the previous Example in Section 7, interactions between a peptide and RNA can be monitored by gel retardation assays. It was hypothesized that interactions between a peptide and RNA could be monitored by gel retardation assays by an automated capillary electrophoresis system. To test this hypothesis, a gel retardation assay by an automated capillary electrophoresis system was performed using the Tat47-58 peptide and the TAR RNA oligonucleotide. As shown in FIG. 5 using the capillary electrophoresis system, in the presence of the Tat peptide, a clear shift is visible upon the addition of increasing concentrations of Tat peptide. In the reaction that lacks peptide, only a peak corresponding to the free RNA is observed. These observations confirm previous reports made using other Tat peptides (Hamy et al., 1997, Proc. Natl. Acad. Sci. USA 94:3548-3553; Huq et al., 1999, Nucleic Acids Res. 27: 1084-1093).


[0233] The present invention is not to be limited in scope by the specific embodiments described herein. Indeed, various modifications of the invention in addition to those described will become apparent to those skilled in the art from the foregoing description and accompanying figures. Such modifications are intended to fall within the scope of the appended claims.


[0234] Various publications are cited herein, the disclosures of which are incorporated by reference in their entireties.


[0235] The invention can be ilustrated by the following embodiments enumerated in the numbered paragraphs that follow:


[0236] 1. A method for identifying a test compound that binds to a target RNA molecule, comprising the steps of (a) contacting a detectably labeled target RNA molecule with a library of test compounds under conditions that permit direct binding of the labeled target RNA to a member of the library of test compounds so that a detectably labeled target RNA:test compound complex is formed; (b) separating the detectably labeled target RNA:test compound complex formed in step(a) from uncomplexed target RNA molecules and test compounds; and (c) determining a structure of the test compound bound to the RNA in the RNA:test compound complex.


[0237] 2. The method of paragraph 1 in which the target RNA molecule contains an HIV TAR element, internal ribosome entry site, “slippery site”, instability element, or adenylate uridylate-rich element.


[0238] 3. The method of paragraph 1 in which the RNA molecule is an element derived from the mRNA for tumor necrosis factor alpha (“TNF-α”), granulocyte-macrophage colony stimulating factor (“GM-CSF”), interleukin 2 (“IL-2”), interleukin 6 (“IL-6”), vascular endothelial growth factor (“VEGF”), human immunodeficiency virus I (“HIV-1”), hepatitis C virus (“HCV”—genotypes 1a & 1b), ribonuclease P RNA (“RNaseP”), X-linked inhibitor of apoptosis protein (“XIAP”), or survivin.


[0239] 4. The method of paragraph 1 in which the detectably labeled RNA is labeled with a fluorescent dye, phosphorescent dye, ultraviolet dye, infrared dye, visible dye, radiolabel, enzyme, spectroscopic colorimetric label, affinity tag, or nanoparticle.


[0240] 5. The method of paragraph 1 in which the test compound is selected from a combinatorial library comprising peptoids; random bio-oligomers; diversomers such as hydantoins, benzodiazepines and dipeptides; vinylogous polypeptides; nonpeptidal peptidomimetics; oligocarbamates; peptidyl phosphonates; peptide nucleic acid libraries; antibody libraries; carbohydrate libraries; and small organic molecule libraries, including but not limited to, libraries of benzodiazepines, isoprenoids, thiazolidinones, metathiazanones, pyrrolidines, morpholino compounds, or diazepindiones.


[0241] 6. The method of paragraph 1 in which screening a library of test compounds comprises contacting the test compound with the target nucleic acid in the presence of an aqueous solution, the aqueous solution comprising a buffer and a combination of salts, preferably approximating or mimicking physiologic conditions.


[0242] 7. The method of paragraph 6 in which the aqueous solution optionally further comprises non-specific nucleic acids comprising DNA, yeast tRNA, salmon sperm DNA, homoribopolymers, and nonspecific RNAs.


[0243] 8. The method of paragraph 6 in which the aqueous solution further comprises a buffer, a combination of salts, and optionally, a detergent or a surfactant. In another embodiment, the aqueous solution further comprises a combination of salts, from about 0 mM to about 100 mM KCl, from about 0 mM to about 1 M NaCl, and from about 0 mM to about 200 mM MgCl2. In a preferred embodiment, the combination of salts is about 100 mM KCl, 500 mM NaCl, and 10 mM MgCl2. In another embodiment, the solution optionally comprises from about 0.01% to about 0.5% (w/v) of a detergent or a surfactant.


[0244] 9. Any method that detects an altered physical property of a target nucleic acid complexed to a test compound from the unbound target nucleic acid may be used for separation of the complexed and non-complexed target nucleic acids in the method of paragraph 1. In a preferred embodiment, elecfrophoresis is used for separation of the complexed and non-complexed target nucleic acids. In a preferred embodiment, the electrophoresis is capillary electrophoresis. In other embodiments, fluorescence spectroscopy, surface plasmon resonance, mass spectrometry, scintillation, proximity assay, structure-activity relationships (“SAR”) by NMR spectroscopy, size exclusion chromatography, affinity chromatography, and nanoparticle aggregation are used for the separation of the complexed and non-complexed target nucleic acids.


[0245] 10. The structure of the test compound of the RNA:test compound complex of paragraph 1 is determined, in part, by the type of library of test compounds. In a preferred embodiment wherein the combinatorial libraries are small organic molecule libraries, mass spectroscopy, NMR, or vibration spectroscopy are used to determine the structure of the test compounds.


Claims
  • 1. A method for identifying a test compound that binds to a target RNA molecule, comprising the steps of: (a) contacting a detectably labeled target RNA molecule with a library of test compounds under conditions that permit direct binding of the labeled target RNA to a member of the library of test compounds so that a detectably labeled target RNA:test compound complex is formed; (b) separating the detectably labeled target RNA:test compound complex formed in step (a) from uncomplexed target RNA molecules and test compounds; and (c) determining a structure of the test compound bound to the RNA in the RNA:test compound complex.
  • 2. The method of claim 1 in which the target RNA molecule contains an HIV TAR element, internal ribosome entry site, “slippery site”, instability element, or adenylate uridylate-rich element.
  • 3. The method of claim 1 in which the RNA molecule is an element derived from the mRNA for tumor necrosis factor alpha (“TNF-α”), granulocyte-macrophage colony stimulating factor (“GM-CSF”), interleukin 2 (“IL-2”), interleukin 6 (“IL-6”), vascular endothelial growth factor (“VEGF”), human immunodeficiency virus I (“HIV-1”), hepatitis C virus (“HCV”—genotypes 1a & 1b), ribonuclease P RNA (“RNaseP”), X-linked inhibitor of apoptosis protein (“XIAP”), or survivin.
  • 4. The method of claim 1 in which the detectably labeled RNA is labeled with a fluorescent dye, phosphorescent dye, ultraviolet dye, infrared dye, visible dye, radiolabel, enzyme, spectroscopic colorimetric label, affinity tag, or nanoparticle.
  • 5. The method of claim 1 in which the test compound is selected from a combinatorial library comprising peptoids; random bio-oligomers; diversomers such as hydantoins, benzodiazepines and dipeptides; vinylogous polypeptides; nonpeptidal peptidomimetics; oligocarbamates; peptidyl phosphonates; peptide nucleic acid libraries; antibody libraries; carbohydrate libraries; or small organic molecule libraries.
  • 6. The method of claim 5 in which the small organic molecule libraries are libraries of benzodiazepines, isoprenoids, thiazolidinones, metathiazanones, pyrrolidines, morpholino compounds, or diazepindiones.
  • 7. The method of claim 1 in which screening a library of test compounds comprises contacting the test compound with the target nucleic acid in the presence of an aqueous solution wherein the aqueous solution comprises a buffer and a combination of salts.
  • 8. The method of claim 7 wherein the aqueous solution approximates or mimics physiologic conditions.
  • 9. The method of claim 7 in which the aqueous solution optionally further comprises non-specific nucleic acids comprising DNA, yeast tRNA, salmon sperm DNA, homoribopolymers, and nonspecific RNAs.
  • 10. The method of claim 7 in which the aqueous solution further comprises a buffer, a combination of salts, and optionally, a detergent or a surfactant.
  • 11. The method of claim 10 in which the aqueous solution further comprises a combination of salts, from about 0 mM to about 100 mM KCl, from about 0 mM to about 1 M NaCl, and from about 0 mM to about 200 MM MgCl2.
  • 13. The method of claim 11 wherein the combination of salts is about 100 mM KCl, 500 mM NaCl, and 10 mM MgCl2.
  • 14. The method of claim 10 wherein the solution optionally comprises from about 0.01% to about 0.5% (w/v) of a detergent or a surfactant.
  • 15. The method of claim 1 in which separating the detectably labeled target RNA:test compound complex formed in step (a) from uncomplexed target RNA and test compounds is by electrophoresis.
  • 16. The method of claim 15 in which the electrophoresis is capillary electrophoresis.
  • 17. The method of claim 1 in which separating the detectably labeled target RNA:test compound. complex formed in step (a) from uncomplexed target RNA and test compounds is by fluorescence spectroscopy, surface plasmon resonance, mass spectrometry, scintillation, proximity assay, structure-activity relationships (“SAR”) by NMR spectroscopy, size exclusion chromatography, affinity chromatography, or nanoparticle aggregation.
  • 18. The method of claim 1 in which the library of test compounds are small organic molecule libraries.
  • 19. The method of claim 18 in which the structure of the test compound is determined by mass spectroscopy, NMR, or vibration spectroscopy.
Parent Case Info

[0001] This application claims the benefit of U.S. Provisional Application No. 60/282,965, filed Apr. 11, 2001, which is incorporated herein by reference in its entirety.

PCT Information
Filing Document Filing Date Country Kind
PCT/US02/11757 4/11/2002 WO
Provisional Applications (1)
Number Date Country
60283869 Apr 2001 US