The present inventions are in the field of methods and constructs useful in targeting viruses with disabling virus like particles. The particles are directed to include disabled essential enzymes so they can only replicate in the presence of rescuing enzymes of a target virus. The particles can include sequences encoding an miRNA against a highly conserved sequence in an essential gene, and include a copy of the essential gene modified to avoid binding by the miRNA. The particles can be used in methods to inhibit an infections target virus from replicating.
The wikipedia (the free Encyclopedia http://en.wikipedia.org/wiki/) gives a brief description of “Discovery and development of HIV-protease inhibitors”. Presently, there are several treatment options for the HIV, such as nucleoside/nucleotide-based reverse transcriptase inhibitors (Freeman et al., (2004)), non-nucleoside reverse transcriptase inhibitors (Hopkins et al., (2004)), protease inhibitors (Craig, et al, (1991), Kempf et al, (1995)), fusion inhibitors (Wyatt, et al, (1994)), integrase inhibitors, maturation inhibitors, uncoating inhibitors, transcription inhibitors, translation inhibitors, and also combination therapies. These options include mainly small molecular enzyme inhibitors. They may require continuous intake of the medication for as long as the infection stays. Entry or fusion inhibitors disrupt the fusion of the virus and the target cell.
The compound enfuvirtide (Lalezari, et al. (2003)), a polypeptide and not a small molecule, has to be administered by injection. There are efforts using gene therapy targeting knock out of the CCR5 (Hütter, et al, (2009)) receptor. A curative approach will likely involve removing and treating stem cells with knockout genes for CCR5 and possibly CXCR4, administering high-dose chemotherapy to wipe out the existing HIV-susceptible immune system, and finally transplanting the modified stem cells to rebuild an immune system that is resistant to the virus. It has been observed that the HIV replication persists from anatomical sites where drug penetration is limited.
Antisense oligonucleotides were first used by Stephenson and Zamecnik (1978) in cell cultures, to inhibit RSV replication. Antisense mediated gene suppression against HIV-1 envelope gene has been reported in the literature Lu, et al, (2004). RNA interference (RNAi) provides post-transcriptional gene silencing (PTGS) and has been well described. It has been shown (Naito, et al, (2004), Zeng, et al, (2005)) that short (less than ˜22-26-mer) double stranded RNAs (dsRNA) are degraded without invoking an interferon response. RNAi binds to a protein to form a complex (RISC/RNA-induced silencing complex), which binds to mRNA transcripts with complementary nucleotide sequences (antisense) to the RNAi and degrades before the mRNA gets translated. Thus, RNAi-mediated post-transcriptional silencing offers a potentially powerful tool to inhibit viral replication. Guo and Kemphues (1995) and others (Zamore (2001), Novina, et al, (2002), Tuschl (2002), Zeng and Cullen (2002), Hemann, et al, (2003), Das, et al, (2004), Ge, et al, (2010), Saayman, et al, (2010), Liu, et al, (2011)) have described attempts to silence viral genes using small interfering RNAs (siRNA). Similar to the function of the synthetic siRNAs many groups (Lagos-Quintana, et al, (2001), Lau, et al, (2001), Zeng and Cullen (2002), Zeng, et al, (2002), Zeng and Cullen (2003), Lim et al, (2003), Zeng, et al, (2003), Zeng and Cullen (2004), Lim, et al, (2005), Creighton, et al, (2010)) have identified and worked with naturally occurring micro RNA (miRNAs), which have a different pathway compared to siRNA (Thompson (2002), Schwarz, et al, (2003)).
Many have identified and worked with small hairpin RNA (shRNA) (Paddison (2002), Boden, et al, (2004), Li, et al, (2007), Liu, et al, (2011)) to induce sequence-specific silencing in mammalian cells. The shRNA has basically a stem(up)-loop-stem(down) construct. The stem(up) and stem(down) are primarily reverse complementary to each other and will be paired. The antisense could be either the stem(up) or the stem(down) sequence. The difference between siRNAs and miRNAs is in their structures, where the siRNAs are double stranded, synthetic molecules needing perfect complementarity to function and miRNAs are single stranded, natural molecules that function even against partially complementary sequences. As a further development, the shRNAs have been reported to be part of a larger precursor sequence called pre-miRNAs from where the miRNAs are released (Lee, et al, (2003), Boden, et al, (2004), and Zeng et al, (20052)).
There are two distinct steps involved in the culmination of the mature miRNA from the pre-miRNA. First the enzyme “drosha” cuts the paired “stem” and loop of the stem(up)-loop-stem(down) from the pre-miRNA. Approximately 22 nts are cut from the start of the loop. Then, the enzyme “dicer” cuts the loop and liberate the mature RNA. Also, “drosha” cuts the RNA sequence to create antisense sequences with an efficiency that depends on the length of flanking sequences on either side of the shRNA; the length of the sequences are to be greater than ˜60 nucleotide sequences (Lee, et al, (2003), Zeng, et al, (20051,2), Zeng and Cullen (2005), Feng, et al, (2011), Boden, et al, (2004)). Either the stem(up) or stem(down) or both could be mature RNAs. The stem(up) mature RNA has been designated as shorter stem hairpin RNA (L-sshRNA) (Khvorova, et al, (2003), Ge, et al, (2010)) where the antisense is placed at the 5′ end of the loop. The antisense placed at the 5′ side of the loop tend to be more potent, than those placed at the 3′ end of the loop.
In the quest to understand RNA and the protein partially involved in packaging, Aldovini and Young (1990) found that virus particles packaged with mutant sequences were unable to productively infect the cells even though the protein contents were similar to that of the infectious virus. There are many articles describing the achievement of making virus like particles (VLP) (Karacostas, et al, (1989), Haffar, et al, (1990), Carroll, et al, (1994), Haddrick, et al, (1996), Haselhorst, et al, (1998), Beckett and Miller (2007), Pal, et al, (2007), V Peremyslov and V Dolja (2007), Cornetta, et al (2008)). Georgens, et al, (2005) have proposed to use VLP as drug delivery systems by the inserting single chain Fv's or immunoglobulin binding domains or by covalently linking the active agents to VP2.
In light of these advances, there remain problems in treatment of viral diseases. For example, small molecule drugs are prone to resistance and may fail to penetrate all sanctuaries. Live virus vaccines may become pathogenic or cause mutations in normal cells as they incorporate into a genome. It would be desirable to have a focused and self limiting construct that destroys a virus's ability to replicate and spread, without the dangers of control loss or hazards to normal cells. The present invention provides these and other features that will be apparent upon review of the following.
The present inventions include therapeutic viruses (TVs) and methods of their use in inhibiting infectious viruses in their host cells. The TVs can be incapable of infection due to inactivating mutations in essential genes, but rendered viable in the presence of the target infectious virus. The TVs can further encode a pre-miRNA, e.g., capable of antisense targeting of one or more essential target virus gene, e.g., while harboring a modified form of the gene not subject to inhibition by the miRNA. The methods include provision of the TVs and contacting them with cells infected by the target virus.
A TV capable of inhibiting propagation of a target virus can include an inactive essential gene for propagation of the TV in the host of the target virus (or have the gene essential for the propagation of TV absent or deleted), so that the TV can not propagate alone in the normal host of the target virus, but the TV can propagate in the presence the target virus providing of an active form of the essential gene. The TV can also include a sequence encoding a pre-miRNA, wherein an miRNA product from the pre-miRNA is has a first affinity for a highly conserved sequence of a gene in the infectious target virus, e.g., so that the miRNA inhibits translation of the highly conserved sequence. Further, the TV can have a modified version of the highly conserved target sequence, configured to transcribe an mRNA with a second affinity lower that the first affinity for the miRNA sequence, and which modified version of the highly conserved target sequence encodes an active form of the target sequence peptide product. Such a TV will fail to propagate in host cells without the presence of the target virus, yet also inhibit propagation of the target virus in the host cell when the TV is present. In this way, an infected cell will accept and replicate the TV, but after replication, the TV will prevent propagation of the infections virus.
In many embodiments, the TV is a modified version of the infectious target virus. For example, the TV can reflect the target virus, but be modified to inactivate a first essential gene product essential for propagation, and be modified at a second essential gene to increase mismatches to the miRNA, while retaining function of the second essential gene translation product peptide. The wherein the first essential gene and second essential gene can optionally be the same gene or other than the same gene. In many cases, the first essential gene is a gene associated with replication of the virus, e.g., a reverse transcriptase (RT), integrase, and/or an RNA dependent RNA polymerase (RDRP). In an embodiment, the target host cell encodes a functional reverse transcriptase, RNA dependent RNA polymerase (RDRP), and/or integrase enzyme, and the TV does not. Optionally the second essential gene can include genes associated with replication of the virus and/or structural genes.
In certain preferred embodiments, the TV is deficient in at least two enzymes necessary for replication in a host cell for the target virus. In this way, a mutation reactivating one enzyme gene will fail to rescue independent viability of the TV.
In many embodiments, the pre-miRNA is adapted to provide the miRNA product by the presence of Dicer or Drosha cutting sites in the pre-miRNA.
In certain embodiments, the miRNA, or pre-miRNA has at least 85% identity to UAUUGCUGGUGAUCCUUUCCA (SEQ ID NO: 1), CUGUCCAUUUAUCAGGAUGGAG (SEQ ID NO: 2), CCAAUCCCCCCUUUUCUUUUAAA (SEQ ID NO: 3), AUACUGCCAUUUGUACUGCUGU (SEQ ID NO: 4), and/or a complementary sequence thereof.
In certain particular embodiments for TVs against human immunodeficiency virus (HIV) the TV can have at least 85% identity to the sequence of
In another particular embodiment, the TV has at least 85% identity to the sequence of
Also included in the present invention are methods of inhibiting propagation of infectious viruses. A method of inhibiting replication of a target infectious virus can include providing a TV of the invention and contacting the TV with a cell infected with the target infectious virus. For example, the method can include providing a therapeutic virus (TV) comprising an inactive essential gene for propagation in the host of the target virus (or having an inactivated or missing version of essential gene) in the normal host of the target virus. In this way, the TV can not replicate alone in the normal host of the target virus but the TV can propagate in the presence the target virus, which provides a copy of an active form of the essential gene. The TV further includes a sequence encoding a pre-miRNA, e.g., cleavable to provide an miRNA product with an antisense affinity for a highly conserved sequence of the target virus in a gene of interest (typically an gene essential for replication or propagation of the target virus) so that the miRNA can inhibit translation of the highly conserved sequence. In addition, the TV can include a modified version of the highly conserved target sequence adapted to transcribe into a mRNA with less affinity (e.g., antisense binding) for the miRNA sequence. This feature can provide for continued TV propagation with an active form of the target sequence peptide product while the target virus is previously disabled by the TV. On contact with the provided TV, a target virus infected cell can be inhibited from taking part in further progression of the target virus infectious progression.
In certain aspects of the methods, the TV is adapted to be deficient in at least two enzymes necessary for replication in a host cell for the target virus. The TV can be a modified version of the target virus, e.g., with an inactivated or deleted first essential gene product, and a second essential gene modified to increase antisense mismatches to the miRNA while retaining function of the second essential gene translation product peptide. In many cases, the first essential gene and second essential gene are other than the same gene. The first essential gene and second essential gene can be the same gene or different genes. For example, in the methods, first essential genes are directed to initial gene replication and/or integration functions, e.g., reverse transcriptases (RT), an integrases, and/or an RNA dependent RNA polymerases (RDRP).
In the methods, the pre-miRNA can be converted into the miRNA by cutting with Cutter or Drosha.
The methods can include inhibiting translation of the highly conserved sequence by hybridization of the miRNA to an mRNA of the target virus, which mRNA encodes the highly conserved sequence. The methods can further comprise modifying the highly conserved target sequence to provide the modified version in the TV by changing codon triplet codes to alternate codons encoding the same amino acid. The methods can further comprise adapting the miRNA to not bind to the modified TV highly conserved target sequence under intracellular host cell conditions.
The present inventions include identification of highly conserved stretches of nucleotides using the multiple sequence alignments to select sites for creating antisense sequences by calculating the highest percentage nucleotide frequency at each location of the entire length of the genome of interest. A moving window of desired length is used to find the maximum scoring stretches.
In designing shRNA to produce the designed (antisense) miRNA, the orientation of the miRNA can include, e.g., stem(up), at least 3 nucleotides (nts) at the 3′ end of the designed miRNA, the loop sequences, and at least 3 sequences following the loop sequence are identical from with the corresponding site in the designed shRNA. In designing shRNA to produce the designed (antisense) miRNA, the orientation of the miRNA can be, e.g., stem(down), at least 3 nts at the 5′ end of the designed miRNA, the loop sequences, and at least 3 sequences before the 5′ end of the loop sequence are identical from with the corresponding sites from an observed shRNA including the ones in a public database. For example, see
In designing pre-miRNAs, the flanking sequences can be, e.g., of length at least 6 nts adjoining 5′ end of the miRNA of the designed pre-miRNA, the orientation of the designed miRNA being stem(up), at least 3 nucleotides at the 5′ end of the designed miRNA, at least 3 nucleotide sequence opposing this 5′ end sequences, and at least 6 nts following the opposing sequence are identical to the corresponding sites from an observed pre-miRNA including a pre-miRNA, e.g., from a public database.
In designing pre-miRNAs, the flanking sequences can be, e.g., of length at least 6 nts adjoining 3′ end of the miRNA of the designed pre-miRNA, the orientation of the designed miRNA being stem (down), at least 3 nucleotides at the 3′ end of the designed miRNA, at least 3 nucleotide sequence opposing this 3′ end sequences, and at least 6 nts followed by the opposing sequence are identical to the corresponding sites from an observed pre-miRNA including a pre-miRNA, e.g., from a public database.
The present inventions include a pharmaceutical composition comprising, or containing within it, a therapeutically effective amount of a product described above.
The inventive method can include design for a mutant non-replicative virus or pathogen to deliver a therapeutic composition comprising of at least one product of claims 2-8 to infected cells. The catalytic site or substrate binding sites of at least one of the enzymes either maintained or coded for by the therapeutic virus or pathogen, are sufficiently modified with mutations, insertions, or deletions either by site specific changes or by natural selection, so that the enzyme is inactive and the chances of regaining activity is very low.
Another aspect of this invention is to protect the therapeutic non-replicative virus or pathogen. The nucleotides corresponding to the selected conserved sites identified above can be modified by insertions, deletions, and/or mutations so that there is a high dissimilarity at nucleic acid level and high similarity at the amino acid level between the modified site of the therapeutic virus or pathogen and the antisense miRNA designed for that site.
Another aspect of this invention is to completely or partly remove the nucleotides corresponding to the gene of the enzyme from the therapeutic non-replicative virus or pathogen that the antisense miRNAs were developed against.
One or more of the miRNA or shRNA, or pre-miRNA of described above can be introduced into the untranslated region of the virus' or pathogen's genome.
The inventions contemplate using a modified form of a noninfectious virus or pathogen, such as in a combination of embodiments described above, to deliver the pre-miRNAs and in turn miRNA to a host cells.
The modified forms of a virus or pathogen can be allowed to produce the designed miRNAs and propagate the designed miRNA automatically from within the cell either by integration into the host genome or by forming a separate plasmid.
Products above can optionally be in the form of RNA or DNA, as appropriate, e.g., to the particular infectious virus target.
Before describing the present invention in detail, it is to be understood that this invention is not limited to particular devices or biological systems, which can, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting. As used in this specification and the appended claims, the singular forms “a”, “an” and “the” include plural referents unless the content clearly dictates otherwise. Thus, for example, reference to “a surface” includes a combination of two or more surfaces; reference to “bacteria” includes mixtures of bacteria, and the like.
Keywords used in this description of the inventions, and well known to those skilled in the art include: antisense, dicer, Drosha, dsRNA, mirbase, miRNA, MSA, pre-miRNA, PTGS, RNAi, shRNA, TV, UTR, and VLP.
CCR5—chemokine receptor 5
CXCR4—CXC chemokine receptor 4
dsRNA—double stranded RNA
HIV-1—human immuodeficiency virus type-1
miRNA—micro RNA
PTGS—post-transcriptional gene silencing
RISC—RNA-induced silencing complex
RNAi—RNA interference
shRNA—short hairpin RNA
sshRNA—shorter stem hairpin RNA
UTR—Untranslated region
The terms “identical” or percent “identity,” in the context of two or more nucleic acid or polypeptide sequences, refer to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same, when compared and aligned for maximum correspondence, as measured using one of the sequence comparison algorithms described below (or other algorithms available to persons of skill), or by visual inspection.
Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith & Waterman, Adv. Appl. Math. 2:482 (1981), by the homology alignment algorithm of Needleman & Wunsch, J. Mol. Biol. 48:443 (1970), by the search for similarity method of Pearson & Lipman, Proc. Natl. Acad. Sci. USA 85:2444 (1988), by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., Madison, Wis.), or by visual inspection (see generally, Ausubel et al., infra).
One example of an algorithm that is suitable for determining percent sequence identity and sequence similarity is the BLAST algorithm, which is described in Altschul et al., J. Mol. Biol. 215:403-410 (1990). Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (www.ncbi.nlm.nih.gov/). This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold (Altschul et al., supra). These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are then extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores for nucleotide sequences are calculated using the parameters M (reward score for a pair of matching residues; always >0) and N (penalty score for mismatching residues; always <0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 11, an expectation (E) of 10, a cutoff of 100, M=5, N=−4, and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (see, Henikoff & Henikoff (1989) Proc. Natl. Acad. Sci. USA 89:10915).
In addition to calculating percent sequence identity, the BLAST algorithm can also perform a statistical analysis of the similarity between two sequences (see, e.g., Karlin & Altschul, Proc. Natl. Acad. Sci. USA 90:5873-5787 (1993)). One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid to the reference nucleic acid is less than about 0.1, more preferably less than about 0.01, and most preferably less than about 0.001. Nucleic acids are considered similar to, and within the purview of the present invention, if they are similar to unique nucleic acids of the invention with smallest sum probability of than about 0.1, preferably less than about 0.01, and more preferably less than about 0.001.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the invention pertains. Although any methods and materials similar or equivalent to those described herein can be used in the practice for testing of the present invention, the preferred materials and methods are described herein. In describing and claiming the present invention, the following terminology will be used in accordance with the definitions set out below.
Per our design, the therapeutic virus (TV) has to be non-infectious to normal cells not infected by the pathogenic virus. This condition is achieved by selectively mutating, deleting, and/or inserting residues lining the active site of one or more of the enzymes essential for the propagation of the infectious virus. The TV should be able to permeate (access) all normal host cells as does the infectious virus to maintain selectivity and specificity. Therefore, any modifications to the enzymes of the virus should not impart significant changes to the three dimensional structure of the enzymes and in turn packing within the capsid. Because of this, the shape and characteristics of the surface of the TV should closely resemble that of the infectious virus. As the TV is missing enzymes essential for the propagation, e.g., no functional reverse transcriptase (RT) and/or integrase enzymes, it depends on existence of RT and integrase in that cell, possible only in those cells that have already been infected by a target virus, e.g., such as HIV-1. Once the essential enzymes are available in a cell, the genome of the TV can be propagated, e.g., by incorporation into the host cell's genome, in a fashion similar to that of the infectious target virus.
The TV can have pre-miRNAs inserted into its UTR region. In the case of a TV for HIV-1, the miRNA can be inserted after the stop codon of the Nef domain and become part of the host genome. When the TV genome is replicated, mRNAs would be produced for the pre-miRNA portions and processed by the dicer and drosha enzymes. The resulting miRNAs bind to the transcripts of the infectious virus and interfere with the production of those key viral enzymes leading to the propagation of the infectious virus.
The miRNA binding sites can also be modified in the TV, so that they would not interfere with the production of the TV. During this process, the TV should be packaged and secreted. The secreted TV would affect more cells to propagate their beneficial function. The production of the TV continues until the infected cells die. This aspect of this method would have long term benefits. On the other hand, the TV would not affect healthy cells. We believe that production of TV would stop after clearing all of the infectious virus and the depletion of functional RT and integrase in the host cells. Therefore, patients need not take the treatment continuously for a long term, rather than a few rounds of the therapy.
The next aspect of our design for, e.g., retrovirus therapy is that the miRNAs will not interfere with the production of TV. By design, we have selected to mutate only the active site residues to render RT and integrase inactive. The miRNAs, targeting a particular stretch of the gene of those enzymes from the infectious virus, would also bind to the genes of the enzymes from the TV and stop their production. To avoid this outcome, we have modified those binding sites within the gene of the enzymes of the TV not to bind the designed miRNAs.
Like any virus, the TV would undergo random mutations rendering the process ineffective after a few cell cycles. To address this issue, we have targeted more than one conserved site and created the corresponding miRNAs to reduce the potential for such inactivation to occur.
Another design aspect is the crossover between the infectious and therapeutic virus. Even if the TV acquires a copy of the gene of an active enzyme, it would be targeted by the miRNAs and destroyed.
Another aspect of the method is having an immune response just like what the infectious virus would face. Though, there will be an immune response, the total number of infective virus would start decreasing once the therapy starts working.
Another important issue is the insertion of the TV at wrong locations within the host genome and this could very well happen from the infectious virus within the patient and so there is no additional risk to what already exists within the infected cells.
In the quest to understand RNA and the protein partially involved in packaging, Aldovini and Young (1990) found that virus particles packaged with mutant sequences were unable to productively infect the cells even though the protein contents were similar to that of the infectious virus. There are many articles describing the achievement of making virus like particles (VLP) (Karacostas, et al, (1989), Haffar, et al, (1990), Carroll, et al, (1994), Haddrick, et al, (1996), Haselhorst, et al, (1998), Beckett and Miller (2007), Pal, et al, (2007), V Peremyslov and V Dolja (2007), Cornetta, et al, (2008)). Georgens, et al, (2005) have proposed to use VLP as drug delivery systems by the inserting single chain Fv's or immunoglobulin binding domains or by covalently linking the active agents to VP2.
Steps involved in the method of designing of therapeutic virus (TV):
1) Inactivate the virus by inactivating one or more of its essential enzymes.
2) Identify the linear sequence within the gene coding for the enzyme to create an antisense miRNA that would bind to this sequence.
3) Modify that stretch within the genome of the TV to protect the TV from the designed antisense miRNA.
4) Design the pre-miRNA for the antisense miRNA.
One of the features of the TV is that it should not be active in normal uninfected cells. To this end, we have decided to inactivate one or more of the enzymes that are essential for the activity of the infectious virus. Our strategy is to identify and mutate those residues that line the active site, so that the enzyme is rendered inactive, while not affecting the folding of the enzyme. Crystal structures of these enzymes, when available, provide information to design these inactive enzymes. One may also get an idea of possible mutations from analyzing the sequence alignments of the enzyme with highly homologous enzymes. Mutations to the active site could be designed through directed evolution. By employing one or more of these techniques, one should be able to modify the enzyme to have desired properties.
Identification of an miRNA Target.
We would like to use the miRNA to disrupt the propagation of the infectious virus. In instances where other groups (Lagos-Quintana, et al, (2001), Lau, et al, (2001), Zeng and Cullen (2002), Zeng, et al, (2002), Zeng and Cullen (2003), Lim et al, (2003), Zeng, et al, (2003), Zeng and Cullen (2004), Lim, et al, (2005), Creighton, et al, (2010)) have used existing miRNAs to achieve results, we describe herein a method to create miRNA that would be directed against the enzyme of interest. First, we want to select sites within the gene of the enzyme that would provide a good target for the miRNA to bind, with the stipulation that the linear stretch would be maximally conserved within the gene of the enzyme.
Next we find a sequentially conserved stretch of nucleotides of the enzyme of interest. For this purpose, we identify a conserved stretch (22-24 nts) within the gene of the enzyme that could lead to inactivating the enzyme. One of the sequences for the enzyme was obtained from one of the public databases. This sequence was used to pull all the nucleotide sequences from the National Center for Biotechnology Information (NCBI) (www.ncbi.nlm.nih.gov) using “blastn” (Altschul, et al, (1997), Stephen et al., (1997)) against “Genbank (Benson et al., (2003)), EMBL (Stoesser et al., (2003)), DDBJ (Tateno et al., (1998)), PDB (Berman et, (2000))” sequence databases. A separate program was written to collect all the sequences from the blast output with high sequence identity (>80%) to the query sequence and a length greater than 90% of the query sequence to get a multiple sequence alignment (MSA), maintaining the alignment from the blast run. Another script was written to analyze the multiple sequence alignment file to calculate the frequency of occurrence as a percentage for each nucleotide at each location. This analysis provided stretches that are maximally conserved for a given length of the stretch.
A new nucleotide sequence for the enzyme is created with nucleotides that have the highest frequency at each location. For each position along the length of the sequence, the nucleotide having the maximum frequency as well as its frequency is stored. Using this, we calculate the cumulative frequency for a given window of nucleotides (22 to 24) at every location and are sorted in descending order of their scores of cumulative frequencies. Next, the enzyme protein structure is analyzed visually using “PYMOL” graphics program (PyMOL™ Evaluation Product—Copyright (C) 2008, DeLano Scientific LLC). The amino acids corresponding to those stretches are viewed to determine if they constitute the active site or support the structure of active site. A reverse complementary sequence (antisense) to this stretch of ˜22 nucleotides would define the miRNA.
Protection of the TV from the Designed miRNA.
Because the genome of the TV is modeled from the infectious virus, the designed miRNA against the infections virus may be expected to disrupt the production of the TV. To avoid this case, the binding stretch within the genome of the TV is modified to have a different sequence of nucleic acids. As described earlier, the sequence modifications could be guided by the MSA, or they could be modified manually so that the changes are greater at the nucleic acid level and small at the amino acid level. Through modification at the nucleic acid level, the lack of significant disruption of protein folding, ensures that the miRNA designed against a stretch in the infectious virus will not disrupt the post-transcription processing of the TV.
Design of Pre-miRNA for the Antisense miRNA.
The rationale for this design follows. We have seen that, enzymes dicer and drosha play their roles in the release of the miRNA from the pre-miRNA. In this design we provide a natural environment for enzyme cleavage to take place at the expected site. We search the miRNA database to find miRNAs that have maximum number of identical nucleotides at the 3′ end to the designed miRNA. These hits would be used to design the loop (segment 5) and segment 6 of the pre-miRNA (
The miRNA could be in either stem(up) or stem(down) orientation. The following description corresponds to designing miRNA in the stem(up) (L-sshRNA) orientation. From the mirbase database, we identify miRNAs that have a relatively high number identical nucleotides (at least three nucleotides) at their 3′ end matching the nucleotides at the 3′ end of the designed miRNA (antisense). If the corresponding shRNA of the database miRNA has its loop at the 3′ end (L-sshRNA orientation), then we select this miRNA and its corresponding shRNA. For the design of pre-miRNA, we use the loop sequences (for segment 5) and the part that is opposed to those matching nucleotides at the 3′ end (segment 6). This analysis can include inspection of the secondary structure of the shRNA using any program that would predict the possible secondary structures of nucleotides and we have used the program RNA structure (see, e.g., Reuter and Mathews (2010)). This design would give an environment that is observed in the database for the dicer to cleave at the expected site.
We search the miRNA database (mirbase) (Ambros, et al, (2003), Griffiths-Jones (2004), Griffiths-Jones, et al, (2006), Griffiths-Jones, et al, (2008), Saini, et al, (2008), Meyers, et al, (2008), Kozomara A and Griffiths-Jones S (2011)), to find mature RNA sequences that have maximum number of nucleotides (>3 nucleotides) at their 5′ ends matching nucleotides at the 5′ end of the designed miRNA. We find the corresponding shRNAs from the mirbase related to these miRNAs, and select the one (designated mir—1) with the most matching sequences and the mature RNA as stem(up) as opposed to stem(down) in the shRNA. Next we use this shRNA (designated hairpin—1) as a query and search the nucleotides database (using a program like BLAST) to get the primary transcript (mir—1_primary) of hairpin—1. Now, we identify the position of the shRNA within the mir—1_primary and extend ˜65 nts on either side to get the flanking sequences.
The matched sequence at the 5′ end of the designed antisense with the selected miRNA from the database, is considered to be second of the nine segments. The flanking sequences (˜60 nts) from the 5′ end of the antisense miRNA primary transcript mir—1_primary will be the first segment. The flanking sequences (˜60 nts) from the 3′ end of the antisense miRNA primary transcript mir—1_primary will be the ninth segment.
Similarly, we find a mature RNA (mir—2) from the mir database which has maximum number of nucleotides (>3 nucleotides) at their 3′ end matching with the 3′ end of our designed antisense. This part of the 3′ end nucleotides of the designed antisense is the fourth segment of the nine segments. We find the shRNA that contains this mir—2 (named hairpin—2) and predict the secondary structure using a program such as RNA structure (Reuter and Mathews (2010)). We can identify the loop sequences, which form the fifth segment of the nine segments.
Now, from the secondary structure prediction, we identify those sequences that correspond to the 3′ end matched sequences in the hairpin—2 (fourth segment) and this will be the sixth segment. Sometimes, segment 4 and segment 6 may not be 100% complementary.
Those nucleotides in the designed antisense that are not part of either segment 2 or segment 4 will be segment 3. Now we develop a perfect complementary sequence for this segment 3 in reverse order and name it segment 7.
Similarly, we also identify those sequences in hairpin—1 that correspond to the matched sequences at the 5′ end (segment 2) and name it segment 8.
Combining segments 1 through 9 will give us the pre-miRNA that contains the designed antisense as a stem(up) sequence.
The following description is for the case where we want the designed antisense to be the stem(down) part of the shRNA or in the R-sshRNA according to the definition of (Khvorova, et al, (2003), Ge, et al, (2010)). We search the mirbase to find mature RNA sequences with a maximum number of nucleotides (>3 nucleotides) at their 5′ ends matching with nucleotides at the 5′ end of the designed miRNA. The miRNA (named mir—3) is found from the mir-database. We find the corresponding shRNA for selection if the mature RNA is stem(down) in the shRNA. We find the shRNA (named hairpin—3) that contains this mir—3 and predict the secondary structure to identify the loop sequences that form the fifth segment of the nine segments.
The matched sequence at the 5′ end of the designed antisense with the selected miRNA from the database is considered the sixth of the nine segments. From the secondary structure prediction, we identify that part that is opposed to the sixth segment as the fourth segment. Sometimes, segment 4 and segment 6 may not have 100% complementarity.
Similarly, we find a mature RNA (mir—4) from the mirbase which has maximum number of nucleotides (>3 nucleotides) at their 3′ end matching with the 3′ end of our designed antisense. This part of the 3′ end nucleotides of the designed antisense is the eighth of the nine segments.
We find the shRNA (named hairpin—4) that contains this mir—4 and predict the secondary structure. From this prediction, we identify those sequences that correspond to the 3′ end of the matched sequences (eighth segment) in the hairpin—4 to be the second segment.
Those nucleotides in the designed antisense that are not part of either segment 6 or segment 8 will be the segment 7.
Now we develop a perfect complementary sequence for this segment 7 in the reverse order as segment 3.
We use the hairpin—4 to search the nucleotides database (using programs like BLAST) for the primary transcript (mir—4_primary) of the hairpin—4. From this analysis we can extract the flanking sequences of desired length (˜60 nts long) that are in the 5′ as well 3′ ends of the shRNA (hairpin—4). Then we extract flanking sequences of desired length from 5′ end of the hairpin—4 within mir—4_primary to be the first segment.
Similarly, we extract flanking sequences of desired length from the 3′ end of hairpin—4 within the mir—4_primary to be the ninth segment.
Now, combining all the segments one through nine will give us the pre-miRNA that would contain the designed antisense as a stem(down) sequence.
We have applied the generic methodology above to design a specific TV against HIV-1 infection.
The schematic view of the HIV-1 genome (
The HIV-genome codes for a few enzymes that are essential for its activity. The HIV-1 genome can be modified (
Next, we deliver the RNAs of the antisense from within the infected cells. We design pre-miRNA sequences and insert them within a UTR region of the designed virus, making them part of the TV. As explained earlier, dicer and drosha enzymes within the host cells would release the miRNAs from the hairpin structures that are part of the pre-miRNAs. Thus, the miRNAs are produced within the cell from its genome once it is incorporated into the host genome. More importantly, we would expect this to inhibit translation of partially complementary mRNA, which is described as a characteristic of miRNAs.
Another aspect of this design is to allow the genome of the TV to be incorporated into the host genome already infected by infectious virus. We expect that enzymes such as reverse transcriptase and integrase from the infectious virus would be available to enable the initial steps of incorporating the TV into the host genome. Accordingly, the antisense sequences would be generated with every cell cycle. We preserve the other sequences of the virus genome (e.g., LTRs, capsids, nucleocapsids, proteins, glycoproteins, REV-responsive element, etc) so that the packing of this TV would resemble the infectious virus externally, maintaining the specificity and selectivity towards those cells targeted by the infectious virus. The TV will permeate into any cell type that could be infected by the infectious virus, as well as become incorporated into the host genome of those cells already infected by HIV-1, continuously producing the antisense sequences to silence the essential enzymes of the infectious virus.
The resulting TV is envisaged to help fight the HIV virus from within, once the therapy begins. This strategy specifically blocks the process of making essential enzymes of the infectious virus by the host cell. In principle, the idea can be used to deliver antisense sequences against any of the viral enzymes, such as the protease, integrase, RNaseH or RT, even though we describe antisense only against the reverse transcriptase of the HIV-1.
For the TV to be transcribed, RT must be present in the cell. Therefore, TV transcription is possible only in cells that have already been infected by the infectious HIV-1 or another retrovirus. Once the TV is reverse transcribed by the reverse transcriptase and integrated into the host genome, RT is no longer needed. This issue is addressed by the development of miRNA against the RT gene. Once the TV is incorporated into the host genome, it would produce more TV as well as miRNAs that will intervene in the translation process of the RT of the infectious virus. Thus, this process allows the TV to incorporate only into the host cell genomes that have are already infected by HIV with copies of RT available. Preferably, we do not inhibit the protease since this enzyme is required to package the TV within the cell.
Most of the existing HIV therapies fail due to development of resistance as the HIV mutates. With small molecule inhibitors of enzymes, a mutation of one residue close to the active site may change the active site and prohibit the inhibitor from binding. With the present invention, the intervention is at the pre-translation step and mRNA function would not be expected to be impacted adversely by one or two mutations. Thus developing viral resistance to this therapy is expected to be minimal. Additionally, with the possibility of more than one antisense sequence directed against same enzyme or multiple enzyme targets of the viral genome, resistance would require mutations in all the segments that contain the antisense sequences. The probability of this process occurring is very low.
The technology described here allows for the host cell to produce copies of the modified virus by itself, so that there may be no need for continuous administration of the therapy. Once we coinfect an already infected HIV cell with this TV, the process of making the antisense sequences will continue until all production of infectious virus is stopped. The replication process of the virus is error prone by nature, and introduction of random mutations is common. This method could fail at the stage when there are mutations in either the RNA sequence targeted by antisense sequences or by mutation at the antisense sequences itself in the TV. However, this concern is minimized by having more than one antisense sequence to bind to different sites of the RNA of the same enzyme.
Recombination is another aspect to be considered. We believe that the complete removal of the RT gene from the therapeutic virus may reduce the chances of converting the TV to an infectious agent. Rather than coming up with an infectious therapeutic virus by acquiring the RT or the integrase gene from the infectious virus genome in the process of recombination, the RT or the integrase would be destroyed by the TV encoded antisense. Complete removal of either the RT or integrase gene from the TV genome could also change the packing of these enzymes within the therapeutic virus and cause other problems.
The following examples are offered to illustrate, but not to limit the claimed invention.
It is understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended claims.
Though many sequences are available in the Genbank database, e.g., starting from (U26942), Adachi, et al, (1986), Salminen, et al, (1995), we use a genomic sequence (Fang, et al, (2001) (gb: U69584.1) of HIV-1 (
The active site of 2zd1.pdb with some of the residues suggested for mutations to inactivate the enzyme highlighted is shown in
Similarly, another structure 1rtd.pdb (Huang, et al, (1998)) was used to identify residues in the active site involving nucleoside reverse transcriptase inhibitors (NRTI), for directed mutagenesis to inactivate the enzyme.
Selection of the First Conserved Site within RT for Creating Antisense Sequences.
When selecting sites for creating antisense sequences, we wanted to know if the loss or breakage at those sites would render the enzyme inactive in the infectious virus and how modifications to that site would affect the folding of the structure in the TV. These two aspects are critically important in our design. When we select a site for creating the antisense, the antisense sequence would bind to that site in the mRNA rendering the produced protein incomplete. We want to know if this incomplete structure is inactive so that the infectious virus would be innocuous. Further, we need to ensure that modification to a particular site will not interfere with incorporation into the TV. These modifications serve two functions. One, the antisense developed against that site will not bind and affect the TV. Second, for targeting the cells concerned, we want the TV to resemble the infectious virus externally by having similar viral packing.
The next step was to find a sequentially conserved stretch of nucleotides within the gene for RT. For this purpose, we plan to find a stretch (of length 22-24) within RT, relatively more conserved than other stretches and structurally meaningful in inactivating the enzyme. One of the sequences for the RT was obtained from a public database. It was used to gather all nucleotide sequences from the National Center for Biotechnology Information (NCBI) (www.ncbi.nlm.nih.gov) using “blastn” (Altschul, et al, (1997), Stephen et al., (1997)) against ““Genbank (Benson et al., (2003)), EMBL (Stoesser et al., (2003)), DDBJ (Tateno et al., (1998)), and PDB (Berman et, (2000))” sequence databases.
Using as an example, the nucleotide sequence for HIV RT is used to “blastn” against all the nucleotide sequences in NCBI site. We have used the word blast as a verb to denote running the BLAST program. To get a multiple sequence alignment and maintain the alignment from the blast run, a separate program was written to collect all the sequences from the blast output with high sequence identity (>80%) to the query sequence and with a hit length at least more than 90% of the query sequence. 4944 sequences were identified. Another script was written to analyze the multiple sequence alignment file to calculate the frequency of occurrence as a percentage of each nucleotide at each location. That percentage is given a null value if that location does not have a significant number of nucleic acids in the total number of sequences. These numbers are used to identify stretches that are maximally conserved for a given stretch length.
A new nucleotide sequence for the RT is created with nucleotides that have the highest frequency at each location. For each position along the length of the sequence, the nucleotide having the maximum frequency as well as its frequency is stored. Using this we calculate the cumulative frequency for a given window of nucleotides (22 to 24) at every location and are used to find the stretches of nucleotides having higher cumulative frequencies. Next, the RT enzyme structure is visually analyzed using the “PYMOL” program (PyMOL™ Evaluation Product—Copyright (C) 2008, DeLano Scientific LLC). The amino acids corresponding to key stretches are analyzed regarding their impact on the active site and in turn enzyme structure and function. Once we have identified the site, with a length of ˜22 nucleotides, an antisense sequence is obtained for the same.
In
If antisense were to bind to WKGSPAI, then the part that forms the active site will be lost and the enzyme will be inactive. Note that this segment is derived from the infectious HIV virus. If we use the antisense in our noninfectious TV, the antisense sequence would bind to them also and disrupt the propagation of the TV. We envision at least two ways to avoid this occurrence. One, we remove the gene encoding the RT from its genomic sequence. Two, we mutate those amino acids of the conserved segment in such a way that the mutated residues have maximum dissimilarity in terms of the triplet codons to the selected segment without greatly perturbing the 3D structure. In this way, we avoid binding of the antisense to our TV.
For the selected segment for RT, we have tried to shorten the loop by deleting G152 and K154. Certain other mutations, e.g., G155T and P157N are based on visual analysis of the structure. We wanted to change G155, because the among previous residues, W153 is encoded by TGG. The 2 guanosines will give rise to a guanosine repeat (G-quartet which have been implicated in recombination of RNAs) (Shen, et al, (2009)), if G155 (GGA) were not modified. There are examples for the Pro to Ser mutation (from the structure AY358072), so we believe P157N would be plausible. The Ala to Ser is from AY779556.
For the amino acid I159, which is surrounded by hydrophobic residues, there are few sequences with either a threonine (gb|AY560444.1, gb|AY275555.1, gb|AY560412.1) or valine (gb|AY331286.1, gb|AY331285.1, gb|JF689897.1). We have chosen to use GTG, which codes for a valine. Next, the mutation from Phe to Leu is from EU489663. Along with the above mentioned changes to the selected conserved segment of RT, we also suggest to delete the G152 (GGA) and K254 (AAA) and to use a different codon for the next serine. The alanine residue has been changed to a serine (GCA to AGT). Thus, the very segment which is part of a loop in the RT structure is shortened, making it very difficult for the designed miRNA to bind to this site. Modeling the conserved segment after the deletion of the two residues G152 and K154 did not show any great change in the structure of the loop, so we believe that changes may not affect the structure significantly. The sequence of RT after incorporation of these changes is termed modified sequence RT′. The gene equivalent to the RT of the infectious virus in the TV will have the modified sequence RT′ at the corresponding site.
Several miRNAs, their shRNAs, and their precursor pre-miRNAs have been documented, and a database has been created (mirbase) (Ambros, et al, (2003), Griffiths-Jones (2004), Griffiths-Jones, et al, (2006), Griffiths-Jones, et al, (2008), Saini, et al, (2008), Meyers, et al, (2008), Kozomara A and Griffiths-Jones S (2011)). An analysis of the miRNA database gives an idea on the location of the matured miRNA within the hairpin sequence. The antisense miRNA has been found to be either of the two stem sequences or in some cases both the stem(up) and stem(down) sequences. We design our antisense to be in stem (up) orientation (L-sshRNA). The following are the steps in the construction of the pre-miRNA for the antisense sequences.
To start our construct, we selected those mature RNAs (species Homo sapiens) from the database (mirbase) that have maximum identity at the 3′ end to our antisense sequence. For example, the antisense sequence for the first segment in RT (ATATTGCTGGTGATCCTTTCCATC (SEQ ID NO: 11) given in
When we used a 22 nt antisense against the first conserved segment for RT, many of the hits from the miRNA database had the matured RNA (miRNA) in the stem(down) orientation; they were on the 3′ end of the loops. So we had to settle with a 21 nucleotide stretch that would find matured miRNAs to the 5′ end of the loop. The final antisense chosen is “TATTGCTGGTGATCCTTTCCA (SEQ ID NO: 14)”. For this sequence, we had the best hit “CTGGAGTCTAGGATTCCA (SEQ ID NO: 15)” for the 3′ end from MIMAT0016859, which is from the hairpin sequence hsa-miR-4309.
Similarly, for the 5′ end, we chose the miRNA MIMAT00000090 with the maximum matches with our antisense at the 5′ end “TATTGCACATTACTAAGTTGCA (SEQ ID NO: 16)”, which lies in hsa-miR-32 and further, this miRNA is stem-(up) within the hsa-mir-32 as shown in
Selection of the Second Site within RT for Creating Antisense Sequences.
The next high scoring conserved segment within the RT gene was evaluated for designing another miRNA as described previously.
The mutation V106E was based in the context of the mutation L234K to inactivate the enzyme. This mutation is expected to establish a salt bridge between V106E and L234K. The H235R mutations is observed in AY560487; P236S in AF331209; D237T in GU328920; K238S in AB253431; T240S in GU345227; and V241L in AF289548. We use a different triplet for the different amino acids observed in other proteins to maximize the difference between this new set of nucleic acids compared to the antisense created against the second site. The segment has a sequence CTCCATCCTGATAAATGGACAG (SEQ ID NO: 17), with its antisense being CTGTCCATTTATCAGGATGGAG (SEQ ID NO: 18). For this antisense, the sequence that matches at the 3′ end is MIMAT0019728 (TGCAGCTCTGGTGGAAA ATGGAG (SEQ ID NO: 19)) and its shRNA is hsa-mir-4660. Similarly, the mature RNA given by MIMAT0009979 (CTGTAATATAAATTTAATTTATT (SEQ ID NO: 20)) (the shRNA being hsa-miR-2054) has matching 5′ end sequences, which are used here for designing the flanking sequences.
Design of Antisense miRNA Against Integrase.
As described above for developing antisense sequences for RT, we have also created two antisense sequences against two sites for the enzyme integrase. For structural details, we have chosen 3nf6.pdb (Rhodes, et al, (2011)), 1ex4.pdb (Chen, et al, (2000)), and 1qs4.pdb (Goldgur, et al, (1999)) from the Protein Data Bank (Berman et, (2000)). Métifiot, et al, (2010) have discussed the antisense resistance from mutations in a loop region of integrase. As we are interested primarily to inactivate this enzyme, we will consider the integrase 3D structure around the catalytic residues. Robert Craigie and others (Jenkins. et al, (1997), Alian, et al, (2009)) have indicated that residues 64D, 116D, and 152E are key among those important for the catalysis. The active site of HIV-1 integrase is shown in
First, residue D64 is considered. From the alignment, the protein AF407656 has a glycine in that place. Thus, we have GAT to GGC. At 116D the protein AY314053 has a glycine. So, we could have GAC to GGT. Residue 152E has an arginine in protein GU216873 and we suggest GAA to CGT. Residue 66T has a valine, hence ACA to GTG. Residue 159R has an arginine in protein AF040274, hence AAG to CGT. These mutations will to make the enzyme inactive.
In
The next relatively highly conserved stretch happens to be adjacent to the first conserved site and is also at the dimer interface with a symmetry related molecule.
Several articles (Garcia, et al, (1993), Aiken, et al, (1994), Bentham, et al, (2003), and Hanna, et al, (2006)) implicate the Nef domain of HIV-1 to the down regulation of CD4 cell expression. Hanna, et al, (2006) identified that mutation in the Nef domain RD35/36AA and D174K abrogated the down regulation of the CD4. Bentham, et al, (2003) reported that di-leucines at 413/414 of CD4 are essential for the down regulation. Further, the di-leucines are needed for endocytosis and not for binding to the Nef domain of HIV-1. We do not want our TV, to down regulate CD4 cells and have tried to incorporate this goal into our design model. However the mutations were slightly modified to have maximum dissimilarity of the original triplets for these residues so that the probability of going back to the original sequences is low. We have changed the residue R35, which has the codons “cga” to alanine (gcc); D36S (gac to tca) and D174R (gac to aga). The modified codons for these residues are shown in italics in the
The total length of the RNA in the infectious virus is 9217:
Sequence extending from 1 to 2100 are designated the A section.
Sequence extending from 2001 to 4650, or the RT part, are designated the B section.
Sequence extending from 4651 to 8400 are designated the C section.
Sequence extending from 8401 to 9000, or the integrase part, are the D section.
The insertion of pre-miRNAs after 9000 is designated the E section.
Sequence extending from 9001 to 9217 is designated the F section.
The B section has variations at two sites when compared to wild type. One (B1) contains mutations to first site against which antisense was created referred to as RT1 in the text. The other one (B2) contains mutations to second site against which antisense was created referred to as RT2 in the text. We can consider one (B3) that contains mutations to both first and second sites and is called as RT3 within the text. The complete absence of RT could be called (B4).
The wild type integrase would be (D0). The section D has variations at two sites when compared to wild type. One (D1) contains mutations to first site against which antisense was created referred to as IN1 in the text. Another one (D2) contains mutations to second site against which antisense was created referred to as IN2 in the text. We can consider one (D3) that contains mutations to both first and second sites and is called referred to as RT3 in the text. The complete absence of integrase could be called (D4).
There are 4 pre-miRNAs two for RT and two for integrase. These are individually called E1, E2, E3, and E4.
One can in principle make all possible and meaningful combinations. The one that includes all the four pre-miRNAs would be preferable as this should be more effective and less prone to spontaneous mutations that make it ineffective. At the same time, when we include 4 pre-miRNAs, the volume of the matter within the capsid is increased and packaging may be less efficient. Optionally, we should be looking at the ones which have lesser pre-miRNAs.
The following are the designs for various therapeutic virus. Therapeutic virus genome comprise, e.g., A+B+C+D+E+F, where B could have 4 possibilities, D could have 4 possibilities, and E could have E1, E2, E3, and E4 individual pre-miRNA. We can also have constructs without any modifications to integrase (wild type) and in this case no E3 or E4 will be used as the presence of E3 and E4 would be targeting the TV and destroy them unless the TV does not contain the integrase at all. The one that contains mutations to both miRNA binding sites of both RT and integrase and including all the four miRNAs is given below:
One can optimize the TV by going through combinations of the B's, D's and E's by constructing many possibilities to verify which are most efficient at producing virus particles and selecting the best. Importance is given to maximum number of pre-miRNAs included in the TV and also producing virus particles. The following list contains a few combinations where at least one of the miRNA binding sites on the reverse transcriptase is included. It is envisioned, and well within the skill in the art, for one to create a similar listing where at least one site for the integrase is maintained, based on the teachings herein.
Based on the teachings above, one may prepare any number of functional therapeutic viruses against HIV. Further, using the general concepts provided, one may design any number of additional TV constructs against any number of different viruses. For example, see HCV construct, below.
As a second example, we have chosen to design a therapeutic virus against hepatitis C virus (HCV). HCV is a positive-sense single stranded RNA virus. A description of the same can be found in http://en.wikipedia.org/wiki/Hepatitis_C_virus. The genome organization of the Hepatitis C virus is given in
Extract the nucleic acid sequences for the NS5B enzyme a RDRP and form one of the nucleotide sequences for the complete genome of the HCV virus from the NCBI database http://www.ncbi.nlm.nih.gov/ with the accession number AB520610 (Weng et al., (2010), Arai et al., (2009)). We used this sequence to do a “nucleotide blast” run against the nucleotide database with options as described earlier. We used the alignments of 1053 sequences, as given in the “blast” output to find out highly conserved stretches for a window of 22 nucleotide long. The best relatively conserved stretch of nucleotides is shown in
The matching elements being “GAGTA”. The MIMAT0003386 happens to be in the stem_up orientation within the shRNA. Similarly, the matured miRNA that matched the 5′ end of the antisense is MIMAT0000090 and the shRNA is hsa-mir-32. The matching sequence is “TATTG”.
The predicted secondary structure of hsa-mir-376a is given in
The final pre-miRNA for the antisense was designed by combining various segments described above which is given in
In
It is expected that such a therapeutic virus administered, should enter the hepatocytes of the liver as the structural proteins are still kept intact. Once inside the cell, as this lacks the RDRP enzyme, it will need the RDRP produced by the infectious virus (if the cell has already been infected with HCV) to get through the initial stages of infection. During the progression of cell cycle, miRNA will be produced by host enzymes “drosha” and “dicers” from the pre-miRNA introduced in the NTR region. The miRNA would enter the RISC (RNA-induced silencing complex) complex, to interfere with the translation of the RDRP and the production of infectious virus would be stopped. During this time, the therapeutic virus will continue to be produced and released to enter other hepatocytes and this process should continue until the infectious virus completely destroyed.
We give here genome sequences for two versions for the therapeutic virus. In
A. A pre-miRNA is adapted to have cleavage sites for efficient and precise processing by enzymes dicer and drosha within the host cells to provide miRNA within an L-sshRNA or R-sshRNA. The pre-miRNA can include the following.
A sense sequence miRNA, a loop sequence at the 3′ end of the sense sequence, an antisense to sense miRNA at the 3′ end of the loop sequence, and 5′ flanking (F5) at the 5′ end of the sense sequence and 3′ flanking (F3) sequences at the 3′ end of the antisense sequence; at least 6 nucleotides, at least three from the 3′ end of sense miRNA and at least three from the 5′ end of the loop sequence and 6 nucleotides, at least three from 3′end of the loop sequence and at least three from the 5′ end of the antisense matches a similar construct from any shRNA from any miRNA database available at that time; at least 6 nucleotides, that is at least three from the 5′ end of the sense miRNA and at least three from the 3′ end of F5 and at least 6 nucleotides, that is at least three from the 3′ end of the antisense sequence and at least three from the 5′ end of the F3 matches a similar construct from any pre-miRNA from any miRNA database available at that time; the F5 has at least 95% sequence identity and F3 has at least 95% sequence identity to the corresponding flanking sequences of the same pre-miRNA from the miRNA database; and, a sense sequence miRNA, a loop sequence at the 5′ end of the sense sequence, an antisense to sense miRNA at the 5′ end of the loop sequence and 5′ flanking (F5) at the 5′ end of the antisense sequence and 3′ flanking (F3) sequences at the 3′ end of the sense sequence; at least 6 nucleotides, at least three from the 3′ end of the loop sequence and at least three from the 5′ end of sense miRNA and 6 nucleotides, at least three from the 3′ end of the antisense and at least three from the 5′ end of the loop sequence matches a similar construct from any shRNA from any miRNA database available at that time; at least 6 nucleotides, that is at least three from the 3′ end of the sense miRNA and at least three from the 5′ end of F3 and at least 6 nucleotides, that is at least three from the 3′ end of the F5 and at least three from the 5′ end of the antisense sequence matches a similar construct from any pre-miRNA from any miRNA database available at that time; the F5 has at least 95% sequence identity and F3 has at least 95% sequence identity to the corresponding flanking sequences of the same pre-miRNA from the miRNA database.
B. A highly conserved stretch of given length within a gene can include the following.
A high scoring stretch of given length, which is the sum of the scores of each position of the stretch; where the score is the frequency of occurrence of a particular nucleotide at the given position or column which is the maximum compared to frequency of occurrence of other possible nucleotides occurring at that position or column, calculated from a multiple sequence alignment of sequences homologous (at least 80% identity) to the gene; which is used to design antisense or miRNA that would hybridize with the conserved stretch.
C. A designed L-sshRNA (stem(up) shRNA) can include the following.
An miRNA (sense sequence) at the 5′ end of the shRNA; a loop sequence at the 3′ end of the sense sequence and a sequence (antisense) complimentary to the miRNA or the sense sequence at the 3′ end of the loop sequence; at least three nucleotides from the 3′ end of the miRNA, the whole loop sequence, and at least three sequences at the 5′ end of the antisense sequence match an shRNA from any miRNA database; whereby the shRNA is adapted to have a natural cleaving sites for dicer.
D. A designed R-sshRNA (stem(down) shRNA) can include the following.
An miRNA (sense sequence) at the 3′ end of the shRNA; a loop sequence at the 5′ end of the sense sequence and a sequence (antisense) complimentary to the miRNA or the sense sequence at the 5′ end of the loop sequence; at least three nucleotides from the 5′ end of the miRNA, the whole loop sequence, and at least three sequences at the 3′ end of the antisense sequence match an shRNA from any miRNA database; whereby the shRNA is adapted to have a natural cleaving sites for dicer.
While the foregoing invention has been described in some detail for purposes of clarity and understanding, it will be clear to one skilled in the art from a reading of this disclosure that various changes in form and detail can be made without departing from the true scope of the invention. For example, all the techniques and apparatus described above can be used in various combinations. All publications, patents, patent applications, and/or other documents cited in this application are incorporated by reference in their entirety for all purposes to the same extent as if each individual publication, patent, patent application, and/or other document were individually indicated to be incorporated by reference for all purposes.
This application claims priority to, and benefit of, U.S. provisional patent application U.S. Ser. No. 61/674,439, by Radhakrishnan Rathnachalam, filed Jul. 23, 2012.
Number | Date | Country | |
---|---|---|---|
61674439 | Jul 2012 | US |